Sampling enables MCP tools to request LLM completions during their execution. This powerful feature allows servers to dynamically generate content, make decisions, or process data using AI models while executing tool calls.
What is Sampling? Sampling is when an MCP server calls back to the client requesting an LLM completion. This enables tools to leverage AI capabilities without directly integrating with LLM providers.
Understanding Sampling
Sampling creates a callback mechanism where:
- Client calls a tool on the server
- Server requests an LLM completion from the client
- Client executes the LLM call and returns results
- Server continues execution with the LLM response
This pattern is useful for:
- Content generation within tools
- Dynamic decision making
- Data transformation and analysis
- Interactive workflows
Configuration
To enable sampling, provide a samplingCallback function when initializing the MCPClient:
import { MCPClient } from 'mcp-use'
import type { CreateMessageRequestParams, CreateMessageResult } from 'mcp-use'
async function samplingCallback(
params: CreateMessageRequestParams
): Promise<CreateMessageResult> {
// Integrate with your LLM of choice (OpenAI, Anthropic, etc.)
// Extract the last message content
const lastMessage = params.messages[params.messages.length - 1]
const content = Array.isArray(lastMessage.content)
? lastMessage.content[0]
: lastMessage.content
// Call your LLM
const response = await yourLlm.complete(content.text)
return {
role: 'assistant',
content: { type: 'text', text: response },
model: 'your-model-name',
stopReason: 'endTurn'
}
}
const client = new MCPClient(config, {
samplingCallback
})
When building MCP servers, tools can request sampling using the context parameter:
import { MCPServer } from 'mcp-use/server'
import type { ToolContext } from 'mcp-use/server'
const server = new MCPServer({
name: 'MyServer',
version: '1.0.0'
})
server.tool({
name: 'analyze-sentiment',
description: 'Analyze the sentiment of text using the client\'s LLM',
inputs: [
{ name: 'text', type: 'string', required: true }
],
cb: async (params, ctx?: ToolContext) => {
if (!ctx) {
throw new Error('Sampling not available - client does not support sampling')
}
// Request LLM analysis through sampling
const prompt = `Analyze the sentiment of the following text as positive, negative, or neutral.
Just output a single word - 'positive', 'negative', or 'neutral'.
Text to analyze: ${params.text}`
// Request LLM analysis through sampling
// By default, this waits indefinitely and sends progress notifications every 5s
const result = await ctx.sample({
messages: [{
role: 'user',
content: { type: 'text', text: prompt }
}],
modelPreferences: {
intelligencePriority: 0.8,
speedPriority: 0.5
}
})
// Extract text from result
const content = Array.isArray(result.content)
? result.content[0]
: result.content
return {
content: [{
type: 'text',
text: content.text || 'Unable to analyze sentiment'
}]
}
}
})
Automatic Progress Reporting
The ctx.sample() function automatically handles progress reporting to prevent client-side timeouts during long-running operations.
- Automatic Progress: By default, it sends a progress notification every 5 seconds.
- Timeout Handling: If the client has
resetTimeoutOnProgress: true enabled, these notifications keep the connection alive indefinitely.
- Configuration: You can customize the timeout and progress interval:
const result = await ctx.sample(
{
messages: [{ role: 'user', content: { type: 'text', text: prompt } }]
},
{
// Optional: Limit wait time (default: Infinity)
timeout: 120000, // 2 minutes
// Optional: Change progress interval (default: 5000ms)
progressIntervalMs: 2000,
// Optional: Custom progress handling
onProgress: ({ progress, message }) => console.log(message)
}
)
Using createMessage Directly
You can also use server.createMessage() directly if you need more control:
const server = new MCPServer({
name: 'MyServer',
version: '1.0.0'
})
server.tool({
name: 'custom-llm-task',
description: 'Perform a custom LLM task',
inputs: [{ name: 'query', type: 'string', required: true }],
cb: async (params) => {
// Use server.createMessage() directly
const result = await server.createMessage({
messages: [{
role: 'user',
content: { type: 'text', text: params.query }
}],
systemPrompt: 'You are a helpful assistant.',
maxTokens: 100
})
return {
content: [{
type: 'text',
text: result.content.text || ''
}]
}
}
})
Error Handling
If no sampling callback is provided but a tool requests sampling:
// The server will receive an error from the client
// indicating that sampling is not supported
server.tool({
name: 'needs-sampling',
cb: async (params, ctx) => {
try {
const result = await ctx?.sample({ messages: [...] })
return { content: [{ type: 'text', text: result.content.text }] }
} catch (error) {
return {
content: [{
type: 'text',
text: `Error: Client does not support sampling - ${error.message}`
}]
}
}
}
})