Skip to main content
Sampling enables MCP tools to request LLM completions during their execution. This powerful feature allows servers to dynamically generate content, make decisions, or process data using AI models while executing tool calls.
What is Sampling? Sampling is when an MCP server calls back to the client requesting an LLM completion. This enables tools to leverage AI capabilities without directly integrating with LLM providers.

Understanding Sampling

Sampling creates a callback mechanism where:
  1. Client calls a tool on the server
  2. Server requests an LLM completion from the client
  3. Client executes the LLM call and returns results
  4. Server continues execution with the LLM response
This pattern is useful for:
  • Content generation within tools
  • Dynamic decision making
  • Data transformation and analysis
  • Interactive workflows

Configuration

To enable sampling, provide a samplingCallback function when initializing the MCPClient:
import { MCPClient } from 'mcp-use'
import type { CreateMessageRequestParams, CreateMessageResult } from 'mcp-use'

async function samplingCallback(
  params: CreateMessageRequestParams
): Promise<CreateMessageResult> {
  // Integrate with your LLM of choice (OpenAI, Anthropic, etc.)
  // Extract the last message content
  const lastMessage = params.messages[params.messages.length - 1]
  const content = Array.isArray(lastMessage.content) 
    ? lastMessage.content[0] 
    : lastMessage.content
  
  // Call your LLM
  const response = await yourLlm.complete(content.text)

  return {
    role: 'assistant',
    content: { type: 'text', text: response },
    model: 'your-model-name',
    stopReason: 'endTurn'
  }
}

const client = new MCPClient(config, {
  samplingCallback
})

Creating Sampling-Enabled Tools

When building MCP servers, tools can request sampling using the context parameter:
import { MCPServer } from 'mcp-use/server'
import type { ToolContext } from 'mcp-use/server'

const server = new MCPServer({
  name: 'MyServer',
  version: '1.0.0'
})

server.tool({
  name: 'analyze-sentiment',
  description: 'Analyze the sentiment of text using the client\'s LLM',
  inputs: [
    { name: 'text', type: 'string', required: true }
  ],
  cb: async (params, ctx?: ToolContext) => {
    if (!ctx) {
      throw new Error('Sampling not available - client does not support sampling')
    }

    // Request LLM analysis through sampling
    const prompt = `Analyze the sentiment of the following text as positive, negative, or neutral.
Just output a single word - 'positive', 'negative', or 'neutral'.

Text to analyze: ${params.text}`

    // Request LLM analysis through sampling
    // By default, this waits indefinitely and sends progress notifications every 5s
    const result = await ctx.sample({
      messages: [{
        role: 'user',
        content: { type: 'text', text: prompt }
      }],
      modelPreferences: {
        intelligencePriority: 0.8,
        speedPriority: 0.5
      }
    })

    // Extract text from result
    const content = Array.isArray(result.content) 
      ? result.content[0] 
      : result.content
    
    return {
      content: [{
        type: 'text',
        text: content.text || 'Unable to analyze sentiment'
      }]
    }
  }
})

Automatic Progress Reporting

The ctx.sample() function automatically handles progress reporting to prevent client-side timeouts during long-running operations.
  • Automatic Progress: By default, it sends a progress notification every 5 seconds.
  • Timeout Handling: If the client has resetTimeoutOnProgress: true enabled, these notifications keep the connection alive indefinitely.
  • Configuration: You can customize the timeout and progress interval:
const result = await ctx.sample(
  {
    messages: [{ role: 'user', content: { type: 'text', text: prompt } }]
  },
  {
    // Optional: Limit wait time (default: Infinity)
    timeout: 120000, // 2 minutes
    
    // Optional: Change progress interval (default: 5000ms)
    progressIntervalMs: 2000,
    
    // Optional: Custom progress handling
    onProgress: ({ progress, message }) => console.log(message)
  }
)

Using createMessage Directly

You can also use server.createMessage() directly if you need more control:
const server = new MCPServer({
  name: 'MyServer',
  version: '1.0.0'
})

server.tool({
  name: 'custom-llm-task',
  description: 'Perform a custom LLM task',
  inputs: [{ name: 'query', type: 'string', required: true }],
  cb: async (params) => {
    // Use server.createMessage() directly
    const result = await server.createMessage({
      messages: [{
        role: 'user',
        content: { type: 'text', text: params.query }
      }],
      systemPrompt: 'You are a helpful assistant.',
      maxTokens: 100
    })

    return {
      content: [{
        type: 'text',
        text: result.content.text || ''
      }]
    }
  }
})

Error Handling

If no sampling callback is provided but a tool requests sampling:
// The server will receive an error from the client
// indicating that sampling is not supported

server.tool({
  name: 'needs-sampling',
  cb: async (params, ctx) => {
    try {
      const result = await ctx?.sample({ messages: [...] })
      return { content: [{ type: 'text', text: result.content.text }] }
    } catch (error) {
      return {
        content: [{
          type: 'text',
          text: `Error: Client does not support sampling - ${error.message}`
        }]
      }
    }
  }
})