mcp-use - Connect Any LLM to Any MCP Server

Sampling allows MCP server tools to request LLM completions from the client during execution. This enables tools to leverage the client’s LLM capabilities for tasks like analysis, decision-making, or text generation.

Overview

When a tool needs an LLM response, it can use ctx.sample() to send a request to the client. The client’s configured LLM processes the request and returns the result to the tool. This creates powerful workflows where tools can “think” using the client’s AI capabilities. Key Benefits:

Leverage client’s LLM: Use whatever model the client has configured
No server-side LLM: No need to integrate an LLM provider in your server

How It Works

Basic Usage

Use the sample() method on the context object within tool callbacks:

import { MCPServer, text } from 'mcp-use/server';
import { z } from 'zod';

const server = new MCPServer({
  name: 'my-server',
  version: '1.0.0',
});

server.tool({
  name: 'analyze-sentiment',
  description: 'Analyze sentiment of text using the client\'s LLM',
  schema: z.object({
    text: z.string(),
  })
}, async (params, ctx) => {
  // Simple string API - just pass a prompt!
  const response = await ctx.sample(
    `Analyze the sentiment of the following text as positive, negative, or neutral. 
     Just output a single word. Text: ${params.text}`
  );

  return text(`Sentiment: ${response.content[0].text}`);
});

await server.listen();

API Styles

The ctx.sample() method supports two API styles to suit different needs:

1. Simplified

Perfect for simple cases where you just need a completion:

server.tool({
  name: 'simple-analysis',
  description: 'Analyze text using simplified API',
  schema: z.object({
    text: z.string(),
  })
}, async (params, ctx) => {
  // Just pass a string - that's it!
  const response = await ctx.sample(
    `Summarize this text in one sentence: ${params.text}`
  );
  
  return text(response.content[0].text);
});

With options:

// Add custom options while keeping the simple string API
const response = await ctx.sample(
  `Analyze this: ${params.text}`,
  {
    maxTokens: 50,
    temperature: 0.3,
    timeout: 30000  // 30 seconds
  }
);

2. Extended

Use when you need fine-grained control over model preferences, system prompts, etc.:

server.tool({
  name: 'advanced-analysis',
  description: 'Analyze text with full control',
  schema: z.object({
    text: z.string(),
  })
}, async (params, ctx) => {
  // Full control with complete params object
  const response = await ctx.sample({
    messages: [
      {
        role: 'user',
        content: { 
          type: 'text', 
          text: `Analyze: ${params.text}` 
        },
      },
    ],
    modelPreferences: {
      intelligencePriority: 0.8,  // Prefer smarter models
      speedPriority: 0.5,          // Balance speed
    },
    systemPrompt: 'You are an expert analyst.',
    maxTokens: 100,
  });

  return text(response.content[0].text);
});

Progress reporting

The client will report progress for the generation that can be handled with the onProgress callback.

server.tool({
  name: 'custom-progress',
  description: 'Custom progress configuration',
  schema: z.object({
    prompt: z.string(),
  })
}, async (params, ctx) => {
  const response = await ctx.sample(
    params.prompt,
    {
      // Limit wait time (default: Infinity)
      timeout: 120000,  // 2 minutes
      
      // Change progress interval (default: 5000ms)
      progressIntervalMs: 2000,
      
      // Custom progress handling
      onProgress: ({ message }) => {
        console.log(`[Progress] ${message}`);
      }
    }
  );
  
  return text(response.content[0].text);
});

Default Behavior: Like ctx.elicit(), sampling has no timeout by default and waits indefinitely for the LLM response. This prevents premature failures for complex prompts that may take time to process.

Checking for Sampling Support

You can check if the client supports sampling before attempting to use it:

server.tool({
  name: 'conditional-sampling',
  description: 'Use sampling if available',
  schema: z.object({
    text: z.string(),
  })
}, async (params, ctx) => {
  if (!ctx) {
    return text('Context not available - running in stateless mode');
  }

  // Check if client supports sampling
  if (!ctx.client.can('sampling')) {
    return text(`Simple Analysis: ${params.text.length} characters`);
  }

  const response = await ctx.sample(`Analyze: ${params.text}`);
  return text(`LLM Analysis: ${response.content[0].text}`);
});

You can also inspect all client capabilities:

server.tool({
  name: 'advanced-feature',
  description: 'Uses different features based on client capabilities',
  schema: z.object({ input: z.string() })
}, async (params, ctx) => {
  const caps = ctx.client.capabilities();
  console.log('Client capabilities:', caps);
  // { sampling: {}, roots: { listChanged: true }, elicitation: { form: {}, url: {} } }
  
  if (ctx.client.can('sampling') && ctx.client.can('elicitation')) {
    // Use both sampling and elicitation
    const analysis = await ctx.sample(`Analyze: ${params.input}`);
    return text(`Advanced analysis: ${analysis.content[0].text}`);
  } else if (ctx.client.can('sampling')) {
    // Use only sampling
    const result = await ctx.sample(`Process: ${params.input}`);
    return text(`Result: ${result.content[0].text}`);
  } else {
    // Basic functionality only
    return text(`Basic processing: ${params.input.length} characters`);
  }
});

Example

import { MCPServer, text } from 'mcp-use/server';
import { z } from 'zod';

const server = new MCPServer({
  name: 'sentiment-analyzer',
  version: '1.0.0',
});

server.tool({
  name: 'analyze-sentiment',
  description: 'Analyze sentiment using client\'s LLM',
  schema: z.object({
    text: z.string(),
  })
}, async (params, ctx) => {
  const prompt = `Analyze the sentiment of the following text as positive, negative, or neutral.
Just output a single word - 'positive', 'negative', or 'neutral'.

Text to analyze: ${params.text}`;

  const response = await ctx.sample(prompt);
  const sentiment = response.content[0].text.trim().toLowerCase();
  
  return text(`Sentiment: ${sentiment}`);
});

await server.listen();

API Reference

ctx.sample(prompt: string, options?)

Simplified API Parameters:

prompt (string): The text prompt to send to the LLM
options (optional):
- maxTokens (number): Maximum tokens in response (default: 1000)
- temperature (number): Sampling temperature 0.0-1.0
- timeout (number): Timeout in milliseconds (default: Infinity)
- progressIntervalMs (number): Progress interval (default: 5000)
- onProgress (function): Custom progress handler

Returns: Promise<CreateMessageResult>

ctx.sample(CreateMessageRequestParams, options?)

Extended API Parameters:

params (object):
- messages (array): Array of message objects
- modelPreferences (optional): Model selection preferences
- systemPrompt (optional): System prompt for the LLM
- maxTokens (optional): Maximum response tokens
options (optional): Same as simplified API

Returns: Promise<CreateMessageResult>

CreateMessageResult

interface CreateMessageResult {
  role: 'assistant';
  content: {
    type: 'text' | 'image';
    text?: string;
    data?: string;
    mimeType?: string;
  };
  model: string;
  stopReason?: 'endTurn' | 'maxTokens' | 'stopSequence';
}

Testing with MCP Inspector

Start your server with inspector enabled
Navigate to Tools tab
Call a tool that uses sampling
Approve in the sampling tab
Verify the LLM response is processed correctly

GitHub code

Check out the complete working example with all sampling patterns: 📁 examples/server/sampling/ This example includes:

Simple string API usage
Full control API with model preferences
Custom progress configuration
All three example tools from this guide

To run the example:

cd libraries/typescript/packages/mcp-use/examples/server/sampling
pnpm install
pnpm dev

Server runs on http://localhost:3000 with MCP Inspector available.

Next Steps

Tools Guide - Building executable tools
Elicitation - Request user input during execution
Notifications - Send status updates to clients
Client Sampling - Configure sampling on the client side

Get Started

MCP Agent

MCP Client

MCP Server

Development

Sampling

Overview

How It Works

Basic Usage

API Styles

1. Simplified

2. Extended

Progress reporting

Checking for Sampling Support

Example

API Reference

ctx.sample(prompt: string, options?)

ctx.sample(CreateMessageRequestParams, options?)

CreateMessageResult

Testing with MCP Inspector

GitHub code

Next Steps

Get Started

MCP Agent

MCP Client

MCP Server

Development

​Overview

​How It Works

​Basic Usage

​API Styles

​1. Simplified

​2. Extended

​Progress reporting

​Checking for Sampling Support

​Example

​API Reference

​ctx.sample(prompt: string, options?)

​ctx.sample(CreateMessageRequestParams, options?)

​CreateMessageResult

​Testing with MCP Inspector

​GitHub code

​Next Steps

Overview

How It Works

Basic Usage

API Styles

1. Simplified

2. Extended

Progress reporting

Checking for Sampling Support

Example

API Reference

ctx.sample(prompt: string, options?)

ctx.sample(CreateMessageRequestParams, options?)

CreateMessageResult

Testing with MCP Inspector

GitHub code

Next Steps