A small, fetch-first implementation of the Model Context Protocol (MCP) server APIs.
mcp-lite
is a ground-up rewrite of the TypeScript MCP SDK. It keeps only the pieces you need to stand up a server: JSON-RPC handling, typed tool definitions, and an HTTP + SSE transport that works anywhere Request
and Response
are available (Node, Bun, Cloudflare Workers, Deno, browsers with Service Workers).
You get:
- A minimal core (
packages/core
) with zero runtime dependencies. - Opt-in adapters for sessions and client calls so you can start without state and add storage when you need it.
- Plain TypeScript APIs that line up with the MCP spec and stay close to the wire format.
npm install mcp-lite
Spin up a minimal MCP server with Hono and Zod:
import { Hono } from "hono";
import { McpServer, StreamableHttpTransport } from "mcp-lite";
import { z } from "zod";
const mcp = new McpServer({
name: "example-server",
version: "1.0.0",
schemaAdapter: (schema) => z.toJSONSchema(schema as z.ZodType),
});
const WeatherInputSchema = z.object({
location: z.string(),
});
const WeatherOutputSchema = z.object({
temperature: z.number(),
conditions: z.string(),
});
mcp.tool("getWeather", {
description: "Gets weather information for a location",
inputSchema: WeatherInputSchema,
outputSchema: WeatherOutputSchema,
handler: (args) => ({
content: [{
type: "text",
text: `Weather in ${args.location}: 22°C, sunny`
}],
structuredContent: {
temperature: 22,
conditions: "sunny",
},
}),
});
const transport = new StreamableHttpTransport();
const httpHandler = transport.bind(mcp);
const app = new Hono();
app.all("/mcp", async (c) => {
const response = await httpHandler(c.req.raw);
return response;
});
Tip
The Model Context Protocol (MCP) is an open standard that enables secure connections between host applications and external data sources and tools, allowing AI assistants to reason over information and execute functions with user permission.
- No runtime dependencies and a single TypeScript entrypoint.
- Type-safe tool definitions with Standard Schema (Zod, Valibot, Effect, ArkType).
- Structured outputs with runtime validation and schema exposure via
tools/list
. - HTTP + SSE transport built on the Fetch API.
- Adapter interfaces for sessions, server-to-client requests, and persistence when you outgrow stateless mode.
- Middleware hooks and server composition via
.group()
for modular setups and namespacing.
Standard Schema validators provide automatic type inference:
import { z } from "zod";
const SearchSchema = z.object({
query: z.string(),
limit: z.number().optional(),
filters: z.array(z.string()).optional()
});
server.tool("search", {
inputSchema: SearchSchema,
handler: (args) => {
// args is typed as { query: string, limit?: number, filters?: string[] }
args.query.toLowerCase()
args.limit ?? 10
args.filters?.map(f => f.trim())
return { content: [{ type: "text", text: "..." }] }
}
})
Tools can return both human-readable content and machine-readable structured data. Use outputSchema
to define the shape of structuredContent
:
See the MCP structured content spec for the protocol details.
const WeatherOutputSchema = z.object({
temperature: z.number(),
conditions: z.string(),
});
server.tool("getWeather", {
inputSchema: z.object({ location: z.string() }),
outputSchema: WeatherOutputSchema,
handler: (args) => ({
content: [{
type: "text",
text: `Weather in ${args.location}: 22°C, sunny`
}],
// structuredContent is typed and validated at runtime
structuredContent: {
temperature: 22,
conditions: "sunny",
}
})
})
The outputSchema
provides runtime validation and type inference for structuredContent
.
The handler context provides typed access to session data, authentication, and client capabilities:
handler: (args, ctx) => {
ctx.progress?.({ progress: 50, total: 100 })
ctx.session?.id
ctx.authInfo?.userId
ctx.state.myCustomData = "..."
const validated = ctx.validate(MySchema, data)
if (ctx.client.supports("elicitation")) {
const result = await ctx.elicit({
message: "Confirm action?",
schema: z.object({ confirmed: z.boolean() })
})
}
}
You can begin with a single file server and add state only when you need it. Adapters let you swap in storage or queueing code without touching tools or handlers.
- Local prototyping: run the transport with no adapters. Every request is stateless and there is nothing to clean up.
- Single server: add
InMemorySessionAdapter
(and optionallyInMemoryClientRequestAdapter
) so progress notifications and elicitations can span multiple requests from the same client. - Distributed or serverless: implement the adapters against shared infrastructure (Redis, SQL, message queues, Durable Objects, etc.). See the Cloudflare Workers example for a KV-backed implementation.
Environment | Session Storage | State Storage | Transport Configuration |
---|---|---|---|
Development | None | N/A | StreamableHttpTransport() |
Single server | In-memory | In-memory | InMemorySessionAdapter |
Distributed | Redis/KV | Redis/KV | Custom adapters |
// Development: stateless
const transport = new StreamableHttpTransport()
// Production: with sessions and client requests
const transport = new StreamableHttpTransport({
sessionAdapter: new InMemorySessionAdapter({
maxEventBufferSize: 1024
}),
clientRequestAdapter: new InMemoryClientRequestAdapter({
defaultTimeoutMs: 30000
})
})
InMemorySessionAdapter
- Session storage in memoryInMemoryClientRequestAdapter
- Client request tracking in memory
The server can send JSON-RPC requests back to the MCP client (for example when you call ctx.elicit
). Those requests are routed through the ClientRequestAdapter
. Provide one when you need:
- Timeouts or retries for client prompts.
- To make sure an elicitation response is delivered even when the original POST is finished.
- To back the pending requests with shared storage in a multi-instance deployment.
The in-memory adapter covers local runs. For production you can implement ClientRequestAdapter
using Redis, a SQL store, Durable Objects, or any queue that can look up pending requests by session and request id.
Implement these interfaces for custom storage:
interface SessionAdapter {
generateSessionId(): string
create(id: string, meta: SessionMeta): Promise<SessionData>
has(id: string): Promise<boolean>
get(id: string): Promise<SessionData | undefined>
appendEvent(id: string, streamId: string, message: unknown): Promise<EventId>
replay(id: string, lastEventId: EventId, write: WriteFunction): Promise<void>
delete(id: string): Promise<void>
}
interface ClientRequestAdapter {
createPending(sessionId: string, requestId: string, options): { promise: Promise<Response> }
resolvePending(sessionId: string, requestId: string, response: Response): boolean
rejectPending(sessionId: string, requestId: string, error: Error): boolean
}
See examples/cloudflare-worker-kv for a production implementation using Cloudflare KV.
StreamableHttpTransport
runs anywhere the Fetch API is available.
-
Hono + Bun
import { Hono } from "hono" import { McpServer, StreamableHttpTransport } from "mcp-lite" import { z } from "zod" const server = new McpServer({ name: "my-server", version: "1.0.0", schemaAdapter: (schema) => z.toJSONSchema(schema as z.ZodType) }).tool("echo", { inputSchema: z.object({ message: z.string() }), handler: (args) => ({ content: [{ type: "text", text: args.message }] }) }) const transport = new StreamableHttpTransport() const handler = transport.bind(server) const app = new Hono() app.all("/mcp", (c) => handler(c.req.raw)) export default app
-
Cloudflare Workers (stateless starter; plug adapters into KV / Durable Objects for production)
import { McpServer, StreamableHttpTransport } from "mcp-lite" import { z } from "zod" export default { async fetch(request: Request): Promise<Response> { const server = new McpServer({ name: "worker-server", version: "1.0.0", schemaAdapter: (schema) => z.toJSONSchema(schema as z.ZodType) }).tool("echo", { inputSchema: z.object({ message: z.string() }), handler: (args) => ({ content: [{ type: "text", text: args.message }] }) }) const transport = new StreamableHttpTransport() return transport.bind(server)(request) } }
-
Next.js App Router
import { McpServer, StreamableHttpTransport } from "mcp-lite" import { z } from "zod" const server = new McpServer({ name: "nextjs-server", version: "1.0.0", schemaAdapter: (schema) => z.toJSONSchema(schema as z.ZodType) }).tool("echo", { inputSchema: z.object({ message: z.string() }), handler: (args) => ({ content: [{ type: "text", text: args.message }] }) }) const handler = new StreamableHttpTransport().bind(server) export const POST = handler export const GET = handler
The repo includes runnable samples that show different adapters and runtimes:
examples/cloudflare-worker-kv
– Workers runtime with Cloudflare KV adapters.examples/composing-servers
– Multiple servers grouped behind one transport.examples/validation-arktype
– Standard Schema via ArkType.examples/validation-valibot
– Validation using Valibot.examples/validation-effectschema
– Validation with Effect Schema.examples/validation-zod
– Validation with Zod.examples/auth-clerk
– Adds Clerk auth middleware and guards.
The sections below map directly to the MCP specification: tools, resources, prompts, and elicitations.
Tools expose callable functionality to MCP clients. Each variant below shows a different way to define inputs and outputs.
Define a tool using plain JSON Schema input and output definitions.
server.tool("add", {
description: "Adds two numbers",
inputSchema: {
type: "object",
properties: {
a: { type: "number" },
b: { type: "number" },
},
required: ["a", "b"],
},
outputSchema: {
type: "object",
properties: {
result: { type: "number" },
},
required: ["result"],
},
handler: (args: { a: number; b: number }) => ({
content: [{ type: "text", text: String(args.a + args.b) }],
structuredContent: { result: args.a + args.b },
}),
});
Use a Standard Schema validator (Zod here) to infer handler types automatically.
import { z } from "zod";
const AddInputSchema = z.object({
a: z.number(),
b: z.number(),
});
const AddOutputSchema = z.object({
result: z.number(),
});
server.tool("add", {
description: "Adds two numbers with structured output",
inputSchema: AddInputSchema,
outputSchema: AddOutputSchema,
handler: (args) => ({
content: [{ type: "text", text: String(args.a + args.b) }],
structuredContent: { result: args.a + args.b },
}),
});
Skip validation entirely for endpoints that return static information.
server.tool("status", {
description: "Returns server status",
handler: () => ({
content: [{ type: "text", text: "Server is running" }],
}),
});
Add title
and _meta
fields to pass arbitrary metadata through tools/list
and tools/call
responses.
server.tool("experimental-feature", {
description: "An experimental feature",
title: "Experimental Feature",
_meta: {
version: "0.1.0",
stability: "experimental",
tags: ["beta", "preview"],
},
inputSchema: z.object({ input: z.string() }),
handler: (args) => ({
content: [{ type: "text", text: `Processing: ${args.input}` }],
_meta: {
executionTime: 123,
cached: false,
},
}),
});
The _meta
and title
from the definition appear in tools/list
responses. Tool handlers can also return _meta
in the result for per-call metadata like execution time or cache status.
Resources are URI-identified content.
Serve fixed content for a specific URI.
server.resource(
"file://config.json",
{
name: "App Configuration",
description: "Application configuration file",
mimeType: "application/json",
},
async (uri) => ({
contents: [{
uri: uri.href,
type: "text",
text: JSON.stringify({ name: "my-app" }),
mimeType: "application/json",
}],
})
);
Bind template variables from the URI before returning content.
server.resource(
"github://repos/{owner}/{repo}",
{ description: "GitHub repository" },
async (uri, { owner, repo }) => ({
contents: [{
uri: uri.href,
type: "text",
text: `Repository: ${owner}/${repo}`,
}],
})
);
Include _meta
in the resource metadata to pass custom information through resources/list
, resources/templates/list
, and resources/read
responses.
server.resource(
"db://records/{id}",
{
name: "Database Record",
description: "Fetch a record from the database",
mimeType: "application/json",
_meta: {
cacheTtl: 300,
accessLevel: "read-only",
region: "us-west-2",
},
},
async (uri, { id }) => ({
contents: [{
uri: uri.href,
type: "text",
text: JSON.stringify({ id, data: "..." }),
_meta: {
contentVersion: "2.0",
lastModified: "2025-01-01",
},
}],
_meta: {
totalSize: 1024,
cached: true,
},
})
);
The _meta
from the resource definition appears in list responses. Handlers can also return _meta
on the result and individual contents for per-read metadata.
Prompts generate message sequences for LLM conversations.
Return a fixed message sequence.
server.prompt("greet", {
description: "Generate a greeting",
handler: () => ({
messages: [{
role: "user",
content: { type: "text", text: "Hello, how are you?" }
}]
})
});
Validate prompt arguments before building messages.
import { z } from "zod";
const SummarySchema = z.object({
text: z.string(),
length: z.enum(["short", "medium", "long"]).optional(),
});
server.prompt("summarize", {
description: "Create a summary prompt",
arguments: SummarySchema,
handler: (args) => ({
description: "Summarization prompt",
messages: [{
role: "user",
content: {
type: "text",
text: `Please summarize this text in ${args.length || "medium"} length:\n\n${args.text}`
}
}]
})
});
Add title
and _meta
to pass additional information through prompts/list
and prompts/get
responses.
server.prompt("research-assistant", {
description: "Research assistant prompt with context",
title: "Research Assistant",
_meta: {
category: "research",
complexity: "advanced",
estimatedTokens: 500,
},
arguments: [
{ name: "topic", description: "Research topic", required: true },
{ name: "depth", description: "Research depth", required: false },
],
handler: (args: { topic: string; depth?: string }) => ({
messages: [{
role: "user",
content: {
type: "text",
text: `Research ${args.topic} at ${args.depth || "medium"} depth`
}
}],
_meta: {
templateVersion: "2.0",
generated: true,
},
})
});
The _meta
and title
from the definition appear in prompts/list
responses. Handlers can also return _meta
in the result for per-generation metadata.
Elicitation lets a tool request input from the client mid-execution. mcp-lite
wires this through the same handler context:
import { z } from "zod";
server.tool("delete_record", {
inputSchema: z.object({
recordId: z.string(),
tableName: z.string(),
}),
handler: async (args, ctx) => {
if (!ctx.client.supports("elicitation")) {
throw new Error("Elicitation not supported");
}
const response = await ctx.elicit({
message: `Delete record "${args.recordId}" from "${args.tableName}"?`,
schema: z.object({ confirmed: z.boolean() })
});
if (response.action === "accept" && response.content.confirmed) {
await deleteFromDatabase(args.tableName, args.recordId);
return { content: [{ type: "text", text: "Record deleted" }] };
}
return { content: [{ type: "text", text: "Deletion cancelled" }] };
}
});
Elicitation requires both adapters:
const transport = new StreamableHttpTransport({
sessionAdapter: new InMemorySessionAdapter({ maxEventBufferSize: 1024 }),
clientRequestAdapter: new InMemoryClientRequestAdapter({ defaultTimeoutMs: 30000 })
});
See packages/core/README.elicitation.md for an implementation that uses an external KV store.
Sampling refers to the ability of an MCP server to pause exeuction and ask the MCP client to provide LLM completions to inform its response.
This is helpful, for example, in order to keep all inference on the client side.
Sampling is unfortunately not well supported across MCP Clients. GitHub Copilot, however does support it.
As with Elicitation, you need to configure both a SessionAdapter and ClientRequestAdapter to make Sampling work.
const FrenchSchema = z.object({});
mcp.tool("frenchness_evaluation", {
description: "Evaluates how French a host application is",
inputSchema: FrenchSchema,
handler: async (args, ctx) => {
// Check if client supports sampling
if (!ctx.client.supports("sampling")) {
throw new Error("This tool requires a client that supports sampling");
}
// Request LLM completion through sampling
const response = await ctx.sample({
// ...
prompt: "What is the capital of France?",
modelPreferences: {
hints: [
{
"name": "claude-4.5-sonnet"
}
],
intelligencePriority: 0.8,
speedPriority: 0.5
},
systemPrompt: "You are a wonky assistant.",
maxTokens: 100
});
if ("result" in response && response.result.type === "text") {
const { content } = response.result;
const isFrench = content?.toLowerCase().includes("paris");
return {
content: [{
type: "text",
text: isFrench ? "Pas mal. You might be French enough" : "You are not very French my friend"
}],
};
}
if ("error" in response) {
return {
content: [{
type: "text",
text: `Sampling completion failed: ${response.error.message}`,
}],
};
}
// Unknown case, should not hit this
throw new Error("Unexpected sampling response");
},
});
mcp-lite
lets you apply Hono-style middleware to every request before it reaches a tool or prompt handler:
// Logging
server.use(async (ctx, next) => {
const start = Date.now();
await next();
console.log(`${ctx.request.method} took ${Date.now() - start}ms`);
});
// Authentication
server.use(async (ctx, next) => {
const token = ctx.request.headers?.get?.("Authorization");
if (!token) throw new Error("Unauthorized");
ctx.state.user = await validateToken(token);
await next();
});
// Rate limiting
server.use(async (ctx, next) => {
const userId = ctx.state.user?.id;
if (await isRateLimited(userId)) {
throw new Error("Rate limit exceeded");
}
await next();
});
Group smaller servers together while preserving their tooling and middleware:
const gitServer = new McpServer({ name: "git", version: "1.0.0" })
.tool("clone", { /* ... */ })
.tool("commit", { /* ... */ });
const dbServer = new McpServer({ name: "database", version: "1.0.0" })
.tool("query", { /* ... */ })
.tool("migrate", { /* ... */ });
// With namespacing
const app = new McpServer({ name: "app", version: "1.0.0" })
.group("git", gitServer) // Registers: git/clone, git/commit
.group("db", dbServer); // Registers: db/query, db/migrate
// Without namespacing
const app2 = new McpServer({ name: "app", version: "1.0.0" })
.group(gitServer) // Registers: clone, commit
.group(dbServer); // Registers: query, migrate
See examples/composing-servers for details.
Throw RpcError
to return structured JSON-RPC failures or customize onError
for fallback logic.
import { RpcError, JSON_RPC_ERROR_CODES } from "mcp-lite";
server.tool("divide", {
inputSchema: z.object({ a: z.number(), b: z.number() }),
handler: (args) => {
if (args.b === 0) {
throw new RpcError(JSON_RPC_ERROR_CODES.INVALID_PARAMS, "Division by zero");
}
return {
content: [{ type: "text", text: String(args.a / args.b) }]
};
}
});
// Custom error handler
server.onError((error, ctx) => {
if (error instanceof MyCustomError) {
return {
code: -32001,
message: "Custom error",
data: { requestId: ctx.requestId }
};
}
// Return undefined for default handling
});
Default mode with no session management:
const transport = new StreamableHttpTransport();
Enable sessions for SSE streaming and event replay:
import { StreamableHttpTransport, InMemorySessionAdapter } from "mcp-lite";
const transport = new StreamableHttpTransport({
sessionAdapter: new InMemorySessionAdapter({
maxEventBufferSize: 1024
})
});
This enables:
- Session persistence across requests
- SSE streaming via GET endpoint
- Event replay for reconnections
- Progress notifications
mcp-lite
supports multiple MCP protocol versions with automatic negotiation:
2025-06-18
(current) - Full feature set including elicitation and structured tool outputs2025-03-26
(backward compatible) - Includes batch request support, optional protocol headers
During the initialize
handshake, the server accepts the client's requested version if supported and echoes it back. The negotiated version is persisted per session and enforces version-specific behavior throughout the connection.
2025-06-18
: TheMCP-Protocol-Version
header is required on all non-initialize requests when using sessions2025-03-26
: TheMCP-Protocol-Version
header is optional on non-initialize requests (if present, must match the negotiated version)
2025-06-18
: Batch requests (JSON array of requests) are not supported and will return an error2025-03-26
: Batch requests are supported - send an array of JSON-RPC requests and receive an array of responses
2025-06-18
: Clients may declareelicitation
andsampling
capabilities2025-03-26
: Clients may declareroots
andsampling
capabilities
Note: Server capabilities (tools
, prompts
, resources
) are version-independent.
// Client requests 2025-03-26
const initRequest = {
jsonrpc: "2.0",
id: 1,
method: "initialize",
params: {
protocolVersion: "2025-03-26",
clientInfo: { name: "my-client", version: "1.0.0" },
capabilities: {}
}
};
// Server responds with the same version
const initResponse = {
jsonrpc: "2.0",
id: 1,
result: {
protocolVersion: "2025-03-26", // Echoed back
serverInfo: { name: "my-server", version: "1.0.0" },
capabilities: {
tools: { listChanged: true }
// Server capabilities are version-independent
}
}
};
If a client requests an unsupported version, the server returns an error with the list of supported versions:
{
"jsonrpc": "2.0",
"id": 1,
"error": {
"code": -32000,
"message": "Unsupported protocol version. Server supports: 2025-03-26, 2025-06-18, client requested: 2024-01-01",
"data": {
"supportedVersions": ["2025-03-26", "2025-06-18"],
"requestedVersion": "2024-01-01"
}
}
}
mcp-lite
exports version constants as an object for clarity:
import { SUPPORTED_MCP_PROTOCOL_VERSIONS } from "mcp-lite";
// Access specific versions
const version03_26 = SUPPORTED_MCP_PROTOCOL_VERSIONS.V2025_03_26; // "2025-03-26"
const version06_18 = SUPPORTED_MCP_PROTOCOL_VERSIONS.V2025_06_18; // "2025-06-18"
// Get all supported versions as an array
const allVersions = Object.values(SUPPORTED_MCP_PROTOCOL_VERSIONS);
For more details on protocol changes, see the MCP Specification Changelog.