llm-sdk

llm-sdk is an open-source suite for building production LLM applications. It ships two libraries:

LLM SDK – cross-language clients (JavaScript, Rust, Go) that talk to multiple LLM providers through one LanguageModel interface.
LLM Agent – a minimal, transparent agent library that orchestrates model generations and tool executions using the SDK under the hood.

Package	Language	Link
`@hoangvvo/llm-sdk`	JavaScript/TypeScript	NPM
`llm-sdk-rs`	Rust	crates.io
`github.com/hoangvvo/llm-sdk`	Go	GitHub
`@hoangvvo/llm-agent`	JavaScript/TypeScript	NPM
`llm-agent`	Rust	crates.io
`github.com/hoangvvo/llm-agent`	Go	GitHub

The accompanying Console app demonstrates the libraries end-to-end.

Status: both libraries are currently v0. The SDK library APIs are largely stable; the Agent library APIs may evolve. Feedback and contributions are welcome.

Why use llm-sdk

Supports multiple LLM providers with a unified API.
Handles multiple modalities: Text, Image, and Audio. Supports streaming.
Supports multi-modality function calling (image/audio returned from tools).
Supports citations (RAG) and reasoning for supported models.
Reports token usage and calculates the cost of a request when provided with the model’s pricing information.
Unified serialization across programming languages (systems in different languages can work together).
Integrates OpenTelemetry for tracing.
Zero abstraction: the agent library is a thin for-loop around the SDK. No overcomplex abstractions like chains, graphs, or hidden prompt templates.

LLM SDKs

Choose the language that fits your service and get the same capabilities:

Each implements the TypeScript reference specification in schema/sdk.ts. Request/response payloads (LanguageModelInput, ModelResponse, tool events, etc.) keep identical field names when serialized to JSON so services can interoperate across languages.

Supported providers

Provider	Sampling Params	Function Calling	Structured Output	Reasoning	Citation ¹	Text Input	Image Input	Audio Input	Text Output	Image Output	Audio Output
OpenAI (Responses)	✅ except `top_k`,`frequency_penalty`, `presence_penalty`, `seed`	✅	✅	✅	➖	✅	✅	✅	✅	✅	➖
OpenAI (Chat Completion)	✅ except `top_k`	✅	✅	➖	➖	✅	✅	✅	✅	➖	✅
Anthropic	✅ except `frequency_penalty`, `presence_penalty`, `seed`	✅	➖	✅	✅ (Search results)	✅	✅	➖	✅	➖	➖
Google	✅	✅	✅	✅	➖	✅	✅	✅	✅	✅	✅
Cohere	✅	✅	✅	✅	✅ (Document)	✅	✅	➖	✅	➖	➖
Mistral	✅ except `top_k`	✅	✅	✅	🚧	✅	✅	✅	✅	➖	➖

Keys: ✅ supported · 🚧 planned · ➖ not available from provider.

Core interfaces

LanguageModel: supplies provider metadata plus generate and stream methods that accept a LanguageModelInput and return unified responses.
LanguageModelInput: captures conversation history, sampling parameters, tool definitions, response-format hints, and modality toggles. The SDK adapts this shape to each provider’s API.
ModelResponse / PartialModelResponse: normalized outputs (with usage/cost when available) that you can forward directly to other services.
Message: building blocks for conversations. Messages represent user, assistant, or tool turns, with a list of parts, each representing a chunk of content in a specific modality:
- Part: TextPart, ImagePart, AudioPart, SourcePart (for citation), ToolCallPart, ToolResultPart, and ReasoningPart.
Tool semantics: function calling and tool-result envelopes share the same schema across providers. The SDK normalizes call IDs, arguments, and error flags so agent runtimes can hydrate rich tool events without per-provider branching.

LLM Agent

llm-agent wraps the SDK to provide a lightweight agent runtime:

Agent objects are stateless blueprints that declare instructions, tools, toolkits, and default model settings.
Run sessions bind an agent to a specific context value. Sessions resolve dynamic instructions once, initialize toolkit state, and stream model/tool events back to you.
Agent items capture every turn: user/assistant messages, model responses (with usage metadata), and rich tool-call records. Append the output list to the next run’s input to continue a conversation.
Streaming mirrors non-streaming responses but emits partial deltas and tool events for real-time UX.

Getting started

Read the full documentation on llm-sdk.hoangvvo.com or start from these guides:

Also check out some agent patterns, including:

Note: To run examples, create an .env file in the root folder (folder containing this README) with your API keys.

Agent Patterns

This agent library (not framework) is designed for transparency and control. Unlike many “agentic” frameworks, it ships with no hidden prompt templates or secret parsing rules, and that’s on purpose:

Nothing hidden – What you write is what runs. No secret prompts or “special sauce” behind the scenes, so your instructions aren’t quietly overridden.
Works in any settings – Many frameworks bake in English-only prompts. Here, the model sees only your words, in whichever language or format.
Easy to tweak – Change prompts, parsing, or flow without fighting built-in defaults.
Less to debug – Fewer layers mean you can trace exactly where things break.
No complex abstraction – Don't waste time learning new concepts or APIs (e.g., “chains”, “graphs”, syntax with special meanings, etc.). Just plain functions and data structures.

LLM in the past was not as powerful as today, so frameworks had to do a lot of heavy lifting to get decent results. But with modern LLMs, much of that complexity is no longer necessary.

Because we keep the core minimal (only 500 LOC!) and do not want to introduce such hidden magic, the library doesn’t bundle heavy agent patterns like hand-off, memory, or planners. Instead, the examples/ folders shows clean, working references you can copy or adapt to see that it can still be used to build complex use cases.

This philosophy is inspired by this blog post.

Comparison with other libraries

The initial version of llm-sdk was developed internally at my company, prior to the existence or knowledge of similar libraries like the Vercel AI SDK or OpenAI Swarm. As a result, it was never intended to compete with or address the limitations of those libraries. As these other libraries matured, llm-sdk continued to evolve independently, focusing on its unique features and use cases, which were designed to be sufficient for its intended applications.

This section is designed to outline the differences for those considering migration to or from llm-sdk or to assert compatibility.

TBD.

License

MIT

Source Input (citation) is not supported by all providers and may be converted to compatible inputs instead. ↩

Name		Name	Last commit message	Last commit date
Latest commit History 281 Commits
.github/workflows		.github/workflows
agent-go		agent-go
agent-js		agent-js
agent-rust		agent-rust
deploy-example-server		deploy-example-server
examples/next-ai-sdk-ui		examples/next-ai-sdk-ui
models		models
schema		schema
sdk-go		sdk-go
sdk-js		sdk-js
sdk-rust		sdk-rust
sdk-tests		sdk-tests
website		website
.gitignore		.gitignore
.prettierrc		.prettierrc
.rustfmt.toml		.rustfmt.toml
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
eslint.config.js		eslint.config.js
go.work		go.work
go.work.sum		go.work.sum
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

llm-sdk

Why use llm-sdk

LLM SDKs

Supported providers

Core interfaces

LLM Agent

Getting started

Agent Patterns

Comparison with other libraries

License

About

Uh oh!

Releases 6

Packages

Uh oh!

Languages

License

hoangvvo/llm-sdk

Folders and files

Latest commit

History

Repository files navigation

llm-sdk

Why use llm-sdk

LLM SDKs

Supported providers

Core interfaces

LLM Agent

Getting started

Agent Patterns

Comparison with other libraries

License

Footnotes

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Languages

Packages