Run AI-generated code on your own machine—locally, securely, at lightning speed.
Vercel AI tool provided out of the box.
Interpreter Tools is a drop-in "code-interpreter" backend for AI agents: it spins up lightweight Docker containers, executes untrusted snippets in < 100 ms (with pooling), streams the output, and can be extended to any language by registering a new config object.
Supports pooling, per-session containers, dependency caching, and real-time stdout/stderr—perfect for chat-based tools like GPT function calling, Jupyter-style notebooks, or autonomous agents that need to evaluate code on the fly.
The most powerful way to experience Interpreter Tools is through our interactive shell example. It provides a chat-like interface where you can:
- Execute commands and code in a secure Docker environment
- Get AI-powered assistance with command execution
- Maintain context of previous commands and their outputs
- Automatically fix errors with AI analysis
- View command history and session information
Try it out:
# Clone the repository
git clone https://github.com/CatchTheTornado/interpreter-tools.git
cd interpreter-tools
# Install dependencies
yarn install
# Run the interactive shell
yarn ts-node examples/interactive-shell-example.ts
Here are some examples showcasing different capabilities of Interpreter Tools:
- Interactive Shell - Chat-like interface with AI assistance
- AI Tool Example - Using code execution with Vercel AI
- Ruby Language Support - Extending with new language support
- Direct Engine Usage - Using ExecutionEngine without AI integration
⚡ Sub-100 ms average execution (with container pool & dep-cache). Run untrusted code fast without leaving Node!
🔌 Plug-in language architecture – add a new language by registering one object (see LanguageRegistry
). No engine edits required.
📦 Zero-install repeat runs – dependencies are installed once per container and skipped thereafter, saving seconds on every call.
🔒 Docker-level isolation – each snippet executes in its own constrained container (CPU, memory, no-new-privileges).
🖥️ Real-time streaming – stdout/stderr stream back instantly; ideal for REPL-like experiences.
Interpreter Tools is published on npm as interpreter-tools
.
Install the package and its dependencies in your Node.js project:
# Using yarn
yarn add interpreter-tools ai @ai-sdk/openai
# Or using npm
npm install interpreter-tools ai @ai-sdk/openai
- Create a new file
example.js
in your project:
const { generateText } = require('ai');
const { openai } = require('@ai-sdk/openai');
const { createCodeExecutionTool } = require('interpreter-tools');
async function main() {
try {
// Create a code execution tool instance
const { codeExecutionTool } = createCodeExecutionTool();
// Use generateText with codeExecutionTool to generate and execute code
const result = await generateText({
model: openai('gpt-4'),
maxSteps: 10,
messages: [
{
role: 'user',
content: 'Write a JavaScript function that calculates the sum of numbers from 1 to n and print the result for n=10. Make sure to include a test case.'
}
],
tools: { codeExecutionTool },
toolChoice: 'auto'
});
console.log('AI Response:', result.text);
console.log('Execution Results:', result.toolResults);
} catch (error) {
console.error('Error:', error);
}
}
main();
- Set up your OpenAI API key:
# Using yarn
yarn add dotenv
# Or using npm
npm install dotenv
Create a .env
file in your project root:
OPENAI_API_KEY=your_api_key_here
- Update your code to use the environment variable:
require('dotenv').config();
// ... rest of the code remains the same
- Run the example:
node example.js
If you prefer to use the ExecutionEngine directly without the AI integration, here's how to do it:
- Create a new file
direct-example.js
:
const { ExecutionEngine, ContainerStrategy } = require('interpreter-tools');
async function main() {
const engine = new ExecutionEngine();
try {
// Create a session with per-execution strategy
const sessionId = await engine.createSession({
strategy: ContainerStrategy.PER_EXECUTION,
containerConfig: {
image: 'node:18-alpine',
environment: {
NODE_ENV: 'development'
}
}
});
// Execute JavaScript code
const result = await engine.executeCode(sessionId, {
language: 'javascript',
code: `
const numbers = [1, 2, 3, 4, 5];
const sum = numbers.reduce((a, b) => a + b, 0);
const average = sum / numbers.length;
console.log('Numbers:', numbers);
console.log('Sum:', sum);
console.log('Average:', average);
`,
streamOutput: {
stdout: (data) => console.log('Container output:', data),
stderr: (data) => console.error('Container error:', data)
}
});
console.log('Execution Result:');
console.log('STDOUT:', result.stdout);
console.log('STDERR:', result.stderr);
console.log('Exit Code:', result.exitCode);
console.log('Execution Time:', result.executionTime, 'ms');
} catch (error) {
console.error('Error:', error);
} finally {
// Clean up resources
await engine.cleanup();
}
}
main();
- Run the example:
node direct-example.js
This example demonstrates:
- Creating a session with a specific container strategy
- Configuring the container environment
- Executing code directly in the container
- Handling real-time output streaming
- Proper resource cleanup
If you're using TypeScript, you can import the packages with type definitions:
import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { createCodeExecutionTool } from 'interpreter-tools';
import 'dotenv/config';
// Rest of the code remains the same
Interpreter Tools can support any language that has a runnable Docker image—just register a LanguageConfig
at runtime.
import { LanguageRegistry, LanguageConfig } from 'interpreter-tools';
const rubyConfig: LanguageConfig = {
language: 'ruby',
defaultImage: 'ruby:3.2-alpine',
codeFilename: 'code.rb',
prepareFiles: (options, dir) => {
const fs = require('fs');
const path = require('path');
fs.writeFileSync(path.join(dir, 'code.rb'), options.code);
},
buildInlineCommand: () => ['sh', '-c', 'ruby code.rb'],
buildRunAppCommand: (entry) => ['sh', '-c', `ruby ${entry}`]
};
// Make the engine aware of Ruby
LanguageRegistry.register(rubyConfig);
// From this point you can use `language: 'ruby'` in `executeCode` or the AI tool.
See examples/ruby-example.ts
for a full working script that:
- Registers Ruby support on-the-fly
- Asks the AI model to generate a Ruby script
- Executes the script inside a
ruby:3.2-alpine
container
To set up the project for local development:
# Clone the repository
git clone https://github.com/CatchTheTornado/interpreter-tools.git
cd interpreter-tools
# Install dependencies
yarn install
The /examples
directory contains several example scripts demonstrating different use cases of the interpreter tools:
examples/ai-example.ts
Demonstrates how to:
- Use the code execution tool with Vercel AI
- Generate and execute Python code using AI
- Handle AI-generated code execution results
- Process Fibonacci sequence calculation
Run it with:
yarn ts-node examples/ai-example.ts
examples/basic-usage.ts
Shows how to:
- Set up a basic execution environment
- Execute JavaScript code in a container
- Handle execution results and errors
- Use the per-execution container strategy
Run it with:
yarn ts-node examples/basic-usage.ts
examples/python-example.ts
Demonstrates how to:
- Execute Python code in a container
- Handle Python dependencies
- Process Python script output
- Use Python-specific container configuration
Run it with:
yarn ts-node examples/python-example.ts
examples/python-chart-example.js
Demonstrates how to:
- Generate files from within a Python script (e.g., PNG charts)
- Persist and retrieve these generated artifacts from the container
- Handle file management workflows when running Python code via the engine
Run it with:
node examples/python-chart-example.js
examples/python-json-sort-example.js
Demonstrates how to:
- Inject files (in this case JSON) into the container before execution
- Generate additional files inside the container
- Sort and process JSON data using a Python script
- Retrieve the resulting files back to the host
Run it with:
node examples/python-json-sort-example.js
examples/shell-json-example.ts
Demonstrates how to:
- Generate and execute a shell script using AI
- Create directories and JSON files
- Process JSON files using
jq
- Handle Alpine Linux package dependencies
Run it with:
yarn ts-node examples/shell-json-example.ts
examples/nodejs-project-example.ts
Shows how to:
- Generate a complete Node.js project structure using AI
- Create an Express server
- Handle project dependencies
- Execute the generated project in a container
Run it with:
yarn ts-node examples/nodejs-project-example.ts
examples/shell-example.ts
A simple example that:
- Creates a shell script
- Executes it in an Alpine Linux container
- Demonstrates basic container configuration
- Shows real-time output streaming
Run it with:
yarn ts-node examples/shell-example.ts
examples/ruby-example.ts
Shows how to:
- Dynamically register Ruby language support
- Use the AI tool to generate and execute Ruby code
Run it with:
yarn ts-node examples/ruby-example.ts
examples/benchmark-pool.ts
– JavaScript/TypeScript pool benchmark (20 rounds)
yarn ts-node examples/benchmark-pool.ts
examples/benchmark-pool-python.ts
– Python pool benchmark
yarn ts-node examples/benchmark-pool-python.ts
Average times on a MacBook M2 Pro: JS 40 ms / round, Python 60 ms / round after first run (deps cached).
The main components of this project are:
ExecutionEngine
: Manages code execution in containersSessionManager
: Manages session state, container metadata, and container history across executionsContainerManager
: Handles Docker container lifecycleCodeExecutionTool
: Provides a high-level interface for executing code
import { createCodeExecutionTool } from 'interpreter-tools';
const { codeExecutionTool } = createCodeExecutionTool();
// Isolated workspace (default)
const result = await codeExecutionTool.execute({
language: 'javascript',
code: 'console.log("Hello, World!");',
streamOutput: {
stdout: (data) => console.log(data),
stderr: (data) => console.error(data)
}
});
// Shared workspace between executions
const result2 = await codeExecutionTool.execute({
language: 'javascript',
code: 'console.log("Hello again!");',
workspaceSharing: 'shared' // Share workspace between executions
});
The workspaceSharing
option controls how workspaces are managed:
isolated
(default): Each execution gets its own workspaceshared
: All executions in a session share the same workspace, allowing file persistence between runs
This is particularly useful when you need to:
- Share files between multiple executions
- Maintain state across executions
- Accumulate generated files
- Build up a workspace over multiple steps
Interpreter Tools is composed of four core layers:
Layer | Responsibility |
---|---|
ExecutionEngine | High-level façade that orchestrates container lifecycle, dependency caching, and code execution. |
SessionManager | Manages session state, container metadata, and container history across executions. |
ContainerManager | Low-level wrapper around Dockerode that creates, starts, pools, and cleans containers. |
LanguageRegistry | Pluggable store of LanguageConfig objects that describe how to build/run code for each language. |
All user-facing helpers (e.g. the codeExecutionTool
for AI agents) are thin wrappers that forward to ExecutionEngine
.
graph TD;
subgraph Runtime
A[ExecutionEngine]
B[SessionManager]
C[ContainerManager]
D[Docker]
end
E[LanguageRegistry]
A --> B --> C --> D
A --> E
new ExecutionEngine()
createSession(config: SessionConfig): Promise<string>
executeCode(sessionId: string, options: ExecutionOptions): Promise<ExecutionResult>
cleanupSession(sessionId: string): Promise<void>
cleanup(): Promise<void>
getSessionInfo(sessionId: string): Promise<SessionInfo>
- SessionConfig – chooses a
ContainerStrategy
(PER_EXECUTION
,POOL
,PER_SESSION
) and passes acontainerConfig
(image, env, mounts, limits). - ExecutionOptions – language, code snippet, optional dependencies, stream handlers, etc.
- ExecutionResult –
{ stdout, stderr, dependencyStdout, dependencyStderr, exitCode, executionTime }
.
dependencyStdout
anddependencyStderr
capture any output produced while installing the declareddependencies
(e.g.npm
,pip
,apk
) before your code starts executing. They are empty when no dependency phase was required. - SessionInfo – comprehensive session information including:
interface SessionInfo { sessionId: string; config: SessionConfig; currentContainer: { container: Docker.Container | undefined; meta: ContainerMeta | undefined; }; containerHistory: ContainerMeta[]; createdAt: Date; lastExecutedAt: Date | null; isActive: boolean; }
Each container's state is tracked through the ContainerMeta
interface:
interface ContainerMeta {
sessionId: string;
depsInstalled: boolean;
depsChecksum: string | null;
baselineFiles: Set<string>;
workspaceDir: string;
generatedFiles: Set<string>;
sessionGeneratedFiles: Set<string>;
isRunning: boolean;
createdAt: Date;
lastExecutedAt: Date | null;
containerId: string;
imageName: string; // docker image tag used
containerName: string; // friendly name assigned by ExecutionEngine
}
This metadata provides:
- Dependency installation status and checksums
- File tracking (baseline, generated, session-generated)
- Container state (running/stopped)
- Timestamps (creation, last execution)
- Container identification
- Image name of the container
- Human-friendly container name
Factory that exposes the engine as an OpenAI function-calling-friendly tool. It validates parameters with Zod and returns the same ExecutionResult
.
import { createCodeExecutionTool } from 'interpreter-tools';
const { codeExecutionTool } = createCodeExecutionTool();
const { stdout } = await codeExecutionTool.execute({
language: 'python',
code: 'print("Hello")'
});
LanguageRegistry.get(name): LanguageConfig | undefined
LanguageRegistry.register(config: LanguageConfig): void
LanguageRegistry.names(): string[]
LanguageConfig
fields:
interface LanguageConfig {
language: string; // identifier
defaultImage: string; // docker image
codeFilename: string; // filename inside /workspace for inline code
prepareFiles(options, dir): void; // write code + metadata into temp dir
buildInlineCommand(depsInstalled): string[];
buildRunAppCommand(entry, depsInstalled): string[];
installDependencies?(container, options): Promise<void>; // optional pre-exec hook
}
- Create a
LanguageConfig
object (see the Ruby example). - Call
LanguageRegistry.register(config)
once at startup. - Provide a suitable Docker image that has the runtime installed.
No changes to ExecutionEngine
are required.
Strategy | Description | When to use |
---|---|---|
PER_EXECUTION | New container per snippet; removed immediately. | Maximum isolation; slowest. |
POOL | Containers are pooled per session and reused—workspace is wiped between runs. | Best latency / resource trade-off for chat bots. |
PER_SESSION | One dedicated container for the whole session; not pooled. | Long-running interactive notebooks. |
Note
ContainerStrategy.POOL
is optimised for fast startup: it re-uses pre-warmed containers and therefore always clears/workspace
between executions.
TheworkspaceSharing: "shared"
option is not available in this mode.
If you need persistent state you can either:
- switch to PER_SESSION (keeps the internal workspace), or
- mount a host directory via
containerConfig.mounts
to share specific data/files across pooled runs.
containerConfig
accepts:
{
image?: string; // overrides language default
mounts?: ContainerMount[]; // { type: 'directory' | 'tmpfs', source, target }
environment?: Record<string, string>;
cpuLimit?: string; // e.g. '0.5'
memoryLimit?: string; // e.g. '512m'
}
ExecutionOptions
let you override CPU and memory for a single run:
await engine.executeCode(id, {
language: 'python',
code: 'print(\"hi\")',
cpuLimit: '0.5', // half a CPU core
memoryLimit: '256m' // 256 MB RAM
});
Under the hood the engine calls container.update({ CpuPeriod, CpuQuota, Memory })
just before execution, so the limits apply even when pooling.
Pass streamOutput: { stdout?, stderr?, dependencyStdout?, dependencyStderr? }
in ExecutionOptions
to receive data chunks in real time while the process runs.
await engine.executeCode(id, {
language: 'shell',
code: 'for i in 1 2 3; do echo $i; sleep 1; done',
streamOutput: {
stdout: (d) => process.stdout.write(d)
}
});
dependencyStdout
/ dependencyStderr
fire during the dependency-installation phase (e.g. when pip install
or npm install
runs) before the user code starts. This lets you surface progress logs or errors related to package installation separately from your program's own output.
Sometimes your code needs additional assets (datasets, JSON files, images, etc.). There are two primary ways to make them available inside the container:
-
Mount a host directory – supply a
mounts
array incontainerConfig
when you create a session. This is the easiest way to share many files or large directories.const sessionId = await engine.createSession({ strategy: ContainerStrategy.PER_EXECUTION, containerConfig: { image: 'python:3.11-alpine', mounts: [ { type: 'directory', // always "directory" for now source: path.resolve(__dirname, 'assets'), // host path target: '/workspace/assets' // inside the container } ] } });
-
Programmatically copy / create individual files – use one of the helper methods that work after you have a session:
// (a) copy an existing file from the host await engine.copyFileIntoWorkspace(sessionId, './input/data.json', 'data.json'); // (b) create a new file from a base64-encoded string await engine.addFileFromBase64( sessionId, 'notes/hello.txt', Buffer.from('hi there').toString('base64') );
Both helpers write directly to /workspace
, so your script can reference the files with just the relative path.
Interpreter Tools automatically tracks new files created in /workspace
during an execution:
ExecutionResult.generatedFiles
– list of absolute paths to the files created in that run.ExecutionResult.workspaceDir
– host path of the temporary workspace directory.
Example:
const result = await engine.executeCode(sessionId, {
language: 'python',
code: 'with open("report.txt", "w") as f:\n f.write("done")',
});
console.log('New files:', result.generatedFiles);
You can retrieve file contents with:
const pngBuffer = await engine.readFileBinary(sessionId, 'charts/plot.png');
fs.writeFileSync('plot.png', pngBuffer);
By default, calling engine.cleanupSession()
or engine.cleanup()
removes the containers and their workspaces. Pass true
to keep only the files that were detected as generated:
await engine.cleanupSession(sessionId, /* keepGenerated = */ true);
All non-generated files are removed, the generated ones stay on disk so you can move or process them later. If you mounted a host directory, the files are already on the host and no special flag is needed.
Happy hacking!
MIT