OpenAI Plays Pokemon

Pokémon Red, fully controlled by an LLM 🤖🎮

This repo contains a minimal, hack‑able agent that teaches large language models to play Pokémon Red inside the PyBoy Game Boy emulator.

Forked from the excellent portalcorp/ClaudePlaysPokemon and extended with the OpenAI Responses API so it can run both the o3 and o4‑mini models alongside Anthropic Claude. Anthropic remains the default provider (see --provider flag below).

Project by Lander Media / Steve Moraco. Initial agent code by o4‑mini.

Highlights

Declarative function‑calling interface – the model calls the tools press_buttons and navigate_to (path‑finding helper enabled by default).
Screenshot‑based gameplay – what the model “sees” is precisely what is on the screen, delivered as a PNG each step (hex‑encoded over WebSocket).
FastAPI + WebSockets live UI – watch the game, pause, resume, load save states, and inspect the model’s thoughts in real time at http://localhost:<port>.
Automatic log folder per run (frames, model messages, structured game log).
Context summarisation to keep the conversation within token limits.

Setup

Clone this repository:

git clone <repo-url>
cd <repo-directory>

Install Python dependencies (Python ≥3.10 recommended):
```
pip install -r requirements.txt
```
Provide an API key for your preferred provider:
- Anthropic (𝚍𝚎𝚏𝚊𝚞𝚕𝚝):
```
export ANTHROPIC_API_KEY="sk-ant-…"
```
- OpenAI (when running with --provider openai):
```
export OPENAI_API_KEY="sk-openai-…"
```
Place a Pokémon Red ROM (pokemon.gb) in the project root (or point to it with --rom).

Usage

Running the agent (CLI + Web UI)

The entry‑point is main.py. It both spins up a FastAPI server and starts the agent. All interaction happens through the web UI – no separate headless mode is needed.

# Quick start – Anthropic Sonnet playing 1 000 000 steps (~10 weeks), UI on port 3000
python main.py --rom pokemon.gb --steps 1000000

# Use OpenAI o4‑mini instead
python main.py --provider openai --model o4-mini

Key flags:

--rom <file.gb> – path to the Pokémon Red ROM (default: pokemon.gb)
--steps <N> – maximum steps to execute (agent can be paused / resumed). Default is 1_000_000 (~30 frames × 10 weeks).
--port <N> – port for the FastAPI server / web UI (default 3000)
--save-state <file.state> – load a PyBoy save state at startup
--overlay – draw walkable‑tile overlay inside the game feed
--provider anthropic|openai – choose LLM backend (default: anthropic)
--model <name> – override default model for the chosen provider

Open http://localhost:<port> in a browser to see:

Game Screen – live 30 FPS video
Assistant Messages – the model’s tool calls & high‑level reasoning
Context History – compressed conversation so far
Controls – Run, Pause, Stop, Load Save State

Logs

Each run writes to logs/run_<timestamp>/:

frames/: PNG screenshots per step
claude_messages.log: model response logs
game.log: emulator and agent logs

Auto‑save snapshots

Inside each run folder you will also find history_saves/ containing periodic PyBoy .state snapshots. These are written automatically:

Whenever the agent summarises the running conversation (~every 50 steps).
Immediately after the player transitions between major areas (e.g. moves to another floor or map).

You can resume from any snapshot by either:

• Supplying --save-state <file> on the command line, or • Clicking Load Save in the web UI and selecting a .state file.

Configuration tips

Global defaults live in config.py:

MODEL_NAME – default Anthropic model (CLI --model overrides)
TEMPERATURE – sampling temperature passed to the LLM
MAX_TOKENS – hard limit for the response size
USE_NAVIGATOR – toggle the higher‑level navigate_to tool (default: True)

Contributing

PRs welcome! Please open issues or pull requests 😊

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
agent		agent
web		web
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
config.py		config.py
main.py		main.py
requirements.txt		requirements.txt
tile_visualizer.py		tile_visualizer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OpenAI Plays Pokemon

Highlights

Setup

Usage

Running the agent (CLI + Web UI)

Logs

Auto‑save snapshots

Configuration tips

Contributing

About

Uh oh!

Releases

Packages

Languages

stevemoraco/DATAPlaysPokemon

Folders and files

Latest commit

History

Repository files navigation

OpenAI Plays Pokemon

Highlights

Setup

Usage

Running the agent (CLI + Web UI)

Logs

Auto‑save snapshots

Configuration tips

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages