An AI agent that uses the ReAct (Reasoning and Acting) methodology to complete complex research tasks by browsing the web, analyzing information, and writing code.
- ReAct Methodology: Implements the original ReAct paradigm from the paper "ReAct: Synergizing Reasoning and Acting in Language Models"
- Task-Agnostic Design: No hardcoded logic for specific tasks - the agent intelligently adapts to any research question
- Extensible Tool System: Easy-to-extend architecture for adding new capabilities
- Multiple Tools:
- Web Search: Google search via Serper.dev API
- Web Scraping: Fetch and parse content from any URL
- Code Execution: Run Python code for data analysis and processing
- File Operations: Read and write files for data persistence
- Powered by Gemini 2.0: Uses Google's Gemini 2.0 Flash model for reasoning and decision-making
This project implements the ReAct (Reasoning and Acting) paradigm for AI agents, as described in:
ReAct: Synergizing Reasoning and Acting in Language Models Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao ICLR 2023 Paper | Project Page
The ReAct framework enables language models to generate both reasoning traces and task-specific actions in an interleaved manner, leading to improved performance on complex tasks requiring planning and information gathering.
The agent follows a simple but powerful loop:
- Thought: The agent reasons about the current state and what action to take next
- Action: The agent selects and executes a tool with specific parameters
- Observation: The agent receives and processes the result
- Repeat: The cycle continues until the task is complete
This approach mirrors human problem-solving: we think, act, observe results, and adjust our strategy accordingly.
web_research_agent/
webresearch/
├── __init__.py
├── agent.py # Agent class
├── llm.py # Language model interface
├── config.py # Configuration management
└── tools/
| ├── __init__.py
| ├── base.py # Base class for tools
| ├── search.py # Base class for search tools
| ├── scrape.py # Base class for scrape tools
| ├── code_executor.py # Base class for code execution
| └── file_ops.py # Base class for file operations
|
├── main.py # Entry point script
├── cli.py # Command-line interface
├── tasks.txt # Example tasks
├── .env.example # Environment variables template
└── requirements.txt # Python dependencies
Install directly from PyPI:
pip install web-research-agentThen run the interactive CLI:
webresearchThe first time you run it, you'll be prompted to enter your API keys. These will be securely stored in ~/.webresearch/config.env.
If you get 'webresearch' is not recognized error on Windows, the Scripts folder isn't in your PATH. Here are solutions:
Quick Fix (Current Session Only):
# Add to PATH temporarily
$env:Path += ";$env:APPDATA\Python\Python313\Scripts"
webresearchPermanent Fix (Recommended):
- Open PowerShell as Administrator
- Run:
[Environment]::SetEnvironmentVariable(
"Path",
[Environment]::GetEnvironmentVariable("Path", "User") + ";$env:APPDATA\Python\Python313\Scripts",
"User"
)- Restart your terminal
- Run
webresearch
Alternative (No PATH needed):
python -m cliOn Linux/Mac: Usually works immediately, but if needed:
export PATH="$HOME/.local/bin:$PATH"-
Clone the repository:
git clone https://github.com/ashioyajotham/web_research_agent.git cd web_research_agent -
Install in development mode:
pip install -e . -
Run the CLI:
webresearch
You'll need:
- Gemini API key: Get yours at Google AI Studio (free tier available)
- Serper API key: Get yours at Serper.dev (free tier: 2,500 searches/month)
The CLI will prompt you for these on first run.
Simply run:
webresearchYou'll see a beautiful interface with options to:
- Run a research query - Ask any research question interactively
- Process tasks from file - Run multiple tasks from a file
- View recent logs - Check execution logs
- Reconfigure API keys - Update your configuration
- Exit - Close the application
For batch processing, you can still use the traditional mode:
python main.py tasks.txtOptions:
-o, --output- Specify output file (default: results.txt)-v, --verbose- Enable detailed logging
This will:
- Read tasks from
tasks.txt(one task per line, separated by blank lines) - Process each task using the ReAct agent
- Save results to
results.txt - Save execution logs to
logs/agent_<timestamp>.log
Specify a custom output file:
python main.py tasks.txt -o my_results.txtEnable detailed debug logging:
python main.py tasks.txt -vTasks should be separated by blank lines. Multi-line tasks are supported:
Find the name of the COO of the organization that mediated secret talks between US and Chinese AI companies in Geneva in 2023.
Compile a list of 10 statements made by Joe Biden regarding US-China relations. Each statement must have been made on a separate occasion. Provide a source for each statement.
By what percentage did Volkswagen reduce the sum of their Scope 1 and Scope 2 greenhouse gas emissions in 2023 compared to 2021?
Edit .env to customize agent behavior:
| Variable | Default | Description |
|---|---|---|
MAX_ITERATIONS |
15 | Maximum reasoning steps before timeout |
MAX_TOOL_OUTPUT_LENGTH |
5000 | Maximum characters from tool outputs |
TEMPERATURE |
0.1 | LLM temperature (0.0-1.0, lower = more focused) |
MODEL_NAME |
gemini-2.0-flash-exp | Gemini model to use |
WEB_REQUEST_TIMEOUT |
30 | Timeout for web requests (seconds) |
CODE_EXECUTION_TIMEOUT |
60 | Timeout for code execution (seconds) |
The agent can handle a variety of research tasks:
- Information Gathering: Compile statements, find specific facts, locate documents
- Data Analysis: Download datasets, process CSV/JSON files, perform calculations
- Multi-Step Research: Tasks requiring multiple sources and synthesis
- Verification: Cross-reference information from multiple sources
See tasks.txt for examples of representative tasks.
The agent is designed to be easily extensible. To add a new tool:
- Create a new tool class in
tools/inheriting fromTool:
from tools.base import Tool
class MyNewTool(Tool):
@property
def name(self) -> str:
return "my_tool"
@property
def description(self) -> str:
return """Description of what your tool does and its parameters."""
def execute(self, **kwargs) -> str:
# Your tool logic here
return "Tool result"- Register the tool in
main.py:
tool_manager.register_tool(MyNewTool())That's it! The agent will automatically discover and use your new tool.
The agent follows this pattern for each iteration:
Thought: I need to search for information about X
Action: search
Action Input: {"query": "X"}
Observation: [Search results appear here]
Thought: Now I need to read the first result
Action: scrape
Action Input: {"url": "https://..."}
Observation: [Page content appears here]
Thought: I have enough information to answer
Final Answer: [Complete answer with sources]
The agent autonomously decides which tools to use based on:
- The task requirements
- Current context and previous observations
- Tool descriptions provided to the LLM
- Network timeouts and errors are caught and reported
- Failed tool executions return error messages to the agent
- Maximum iteration limit prevents infinite loops
- Best-effort answers provided if task cannot be completed
API Key Errors:
- Ensure
.envfile exists and contains valid API keys - Check that keys are not wrapped in quotes
Import Errors:
- Run
pip install -r requirements.txtto install all dependencies - Ensure you're using Python 3.8 or higher
Timeout Errors:
- Increase timeout values in
.env - Some tasks may require more iterations - adjust
MAX_ITERATIONS
Empty Results:
- Check logs in
logs/directory for detailed error information - Verify network connectivity for web requests
Run with -v flag to see detailed execution logs:
python main.py tasks.txt -v- Adjust iterations: Complex tasks may need more than 15 iterations
- Temperature tuning: Lower temperature (0.0-0.2) for focused research, higher (0.5-0.7) for creative tasks
- Output length: Increase
MAX_TOOL_OUTPUT_LENGTHfor tasks requiring full document analysis - Model selection: Use
gemini-2.0-flash-expfor speed orgemini-1.5-profor complex reasoning
- Web content behind paywalls or login walls cannot be accessed
- PDF parsing is limited (URLs are noted for manual download)
- Code execution is sandboxed but runs in the local environment
- Some websites may block scraping attempts
- Rate limits apply to API calls (Serper free tier: 2,500 searches/month)
The codebase emphasizes:
- Modularity: Each component has a single responsibility
- Extensibility: New tools can be added without modifying core logic
- Documentation: Comprehensive docstrings and comments
- Error Handling: Graceful degradation and informative error messages
- Logging: Detailed execution traces for debugging
- Type Hints: Clear interfaces using Python type annotations
We welcome contributions! Please see CONTRIBUTING.md for detailed guidelines.
For bug reports and feature requests, please open an issue on GitHub.
This project is licensed under the MIT License - see the LICENSE file for details.
TL;DR: You are free to use, modify, and distribute this software, even for commercial purposes, as long as you include the original copyright notice.