🔄 From Token to Action: State Machine Reasoning (SMR)

Paper link: From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval

🤔 What is SMR?

State Machine Reasoning (SMR) is a lightweight framework that replaces token-level Chain-of-Thought (CoT) traces with three discrete, IR-focused actions—REFINE, RERANK, and STOP—to prevent overthinking in retrieval.

🎯 Existing Challenges

Token-based CoT often leads to:

Figure 1: (a) Standard CoT, (b) Compressed CoT, (c) SMR.

Redundancy
- Repeatedly paraphrases without new retrieval gains.
Drift
- Compression may alter the query’s original intent.

🔍 Causes of Overthinking

Token-Level Decoding:
LLMs generate each token in sequence, causing repeated or off-track reasoning.
No Early Validation:
CoT lacks a mechanism to check if a reasoning step actually improves retrieval.

🚀 Our Solution: SMR Overview

SMR treats retrieval as a sequence of state transitions ((q_t, D_t)):

(q_t) = current query
(D_t) = top-k documents

At each step, SMR’s prompt-based LLM selects one of three actions:

REFINE: Update the query to add or clarify terms.
RERANK: Reorder retrieved documents to surface more relevant passages.
STOP: End reasoning when no further improvements occur.

Figure 2: SMR’s state transitions (REFINE, RERANK, STOP).

✨ Key Advantages

Fewer Tokens
- Actions operate on states, not long token streams.
- Reasoning stops as soon as no new documents or query changes appear.
Effective Steps
- Each REFINE or RERANK directly boosts retrieval quality.
- No task-specific tuning needed—SMR works with any retriever or LLM.
Clear Control
- The LLM explains each action choice for interpretability.
- A maximum step limit (e.g., 16) bounds inference cost.

SMR cuts redundant and off-track reasoning by modeling retrieval as a state machine of ((q, D)) states, guided by explicit IR actions. This yields improved retrieval and lower token usage compared to traditional CoT methods.

🛠️ Setup

Create a virtual environment

conda create -n smr -c anaconda python=3.12.2
conda activate smr
conda install -c conda-forge openjdk=21 maven -y
pip install -r requirements.txt

Install Ollama

apt-get update
apt-get install pciutils udev lshw
curl -fsSL https://ollama.com/install.sh | sh
ollama serve

Pull required models

# e.g. ollama pull qwen2.5:32b-instruct-q4_K_M
ollama pull {your_model}

⚡ Quick Start

Run the full pipeline

bash main.sh

This command downloads the BRIGHT dataset and performs inference from start to finish.

Explaination for some arguments

MODEL=bm25                              # 1st-stage retriever
cache_dir=cache                         # path to cache directory
agent=qwen2.5:32b-instruct-q4_K_M       # Ollama LLM model
agent_tokenizer=Qwen/Qwen2.5-32B-Instruct # Hugging Face tokenizer path

for TASK in biology earth_science economics psychology robotics stackoverflow sustainable_living leetcode pony aops theoremqa_theorems theoremqa_questions; do
    python main.py \
        --task $TASK \
        --model $MODEL \
        --output_dir output/${MODEL} \
        --cache_dir ${cache_dir} \
        --agent $agent \
        --agent_tokenizer $agent_tokenizer
done

MODEL: First-stage retriever to use
cache_dir: Directory for caching data
agent: Ollama LLM model identifier
agent_tokenizer: Hugging Face tokenizer path (for token count)

📝 Citation

If you find our work useful, please consider citing our paper:

@article{lee2025state,
  title={From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval},
  author={Lee, Dohyeon and Jeong, Yeonseok and Hwang, Seung-won},
  journal={arXiv preprint arXiv:2505.23059},
  year={2025}
}

📦 References

We also referenced the ReasonIR code from facebookresearch/ReasonIR.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
configs		configs
prompt		prompt
smr		smr
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
main.sh		main.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🔄 From Token to Action: State Machine Reasoning (SMR)

🤔 What is SMR?

🎯 Existing Challenges

🔍 Causes of Overthinking

🚀 Our Solution: SMR Overview

✨ Key Advantages

🛠️ Setup

⚡ Quick Start

Run the full pipeline

Explaination for some arguments

📝 Citation

📦 References

About

Uh oh!

Releases

Packages

Languages

License

ldilab/SMR

Folders and files

Latest commit

History

Repository files navigation

🔄 From Token to Action: State Machine Reasoning (SMR)

🤔 What is SMR?

🎯 Existing Challenges

🔍 Causes of Overthinking

🚀 Our Solution: SMR Overview

✨ Key Advantages

🛠️ Setup

⚡ Quick Start

Run the full pipeline

Explaination for some arguments

📝 Citation

📦 References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages