The AI bug hunter and slop cleaner is a personality-driven AI prompt designed to transform questionable code into production-grade software. The prompts in this repository offer a systematic approach to detect, analyze, and eliminate bugs using multiple expert perspectives simultaneously. Simply paste them in your favorite AI coding tool like Claude Code or Codex after you have developed some software to analyze it and test for bugs. There are two prompts in this repository - one that uses just AI logic to find bugs and another that uses AI logic and tools to find bugs. These prompts will use many tokens but should result in comprehensive coverage of the code and all bugs.
AI-based codebases suffer from:
- ❌ Untested edge cases hiding everywhere
- ❌ Variables named 'data', 'temp', 'x'
- ❌ Float arithmetic for financial calculations
- ❌ Race conditions waiting to explode
- ❌ SQL injection vulnerabilities lurking
- ❌ Memory leaks slowly killing performance
- ❌ Error handling that doesn't handle errors
- ❌ 25% test coverage claimed as "good enough"
- ❌ Silent errors, logical bugs
To solve this problem, we leverage AI to analyze the code, applying specific analytical lenses to ensure accuracy.
Instead of generic bug detection, we simulate 10 different expert perspectives, each is used to assess code from different angles.
Personality | Focus Area | Catches |
---|---|---|
🧮 Senior Mathematician | Calculations & Algorithms | Floating point errors, overflow, complexity issues |
🔒 Security Paranoid | Input/Output Security | Injection attacks, authentication bypass, crypto failures |
🏗️ Systems Architect | Design & Structure | Coupling issues, SOLID violations, scaling problems |
⚡ Concurrency Specialist | Threading & Async | Race conditions, deadlocks, synchronization bugs |
💾 Memory Surgeon | Resource Management | Memory leaks, unclosed handles, buffer overflows |
🚀 Performance Optimizer | Speed & Efficiency | O(n²) crimes, cache misses, unnecessary allocations |
📊 Data Scientist | ML & Data Processing | Data leakage, distribution issues, validation gaps |
💰 Financial Engineer | Money & Calculations | Precision loss, rounding errors, compliance violations |
🌐 Distributed Systems Expert | Network & Scale | Consistency issues, partition failures, timeout bugs |
🧪 Testing Philosopher | Test Quality | Missing coverage, non-deterministic tests, bad mocks |
- Auto-detects repository languages
- Deploys appropriate tools (ESLint, Pylint, SpotBugs, etc.)
- Configures strictest possible settings
Every bug gets complete documentation:
- WHAT was detected (exact pattern)
- WHY it's wrong (mathematical/logical proof)
- HOW it fails (concrete example)
- WHAT breaks (impact analysis)
- HOW to fix (exact solution)
- Every code path tested
- Every edge case validated
- Every assumption verified
- No exceptions
- CRITICAL: Security vulnerabilities, data loss, auth bypass
- HIGH: Crashes, miscalculations, race conditions
- MEDIUM: Performance issues, missing validation
- LOW: Style issues, documentation gaps
═══════════════════════════════════════════════════════════
BUG REPORT #001
PERSONALITY DETECTING: Financial Engineer + Security Paranoid
SEVERITY: CRITICAL
LOCATION: payment_processor.py:45:calculate_total()
THE BUG: Using float for currency calculations
CHAIN OF THOUGHT:
1. Detected: total = price * 0.1 (float arithmetic)
2. Why Wrong: Floating point imprecision loses pennies
3. Impact: $0.10 + $0.20 = $0.299999999997
4. At scale: Millions in rounding errors
CONCRETE FAILURE:
INPUT: [100 transactions of $33.33]
EXPECTED: $3,333.00
ACTUAL: $3,332.99
CUSTOMER IMPACT: Missing penny per calculation
FIX:
BEFORE: total = price * 0.1
AFTER: total = Decimal(price) * Decimal('0.1')
TESTS ADDED:
✓ test_currency_precision()
✓ test_rounding_accuracy()
✓ test_large_scale_calculations()
═══════════════════════════════════════════════════════════
- Bugs Found per KLOC (target: find them ALL)
- False Positive Rate (target: <5%)
- Path Coverage (requirement: 100%)
- Fix Completeness (requirement: root cause + tests)
- Security Vulnerabilities (requirement: ZERO)
No code passes without:
- ✅ All critical paths tested
- ✅ All inputs validated
- ✅ All resources managed properly
- ✅ All calculations mathematically verified
- ✅ All security vulnerabilities eliminated
Language | Linters | Security | Performance | Testing |
---|---|---|---|---|
Python | pylint, flake8, mypy | bandit, safety | py-spy, memory_profiler | pytest, hypothesis |
JavaScript/TS | eslint, tslint | npm audit, snyk | clinic.js, 0x | jest, mocha |
Java | SpotBugs, PMD | OWASP, Fortify | JProfiler, async-profiler | JUnit, Mockito |
Go | golangci-lint, staticcheck | gosec, nancy | pprof, trace | go test, testify |
Rust | clippy, rustfmt | cargo audit | cargo flamegraph | cargo test, proptest |
C/C++ | clang-tidy, cppcheck | flawfinder, RATS | valgrind, perf | gtest, catch2 |
Traditional code review: "Looks good to me" 👍
AI bug hunter:
- "Line 45 has a race condition that occurs when two threads call this method within 3ms"
- "The financial calculation loses precision after 10,000 iterations"
- "This 'temporary' variable is accessed 47 lines later across 3 function calls"
- "Your O(n²) algorithm becomes O(n) with a HashMap"
"Every variable is guilty until proven innocent. Every code path hides bugs. Every input is malicious. Every assumption is wrong until validated."
This isn't about writing more code—it's about writing correct code.
This tool will find bugs you didn't know existed. It will question code you thought was perfect. It will demand standards you might find excessive. That's the point. Bad code is expensive. Bugs are expensive. Security breaches are expensive. Quality is not negotiable.
MIT - Because good code should be everywhere.
Built by a developer who is tired of debugging production at 3 AM and has seen one too many breaches.