AI Brand Protection Analyst Agent

A semantic brand protection agent powered by Google's Open Source Gemini 2.5 Pro AI Model. This tool helps detect fraudulent, malicious, or brand-abusing domains across the internet using advanced LLM-based semantic analysis and customizable analyst personas.

Motivation

After years working with threat intelligence and brand protection products, I've reverse-engineered some truly "creative" evaluation techniques.
One of the most memorable: a major platform that hard-cuts domain feeds, discarding any domain that doesn't start or end with the brand name.

This shortcut is often necessary to reduce the flood of irrelevant fullword matches. For short brand names like "tui" or "otto", it comes at a serious cost: Thousands of potentially threatening domains are silently lost every week.

Domains like secure-tui-login[.]com, my-tui-booking[.]net or nl-ottoshop[.]nl never reach analyst desks because they fail a basic syntax check.

This project was born from the need to:

think semantically, not just syntactically,
detect threats beyond keyword / dictionary matches,
empower analysts with LLM-based reasoning,
recognize internationalized and multilingual domain patterns — domain registrations in localized variations are often missed by static dictionary-based systems.

Features

Semantic Threat Detection — Detects impersonation, phishing, and domain abuse based on brand context on a large scale.
Analyst Modes — Choose between junior, senior, and expert AI analyst personas.
Batch Processing — Efficiently handles large domain datasets via Gemini API.
Structured Output — Saves results with confidence scores, risk levels, and explanations.
Flexible API Key Handling — Use command-line, environment variables, .env, or secure prompt.
Example Analysis Results:

Installation

Clone the repo:

git clone https://github.com/PAST2212/brand-protection-analyst-agent.git
cd brand-protection-analyst-agent

Install dependencies:

pip install -r requirements.txt

Generate an (free of charge) API Key for (Open Source) Gemini 2.5 Pro Model from here: https://aistudio.google.com/apikey
Add your Gemini 2.5 Pro API key:

Option 1: via .env file

echo "GEMINI_API_KEY=your_actual_api_key_here" > .env

Option 2: via environment variable

export GEMINI_API_KEY=your_actual_api_key_here

Option 3: pass via CLI (--api-key)

Updating

cd brand-protection-analyst-agent
git pull

If you encounter a merge error:

git reset --hard
git pull

Usage

Overview of different available commands

python main.py --help

Examples

# Basic analysis
python main.py --domains tui.txt --brand-name "tui"

# With custom output
python main.py --domains tui.txt --brand-name "tui" --output tui_analysis.csv

# Advanced analysis
python main.py --domains tui.txt --brand-name "tui"   --company-name "TUI AG"   --industry "Travel & Tourism"   --description "TUI AG (trading as TUI Group) is a German multinational leisure, travel and tourism company; it is the largest such company in the world. It fully or partially owns several travel agencies, hotel chains, cruise lines and retail shops as well as five European airlines. TUI is an acronym for Touristik Union International (Tourism Union International). It is headquartered in Hanover, Germany"   --batch-size 500   --analyst junior   --output tui_results.csv

Default Values

Argument	Default
`--batch-size`	`200`
`--analyst`	`senior`

Analyst Modes

Mode	Description
junior	Entry-level analyst. Consistent, rule-based, deterministic pattern detection.
senior	Balanced reasoning. Slightly creative with nuanced evaluation. (default)
expert	Advanced threat detection. High pattern recognition and semantic flexibility.

Domain File and Output Files

Input Domain Files

Place domain files in data/ folder
Use .txt format with one domain per line

Example:

data/
├── tui.txt
├── otto.txt
└── gea.txt

Output Files

All output files are stored in the data/ folder and include:

*_threats.csv — Identified threat domains
*_filtered.csv — Domains considered safe
*_complete.json — Full analysis report

Each .csv contains:

Domain
Confidence score
Relevance
Risk level
Explanation

API Rate Limits

Gemini API rate limits: Gemini Rate Limits

Notes

Google Gemini 2.5 Pro
Python 3.10+
Input and Output files are handled in the data/ folder
Only fulltext domain matches are considered
Use your current domain monitoring provider for domain data source input or others like my other project: domainthreat
Some example results are saved in data/tui_results_threats.csv

TODO

IDN support
Multimodal processing
Additional evaluation features

Author

Patrick Steinhoff
LinkedIn

Disclaimer

This tool is intended for research, legal and security analysis purposes only.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
agent		agent
assets		assets
data		data
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

AI Brand Protection Analyst Agent

Motivation

Features

Installation

Updating

Usage

Overview of different available commands

Examples

Default Values

Analyst Modes

Domain File and Output Files

Input Domain Files

Output Files

API Rate Limits

Notes

TODO

Author

Disclaimer

About

Uh oh!

Releases

Packages

Languages

Uh oh!

License

Uh oh!

PAST2212/brand-protection-analyst-agent

Folders and files

Latest commit

History

Repository files navigation

AI Brand Protection Analyst Agent

Motivation

Features

Installation

Updating

Usage

Overview of different available commands

Examples

Default Values

Analyst Modes

Domain File and Output Files

Input Domain Files

Output Files

API Rate Limits

Notes

TODO

Author

Disclaimer

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages