A CLI tool to anonymize Markdown, plain text, TeX, and BibTeX files with spaCy-based entity detection and automatic YAML configuration.
- Detects names, emails, addresses, phone numbers, and CPR numbers using Presidio with spaCy
- Groups name and number variants using rapidfuzz
- Extracts entities to generate a YAML config (
did ex
) - Anonymizes text using YAML config (
did an
), preserving file formats - Supports English (
en
) and Danish (da
)
uv sync
Extract entities:
uv run did ex -f input.txt -c config.yaml
Anonymize:
uv run did an -f input.txt -c config.yaml -o output.txt
For details, see the documentation.