I'm Mikhail Biriuchinskii, an R&D Data Scientist and NLP specialist in Paris, with expertise in:
- ๐งพ Text & documents โ OCR/HTR, multilingual, historical archives
- ๐ง LLMs โ Fine-tuning, prompt design, retrieval-augmented generation
- ๐ฃ Speech โ Low-resource languages, Whisper pipelines
- ๐งฐ Annotation & evaluation โ FAIR data, human-in-the-loop workflows
- ๐ Deployment & tooling โ Docker, FastAPI, open-access apps
I build tools at the intersection of language, AI, and data, with a focus on open-source, multimodal processing (text, speech, image), and large language models (LLMs).
I value clarity, rigor, and collaborationโespecially across tech and the humanities.
Youโll find:
- ๐ NLP demos (classification, NER, RAG, etc.)
- ๐ Tools for linguists and researchers
- ๐ฆ Open-source contributions (TAL, OCR/HTR, LLMs)
- ๐งช Experiments with Transformers, embeddings, and vector DBs
๐งญ Open to new opportunities from September 2025.


