Translate bangla to english. This model is train based on encoder decoder with attention mechanism. This repository may be a starting point to approaching bangla machine translation problem. If this repository helps others people who are working on bangla machine translation then it would be very greatfull for me.
I use dataset provide in http://www.manythings.org/anki/ben-eng.zip . This dataset contain english bangla sentence pair in the following format,
I'm counting on you. আমি আপনার উপর নির্ভর করে আছি।
I want your opinion. আমি আপনার মতামত চাই।
How is your daughter? আপনার মেয়ে কেমন আছে?
BanglaTranslator
├── assets
│ └── banglafonts
│ └── Siyamrupali.ttf
├── data
│ ├── ben-eng
│ │ ├── _about.txt
│ │ └── ben.txt
├── docs
│ └── U0980.pdf
├── models
│ ├── input_language_tokenizer.json
│ ├── target_language_tokenizer.json
├── translator
│ ├── config.py
│ ├── datasets.py
│ ├── infer.py
│ ├── __init__.py
│ ├── models.py
│ ├── train.py
│ └── utils.py
├── infer-example.ipynb
├── README.md
└── training-example.ipynb
assetscontain bangla font that used in plottingdatacontain english bangla pair datasetdocscontrain documeantaion bangla unicode poins and it's char mapingmodelscontrain saved tokenize and training checkpoints if you do trainingtranslatoris the core of the project that contrain all the required scripts for this project.infer-example.ipynbAn example notebook that shows how predict on single sentence using saved checkpointstraining-example.ipynbyou can use this notebook to train bangla to english translator model
python 3.7
tensorflow 2.x
matplotlib
sklearn
tqdm
jupyter notebook
If you want to just test the model then you need to download pretrain model from from google drive link
and extract training_checkpoints.zip file under models directory
I test pre-train model and got result like bellow.
- If you want to test it yourself please check
infer-example.ipynband also download pre-train model
