Skip to content

NL2G/LiTransProQA

Repository files navigation

LiTransProQA
LLM-based Literary Translation evaluation metric with Professional Question Answering

Lab Logo

🤖 Models and datasets 📄 arXiv

📁 Repository Structure

LitTransProQA/
├── datasets/                    # Dataset directory
│   ├── QA_weights.csv          # selected question weights
│   ├── benchmark_dataset_all_src_tgt.csv # LitEval + Iyyer: full sample ID 
│   └── sampled_benchmark.csv   # Sampled ID for validation set
├── finetuning_method/         # Fine-tuning related code
│   ├── configs/               # Configuration files
│   ├── xcomet_regression.py   # Regression task finetuning
│   ├── xcomet_inference.py    # Inference implementation
│   └── xcomet_ranking.py      # Ranking task finetuning
├── prompting_method/          # Prompt-based approaches
│   ├── template/             # Prompt templates
│   ├── QA_translators/       # translator voting results
│   ├── prompt_openrouter.py  # API integration
│   ├── run_all_models.py     # Model execution script
│   └── build_dataset.py      # Prompt preparation
└── SOTA_metric/              # State-of-the-art metrics
    └── m_prometheous.py      # Prometheus metric implementation

🛠️ Setup & Installation

Dataset: Due to copyright and licensing restrictions, we only release the IDs on GitHub. The complete test datasets, including source and target texts, can be downloaded via Google form (coming soon, awaiting permission and ethical checks) for research purposes only.

Installation:

  • Install the COMET for task finetuning and run the xcomet evaluation. Guide from COMET: requires python 3.8 or above. Simple installation from PyPI
pip install --upgrade pip  # ensures that pip is current 
pip install unbabel-comet
  • Install mt-metrics-eval to reproduce correlation results using mt_metrics_eval_LitEval.ipynb.

🚀 Usage

  • Multiple Assessment Methods:

    • Fine-tuning based approaches using XCOMET, see finetuning_method
    • Prompt-based LitTransProQA: question-answering-based translation evaluation
    • Other SOTA metrics
  • Instructions to run LitTransProQA: To reproduce results or use for scoring, one must first download the dataset containing the source and target information from the instructions above. Alternatively, you can replace this with your own data.

# Step1: Build prompts from datasets (.csv containing "src" [source] and "tgt" [target] columns); one can modify the template  
python prompting_method/build_dataset.py 

# Step2: Scoring using a single {model}  
python prompting_method/prompt_openrouter.py \
              --file final_set/final_set_with_QA.csv \
              --model {model} \
              --content-column QA \
              --temperature 0.3 \
              --output-dir final_results/

# Or Step2: Scoring using multiple models, e.g., from model_list.txt
python prompting_method/run_all_models.py 
  • Instruction to reproduce results using mt-metrics-eval, see first markdowm in mt_metrics_eval_LitEval.ipynb.

📊 Results Overview

LitTransproQA summary

🤝 Contributing

Feel free to contribute by submitting a pull request.

# Fork the repository
# Create a new branch for your feature or fix
# Commit your changes with a clear message
# Push to your fork and submit a PR

📜 License

Specify the license under which this code is shared.

This project is licensed under the CC License - see the LICENSE file for details.


📖 Citation

If you use this work in your research, please cite it as:

@misc{zhang2025litransproqallmbasedliterarytranslation,
      title={LiTransProQA: an LLM-based Literary Translation evaluation metric with Professional Question Answering}, 
      author={Ran Zhang and Wei Zhao and Lieve Macken and Steffen Eger},
      year={2025},
      eprint={2505.05423},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.05423}, 
}

About

LLM-based Literary Translation evaluation metric with Professional Question Answering

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published