This repository contains the official implementation for the paper "Watermark Smoothing Attacks against Language Models" accepted at EMNLP 2025 Findings.
Install the required dependencies:
pip install -r requirements.txtNote: Some watermarking methods (XSIR, UPV, SIR) require additional model downloads. Please refer to the MarkLLM framework documentation for detailed instructions.
Generate results for OPT models:
bash run_opt.shGenerate results for LLaMA models:
bash run_llama.sh├── attack_processor.py # Core implementation of watermark smoothing attack
├── main.py # Main execution script
├── config/ # Configuration files for different watermarking methods
│ ├── DIP.json
│ ├── EWD.json
│ ├── EXP.json
│ ├── KGW.json
│ ├── SIR.json
│ ├── SWEET.json
│ └── ...
├── watermark/ # Watermarking implementations (based on MarkLLM)
│ ├── auto_watermark.py
│ ├── base.py
│ ├── dip/
│ ├── exp/
│ ├── kgw/
│ └── ...
├── evaluation/ # Evaluation pipelines and tools
│ ├── pipelines/
│ ├── tools/
│ └── examples/
├── dataset/ # Dataset handling
├── utils/ # Utility functions
└── visualize/ # Visualization tools
This codebase extends the MarkLLM framework (Apache License 2.0) with the following key modifications:
- Added
attack_processorparameter to watermarking methods - Integrated smoothing attack into the generation pipeline
- Enhanced evaluation metrics for attack assessment
We thank the authors for their excellent foundation.