Watermark Smoothing Attacks against Language Models

This repository contains the official implementation for the paper "Watermark Smoothing Attacks against Language Models" accepted at EMNLP 2025 Findings.

Installation

Install the required dependencies:

pip install -r requirements.txt

Note: Some watermarking methods (XSIR, UPV, SIR) require additional model downloads. Please refer to the MarkLLM framework documentation for detailed instructions.

Quick Start

Running Experiments

Generate results for OPT models:

bash run_opt.sh

Generate results for LLaMA models:

bash run_llama.sh

Repository Structure

├── attack_processor.py          # Core implementation of watermark smoothing attack
├── main.py                     # Main execution script
├── config/                     # Configuration files for different watermarking methods
│   ├── DIP.json
│   ├── EWD.json
│   ├── EXP.json
│   ├── KGW.json
│   ├── SIR.json
│   ├── SWEET.json
│   └── ...
├── watermark/                  # Watermarking implementations (based on MarkLLM)
│   ├── auto_watermark.py
│   ├── base.py
│   ├── dip/
│   ├── exp/
│   ├── kgw/
│   └── ...
├── evaluation/                 # Evaluation pipelines and tools
│   ├── pipelines/
│   ├── tools/
│   └── examples/
├── dataset/                    # Dataset handling
├── utils/                      # Utility functions
└── visualize/                  # Visualization tools

Acknowledgments

This codebase extends the MarkLLM framework (Apache License 2.0) with the following key modifications:

Added attack_processor parameter to watermarking methods
Integrated smoothing attack into the generation pipeline
Enhanced evaluation metrics for attack assessment

We thank the authors for their excellent foundation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Watermark Smoothing Attacks against Language Models

Installation

Quick Start

Running Experiments

Repository Structure

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
config		config
dataset/c4		dataset/c4
evaluation		evaluation
exceptions		exceptions
utils		utils
visualize		visualize
watermark		watermark
LICENSE		LICENSE
attack_processor.py		attack_processor.py
config.yaml		config.yaml
main.py		main.py
notebook_table_results.ipynb		notebook_table_results.ipynb
readme.md		readme.md
requirements.txt		requirements.txt
run_llama.sh		run_llama.sh
run_opt.sh		run_opt.sh

License

changhongyan123/watermark_smoothing

Folders and files

Latest commit

History

Repository files navigation

Watermark Smoothing Attacks against Language Models

Installation

Quick Start

Running Experiments

Repository Structure

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages