This repository contains the official code for the paper "Context-aware Membership Inference Attacks against Pre-trained Models", accepted at the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025).
conda create -n mia python=3.10
conda activate mia
pip install -r requirements.txtrun_baselines.py: Run baseline attacks and save results inresults_newfolderrun_ref_baselines.py: Run attacks on reference models for preparing reference attackrun_ours_construct_mia_data.py: Generate train and test data for our attacksrun_ours_train_lr.py: Get all our attack resultsrun_ours_different_agg.py: Get our attack results with p-value combination for different aggregationsrun_ours_get_roc.py: Get the complete ROC curves for our attacks
run.sh: Execute all attacks using the provided bash script
If you use this code in your research, please cite our paper:
@article{chang2024context,
title={Context-aware membership inference attacks against pre-trained large language models},
author={Chang, Hongyan and Shamsabadi, Ali Shahin and Katevas, Kleomenis and Haddadi, Hamed and Shokri, Reza},
booktitle={Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
year={2025}
}This code is based on the MIMIR codebase under an MIT license.