(AAAI 2025) Enhancing Adversarial Transferability with Adversarial Weight Tuning: AWT is a data-free tuning method that combines gradient-based and model-based attack methods to enhance the transferability of AEs, and proposes a new adversarial attack algorithm, which adaptively adjusts the parameters of the surrogate model using generated AEs to optimize the flat local maxima and model smoothness simultaneously, without the need for extra data. For gradient-based parts, AWT leverages the idea from PGN (Ge et al., 2023) which penalizes gradient norm on the original loss function. For model-based parts, AWT adopts the idea from SAM which achieve the flat landscape of the surrogate model.
- Python >= 3.6
 - PyTorch >= 1.12.1
 - Torchvision >= 0.13.1
 - timm >= 0.6.12
 
pip install -r requirements.txtFirst, you need to download the data from 
 or 
 into 
/path/to/data. Then you can execute the attack as follows:
# generate adversarial samples
python main_awt.py --input_dir ./path/to/data --output_dir adv_data/mifgsm/resnet18 --attack awt --model resnet50
# evaluate the transferability
python main_awt.py --input_dir ./path/to/data --output_dir adv_data/mifgsm/resnet18 --attack awt --eval
If you have any questions, please feel free to contact me at [email protected]. I am focusing on Trustworthy AI (GenAI Security & Privacy, Distributed Learning Security & Privacy, etc).
We thank all the researchers who contribute to the development of Transferable Adversarial Attack. Especially, we thank the benchmark TransferAttack, provided by Trustworthy-AI-Group and the authors of PGN (Ge et al., 2023) for their great work.