Skip to content

milinbhade1214/rl-trim

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RL-TRIM: Reinforcement Learning drivenTransformer Model Structured Pruning

This framework employs reinforcement learning for the structured pruning of Transformer models, specifically targeting models like LLaMA

Our approach involves pruning at different granularities, including head pruning and intermediate dimension pruning, which directly reduces memory size and computational load, facilitating acceleration on consumer GPUs. By utilizing a reinforcement learning agent to determine the optimal pruning strategy, RL-TRIM achieves a significant balance between model size reduction and performance retention, offering a scalable and efficient solution for optimizing various Transformer architectures. ii


Acknowledgments

  • AMC: AutoML for Model Compression and Acceleration on Mobile Devices. Thanks for providing the pruning framework
  • LLM-Pruner, which utilizes LM Evaluation Harness, PEFT, and Alpaca-LoRA. Thanks for the pioneering work on structured pruning of LLMs!

About

RL-TrIM: Reinforcement Learning driven Transformers model Structured Pruning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published