List view
- No due date•1/3 issues closed
Optimize on 1 and 2 pods of TPU v6e
Overdue by 3 month(s)•Due by March 1, 2025•7/7 issues closedThe goal of this milestone is to ensure we can replace the hard-to-understand Llama reference implementation in https://github.com/pytorch-tpu/transformers/tree/flash_attention. The branch of the Huggingface fork is not ideal for engineering and for showing to interested users.
Overdue by 4 month(s)•Due by January 30, 2025•6/7 issues closed