[WIP] Add autocast in torchax #9361

qihqi · 2025-06-16T02:07:40Z

How `torch.autocast` works:

When you do

with torch.autocast(device):
  some math

torch will do the following:

Check the device module (in our case it's in the device_module.py file) which is registered here: https://github.com/pytorch/xla/blob/master/torchax/torchax/__init__.py#L92; on what dtypes are supported for autocast.
switch dispatch key for the underlying math. For example, CPU device -> AutocastCPU; CUDA -> AutocastCUDA. We use the PrivateUse1 dispatch key so this will dispatch with AutocastPrivateUse1. Usual case of CUDA, it will call an op registered to it: for example, for AutocastCPU, it is registered here: https://github.com/pytorch/pytorch/blob/05faba40287cf7d8734da96cb2e904f39710bf29/aten/src/ATen/autocast_mode.cpp#L322
a. Above, L332 registers fallback meaning every op to AutocastCPU will automatically fallback to next (presumably to CPU), i.e. unless otherwise specified, it will be no-op. This line prevents errors. and then lines 329 and beyond defines those op that do have additional behavior; these are generated by a cpp macro + templates.

What we need to do to support autocast:

add this function

def get_amp_supported_dtype():
  return [torch.float16, torch.bfloat16]

to tell torch that we do support autocast. At this point it will run but will not actually do autocast yet

Change tensor base device to 'privateuseone' instead of meta in here: https://github.com/pytorch/xla/blob/master/torchax/torchax/tensor.py#L62 because if we use meta device we won't get dispatch from AutocastPrivateUse1. At this point we try running autocast we would see errors on op not registered for AutocastPrivateUse1.
Now we want to register the ops, and we can do that in Python using the torch.library API. The exact incantation is in the autocat_policies.py file. In that file, we still need to reimplement the logic of autocast, mainly downcast the input before calling certain ops.
Fix up errors from changing device from meta to privateuseone introduced in 2. There are many decompositions that calls into CPP and it will check Cpp registrations. Device Guard is one of those: (note: filed Ability to set device guard in Python pytorch#156052 to do that in python). This can be fixed with adding a Cpp file that calls the Guard registration (we don't really need to use the features provided by the device guard, just need to register it so Pytorch doesn't complain).
There is one more tests fail (on autograd from j2t_autograd) likely more cpp registration is needed.
Move the building of cpp file from runtime to compile time by fiddling with pyproject.toml / setup.py etc.

This PR implements the 3 autocast policies that we use and wires them in the Environment. Wiring it through torch infrastructure so that torch.autocast also work is WIP in #9361

qihqi force-pushed the hanq_autocast branch from b99bf64 to c57e211 Compare June 16, 2025 02:10

Implement autocast for torchax

0342e14

qihqi force-pushed the hanq_autocast branch from c57e211 to 0342e14 Compare June 16, 2025 02:11

qihqi added 2 commits June 16, 2025 03:35

register fake device guard

61683f1

Cpp reg

1c1bc19

qihqi changed the title ~~Add autocast~~ [WIP] Add autocast Jun 16, 2025

qihqi marked this pull request as draft June 16, 2025 14:01

qihqi changed the title ~~[WIP] Add autocast~~ [WIP] Add autocast in torchax Jun 16, 2025

qihqi mentioned this pull request Jun 16, 2025

Add autocast feature as torchax.amp.autocast. #9364

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Add autocast in torchax #9361

[WIP] Add autocast in torchax #9361

Uh oh!

qihqi commented Jun 16, 2025 •

edited

Loading

Uh oh!

Uh oh!

[WIP] Add autocast in torchax #9361

Are you sure you want to change the base?

[WIP] Add autocast in torchax #9361

Uh oh!

Conversation

qihqi commented Jun 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How torch.autocast works:

Uh oh!

Uh oh!

qihqi commented Jun 16, 2025 •

edited

Loading

How `torch.autocast` works: