Skip to content

Migrate to PythonActorMesh and PythonActorMeshRef #557

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

pzhan9
Copy link
Contributor

@pzhan9 pzhan9 commented Jul 16, 2025

Summary:
This diff swaps _ActorMeshRefImpl with PythonActorMesh[Ref].

The swap itself should be straightforward since PythonActorMesh[Ref] should be drop-in replacements for _ActorMeshRefImpl. Most of the complexity in this diff is from how I tried to add a toggle between them, just in case there is any bugs with PythonActorMesh[Ref], so we can quickly switch back to _ActorMeshRefImpl. What I did is:

  1. Add wrapper classes EitherPyActorMesh[Ref], whose underlying type can be either PythonActorMesh[Ref] or _ActorMeshRefImpl;
  2. a env var USE_STANDIN_ACTOR_MESH is used to which one would be used when instantiating EitherPyActorMesh[Ref].

The landing of this diff would mean all Python-side mesh API calls should go through Rust-side's cast code path, except several usages of ActorIdRef.

Differential Revision: D78355743

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D78355743

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 16, 2025
pzhan9 added 5 commits July 16, 2025 13:03
…#551)

Summary:

This diffs adds `def slice` method to both `PythonActorMesh` and `PythonActorMeshRef`. With this method, we can:

1. slice a `PythonActorMesh` object into a `PythonActorMeshRef`;
1. slice a `PythonActorMeshRef` into another `PythonActorMeshRef`.

Tests are added to demo that we can cast to the sliced mesh ref.

Reviewed By: shayne-fletcher

Differential Revision: D78292490
Summary:

`_ActorMeshRefImpl` is the standin class for `PythonActorMesh` and `PythonActorMeshRef`. This diff stack is working on replacing `_ActorMeshRefImpl` with `PythonActorMesh` and `PythonActorMeshRef`.

Compared to `PythonActorMesh` and `PythonActorMeshRef`, one method `_ActorMeshRefImpl` is missing is the `def slice` method.  Lacking of this method blocks to do a drop-in replacement.

This diff pushes the `MeshTrait` implementation to `_ActorMeshRefImpl`. In this way, `_ActorMeshRefImpl` will have `the slice` method from `MeshTrait`.

Differential Revision: D78300586
…f` (pytorch-labs#555)

Summary:

`RDMAManger` and `DebugManager` have use cases that need to send messages directly to an actor based on its actor id, instead of to a mesh. Currently the "send-to-actor" function is piggybacking `_ActorMeshRefImpl`'s implementation. This is blocking the migration from `_ActorMeshRefImpl` to `PythonActorMesh`, because we cannot create a mesh solely based on its actor ID. 

To unblock, this diff lifts the `send-to-actor` logic from  `_ActorMeshRefImpl` to `ActorIdRef`, so it is decoupled from `_ActorMeshRefImpl` and `ActorMeshRef`.

Differential Revision: D78302352
Summary:
This is a prep diff for D78355743.

`class ActorMeshRef` is currently a standin for both `PythonActorMesh` and `PythonActorMeshRef`. Specifically,

* `PythonActorMesh` has states, and cannot be serialized, and sent to a remote actor, instead,
*  it needs to [be bound into a `PythonActorMeshRef`](https://www.internalfb.com/code/fbsource/[641ad3ff455893bd5b8000bd108d7fc2a5bfc0db]/fbcode/monarch/python/monarch/_rust_bindings/monarch_hyperactor/actor_mesh.pyi?lines=44-45). The ref object, can be serialized.

This diff introduces a new class `ActorMeshHandle`. Then we will have:

* `ActorMeshHandle` is mapped to `PythonActorMesh`;
* `ActorMeshRef` is mapped to `PythonActorMeshRef`.

Then in D78355743, I can just swap the underlying `_ActorMeshRefImpl` field with `PythonActorMesh` and ``PythonActorMeshRef` respectively.

Differential Revision: D78306092
Summary:
This diff swaps `_ActorMeshRefImpl` with `PythonActorMesh[Ref]`.

The swap itself should be straightforward since `PythonActorMesh[Ref]` should be drop-in replacements for `_ActorMeshRefImpl`. Most of the complexity in this diff is from how I tried to add a toggle between them, just in case there is any bugs with  `PythonActorMesh[Ref]`, so we can quickly switch back to `_ActorMeshRefImpl`. What I did is:

1. Add wrapper classes `EitherPyActorMesh[Ref]`, whose underlying type can be either `PythonActorMesh[Ref]` or `_ActorMeshRefImpl`;
2. a env var `USE_STANDIN_ACTOR_MESH` is used to which one would be used when instantiating `EitherPyActorMesh[Ref]`.

The landing of this diff would mean all Python-side mesh API calls should go through Rust-side's `cast` code path, except several usages of `ActorIdRef`.

Differential Revision: D78355743
@pzhan9 pzhan9 force-pushed the export-D78355743 branch from 37108dd to 7543316 Compare July 16, 2025 20:03
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D78355743

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants