Skip to content

Compute resource aware simulation #857

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

thomasywang
Copy link
Contributor

Summary: The sim allocator will now register the location (region, dc, zone, rack, host, gpu) of every ProcId upon creation with the simnet.

Differential Revision: D80137963

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 13, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80137963

thomasywang added a commit to thomasywang/monarch-1 that referenced this pull request Aug 14, 2025
Summary:

The sim allocator will now register the location (region, dc, zone, rack, host, gpu) of every ProcId upon creation with the simnet.

Differential Revision: D80137963
thomasywang added a commit to thomasywang/monarch-1 that referenced this pull request Aug 14, 2025
Summary:

The sim allocator will now register the location (region, dc, zone, rack, host, gpu) of every ProcId upon creation with the simnet.

Differential Revision: D80137963
thomasywang added a commit to thomasywang/monarch-1 that referenced this pull request Aug 14, 2025
Summary:

The sim allocator will now register the location (region, dc, zone, rack, host, gpu) of every ProcId upon creation with the simnet.

Differential Revision: D80137963
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80137963

thomasywang added a commit to thomasywang/monarch-1 that referenced this pull request Aug 14, 2025
Summary:
Pull Request resolved: meta-pytorch#857

The sim allocator will now register the location (region, dc, zone, rack, host, gpu) of every ProcId upon creation with the simnet.

Differential Revision: D80137963
thomasywang added a commit to thomasywang/monarch-1 that referenced this pull request Aug 18, 2025
Summary:

The sim allocator will now register the location (region, dc, zone, rack, host, gpu) of every ProcId upon creation with the simnet.

Reviewed By: pablorfb-meta

Differential Revision: D80137963
thomasywang added a commit to thomasywang/monarch-1 that referenced this pull request Aug 19, 2025
Summary:

The sim allocator will now register the location (region, dc, zone, rack, host, gpu) of every ProcId upon creation with the simnet.

Reviewed By: pablorfb-meta

Differential Revision: D80137963
thomasywang added a commit to thomasywang/monarch-1 that referenced this pull request Aug 19, 2025
Summary:

The sim allocator will now register the location (region, dc, zone, rack, host, gpu) of every ProcId upon creation with the simnet.

Reviewed By: pablorfb-meta

Differential Revision: D80137963
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80137963

thomasywang added a commit to thomasywang/monarch-1 that referenced this pull request Aug 19, 2025
Summary:
Pull Request resolved: meta-pytorch#857

The sim allocator will now register the location (region, dc, zone, rack, host, gpu) of every ProcId upon creation with the simnet.

Reviewed By: pablorfb-meta

Differential Revision: D80137963
Summary: Previously we had to use u64 for serialization reasons but those reasons no longer exist

Differential Revision: D80556690
Summary: There was an open TODO to remove the global mailbox for SimClock. We don't actually even need mailboxes for sim clock and a oneshot works just fine

Differential Revision: D80029571
Summary:
Pull Request resolved: meta-pytorch#854

When we increase the number of actors in our simulation it takes longer for all the events at a certain time to complete so we need to wait for longer. If we wait to long then the simulation just runs slower than it needs to so its nice to make this configurable.

In the long term we will come up with a more robust solution to this but in the meantime that is not a priority. See EX528476 to understand the underlying problem the debounce is remedying

Differential Revision: D80137965

Reviewed By: pablorfb-meta
thomasywang added a commit to thomasywang/monarch-1 that referenced this pull request Aug 20, 2025
Summary:

The sim allocator will now register the location (region, dc, zone, rack, host, gpu) of every ProcId upon creation with the simnet.

Reviewed By: pablorfb-meta

Differential Revision: D80137963
thomasywang added a commit to thomasywang/monarch-1 that referenced this pull request Aug 20, 2025
Summary:

The sim allocator will now register the location (region, dc, zone, rack, host, gpu) of every ProcId upon creation with the simnet.

Reviewed By: pablorfb-meta

Differential Revision: D80137963
Summary:
Pull Request resolved: meta-pytorch#857

The sim allocator will now register the location (region, dc, zone, rack, host, gpu) of every ProcId upon creation with the simnet.

Reviewed By: pablorfb-meta

Differential Revision: D80137963
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80137963

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 7238c13.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants