Skip to content

Conversation

kathyxuyy
Copy link

Summary:
According to https://fburl.com/gdoc/wij87gpm, we have opportunities to improve the training QPS a bit.

Optimization 1

  • put set_feature_score_metadata_cuda execution in another stream out of critical path
    Optimization 2
  • pin tensor nonblocking copy
    Optimization 3
  • remove unnecessary expensive logging

Reviewed By: emlin

Differential Revision: D80491163

Summary:
According to https://fburl.com/gdoc/wij87gpm, we have opportunities to improve the training QPS a bit.

Optimization 1
- put `set_feature_score_metadata_cuda`  execution in another stream out of critical path
Optimization 2
- pin tensor nonblocking copy
Optimization 3
- remove unnecessary expensive logging

Reviewed By: emlin

Differential Revision: D80491163
Copy link

netlify bot commented Aug 22, 2025

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit 9aa0a86
🔍 Latest deploy log https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/68a903586c33470008d707cc
😎 Deploy Preview https://deploy-preview-4763--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@meta-cla meta-cla bot added the cla signed label Aug 22, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80491163

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants