Skip to content

Encoding: allow user-defined encoded sample rate #700

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 37 commits into from
Jul 8, 2025

Conversation

NicolasHug
Copy link
Member

@NicolasHug NicolasHug commented May 27, 2025

This PR allows the user to specify a desired sample rate for the encoded output.

Unlike other conversions (format, num_channels), this one is far from trivial. It involves, in some specific scenarios, the use of an intermediate FIFO to store samples before they get encoded. I tried to document all of this in the Note: [Encoding loop, sample rate conversion and FIFO], so I recommend starting the review from there.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 27, 2025
@NicolasHug NicolasHug marked this pull request as draft May 27, 2025 13:02
@NicolasHug NicolasHug marked this pull request as ready for review July 4, 2025 18:41
// │ │
// AVFrame from maybeFlushSwrBuffers() ───┘ │
// Only if sample rate conversion was needed
// nullptr, to flush
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is art.

swrContext_.get(), avFrame->data, avFrame->nb_samples, NULL, 0);
avFrame->nb_samples = actualNumRemainingSamples;

encodeFrameThroughFifo(autoAVPacket, avFrame, /*andFlushFifo=*/true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to think about what's going on here for a while, let me know if I'm correct: we sometimes need to encode a frame through a FIFO because of various conversion reasons. We also need to make sure we flush our buffers, sometimes, in order to get the final frames. And those frames also need to sometimes go through the FIFO for the same reasons as above. But when we do the final flushing encoding through a FIFO, we need to do some special logic to make sure we're not leaving a frame behind in the FIFO. Correct?

We have the excellent comment in the header, but I think saying something here and at the top of the while loop in encodeFrameThroughFifo() could help readers put all this together.

Copy link
Member Author

@NicolasHug NicolasHug Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's all correct. Something I may not have been made clear yet, and I will add comments to that end:

The use of the FIFO applies to all frames, or to none of them. i.e. a given Encoder instance will either:

  • create and use the FIFO for every single frame
  • not use the FIFO.

I hope I was able to clarify all this a bit in fa0856a (#700)

@NicolasHug NicolasHug merged commit eb51e0e into pytorch:main Jul 8, 2025
37 of 41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants