Skip to content

Conversation

Funatiq
Copy link
Collaborator

@Funatiq Funatiq commented May 19, 2025

Description

  • Moved the handling of logits from DecoderBuffers to DecoderInputBuffers.
  • Separated the DraftBuffers from DecoderBuffers to manage draft token buffers, enhancing the organization of tensor management.
  • Removed DecoderBuffers from function signatures across various components.
  • Updated Python bindings to reflect changes, maintaining compatibility with existing interfaces.

Changes to TRTLLMSampler:

  • Move decoder_state from algs to store.
  • Use pure python type for decoder buffer logits.

Test Coverage

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

run [--disable-fail-fast --skip-test --stage-list "A10-1, xxx" --gpu-type "A30, H100_PCIe" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-[Post-Merge]-1, xxx"]

Launch build/test pipelines. All previously running jobs will be killed.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests. Will also run L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-[Post-Merge]-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-[Post-Merge]-1, xxx".

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

@Funatiq
Copy link
Collaborator Author

Funatiq commented May 19, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #5746 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #5746 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #4201 completed with status: 'FAILURE'

@Funatiq
Copy link
Collaborator Author

Funatiq commented May 20, 2025

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #5849 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #5849 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #4286 completed with status: 'FAILURE'

@Funatiq Funatiq force-pushed the dev/refactor_decoder_buffers_2 branch 2 times, most recently from cc5e39b to ce13b9e Compare May 27, 2025 10:53
@Funatiq
Copy link
Collaborator Author

Funatiq commented May 27, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6628 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6628 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #4843 completed with status: 'FAILURE'

@Funatiq
Copy link
Collaborator Author

Funatiq commented May 28, 2025

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6783 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6783 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #4945 completed with status: 'FAILURE'

@Funatiq Funatiq force-pushed the dev/refactor_decoder_buffers_2 branch from 51f8341 to 693609b Compare June 3, 2025 09:03
@Funatiq
Copy link
Collaborator Author

Funatiq commented Jun 3, 2025

/bot run

1 similar comment
@Funatiq
Copy link
Collaborator Author

Funatiq commented Jun 3, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7319 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7322 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7319 [ run ] completed with state ABORTED

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7322 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #5306 completed with status: 'FAILURE'

@Funatiq Funatiq force-pushed the dev/refactor_decoder_buffers_2 branch from f1c1043 to eabbdf8 Compare June 3, 2025 14:36
@Funatiq
Copy link
Collaborator Author

Funatiq commented Jun 3, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7365 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7365 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #5340 completed with status: 'FAILURE'

@Funatiq Funatiq force-pushed the dev/refactor_decoder_buffers_2 branch from eabbdf8 to b82d4dd Compare June 4, 2025 07:49
@Funatiq
Copy link
Collaborator Author

Funatiq commented Jun 4, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7465 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7465 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #5419 completed with status: 'ABORTED'

@Funatiq Funatiq force-pushed the dev/refactor_decoder_buffers_2 branch from b82d4dd to 2ada5c9 Compare June 6, 2025 15:09
@tensorrt-cicd
Copy link
Collaborator

PR_Github #8890 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #6475 completed with status: 'SUCCESS'

@Funatiq Funatiq force-pushed the dev/refactor_decoder_buffers_2 branch from 47f304e to 2d0fb6b Compare June 16, 2025 07:50
@Funatiq
Copy link
Collaborator Author

Funatiq commented Jun 16, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #8999 [ run ] triggered by Bot

@dcampora dcampora enabled auto-merge (squash) June 16, 2025 08:07
@tensorrt-cicd
Copy link
Collaborator

PR_Github #8999 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #6571 completed with status: 'FAILURE'

@Funatiq Funatiq force-pushed the dev/refactor_decoder_buffers_2 branch from 2d0fb6b to 68edea3 Compare June 16, 2025 18:50
@Funatiq
Copy link
Collaborator Author

Funatiq commented Jun 16, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9058 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9058 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #6622 completed with status: 'FAILURE'

@Funatiq Funatiq force-pushed the dev/refactor_decoder_buffers_2 branch from 68edea3 to 02d2b71 Compare June 17, 2025 04:59
@Funatiq
Copy link
Collaborator Author

Funatiq commented Jun 17, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9131 [ run ] triggered by Bot

Funatiq added 4 commits June 17, 2025 10:09
- Moved the handling of logits from `DecoderBuffers` to `DecoderInputBuffers`.
- Separated the `DraftBuffers` from `DecoderBuffers` to manage draft token buffers, enhancing the organization of tensor management.
- Removed `DecoderBuffers` from function signatures across various components.
- Updated Python bindings to reflect changes, maintaining compatibility with existing interfaces.

These changes improve the maintainability and clarity of the decoding process in the batch manager.

Signed-off-by: Robin Kobus <[email protected]>
- Modified the `HandleContextLogits` class to accept `max_num_sequences` and return a list of logits tensors along with the logits index.
- Adjusted the `HandleGenerationLogits` class to work with the updated logits handling, removing the dependency on `DecoderInputBuffers`.
- Updated the `TRTLLMSampler` to accommodate changes in logits management, ensuring proper buffer handling.

These changes enhance the clarity and maintainability of the logits processing workflow.

Signed-off-by: Robin Kobus <[email protected]>
@Funatiq Funatiq force-pushed the dev/refactor_decoder_buffers_2 branch from 02d2b71 to c90f89b Compare June 17, 2025 10:11
@tensorrt-cicd
Copy link
Collaborator

PR_Github #9131 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #6683 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

@Funatiq
Copy link
Collaborator Author

Funatiq commented Jun 17, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9192 [ run ] triggered by Bot

@Funatiq
Copy link
Collaborator Author

Funatiq commented Jun 17, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9231 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9231 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #6768 completed with status: 'FAILURE'

@Funatiq
Copy link
Collaborator Author

Funatiq commented Jun 17, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9235 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9235 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #6772 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

@dcampora dcampora merged commit 627062c into NVIDIA:main Jun 18, 2025
3 checks passed
@Funatiq Funatiq deleted the dev/refactor_decoder_buffers_2 branch June 18, 2025 05:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants