Skip to content

Pull requests: NVIDIA/TensorRT-LLM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[doc] Add speculative decoding PyTorch docs
#4962 opened Jun 5, 2025 by mikeiovine Loading…
Doc: Add info about stop words appearing in output
#4956 opened Jun 5, 2025 by Linda-Stadter Loading…
ci: [nvbugs/5280806] Unwaive unittests/_torch.
#4951 opened Jun 5, 2025 by yuxianq Loading…
[WIP] Introduce Flux MoE operator
#4948 opened Jun 5, 2025 by lancelly Draft
fix: Mapping rank boundary check bug
#4935 opened Jun 5, 2025 by venkywonka Loading…
Kv cache transfer support duplicate heads
#4929 opened Jun 5, 2025 by chuangz0 Loading…
chore: cleanup GDS Cmake interface
#4928 opened Jun 5, 2025 by achartier Loading…
fix:https://nvbugs/5324252
#4925 opened Jun 5, 2025 by nv-guomingz Loading…
fix: trtllm-bench --dataset required=True
#4924 opened Jun 5, 2025 by jasonqinzhou Loading…
Coalesce text diffs in streaming requests.
#4923 opened Jun 5, 2025 by pathorn Loading…
chore: Refactor apply_rope.
#4918 opened Jun 4, 2025 by bobboli Loading…
ProTip! Updated in the last three days: updated:>2025-06-02.