Skip to content

From NVIDIA Megatron-LM for visibility #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4,904 commits into
base: multi-query-attention
Choose a base branch
from

Conversation

RaymondLi0
Copy link
Collaborator

No description provided.

@RaymondLi0 RaymondLi0 changed the base branch from multi-query-attention to before-merge June 20, 2023 20:12
@RaymondLi0 RaymondLi0 changed the base branch from before-merge to multi-query-attention June 20, 2023 20:12
ko3n1g and others added 28 commits May 30, 2025 08:07
ci: Auto-restart on nan

See merge request ADLR/megatron-lm!3388
perf(mla, experimental): MLA RoPE fusion and YARN embedding cache

Closes #429

See merge request ADLR/megatron-lm!2949
Fix custom FSDP float8 tensor set_item

See merge request ADLR/megatron-lm!3280
ci: Move queue blocker

See merge request ADLR/megatron-lm!3401
ci: Improve error-handling of missing logs

See merge request ADLR/megatron-lm!3400
ci: Control job concurrency

See merge request ADLR/megatron-lm!3408
ci: Catch missing logs

See merge request ADLR/megatron-lm!3412
ci: Remove tests from A100

See merge request ADLR/megatron-lm!3411
Add an option to skip counting zeros in grad of ChainedOptimizer

See merge request ADLR/megatron-lm!3393
Add an interface to set high priority stream groups

See merge request ADLR/megatron-lm!3326
Co-authored-by: Chen-Han Yu <[email protected]>
Co-authored-by: Chenhan Yu <[email protected]>
Llama4 inference

See merge request ADLR/megatron-lm!3241
Change default value of high_priority_stream_groups from [] to None

See merge request ADLR/megatron-lm!3421
[feat, moe]: FP8 padding optimization of MoE models by padding the routing map.

See merge request ADLR/megatron-lm!3170
Remove deprecated alltoall_seq dispatcher.

See merge request ADLR/megatron-lm!3306
ko3n1g and others added 30 commits July 6, 2025 13:38
tests: Update golden values

See merge request ADLR/megatron-lm!3591
build: Add pytest-asyncio

See merge request ADLR/megatron-lm!3592
ci: Comment out outdated test

See merge request ADLR/megatron-lm!3597
ci: Disable flaky test

See merge request ADLR/megatron-lm!3598
Add default values for Fp8Padding and Fp8Unpadding

See merge request ADLR/megatron-lm!3501
Add flag to disable early termination for static inference

See merge request ADLR/megatron-lm!3509
Signed-off-by: oliver könig <[email protected]>
…er" since in previous PR we removed `encoder_or_decoder`
…or_decoder" since in previous PR we removed `encoder_or_decoder`"

This reverts commit 9e49aa4.
Fix OOM when merging text datasets

Closes #461

See merge request ADLR/megatron-lm!3356
Adding CUDA Graph Support for Frozen Transformer Layers

See merge request ADLR/megatron-lm!3531
Add assertion to PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True with nccl_ub case

See merge request ADLR/megatron-lm!3599
Miscellaneous timing fixes for inference scripts

See merge request ADLR/megatron-lm!3527
…is not initialized

Co-authored-by: Mcore Bot <[email protected]>
Co-authored-by: yaoyu-33 <[email protected]>
Fix issues from cpu init when parallel state is not initialized

See merge request ADLR/megatron-lm!3520
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.