From NVIDIA Megatron-LM for visibility #18

RaymondLi0 · 2023-01-24T20:01:13Z

No description provided.

ci: Auto-restart on nan See merge request ADLR/megatron-lm!3388

…YARN embedding cache Co-authored-by: xuwenc <[email protected]>

perf(mla, experimental): MLA RoPE fusion and YARN embedding cache Closes #429 See merge request ADLR/megatron-lm!2949

Co-authored-by: jianbinc <[email protected]>

Fix custom FSDP float8 tensor set_item See merge request ADLR/megatron-lm!3280

ci: Move queue blocker See merge request ADLR/megatron-lm!3401

Co-authored-by: Mcore Bot <[email protected]>

ci: Improve error-handling of missing logs See merge request ADLR/megatron-lm!3400

Co-authored-by: Mcore Bot <[email protected]>

ci: Control job concurrency See merge request ADLR/megatron-lm!3408

ci: Catch missing logs See merge request ADLR/megatron-lm!3412

ci: Remove tests from A100 See merge request ADLR/megatron-lm!3411

…of ChainedOptimizer

Add an option to skip counting zeros in grad of ChainedOptimizer See merge request ADLR/megatron-lm!3393

…groups

Add an interface to set high priority stream groups See merge request ADLR/megatron-lm!3326

Co-authored-by: Chen-Han Yu <[email protected]> Co-authored-by: Chenhan Yu <[email protected]>

Llama4 inference See merge request ADLR/megatron-lm!3241

…groups from [] to None

Change default value of high_priority_stream_groups from [] to None See merge request ADLR/megatron-lm!3421

…models by padding the routing map.

[feat, moe]: FP8 padding optimization of MoE models by padding the routing map. See merge request ADLR/megatron-lm!3170

Co-authored-by: Robin Zhang <[email protected]>

Remove deprecated alltoall_seq dispatcher. See merge request ADLR/megatron-lm!3306

tests: Update golden values See merge request ADLR/megatron-lm!3591

build: Add pytest-asyncio See merge request ADLR/megatron-lm!3592

ci: Comment out outdated test See merge request ADLR/megatron-lm!3597

ci: Disable flaky test See merge request ADLR/megatron-lm!3598

…dding

Add default values for Fp8Padding and Fp8Unpadding See merge request ADLR/megatron-lm!3501

…tic inference

Add flag to disable early termination for static inference See merge request ADLR/megatron-lm!3509

Signed-off-by: oliver könig <[email protected]>

…ted Functions" This reverts commit 509384d.

…er" since in previous PR we removed `encoder_or_decoder`

Signed-off-by: oliver könig <[email protected]>

… PP related Functions"" This reverts commit 1169ce2.

…or_decoder" since in previous PR we removed `encoder_or_decoder`" This reverts commit 9e49aa4.

Fix OOM when merging text datasets Closes #461 See merge request ADLR/megatron-lm!3356

…mer Layers Co-authored-by: Lifu Zhang <[email protected]>

Adding CUDA Graph Support for Frozen Transformer Layers See merge request ADLR/megatron-lm!3531

…ndable_segments:True with nccl_ub case

Add assertion to PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True with nccl_ub case See merge request ADLR/megatron-lm!3599

Miscellaneous timing fixes for inference scripts See merge request ADLR/megatron-lm!3527

…is not initialized Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: yaoyu-33 <[email protected]>

Fix issues from cpu init when parallel state is not initialized See merge request ADLR/megatron-lm!3520

RaymondLi0 changed the base branch from multi-query-attention to before-merge June 20, 2023 20:12

RaymondLi0 changed the base branch from before-merge to multi-query-attention June 20, 2023 20:12

ko3n1g and others added 28 commits May 30, 2025 08:07

ADLR/megatron-lm!3388 - ci: Auto-restart on nan

de245df

Merge branch 'ko3n1g/ci/restart-on-nan' into 'main'

0a438ed

ci: Auto-restart on nan See merge request ADLR/megatron-lm!3388

ADLR/megatron-lm!2949 - perf(mla, experimental): MLA RoPE fusion and …

23e6471

…YARN embedding cache Co-authored-by: xuwenc <[email protected]>

Merge branch 'hongxiaob/mla_rope' into 'main'

9c1a535

perf(mla, experimental): MLA RoPE fusion and YARN embedding cache Closes #429 See merge request ADLR/megatron-lm!2949

ADLR/megatron-lm!3280 - Fix custom FSDP float8 tensor set_item

da3f0ff

Co-authored-by: jianbinc <[email protected]>

Merge branch 'fix_cfsdp_fp8_param_load' into 'main'

549d637

Fix custom FSDP float8 tensor set_item See merge request ADLR/megatron-lm!3280

ADLR/megatron-lm!3401 - ci: Move queue blocker

24c60db

Merge branch 'ko3n1g/ci/move-queue-blocker' into 'main'

cfea2ea

ci: Move queue blocker See merge request ADLR/megatron-lm!3401

ADLR/megatron-lm!3400 - ci: Improve error-handling of missing logs

37b0afd

Co-authored-by: Mcore Bot <[email protected]>

Merge branch 'ko3n1g/ci/better-log-failure-handling' into 'main'

6a62a54

ci: Improve error-handling of missing logs See merge request ADLR/megatron-lm!3400

ADLR/megatron-lm!3408 - ci: Control job concurrency

4648912

Co-authored-by: Mcore Bot <[email protected]>

Merge branch 'ko3n1g/ci/job-concurrency' into 'main'

cde60ce

ci: Control job concurrency See merge request ADLR/megatron-lm!3408

ADLR/megatron-lm!3412 - ci: Catch missing logs

eab047c

Merge branch 'ko3n1g/ci/fix-no-log' into 'main'

25a26ca

ci: Catch missing logs See merge request ADLR/megatron-lm!3412

ADLR/megatron-lm!3411 - ci: Remove tests from A100

9bdfe31

Merge branch 'ko3n1g/ci/move-tests' into 'main'

ff64f96

ci: Remove tests from A100 See merge request ADLR/megatron-lm!3411

ADLR/megatron-lm!3393 - Add an option to skip counting zeros in grad …

d960800

…of ChainedOptimizer

Merge branch 'no_count_zeros' into 'main'

b47a9bb

Add an option to skip counting zeros in grad of ChainedOptimizer See merge request ADLR/megatron-lm!3393

ADLR/megatron-lm!3326 - Add an interface to set high priority stream …

bc80491

…groups

Merge branch 'comm-priority-setting' into 'main'

957f348

Add an interface to set high priority stream groups See merge request ADLR/megatron-lm!3326

ADLR/megatron-lm!3241 - Llama4 inference

7af72f9

Co-authored-by: Chen-Han Yu <[email protected]> Co-authored-by: Chenhan Yu <[email protected]>

Merge branch 'llama4-inference' into 'main'

4eb36f8

Llama4 inference See merge request ADLR/megatron-lm!3241

ADLR/megatron-lm!3421 - Change default value of high_priority_stream_…

61a42f6

…groups from [] to None

Merge branch 'comm-priority-patch' into 'main'

7c64be3

Change default value of high_priority_stream_groups from [] to None See merge request ADLR/megatron-lm!3421

ADLR/megatron-lm!3170 - [feat, moe]: FP8 padding optimization of MoE …

92d68da

…models by padding the routing map.

Merge branch 'denliu/router_pad' into 'main'

140dce2

[feat, moe]: FP8 padding optimization of MoE models by padding the routing map. See merge request ADLR/megatron-lm!3170

ADLR/megatron-lm!3306 - Remove deprecated alltoall_seq dispatcher.

9e3adb5

Co-authored-by: Robin Zhang <[email protected]>

Merge branch 'denliu/remove_alltoall_seq_dispatcher' into 'main'

823466e

Remove deprecated alltoall_seq dispatcher. See merge request ADLR/megatron-lm!3306

ko3n1g and others added 30 commits July 6, 2025 13:38

ADLR/megatron-lm!3591 - tests: Update golden values

65bd6f1

Merge branch 'ko3n1g/tests/update-nightlies' into 'main'

6a1b515

tests: Update golden values See merge request ADLR/megatron-lm!3591

ADLR/megatron-lm!3592 - build: Add pytest-asyncio

aa8c311

Merge branch 'ko3n1g/tests/fix-asyncio' into 'main'

8f4d909

build: Add pytest-asyncio See merge request ADLR/megatron-lm!3592

chore: Version bump

3450806

ADLR/megatron-lm!3597 - ci: Comment out outdated test

0df8ee9

Merge branch 'ko3n1g/ci/disable-outdated-test' into 'main'

7b296e0

ci: Comment out outdated test See merge request ADLR/megatron-lm!3597

ADLR/megatron-lm!3598 - ci: Disable flaky test

97c0766

Merge branch 'ko3n1g/ci/flaky-test-2' into 'main'

80f88df

ci: Disable flaky test See merge request ADLR/megatron-lm!3598

ADLR/megatron-lm!3501 - Add default values for Fp8Padding and Fp8Unpa…

2621b4f

…dding

Merge branch 'fp8_fix' into 'main'

42b2b1d

Add default values for Fp8Padding and Fp8Unpadding See merge request ADLR/megatron-lm!3501

ADLR/megatron-lm!3509 - Add flag to disable early termination for sta…

aa67beb

…tic inference

Merge branch 'no_early_termination' into 'main'

efcabfc

Add flag to disable early termination for static inference See merge request ADLR/megatron-lm!3509

ci(hotfix): Disable more flaky tests

5413c63

Signed-off-by: oliver könig <[email protected]>

ci(hotfix): Disable flaky tests

6fb6d14

Signed-off-by: oliver könig <[email protected]>

Reapply "ADLR/megatron-lm!3439 - M4 Taskforce: Remove Encoder PP rela…

1169ce2

…ted Functions" This reverts commit 509384d.

update "ModelType.encoder_and_decoder" to "ModelType.encoder_or_decod…

9e49aa4

…er" since in previous PR we removed `encoder_or_decoder`

ci(hotfix): Release wheel workflow

d235851

Signed-off-by: oliver könig <[email protected]>

Revert "Reapply "ADLR/megatron-lm!3439 - M4 Taskforce: Remove Encoder…

436691d

… PP related Functions"" This reverts commit 1169ce2.

Revert "update "ModelType.encoder_and_decoder" to "ModelType.encoder_…

7a8b745

…or_decoder" since in previous PR we removed `encoder_or_decoder`" This reverts commit 9e49aa4.

ADLR/megatron-lm!3356 - Fix OOM when merging text datasets

abb1704

Merge branch 'vmarch/fix-oom-preproc' into 'main'

082373e

Fix OOM when merging text datasets Closes #461 See merge request ADLR/megatron-lm!3356

ADLR/megatron-lm!3531 - Adding CUDA Graph Support for Frozen Transfor…

0059345

…mer Layers Co-authored-by: Lifu Zhang <[email protected]>

Merge branch 'frozen_layer_cuda_graph_support_lifuz' into 'main'

288dbf0

Adding CUDA Graph Support for Frozen Transformer Layers See merge request ADLR/megatron-lm!3531

ADLR/megatron-lm!3599 - Add assertion to PYTORCH_CUDA_ALLOC_CONF=expa…

a0ac48d

…ndable_segments:True with nccl_ub case

Merge branch 'exp_seg_assertion' into 'main'

e11d285

Add assertion to PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True with nccl_ub case See merge request ADLR/megatron-lm!3599

ADLR/megatron-lm!3527 - Miscellaneous timing fixes for inference scripts

c7ad90a

Merge branch 'timing_fixes' into 'main'

3a2a972

Miscellaneous timing fixes for inference scripts See merge request ADLR/megatron-lm!3527

ADLR/megatron-lm!3520 - Fix issues from cpu init when parallel state …

14bfcc0

…is not initialized Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: yaoyu-33 <[email protected]>

Merge branch 'yuya/fix_cpu_init_pg_issue' into 'main'

ee082bf

Fix issues from cpu init when parallel state is not initialized See merge request ADLR/megatron-lm!3520

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

From NVIDIA Megatron-LM for visibility #18

From NVIDIA Megatron-LM for visibility #18

Uh oh!

RaymondLi0 commented Jan 24, 2023

Uh oh!

Uh oh!

From NVIDIA Megatron-LM for visibility #18

Are you sure you want to change the base?

From NVIDIA Megatron-LM for visibility #18

Uh oh!

Conversation

RaymondLi0 commented Jan 24, 2023

Uh oh!

Uh oh!