forked from NVIDIA/Megatron-LM
-
Notifications
You must be signed in to change notification settings - Fork 51
From NVIDIA Megatron-LM for visibility #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
RaymondLi0
wants to merge
4,904
commits into
bigcode-project:multi-query-attention
Choose a base branch
from
NVIDIA:main
base: multi-query-attention
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ci: Auto-restart on nan See merge request ADLR/megatron-lm!3388
…YARN embedding cache Co-authored-by: xuwenc <[email protected]>
perf(mla, experimental): MLA RoPE fusion and YARN embedding cache Closes #429 See merge request ADLR/megatron-lm!2949
Co-authored-by: jianbinc <[email protected]>
Fix custom FSDP float8 tensor set_item See merge request ADLR/megatron-lm!3280
ci: Move queue blocker See merge request ADLR/megatron-lm!3401
Co-authored-by: Mcore Bot <[email protected]>
ci: Improve error-handling of missing logs See merge request ADLR/megatron-lm!3400
Co-authored-by: Mcore Bot <[email protected]>
ci: Control job concurrency See merge request ADLR/megatron-lm!3408
ci: Catch missing logs See merge request ADLR/megatron-lm!3412
ci: Remove tests from A100 See merge request ADLR/megatron-lm!3411
…of ChainedOptimizer
Add an option to skip counting zeros in grad of ChainedOptimizer See merge request ADLR/megatron-lm!3393
Add an interface to set high priority stream groups See merge request ADLR/megatron-lm!3326
Co-authored-by: Chen-Han Yu <[email protected]> Co-authored-by: Chenhan Yu <[email protected]>
Llama4 inference See merge request ADLR/megatron-lm!3241
…groups from [] to None
Change default value of high_priority_stream_groups from [] to None See merge request ADLR/megatron-lm!3421
…models by padding the routing map.
[feat, moe]: FP8 padding optimization of MoE models by padding the routing map. See merge request ADLR/megatron-lm!3170
Co-authored-by: Robin Zhang <[email protected]>
Remove deprecated alltoall_seq dispatcher. See merge request ADLR/megatron-lm!3306
tests: Update golden values See merge request ADLR/megatron-lm!3591
build: Add pytest-asyncio See merge request ADLR/megatron-lm!3592
ci: Comment out outdated test See merge request ADLR/megatron-lm!3597
ci: Disable flaky test See merge request ADLR/megatron-lm!3598
Add default values for Fp8Padding and Fp8Unpadding See merge request ADLR/megatron-lm!3501
Add flag to disable early termination for static inference See merge request ADLR/megatron-lm!3509
Signed-off-by: oliver könig <[email protected]>
Signed-off-by: oliver könig <[email protected]>
…ted Functions" This reverts commit 509384d.
…er" since in previous PR we removed `encoder_or_decoder`
Signed-off-by: oliver könig <[email protected]>
… PP related Functions"" This reverts commit 1169ce2.
…or_decoder" since in previous PR we removed `encoder_or_decoder`" This reverts commit 9e49aa4.
Fix OOM when merging text datasets Closes #461 See merge request ADLR/megatron-lm!3356
…mer Layers Co-authored-by: Lifu Zhang <[email protected]>
Adding CUDA Graph Support for Frozen Transformer Layers See merge request ADLR/megatron-lm!3531
…ndable_segments:True with nccl_ub case
Add assertion to PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True with nccl_ub case See merge request ADLR/megatron-lm!3599
Miscellaneous timing fixes for inference scripts See merge request ADLR/megatron-lm!3527
…is not initialized Co-authored-by: Mcore Bot <[email protected]> Co-authored-by: yaoyu-33 <[email protected]>
Fix issues from cpu init when parallel state is not initialized See merge request ADLR/megatron-lm!3520
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.