[grpo] Optimize vLLM weight synchronization #5773

hjh0119 · 2025-09-11T09:13:45Z

Optimize weight synchronization between the training model and the inference engine (vLLM):

LoRA

Synchronize/load only the trained adapter weights.(both colocate / server mode)
In server mode, transmit flattened adapter weights to reduce communication overhead of model parameters.

FULL

Remove the original per-tensor synchronization logic and adopt a bucketing strategy to reduce redundant communication requests and overhead, especially for MoE models (which have more tensors than dense models).

hjh0119 · 2025-09-11T09:30:18Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces an optimization for LoRA training with vLLM by enabling in-memory weight synchronization using flattened tensors. This avoids disk I/O and should improve training speed. The changes involve adding new arguments, new protocol definitions, and new methods in the rollout engine and GRPO trainer. A key part of the implementation is monkey-patching vLLM to support loading LoRA adapters from tensors. The overall approach is sound, but there are a few areas that need attention, such as ensuring deterministic adapter selection, cleaning up commented-out code, and addressing TODO comments.

swift/llm/infer/infer_engine/grpo_vllm_engine.py

swift/llm/infer/rollout.py

swift/trainers/rlhf_trainer/grpo_trainer.py

hjh0119 · 2025-09-12T09:08:32Z

Qwen2.5-VL-7B-Instruct, server mode, tp=2, dp=2 → 10× speed-up

hjh0119 · 2025-09-12T09:21:04Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a significant optimization for LoRA training with vLLM by synchronizing only the adapter weights instead of the full model. This is achieved by patching vLLM to load adapters from in-memory tensors and flattening the weights for more efficient communication. The changes are well-implemented across the trainer, rollout logic, and communication protocols. The documentation has also been updated to reflect these new features. My review includes a few suggestions to improve code quality and fix a minor typo in the documentation.

docs/source/Instruction/GRPO/GetStarted/GRPO.md

swift/trainers/rlhf_trainer/grpo_trainer.py

swift/trainers/rlhf_trainer/utils.py

hjh0119 added 14 commits September 8, 2025 15:39

support only sync lora weight

6301476

fix wip

c7be012

wip

22042fc

fix colocate lora

1081caa

add lora for server wip

4c04d36

Merge branch 'lora+' of github.com:hjh0119/swift into lora+

1982e9e

fix import

5fe3690

update extension path

161dac8

override enable_lora for rollout

0a14d20

Merge branch 'lora+' of github.com:hjh0119/swift into lora+

4574665

catch rollout exception

efae3b2

fix lora request

f454598

server wip

d46bc1f

server add_lora wip

986ac8d

hjh0119 changed the title ~~[grpo] Optimize LoRA training vLLM weight synchronization~~ [WIP] Optimize LoRA training vLLM weight synchronization Sep 11, 2025

hjh0119 marked this pull request as ready for review September 11, 2025 09:27

gemini-code-assist bot reviewed Sep 11, 2025

View reviewed changes

swift/llm/infer/infer_engine/grpo_vllm_engine.py Show resolved Hide resolved

swift/llm/infer/rollout.py Outdated Show resolved Hide resolved

swift/llm/infer/rollout.py Outdated Show resolved Hide resolved

swift/trainers/rlhf_trainer/grpo_trainer.py Outdated Show resolved Hide resolved

hjh0119 added 8 commits September 12, 2025 14:51

fix server tp

0dc8c6e

merge main

ba284ba

Merge branch 'lora+' of github.com:hjh0119/swift into lora+

0f7ca2a

doc wip

849696f

doc

f70e827

check lora

274f6db

support only sync lora weight

f0b4de8

add args for lora script

6069888

hjh0119 changed the title ~~[WIP] Optimize LoRA training vLLM weight synchronization~~ Optimize LoRA training vLLM weight synchronization Sep 12, 2025

update script

0cf5a62

fix

691a5df

hjh0119 added 2 commits September 12, 2025 17:17

remove unused import

5cab78d

fix

e43a0da

gemini-code-assist bot reviewed Sep 12, 2025

View reviewed changes

docs/source/Instruction/GRPO/GetStarted/GRPO.md Outdated Show resolved Hide resolved

swift/trainers/rlhf_trainer/grpo_trainer.py Show resolved Hide resolved

swift/trainers/rlhf_trainer/utils.py Show resolved Hide resolved

fix typo

688bf64

hjh0119 changed the title ~~Optimize LoRA training vLLM weight synchronization~~ [grpo] Optimize LoRA training vLLM weight synchronization Sep 12, 2025

hjh0119 added 7 commits September 12, 2025 19:16

fix unmerge

4fa2d2f

wip

78f9473

Merge branch 'lora+' of github.com:hjh0119/swift into lora+

a50f756

Merge remote-tracking branch 'origin' into lora+

be00cf4

bucket for full training in server mode

6745666

remove circle import

dfccf15

fix TokenizerGroup removed in vllm 0.11.0

4652f77

hjh0119 changed the title ~~[grpo] Optimize LoRA training vLLM weight synchronization~~ [grpo] Optimize vLLM weight synchronization Oct 10, 2025

hjh0119 changed the title ~~[grpo] Optimize vLLM weight synchronization~~ [grpo] Optimize vLLM weight synchronization for server mode Oct 10, 2025

hjh0119 changed the title ~~[grpo] Optimize vLLM weight synchronization for server mode~~ [grpo] Optimize vLLM weight synchronization Oct 10, 2025

hjh0119 added 2 commits October 10, 2025 16:01

rm comments

8c73590

Merge branch 'lora+' of github.com:hjh0119/swift into lora+

4389268

hjh0119 mentioned this pull request Oct 11, 2025

rlhf，grpo，qwen30B， Stuck in the initial stage of training #5949

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[grpo] Optimize vLLM weight synchronization #5773

[grpo] Optimize vLLM weight synchronization #5773

Uh oh!

hjh0119 commented Sep 11, 2025 •

edited

Loading

Uh oh!

hjh0119 commented Sep 11, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hjh0119 commented Sep 12, 2025

Uh oh!

hjh0119 commented Sep 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[grpo] Optimize vLLM weight synchronization #5773

Are you sure you want to change the base?

[grpo] Optimize vLLM weight synchronization #5773

Uh oh!

Conversation

hjh0119 commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Optimize weight synchronization between the training model and the inference engine (vLLM):

LoRA

FULL

Uh oh!

hjh0119 commented Sep 11, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hjh0119 commented Sep 12, 2025

Uh oh!

hjh0119 commented Sep 12, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hjh0119 commented Sep 11, 2025 •

edited

Loading