FEAT: Kimi-VL-A3B #3372

Minamiyama · 2025-05-02T03:47:59Z

No description provided.

MilongWong · 2025-05-14T03:41:44Z

Just found this info after I failed to try the Kimi-VL by Xinference. Is there any plan to make it merged?

MilongWong · 2025-05-14T04:00:37Z

I just tried in the xinference latest docker with cmd below, and seems that it works (on 1 H20 96G):

python3 -m vllm.entrypoints.openai.api_server --port 8888 --served-model-name kimi-vl --trust-remote-code --model /workspace/modelscope/hub/models/moonshotai/Kimi-VL-A3B-Thinking/ --tensor-parallel-size 1 --max-num-batched-tokens 131072 --max-model-len 131072 --max-num-seqs 512 --limit-mm-per-prompt image=256

Minamiyama · 2025-05-19T07:41:38Z

Just found this info after I failed to try the Kimi-VL by Xinference. Is there any plan to make it merged?

80% WIP, waiting for a refactor to multimodal LLM engine

ChengjieLi28 · 2025-05-28T03:57:26Z

Hi, @Minamiyama . Please rebase the code to the latest main branch.
The current multimodal model implementation has undergone a refactoring. Please implement this model's functionality according to the new interface. A typical example is gemma3, as detailed in xinference/model/llm/transformers/multimodal/gemma3.py.

The key point is: You don’t need to focus on adapting to Xinference’s output format—just concentrate on implementing stream generation correctly.

ChengjieLi28 · 2025-05-28T03:59:40Z

Going further, if you're interested, you could explore integrating this model into a continuous batching-supported framework in the future. This would require implementing additional interfaces—refer to xinference/model/llm/transformers/multimodal/qwen2_vl.py for guidance.

Minamiyama · 2025-05-29T08:37:31Z

got it, thx

# Conflicts: # xinference/model/llm/llm_family.json # xinference/model/llm/llm_family_modelscope.json

Add support for Kimi-VL-A3B-Instruct and Kimi-VL-A3B-Thinking-2506 vision-language models with multimodal reasoning capabilities

feat: add Kimi-VL model cards

cd96f1e

XprobeBot added the feature label May 2, 2025

XprobeBot added this to the v1.x milestone May 2, 2025

Minamiyama added 4 commits May 2, 2025 11:59

feat(vllm): 添加对Kimi-VL-A3B-Instruct和Kimi-VL-A3B-Thinking视觉模型的支持

f71f9d7

feat: [WIP]transformers impl

2509f4a

[WIP]model vision ability

0c727d4

[WIP]generate

eefa8e9

Minamiyama added 3 commits July 21, 2025 14:23

Merge branch 'main' into FEAT/model/llm/Kimi-VL-A3B

ee830f0

# Conflicts: # xinference/model/llm/llm_family.json # xinference/model/llm/llm_family_modelscope.json

feat(model): add Kimi-VL-A3B models to llm_family.json

26836d2

Add support for Kimi-VL-A3B-Instruct and Kimi-VL-A3B-Thinking-2506 vision-language models with multimodal reasoning capabilities

fix(vllm): update Kimi-VL-A3B-Thinking model name to include version

64f7e16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FEAT: Kimi-VL-A3B #3372

FEAT: Kimi-VL-A3B #3372

Uh oh!

Minamiyama commented May 2, 2025

Uh oh!

MilongWong commented May 14, 2025

Uh oh!

MilongWong commented May 14, 2025

Uh oh!

Minamiyama commented May 19, 2025

Uh oh!

ChengjieLi28 commented May 28, 2025

Uh oh!

ChengjieLi28 commented May 28, 2025

Uh oh!

Minamiyama commented May 29, 2025

Uh oh!

Uh oh!

FEAT: Kimi-VL-A3B #3372

Are you sure you want to change the base?

FEAT: Kimi-VL-A3B #3372

Uh oh!

Conversation

Minamiyama commented May 2, 2025

Uh oh!

MilongWong commented May 14, 2025

Uh oh!

MilongWong commented May 14, 2025

Uh oh!

Minamiyama commented May 19, 2025

Uh oh!

ChengjieLi28 commented May 28, 2025

Uh oh!

ChengjieLi28 commented May 28, 2025

Uh oh!

Minamiyama commented May 29, 2025

Uh oh!

Uh oh!