Skip to content

Phi4-multimodal rotary_scaling is RotarySaclingType.none #4728

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jl749 opened this issue May 28, 2025 · 0 comments
Open

Phi4-multimodal rotary_scaling is RotarySaclingType.none #4728

jl749 opened this issue May 28, 2025 · 0 comments

Comments

@jl749
Copy link
Contributor

jl749 commented May 28, 2025

Hello TensorRT-LLM team

I have a question regarding Phi4-multimodal trtllm deployment.

  • tensorrt_llm == 0.19.0
  • transformers == 4.51.0

https://github.com/NVIDIA/TensorRT-LLM/blob/v0.19.0/tensorrt_llm/models/__init__.py#L157
suggests microsoft/Phi-4-multimodal-instruct(config.json) follows Phi3ForCausalLM pipeline when trtllm-build is called

# phi4mm example
huggingface-cli download microsoft/Phi-4-multimodal-instruct --local-dir Phi-4-multimodal-instruct_BASELINE
python3 -m TensorRT-LLM.examples.phi.convert_checkpoint \
	  --model_dir Phi-4-multimodal-instruct_BASELINE \
	  --output_dir Phi-4-multimodal-instruct_BASELINE/trtllm_ckpt
trtllm-build \
  --checkpoint_dir tests/Phi-4-multimodal-instruct_BASELINE/trtllm_ckpt \
  --output_dir tests/Phi-4-multimodal-instruct_BASELINE/trtllm_engine \
  --max_beam_width 1 --max_batch_size 1 --max_input_len 1024 --max_seq_len 2048  \
  --context_fmha enable --remove_input_padding enable \
  --kv_cache_type paged --gpt_attention_plugin auto --gemm_plugin disable

However, when Attention modules are initialized under Phi3ForCausalLM (link)
It sets rotary_embedding_scaling=None

Making rotary_embedding_scale_type to become RotaryScalingType.none instead of RotaryScalingType.longrope

Is this expected?

When rope_type is longrope I expect Attention modules to contain

  • self.rotary_embedding_scale_type = RotaryScalingType.longrope
  • self.position_embedding_type = PositionEmbeddingType.long_rope
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant