We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
RotarySaclingType.none
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hello TensorRT-LLM team
I have a question regarding Phi4-multimodal trtllm deployment.
https://github.com/NVIDIA/TensorRT-LLM/blob/v0.19.0/tensorrt_llm/models/__init__.py#L157 suggests microsoft/Phi-4-multimodal-instruct(config.json) follows Phi3ForCausalLM pipeline when trtllm-build is called
microsoft/Phi-4-multimodal-instruct
trtllm-build
# phi4mm example huggingface-cli download microsoft/Phi-4-multimodal-instruct --local-dir Phi-4-multimodal-instruct_BASELINE python3 -m TensorRT-LLM.examples.phi.convert_checkpoint \ --model_dir Phi-4-multimodal-instruct_BASELINE \ --output_dir Phi-4-multimodal-instruct_BASELINE/trtllm_ckpt trtllm-build \ --checkpoint_dir tests/Phi-4-multimodal-instruct_BASELINE/trtllm_ckpt \ --output_dir tests/Phi-4-multimodal-instruct_BASELINE/trtllm_engine \ --max_beam_width 1 --max_batch_size 1 --max_input_len 1024 --max_seq_len 2048 \ --context_fmha enable --remove_input_padding enable \ --kv_cache_type paged --gpt_attention_plugin auto --gemm_plugin disable
However, when Attention modules are initialized under Phi3ForCausalLM (link) It sets rotary_embedding_scaling=None
Attention
Phi3ForCausalLM
rotary_embedding_scaling=None
Making rotary_embedding_scale_type to become RotaryScalingType.none instead of RotaryScalingType.longrope
rotary_embedding_scale_type
RotaryScalingType.none
RotaryScalingType.longrope
Is this expected?
When rope_type is longrope I expect Attention modules to contain
rope_type
self.rotary_embedding_scale_type = RotaryScalingType.longrope
self.position_embedding_type = PositionEmbeddingType.long_rope
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Hello TensorRT-LLM team
I have a question regarding Phi4-multimodal trtllm deployment.
https://github.com/NVIDIA/TensorRT-LLM/blob/v0.19.0/tensorrt_llm/models/__init__.py#L157
suggests
microsoft/Phi-4-multimodal-instruct
(config.json) follows Phi3ForCausalLM pipeline whentrtllm-build
is calledHowever, when
Attention
modules are initialized underPhi3ForCausalLM
(link)It sets
rotary_embedding_scaling=None
Making
rotary_embedding_scale_type
to becomeRotaryScalingType.none
instead ofRotaryScalingType.longrope
Is this expected?
When
rope_type
is longrope I expectAttention
modules to containself.rotary_embedding_scale_type = RotaryScalingType.longrope
self.position_embedding_type = PositionEmbeddingType.long_rope
The text was updated successfully, but these errors were encountered: