GPU Utilization is 100% even when we are not inferencing. #6117

ashissharma97 · 2025-05-08T08:54:59Z

ashissharma97
May 8, 2025

Hi there,

We have hosted DeepSeek R1 in 2 H100 Machines. After starting the server on both the machine we noticed that GPU utilization is 100%. Even when we made a concurrent request of 1000 Clients still its 100%. It stays the same. Is it the issue with SGLand or we are missing something?

This is the command we are running to start the server.

python3 -m sglang.launch_server --model-path DeepSeek-R1 --tp 16 --dist-init-addr x.x.x.x:5000 --host 0.0.0.0 --nnodes 2 --node-rank 0 --trust-remote-code --port 40000 --attention-backend fa3

This is the GPU Utilization screenshot.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPU Utilization is 100% even when we are not inferencing. #6117

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

GPU Utilization is 100% even when we are not inferencing. #6117

Uh oh!

ashissharma97 May 8, 2025

Replies: 0 comments

ashissharma97
May 8, 2025