GPU Utilization is 100% even when we are not inferencing. #6117
ashissharma97
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi there,
We have hosted DeepSeek R1 in 2 H100 Machines. After starting the server on both the machine we noticed that GPU utilization is 100%. Even when we made a concurrent request of 1000 Clients still its 100%. It stays the same. Is it the issue with SGLand or we are missing something?
This is the command we are running to start the server.
python3 -m sglang.launch_server --model-path DeepSeek-R1 --tp 16 --dist-init-addr x.x.x.x:5000 --host 0.0.0.0 --nnodes 2 --node-rank 0 --trust-remote-code --port 40000 --attention-backend fa3
This is the GPU Utilization screenshot.
Beta Was this translation helpful? Give feedback.
All reactions