VLLM 和 Sglang 部署 Qwen3-Omni，GPU 的利用率都不高，请问是什么原因？ #114

#27

by yixue - opened Oct 31

yixue

Oct 31

VLLM 和 Sglang 部署 Qwen3-Omni，GPU 的利用率都不高，请问是什么原因？跑更大的文本大模型，都不会出现这样的情况。

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment