QuantTrio
/

Qwen3-VL-235B-A22B-Thinking-FP8

Text Generation

Model card Files Files and versions

tclf90 commited on 16 days ago

Commit

ac20bc1

·

verified ·

1 Parent(s): 50e9a67

Update README.md

Files changed (1) hide show

README.md +11 -14

README.md CHANGED Viewed

@@ -13,23 +13,20 @@ base_model_relation: quantized
 Base Model: [Qwen/Qwen3-VL-235B-A22B-Thinking](https://www.modelscope.cn/models/Qwen/Qwen3-VL-235B-A22B-Thinking)
 ### 【Dependencies / Installation】
-As of **2025-09-26**, create a fresh Python environment and run:
 ```bash
-pip install -U pip
-pip install uv
-pip install git+https://github.com/huggingface/transformers
-pip install accelerate
-pip install qwen-vl-utils==0.0.14
-# pip install 'vllm>0.10.2' # If this is not working use the below one.
-uv pip install -U vllm \
-    --torch-backend=auto \
-    --extra-index-url https://wheels.vllm.ai/nightly
-```
-or use the docker image from qwen3vl team:
-```
-docker run --gpus all --ipc=host --network=host --rm --name qwen3vl -it qwenllm/qwenvl:qwen3vl-cu128 bash
 ```
 ### 【vLLM Startup Command】
 <i>Note: When launching with TP=8, include `--enable-expert-parallel`;
 otherwise the expert tensors couldn’t be evenly sharded across GPU devices.</i>

 Base Model: [Qwen/Qwen3-VL-235B-A22B-Thinking](https://www.modelscope.cn/models/Qwen/Qwen3-VL-235B-A22B-Thinking)
 ### 【Dependencies / Installation】
+As of **2025-10-08**, create a fresh Python environment and run:
 ```bash
+uv venv
+source .venv/bin/activate
+# Install vLLM >=0.11.0
+uv pip install -U vllm
+# Install Qwen-VL utility library (recommended for offline inference)
+uv pip install qwen-vl-utils==0.0.14
 ```
+For more details, refer to [vLLM Official Qwen3-VL Guide](https://docs.vllm.ai/projects/recipes/en/latest/Qwen/Qwen3-VL.html)
 ### 【vLLM Startup Command】
 <i>Note: When launching with TP=8, include `--enable-expert-parallel`;
 otherwise the expert tensors couldn’t be evenly sharded across GPU devices.</i>