Precompiled TensorRT Engines by Qlip

Precompiled Qlip engines for accelerated diffusion model inference. Base model weights are not included.

For supported models, benchmarks, and usage instructions see ComfyUI-Qlip.

Usage

ComfyUI (automatic download)

In the Qlip Engines Loader node, set the hf_repo input. Engines are downloaded once and cached locally.

Manual download

pip install huggingface-hub
huggingface-cli download TheStageAI/<repo-name> \
    --local-dir ./engines \
    --include "models/H100/<variant>/*"

Installation

# Qlip core
pip install 'qlip.core[nvidia]' \
    --extra-index-url https://thestage.jfrog.io/artifactory/api/pypi/pypi-thestage-ai-production/simple

# elastic_models (LoRA runtime support)
pip install 'thestage-elastic-models[nvidia]' \
    --extra-index-url https://thestage.jfrog.io/artifactory/api/pypi/pypi-thestage-ai-production/simple

Requirements

NVIDIA GPU with CUDA 12.x
TensorRT 10.13.3.9 (must match compilation version)
Engines are GPU-architecture specific — recompile after changing hardware

License

Proprietary. Powered by TheStage AI.

Downloads last month: 7

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support