Edit Models filters

Apps

Docker Model Runner

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

666

Full-text search

Active filters: fp8

adriabama06/DeepCoder-1.5B-Preview-FP8-W8A8

Text Generation • Updated Apr 13 • 5 • 1

cloud19/Qwen2.5-32B-ArliAI-RPMax-v1.3-FP8-Dynamic

33B • Updated Apr 15 • 3

microsoft/MAI-DS-R1-FP8

Text Generation • 671B • Updated Apr 22 • 1.54k • 23

SalonbusAI/GLM-4-32B-0414-FP8

33B • Updated Apr 24 • 744 • 4

parasail-ai/Qwen2.5-VL-72B-Instruct-FP8-Dynamic

Image-to-Text • 73B • Updated Apr 18 • 879 • 2

yejingfu/Captain-Eris_Violet-V0.420-12B-FP8

12B • Updated Apr 19 • 7.86k

baseten/DeepSeek-V3-FP4

397B • Updated Apr 22 • 1.46k • 1

jobs-git/DeepSeek-V3-0324

Text Generation • 685B • Updated Apr 25 • 4

superbigtree/Mistral-Nemo-Instruct-2407-FP8_sglang

12B • Updated Apr 25 • 4

GreenBitAI/DeepSeek-R1-671B-layer-mix-bpw-4.0-mlx

96B • Updated Apr 28 • 118

anq/r1_fake_int4

685B • Updated Apr 27 • 4

bullerwins/DeepSeek-R1T-Chimera-bf16

Text Generation • 684B • Updated Apr 28 • 3 • 1

unsloth/Qwen3-0.6B-FP8

Text Generation • 0.6B • Updated May 11 • 36

unsloth/Qwen3-1.7B-FP8

Text Generation • 2B • Updated May 11 • 32

unsloth/Qwen3-4B-FP8

Text Generation • 4B • Updated May 11 • 789

unsloth/Qwen3-8B-FP8

Text Generation • 8B • Updated May 11 • 964

unsloth/Qwen3-14B-FP8

Text Generation • 15B • Updated May 11 • 2.17k • 1

unsloth/Qwen3-32B-FP8

Text Generation • 33B • Updated May 11 • 987 • 1

RedHatAI/gemma-3-4b-it-FP8-dynamic

Image-Text-to-Text • 4B • Updated Jun 9 • 390

RedHatAI/gemma-3-12b-it-FP8-dynamic

Image-Text-to-Text • 12B • Updated Jun 9 • 648 • 1

enferAI/Mistral-7B-Instruct-v0.3-FP8

7B • Updated Apr 28 • 3

michaelfeil/Qwen3-4B-FP8

4B • Updated Apr 28 • 3

pedalnomica/Qwen3-235B-A22B-FP8

Text Generation • 235B • Updated Apr 28 • 82

pedalnomica/Qwen3-32B-FP8

Text Generation • 33B • Updated Apr 28 • 4

qwen-community/Qwen3-235B-A22B-FP8

Text Generation • 235B • Updated Apr 28 • 6

qwen-community/Qwen3-32B-FP8

Text Generation • 33B • Updated Apr 28 • 51

qwen-community/Qwen3-0.6B-FP8

Text Generation • 0.8B • Updated Apr 28 • 8

qwen-community/Qwen3-1.7B-FP8

Text Generation • 2B • Updated Apr 28 • 5

qwen-community/Qwen3-14B-FP8

Text Generation • 15B • Updated Apr 28 • 7

qwen-community/Qwen3-4B-FP8

Text Generation • 4B • Updated Apr 28 • 4