|  | --- | 
					
						
						|  | quantized_by: bobchenyx | 
					
						
						|  | base_model: | 
					
						
						|  | - deepseek-ai/DeepSeek-V3-0324 | 
					
						
						|  | base_model_relation: quantized | 
					
						
						|  | license: mit | 
					
						
						|  | tags: | 
					
						
						|  | - deepseek_v3 | 
					
						
						|  | - deepseek | 
					
						
						|  | - transformers | 
					
						
						|  | - GGUF | 
					
						
						|  | pipeline_tag: text-generation | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## llama.cpp Quantizations of DeepSeek-V3-0324 (MLA version) | 
					
						
						|  |  | 
					
						
						|  | This page is going to be deprecated. For other quantized versions, please refer to [moxin-org/DeepSeek-V3-0324-Moxin-GGUF](https://huggingface.co/moxin-org/DeepSeek-V3-0324-Moxin-GGUF) for more details. | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  | All quants made based on [moxin-org/CC-MoE](https://github.com/moxin-org/CC-MoE). | 
					
						
						|  | ``` | 
					
						
						|  | - IQ1_S : 129.94 GiB (1.66 BPW) | 
					
						
						|  | - IQ1_M : 144.24 GiB (1.85 BPW) | 
					
						
						|  | - Q2_K_L : 222.01 GiB (2.84 BPW) | 
					
						
						|  | - Q4_K_L : 381.64 GiB (4.89 BPW) | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | ## Smallest Compression (103GB) | 
					
						
						|  |  | 
					
						
						|  | For our smallest compressed version. Please refer to | 
					
						
						|  | [tflsxyy/DeepSeek-V3-0324-E192](https://huggingface.co/tflsxyy/DeepSeek-V3-0324-MoE-Pruner-E192-bf16) | 
					
						
						|  | and [bobchenyx/DeepSeek-V3-0324-508B-A32B-MLA-GGUF](https://huggingface.co/bobchenyx/DeepSeek-V3-0324-508B-A32B-MLA-GGUF) | 
					
						
						|  | for more details. | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  | ## Download Guide | 
					
						
						|  |  | 
					
						
						|  | ``` | 
					
						
						|  | # !pip install huggingface_hub hf_transfer | 
					
						
						|  | import os | 
					
						
						|  | os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1" | 
					
						
						|  | from huggingface_hub import snapshot_download | 
					
						
						|  | snapshot_download( | 
					
						
						|  | repo_id = "bobchenyx/DeepSeek-V3-0324-MLA-GGUF", | 
					
						
						|  | local_dir = "bobchenyx/DeepSeek-V3-0324-MLA-GGUF", | 
					
						
						|  | allow_patterns = ["*IQ1_M*"], | 
					
						
						|  | ) | 
					
						
						|  | ``` |