Noorhan
/

DeepSeek-16b-light

4-bit precision

Model card Files Files and versions Community

DeepSeek-16b-light / README.md

Noorhan's picture

Create README.md

20c3539 verified 5 months ago

|

history blame contribute delete

2.33 kB

	# DeepSeek-16b-light

	This is a 4-bit quantized version of the [DeepSeek Coder V2 Lite Instruct](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct) model. The quantization was performed using the bitsandbytes library with 4-bit precision to reduce the model size and memory requirements while maintaining most of its capabilities.

	## Model Details

	- Original Model: DeepSeek Coder V2 Lite Instruct
	- Quantization: 4-bit quantization using bitsandbytes
	- Compute Type: float16
	- Double Quantization: Enabled
	- Size Reduction: Approximately 75% smaller than the original model
	- Use Case: Code generation, code completion, and programming assistance

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Load the model and tokenizer
	tokenizer = AutoTokenizer.from_pretrained("Noorhan/DeepSeek-16b-light")
	model = AutoModelForCausalLM.from_pretrained("Noorhan/DeepSeek-16b-light", device_map="auto")

	# Example code generation
	prompt = """
	Write a Python function to calculate the Fibonacci sequence up to n terms.
	"""

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(
	inputs.input_ids,
	max_length=500,
	temperature=0.7,
	top_p=0.95,
	do_sample=True
	)

	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response)
	```

	## Performance and Limitations

	This 4-bit quantized model:
	- Requires significantly less memory than the original model
	- Runs faster on consumer-grade hardware
	- Has minimal quality degradation for most use cases
	- May show some performance reduction for edge cases or complex reasoning tasks

	## Hardware Requirements

	- Recommended: GPU with at least 8GB VRAM
	- Minimum: 4GB VRAM (with potential performance limitations)

	## Acknowledgements

	This model is a quantized version of [DeepSeek Coder V2 Lite Instruct](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct). All credits for the original model go to the DeepSeek AI team. The quantization was performed to make this powerful coding assistant more accessible for users with limited computational resources.

	## License

	This model inherits the license of the original DeepSeek Coder V2 Lite Instruct model. Please refer to the original model's documentation for licensing details.