|
--- |
|
language: |
|
- en |
|
license: apache-2.0 |
|
tags: |
|
- granite |
|
- client-simulation |
|
- dialogue |
|
- bitsandbytes |
|
- 4-bit |
|
- unsloth |
|
- transformers |
|
base_model: ibm-granite/granite-3.2-2b-instruct |
|
pipeline_tag: text-generation |
|
datasets: |
|
- merged_mental_health_dataset.jsonl |
|
library_name: transformers |
|
--- |
|
# Gradiant-ClientSim-v0.1 |
|
|
|
A 4-bit quantized client simulation model based on IBM Granite 3.2B, fine-tuned for client interaction and simulation tasks. This model is compatible with Huggingface Transformers and bitsandbytes for efficient inference. |
|
|
|
## Model Details |
|
- **Base Model:** IBM Granite 3.2B (Unsloth) |
|
- **Precision:** 4-bit (safetensors, bitsandbytes) |
|
- **Architecture:** Causal Language Model |
|
- **Tokenizer:** Included (BPE) |
|
- **Intended Use:** Client simulation, dialogue, and assistant tasks |
|
|
|
## Files Included |
|
- `model.safetensors` — Main model weights (4-bit) |
|
- `config.json` — Model configuration |
|
- `generation_config.json` — Generation parameters |
|
- `tokenizer.json`, `tokenizer_config.json`, `vocab.json`, `merges.txt`, `special_tokens_map.json`, `added_tokens.json` — Tokenizer files |
|
|
|
## Example Usage |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig |
|
|
|
model_id = "oneblackmage/Gradiant-ClientSim-v0.1" |
|
bnb_config = BitsAndBytesConfig(load_in_4bit=True) |
|
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto") |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
|
prompt = "<|user>How can I improve my focus at work?\n<|assistant|>\n" |
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
outputs = model.generate(**inputs, max_new_tokens=100) |
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
``` |
|
|
|
## Quantization |
|
- This model is stored in 4-bit precision using [bitsandbytes](https://github.com/TimDettmers/bitsandbytes) for efficient inference on modern GPUs. |
|
- For best performance, use with `transformers` >= 4.45 and `bitsandbytes` >= 0.43. |
|
|
|
## License |
|
- See the LICENSE file or Huggingface model card for details. |
|
|
|
## Citation |
|
If you use this model, please cite the original IBM Granite model and this fine-tuned version. |
|
|
|
--- |
|
|
|
For questions or issues, open an issue on the Huggingface repo or contact the maintainer. |