File size: 2,274 Bytes
eecce05
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8f875c1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
---
language:
  - en
license: apache-2.0
tags:
  - granite
  - client-simulation
  - dialogue
  - bitsandbytes
  - 4-bit
  - unsloth
  - transformers
base_model: ibm-granite/granite-3.2-2b-instruct
pipeline_tag: text-generation
datasets:
  - merged_mental_health_dataset.jsonl
library_name: transformers
---
# Gradiant-ClientSim-v0.1

A 4-bit quantized client simulation model based on IBM Granite 3.2B, fine-tuned for client interaction and simulation tasks. This model is compatible with Huggingface Transformers and bitsandbytes for efficient inference.

## Model Details
- **Base Model:** IBM Granite 3.2B (Unsloth)
- **Precision:** 4-bit (safetensors, bitsandbytes)
- **Architecture:** Causal Language Model
- **Tokenizer:** Included (BPE)
- **Intended Use:** Client simulation, dialogue, and assistant tasks

## Files Included
- `model.safetensors` — Main model weights (4-bit)
- `config.json` — Model configuration
- `generation_config.json` — Generation parameters
- `tokenizer.json`, `tokenizer_config.json`, `vocab.json`, `merges.txt`, `special_tokens_map.json`, `added_tokens.json` — Tokenizer files

## Example Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

model_id = "oneblackmage/Gradiant-ClientSim-v0.1"
bnb_config = BitsAndBytesConfig(load_in_4bit=True)
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_id)

prompt = "<|user>How can I improve my focus at work?\n<|assistant|>\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## Quantization
- This model is stored in 4-bit precision using [bitsandbytes](https://github.com/TimDettmers/bitsandbytes) for efficient inference on modern GPUs.
- For best performance, use with `transformers` >= 4.45 and `bitsandbytes` >= 0.43.

## License
- See the LICENSE file or Huggingface model card for details.

## Citation
If you use this model, please cite the original IBM Granite model and this fine-tuned version.

---

For questions or issues, open an issue on the Huggingface repo or contact the maintainer.