oneblackmage
/

Gradiant-ClientSim-v0.1

Text Generation

client-simulation

4-bit precision

Model card Files Files and versions

Gradiant-ClientSim-v0.1 / README.md

oneblackmage's picture

Upload README.md

eecce05 verified 4 months ago

|

history blame contribute delete

2.27 kB

	---
	language:
	- en
	license: apache-2.0
	tags:
	- granite
	- client-simulation
	- dialogue
	- bitsandbytes
	- 4-bit
	- unsloth
	- transformers
	base_model: ibm-granite/granite-3.2-2b-instruct
	pipeline_tag: text-generation
	datasets:
	- merged_mental_health_dataset.jsonl
	library_name: transformers
	---
	# Gradiant-ClientSim-v0.1

	A 4-bit quantized client simulation model based on IBM Granite 3.2B, fine-tuned for client interaction and simulation tasks. This model is compatible with Huggingface Transformers and bitsandbytes for efficient inference.

	## Model Details
	- Base Model: IBM Granite 3.2B (Unsloth)
	- Precision: 4-bit (safetensors, bitsandbytes)
	- Architecture: Causal Language Model
	- Tokenizer: Included (BPE)
	- Intended Use: Client simulation, dialogue, and assistant tasks

	## Files Included
	- `model.safetensors` — Main model weights (4-bit)
	- `config.json` — Model configuration
	- `generation_config.json` — Generation parameters
	- `tokenizer.json`, `tokenizer_config.json`, `vocab.json`, `merges.txt`, `special_tokens_map.json`, `added_tokens.json` — Tokenizer files

	## Example Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

	model_id = "oneblackmage/Gradiant-ClientSim-v0.1"
	bnb_config = BitsAndBytesConfig(load_in_4bit=True)
	model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto")
	tokenizer = AutoTokenizer.from_pretrained(model_id)

	prompt = "<\|user>How can I improve my focus at work?\n<\|assistant\|>\n"
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=100)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Quantization
	- This model is stored in 4-bit precision using [bitsandbytes](https://github.com/TimDettmers/bitsandbytes) for efficient inference on modern GPUs.
	- For best performance, use with `transformers` >= 4.45 and `bitsandbytes` >= 0.43.

	## License
	- See the LICENSE file or Huggingface model card for details.

	## Citation
	If you use this model, please cite the original IBM Granite model and this fine-tuned version.

	---

	For questions or issues, open an issue on the Huggingface repo or contact the maintainer.