leafspark
/

Iridium-72B-v0.1

Text Generation

text-generation-inference

Model card Files Files and versions Community

Iridium-72B-v0.1 / README.md

leafspark's picture

docs: update model card

0d66f37 verified 11 months ago

|

history blame contribute delete

1.67 kB

	---
	license: other
	license_name: tongyi-qianwen
	license_link: https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE
	pipeline_tag: text-generation
	language:
	- en
	- zh
	library_name: transformers
	tags:
	- mergekit
	- qwen2
	---

	# Iridium-72B-v0.1

	## Model Description
	Iridium is a 72B parameter language model created through a merge of Qwen2-72B-Instruct, calme2.1-72b, and magnum-72b-v1 using `model_stock`.

	## Features
	- 72 billion parameters
	- Combines Magnum prose with Calam smarts

	## Technical Specifications

	### Architecture
	- `Qwen2ForCasualLM`
	- Models: Qwen2-72B-Instruct (base), calme2.1-72b, magnum-72b-v1
	- Merged layers: 80
	- Total tensors: 963
	- Context length: 128k

	### Tensor Distribution
	- Attention layers: 560 files
	- MLP layers: 240 files
	- Layer norms: 160 files
	- Miscellaneous (embeddings, output): 3 files

	### Merging
	Custom script utilizing safetensors library.

	## Usage

	### Loading the Model
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	model = AutoModelForCausalLM.from_pretrained("leafspark/Iridium-72B-v0.1",
	device_map="auto",
	torch_dtype=torch.float16)
	tokenizer = AutoTokenizer.from_pretrained("leafspark/Iridium-72B-v0.1")
	```
	### GGUFs

	Find them here: [leafspark/Iridium-72B-v0.1-GGUF](https://huggingface.co/leafspark/Iridium-72B-v0.1-GGUF)

	### Optimal Sampling Parameters

	I found these to work well:
	```json
	{
	"temperature": 1
	"min_p": 0.08
	"top_p": 1
	"top_k": 40
	"repetition_penalty": 1
	}
	```

	### Hardware Requirements
	- At least 135GB of free space
	- ~140GB VRAM/RAM