File size: 1,666 Bytes
4df1f10 f281f2a 76bdbc3 4df1f10 76bdbc3 4df1f10 76bdbc3 4df1f10 f281f2a 4df1f10 f281f2a 0d66f37 3d202c1 4df1f10 f281f2a 0d66f37 4df1f10 76bdbc3 4df1f10 76bdbc3 4df1f10 f281f2a 76bdbc3 4df1f10 3d202c1 4df1f10 3d202c1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
---
license: other
license_name: tongyi-qianwen
license_link: https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE
pipeline_tag: text-generation
language:
- en
- zh
library_name: transformers
tags:
- mergekit
- qwen2
---
# Iridium-72B-v0.1
## Model Description
Iridium is a 72B parameter language model created through a merge of Qwen2-72B-Instruct, calme2.1-72b, and magnum-72b-v1 using `model_stock`.
## Features
- 72 billion parameters
- Combines Magnum prose with Calam smarts
## Technical Specifications
### Architecture
- `Qwen2ForCasualLM`
- Models: Qwen2-72B-Instruct (base), calme2.1-72b, magnum-72b-v1
- Merged layers: 80
- Total tensors: 963
- Context length: 128k
### Tensor Distribution
- Attention layers: 560 files
- MLP layers: 240 files
- Layer norms: 160 files
- Miscellaneous (embeddings, output): 3 files
### Merging
Custom script utilizing safetensors library.
## Usage
### Loading the Model
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained("leafspark/Iridium-72B-v0.1",
device_map="auto",
torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained("leafspark/Iridium-72B-v0.1")
```
### GGUFs
Find them here: [leafspark/Iridium-72B-v0.1-GGUF](https://huggingface.co/leafspark/Iridium-72B-v0.1-GGUF)
### Optimal Sampling Parameters
I found these to work well:
```json
{
"temperature": 1
"min_p": 0.08
"top_p": 1
"top_k": 40
"repetition_penalty": 1
}
```
### Hardware Requirements
- At least 135GB of free space
- ~140GB VRAM/RAM |