Files changed (1) hide show
  1. README.md +74 -60
README.md CHANGED
@@ -1,61 +1,75 @@
1
- ---
2
- base_model:
3
- - SicariusSicariiStuff/Impish_QWEN_14B-1M
4
- - Qwen/Qwen2.5-14B-Instruct
5
- - sometimesanotion/LamarckInfusion-14B-v1
6
- - Qwen/Qwen2.5-Coder-14B
7
- - suayptalha/Lamarckvergence-14B
8
- - huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2
9
- - Qwen/Qwen2.5-14B
10
- - tanliboy/lambda-qwen2.5-14b-dpo-test
11
- - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
12
- library_name: transformers
13
- tags:
14
- - mergekit
15
- - merge
16
- license: mit
17
- ---
18
- # merge
19
-
20
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
21
-
22
- ## Merge Details
23
- ### Merge Method
24
-
25
- This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) as a base.
26
-
27
- ### Models Merged
28
-
29
- The following models were included in the merge:
30
- * [SicariusSicariiStuff/Impish_QWEN_14B-1M](https://huggingface.co/SicariusSicariiStuff/Impish_QWEN_14B-1M)
31
- * [sometimesanotion/LamarckInfusion-14B-v1](https://huggingface.co/sometimesanotion/LamarckInfusion-14B-v1)
32
- * [Qwen/Qwen2.5-Coder-14B](https://huggingface.co/Qwen/Qwen2.5-Coder-14B)
33
- * [suayptalha/Lamarckvergence-14B](https://huggingface.co/suayptalha/Lamarckvergence-14B)
34
- * [huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2](https://huggingface.co/huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2)
35
- * [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B)
36
- * [tanliboy/lambda-qwen2.5-14b-dpo-test](https://huggingface.co/tanliboy/lambda-qwen2.5-14b-dpo-test)
37
- * [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)
38
-
39
- ### Configuration
40
-
41
- The following YAML configuration was used to produce this model:
42
-
43
- ```yaml
44
- models:
45
- - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B #logic
46
- - model: huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2
47
- - model: Qwen/Qwen2.5-14B #text generation
48
- - model: Qwen/Qwen2.5-14B-Instruct #chat assistant
49
- - model: Qwen/Qwen2.5-Coder-14B #coding
50
- - model: sometimesanotion/LamarckInfusion-14B-v1
51
- - model: suayptalha/Lamarckvergence-14B
52
- - model: tanliboy/lambda-qwen2.5-14b-dpo-test
53
- - model: SicariusSicariiStuff/Impish_QWEN_14B-1M
54
-
55
- merge_method: model_stock
56
- base_model: Qwen/Qwen2.5-14B-Instruct
57
- normalize: true
58
- int8_mask: true
59
- dtype: bfloat16
60
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61
  ```
 
1
+ ---
2
+ base_model:
3
+ - SicariusSicariiStuff/Impish_QWEN_14B-1M
4
+ - Qwen/Qwen2.5-14B-Instruct
5
+ - sometimesanotion/LamarckInfusion-14B-v1
6
+ - Qwen/Qwen2.5-Coder-14B
7
+ - suayptalha/Lamarckvergence-14B
8
+ - huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2
9
+ - Qwen/Qwen2.5-14B
10
+ - tanliboy/lambda-qwen2.5-14b-dpo-test
11
+ - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B
12
+ library_name: transformers
13
+ tags:
14
+ - mergekit
15
+ - merge
16
+ license: mit
17
+ language:
18
+ - zho
19
+ - eng
20
+ - fra
21
+ - spa
22
+ - por
23
+ - deu
24
+ - ita
25
+ - rus
26
+ - jpn
27
+ - kor
28
+ - vie
29
+ - tha
30
+ - ara
31
+ ---
32
+ # merge
33
+
34
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
35
+
36
+ ## Merge Details
37
+ ### Merge Method
38
+
39
+ This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) as a base.
40
+
41
+ ### Models Merged
42
+
43
+ The following models were included in the merge:
44
+ * [SicariusSicariiStuff/Impish_QWEN_14B-1M](https://huggingface.co/SicariusSicariiStuff/Impish_QWEN_14B-1M)
45
+ * [sometimesanotion/LamarckInfusion-14B-v1](https://huggingface.co/sometimesanotion/LamarckInfusion-14B-v1)
46
+ * [Qwen/Qwen2.5-Coder-14B](https://huggingface.co/Qwen/Qwen2.5-Coder-14B)
47
+ * [suayptalha/Lamarckvergence-14B](https://huggingface.co/suayptalha/Lamarckvergence-14B)
48
+ * [huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2](https://huggingface.co/huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2)
49
+ * [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B)
50
+ * [tanliboy/lambda-qwen2.5-14b-dpo-test](https://huggingface.co/tanliboy/lambda-qwen2.5-14b-dpo-test)
51
+ * [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)
52
+
53
+ ### Configuration
54
+
55
+ The following YAML configuration was used to produce this model:
56
+
57
+ ```yaml
58
+ models:
59
+ - model: deepseek-ai/DeepSeek-R1-Distill-Qwen-14B #logic
60
+ - model: huihui-ai/DeepSeek-R1-Distill-Qwen-14B-abliterated-v2
61
+ - model: Qwen/Qwen2.5-14B #text generation
62
+ - model: Qwen/Qwen2.5-14B-Instruct #chat assistant
63
+ - model: Qwen/Qwen2.5-Coder-14B #coding
64
+ - model: sometimesanotion/LamarckInfusion-14B-v1
65
+ - model: suayptalha/Lamarckvergence-14B
66
+ - model: tanliboy/lambda-qwen2.5-14b-dpo-test
67
+ - model: SicariusSicariiStuff/Impish_QWEN_14B-1M
68
+
69
+ merge_method: model_stock
70
+ base_model: Qwen/Qwen2.5-14B-Instruct
71
+ normalize: true
72
+ int8_mask: true
73
+ dtype: bfloat16
74
+
75
  ```