Upload folder using huggingface_hub
Browse files- README.md +17 -14
- config.json +1 -0
- figures/new_logo2.png +0 -0
README.md
CHANGED
@@ -10,7 +10,7 @@ language:
|
|
10 |
# dots1
|
11 |
|
12 |
<p align="center">
|
13 |
-
<img src="figures/
|
14 |
<p>
|
15 |
|
16 |
<p align="center">
|
@@ -20,8 +20,6 @@ language:
|
|
20 |
</p>
|
21 |
|
22 |
|
23 |
-
|
24 |
-
|
25 |
Visit our Hugging Face (click links above), search checkpoints with names starting with `dots.llm1` or visit the [dots1 collection](https://huggingface.co/collections/rednote-hilab/dotsllm1-68246aaaaba3363374a8aa7c), and you will find all you need! Enjoy!
|
26 |
|
27 |
|
@@ -113,6 +111,8 @@ curl http://localhost:8000/v1/chat/completions \
|
|
113 |
|
114 |
### Inference with huggingface
|
115 |
|
|
|
|
|
116 |
#### Text Completion
|
117 |
|
118 |
```python
|
@@ -122,8 +122,7 @@ from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
|
|
122 |
model_name = "rednote-hilab/dots.llm1.base"
|
123 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
124 |
|
125 |
-
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype=torch.bfloat16
|
126 |
-
model.generation_config = GenerationConfig.from_pretrained(model_name)
|
127 |
|
128 |
text = "An attention function can be described as mapping a query and a set of key-value pairs to an output, where the query, keys, values, and output are all vectors. The output is"
|
129 |
inputs = tokenizer(text, return_tensors="pt")
|
@@ -141,8 +140,7 @@ from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
|
|
141 |
model_name = "rednote-hilab/dots.llm1.inst"
|
142 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
143 |
|
144 |
-
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype=torch.bfloat16
|
145 |
-
model.generation_config = GenerationConfig.from_pretrained(model_name)
|
146 |
|
147 |
messages = [
|
148 |
{"role": "user", "content": "Write a piece of quicksort code in C++"}
|
@@ -154,21 +152,26 @@ result = tokenizer.decode(outputs[0][input_tensor.shape[1]:], skip_special_token
|
|
154 |
print(result)
|
155 |
```
|
156 |
|
|
|
157 |
|
158 |
-
|
159 |
-
[SGLang](https://github.com/sgl-project/sglang) is a fast serving framework for large language models and vision language models. SGLang could be used to launch a server with OpenAI-compatible API service. `sglang>=***` is required. It is as easy as
|
160 |
|
161 |
```shell
|
162 |
-
|
163 |
```
|
|
|
164 |
An OpenAI-compatible API will be available at `http://localhost:8000/v1`.
|
165 |
|
166 |
-
### Inference with
|
167 |
-
|
|
|
|
|
|
|
168 |
|
169 |
```shell
|
170 |
-
|
171 |
```
|
|
|
172 |
An OpenAI-compatible API will be available at `http://localhost:8000/v1`.
|
173 |
|
174 |
## 4. Evaluation Results
|
@@ -186,4 +189,4 @@ If you find `dots.llm1` is useful or want to use in your projects, please kindly
|
|
186 |
journal={arXiv preprint arXiv:TBD},
|
187 |
year={2025}
|
188 |
}
|
189 |
-
```
|
|
|
10 |
# dots1
|
11 |
|
12 |
<p align="center">
|
13 |
+
<img src="figures/new_logo2.png" width="300"/>
|
14 |
<p>
|
15 |
|
16 |
<p align="center">
|
|
|
20 |
</p>
|
21 |
|
22 |
|
|
|
|
|
23 |
Visit our Hugging Face (click links above), search checkpoints with names starting with `dots.llm1` or visit the [dots1 collection](https://huggingface.co/collections/rednote-hilab/dotsllm1-68246aaaaba3363374a8aa7c), and you will find all you need! Enjoy!
|
24 |
|
25 |
|
|
|
111 |
|
112 |
### Inference with huggingface
|
113 |
|
114 |
+
We are working to merge it into Transformers ([PR #38143](https://github.com/huggingface/transformers/pull/38143)).
|
115 |
+
|
116 |
#### Text Completion
|
117 |
|
118 |
```python
|
|
|
122 |
model_name = "rednote-hilab/dots.llm1.base"
|
123 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
124 |
|
125 |
+
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype=torch.bfloat16)
|
|
|
126 |
|
127 |
text = "An attention function can be described as mapping a query and a set of key-value pairs to an output, where the query, keys, values, and output are all vectors. The output is"
|
128 |
inputs = tokenizer(text, return_tensors="pt")
|
|
|
140 |
model_name = "rednote-hilab/dots.llm1.inst"
|
141 |
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
142 |
|
143 |
+
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype=torch.bfloat16)
|
|
|
144 |
|
145 |
messages = [
|
146 |
{"role": "user", "content": "Write a piece of quicksort code in C++"}
|
|
|
152 |
print(result)
|
153 |
```
|
154 |
|
155 |
+
### Inference with vllm
|
156 |
|
157 |
+
[vLLM](https://github.com/vllm-project/vllm) is a high-throughput and memory-efficient inference and serving engine for LLMs. Official support for this feature is covered in [PR #18254](https://github.com/vllm-project/vllm/pull/18254).
|
|
|
158 |
|
159 |
```shell
|
160 |
+
vllm serve dots.llm1.inst --port 8000 --tensor-parallel-size 8
|
161 |
```
|
162 |
+
|
163 |
An OpenAI-compatible API will be available at `http://localhost:8000/v1`.
|
164 |
|
165 |
+
### Inference with sglang
|
166 |
+
|
167 |
+
[SGLang](https://github.com/sgl-project/sglang) is a fast serving framework for large language models and vision language models. SGLang could be used to launch a server with OpenAI-compatible API service. Official support for this feature is covered in [PR #6471](https://github.com/sgl-project/sglang/pull/6471).
|
168 |
+
|
169 |
+
Getting started is as simple as running:
|
170 |
|
171 |
```shell
|
172 |
+
python -m sglang.launch_server --model-path dots.llm1.inst --tp 8 --host 0.0.0.0 --port 8000
|
173 |
```
|
174 |
+
|
175 |
An OpenAI-compatible API will be available at `http://localhost:8000/v1`.
|
176 |
|
177 |
## 4. Evaluation Results
|
|
|
189 |
journal={arXiv preprint arXiv:TBD},
|
190 |
year={2025}
|
191 |
}
|
192 |
+
```
|
config.json
CHANGED
@@ -28,6 +28,7 @@
|
|
28 |
"rope_theta": 10000000,
|
29 |
"routed_scaling_factor": 2.5,
|
30 |
"sliding_window": null,
|
|
|
31 |
"tie_word_embeddings": false,
|
32 |
"torch_dtype": "bfloat16",
|
33 |
"transformers_version": "4.46.3",
|
|
|
28 |
"rope_theta": 10000000,
|
29 |
"routed_scaling_factor": 2.5,
|
30 |
"sliding_window": null,
|
31 |
+
"scoring_func": "noaux_tc",
|
32 |
"tie_word_embeddings": false,
|
33 |
"torch_dtype": "bfloat16",
|
34 |
"transformers_version": "4.46.3",
|
figures/new_logo2.png
ADDED
![]() |