Improve model card: add pipeline tag, library name, code link, and usage example (#1)
Browse files- Improve model card: add pipeline tag, library name, code link, and usage example (4cf85f01308bc7f1f2e75b12582bcfc401da9ee6)
- Update README.md (c55a1975d37ac6001ba99336b9eaed3e437f7edf)
Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>
README.md
CHANGED
|
@@ -1,12 +1,17 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
language:
|
| 4 |
-
- en
|
| 5 |
base_model:
|
| 6 |
- JunxiongWang/Llama3.2-Mamba2-3B-distill
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
---
|
|
|
|
| 8 |
# Description
|
| 9 |
|
| 10 |
2 layer mamba2 models distilled from JunxiongWang/Llama3.2-Mamba2-3B-distill. Early stop at 48000 step.
|
| 11 |
|
| 12 |
-
Used in [STree](https://arxiv.org/abs/2505.14969) as a draft model for speculative decoding for hybrid models.
|
|
|
|
|
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
| 2 |
base_model:
|
| 3 |
- JunxiongWang/Llama3.2-Mamba2-3B-distill
|
| 4 |
+
language:
|
| 5 |
+
- en
|
| 6 |
+
license: apache-2.0
|
| 7 |
+
pipeline_tag: text-generation
|
| 8 |
+
library_name: transformers
|
| 9 |
---
|
| 10 |
+
|
| 11 |
# Description
|
| 12 |
|
| 13 |
2 layer mamba2 models distilled from JunxiongWang/Llama3.2-Mamba2-3B-distill. Early stop at 48000 step.
|
| 14 |
|
| 15 |
+
Used in [STree: Speculative Tree Decoding for Hybrid State-Space Models](https://arxiv.org/abs/2505.14969) as a draft model for speculative decoding for hybrid models.
|
| 16 |
+
|
| 17 |
+
For more details on installation, training, and evaluation, please refer to the [GitHub repository](https://github.com/wyc1997/stree).
|