atsuki-yamaguchi commited on
Commit
2c6f2da
·
verified ·
1 Parent(s): 1be16ae

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +63 -0
README.md ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ license: apache-2.0
4
+ datasets:
5
+ - allenai/MADLAD-400
6
+ language:
7
+ - bn
8
+ base_model:
9
+ - Qwen/Qwen3-14B-Base
10
+ library_name: transformers
11
+ ---
12
+ # Qwen3 14B Base for Bengali: Vocabulary expansion
13
+
14
+ This model is built on top of Qwen3 14B Base adapted for Bengali using 500M target language tokens sampled from MADLAD-400. It has an additional target vocabulary of 10K.
15
+
16
+ ## Model Details
17
+
18
+ * **Vocabulary**: This model has an additional target vocabulary of 10K.
19
+ * **Target vocabulary initialization**: The target weights of the embedding and LM head were initialized using mean initialization.
20
+ * **Training**: This model was continually pre-trained on 500M target language tokens sampled from MADLAD-400.
21
+
22
+
23
+ ## Model Description
24
+
25
+ - **Language:** Bengali
26
+ - **License:** Apache 2.0
27
+ - **Fine-tuned from model:** Qwen/Qwen3-14B-Base
28
+
29
+
30
+ ## Model Sources
31
+
32
+ - **Repository:** https://github.com/gucci-j/chat-cve
33
+ - **Paper:** https://arxiv.org/abs/2412.11704
34
+
35
+
36
+ ## How to Get Started with the Model
37
+ Use the code below to get started with the model.
38
+ ```python
39
+ from transformers import AutoTokenizer, AutoModelForCausalLM
40
+
41
+ model = AutoModelForCausalLM.from_pretrained(
42
+ "atsuki-yamaguchi/Qwen3-14B-Base-bn-madlad-mean-tuned"
43
+ )
44
+ tokenizer = AutoTokenizer.from_pretrained(
45
+ "atsuki-yamaguchi/Qwen3-14B-Base-bn-madlad-mean-tuned"
46
+ )
47
+ ```
48
+
49
+
50
+ ## Citation
51
+ ```
52
+ @article{yamaguchi2025adapting,
53
+ title={Adapting Chat Language Models Using Only Target Unlabeled Language Data},
54
+ author={Atsuki Yamaguchi and Terufumi Morishita and Aline Villavicencio and Nikolaos Aletras},
55
+ journal={Transactions on Machine Learning Research},
56
+ issn={2835-8856},
57
+ year={2025},
58
+ url={https://openreview.net/forum?id=6IdoIKowfe},
59
+ note={}
60
+ }
61
+ ```
62
+
63
+