This is a randomly created qwen3 tiny model with 10M parameters. This is part of a projct to create a LLM from scratch completely on apple silicon. I'm using the created tokenizer from alibaba, thats used for the Qwen3 models.
The next (trained) versino of this will be on Goekdeniz-Guelmez/J.O.S.I.E.-Qwen3-10M-Base-Phase1.
- Downloads last month
- 8