--- license: gpl-3.0 language: - vi - en pipeline_tag: text-generation tags: - chemistry - biology - code datasets: - open-r1/OpenR1-Math-220k - open-thoughts/OpenThoughts-114k base_model: - meta-llama/Llama-3.3-70B-Instruct - google/gemma-3-27b-it metrics: - chrf library_name: fasttext --- ![Frame 34.png](https://cdn-uploads.huggingface.co/production/uploads/67bb241a51e34f525bd1b1fc/5A89_Ng_DDt57cahHJsxC.png) # SOFIA X1 MODEL Trained aiming to be an AI-instructor. Excels in coding, deep-learning and machine-learning. ## Model Details - **Total Parameters:** 405 billion - **Active Parameters:** 70 billion - **Max input tokens:** 128K - **Max output tokens:** 1M - **FP8 Utilization** - **Knowledge cut-off date:** 3/17/2025 ### Model Description - **Developed by:** ngkhoi, quren - **Model type:** X1 v1.2 - **Language(s) (NLP):** Vietnamese, English - **License:** GNU ## Risks & Limitation - Due to the Llama 3.3 leftover cores, the model now still having issue with multilingual. - Mediocre response speed. - Requires high-end hardware to run. ## Demo Discord bot access only. ### Training Data ### ! The model is designed to solve problems and assignments ONLY in Vietnam region ! - **Subjects** - Mathematics 10 - Physics 10 - Chemistry 10 (confusing dataset with the online websites' results) - **Programming languages** - Python - Lua - HTML - CSS - **Integrated** - Google Gemma (Image Analysis) - Deep ML [More Information Needed] #### Hardware - **CPU:** >= 8 cores - **RAM:** > 32GB - **GPU:** 3000 series and up - **Storage:** More than 1TB [More Information Needed]