Update README.md
Browse files
README.md
CHANGED
|
@@ -17,7 +17,7 @@ base_model:
|
|
| 17 |
A family of fully open-source large multimodal models demonstrating **superior performance** across multiple multimodal benchmarks, **outperforming Qwen2.5-VL** in most evaluation tasks.
|
| 18 |
|
| 19 |
2. **High-Quality Data at Scale**
|
| 20 |
-
Meticulously curated **
|
| 21 |
- Concept-balanced, highly diverse, high-quality caption data
|
| 22 |
- Comprehensive instruction fine-tuning data covering a wide range of tasks
|
| 23 |
|
|
@@ -29,7 +29,7 @@ Complete end-to-end training framework designed for maximum efficiency:
|
|
| 29 |
- Optimized codebase for cost-effective scaling
|
| 30 |
|
| 31 |
4. **Fully Open Framework** for community access and reproducibility:
|
| 32 |
-
- β
High-quality
|
| 33 |
- β
Complete training framework & code
|
| 34 |
- β
Training recipes & configurations
|
| 35 |
- β
Base & instruct model checkpoints
|
|
@@ -38,7 +38,7 @@ Complete end-to-end training framework designed for maximum efficiency:
|
|
| 38 |
## Dataset
|
| 39 |
| Description | Link |
|
| 40 |
|-------------|------|
|
| 41 |
-
|
|
| 42 |
| SFT data for LLaVA-OneVision-1.5 | [π€ Download (Uploading!)](https://huggingface.co/datasets/lmms-lab/LLaVA-One-Vision-1.5-Insturct-26M) |
|
| 43 |
|
| 44 |
## Evaluation Results
|
|
|
|
| 17 |
A family of fully open-source large multimodal models demonstrating **superior performance** across multiple multimodal benchmarks, **outperforming Qwen2.5-VL** in most evaluation tasks.
|
| 18 |
|
| 19 |
2. **High-Quality Data at Scale**
|
| 20 |
+
Meticulously curated **mid-training and SFT data** with rigorous filtering and quality control, achieving **superior data efficiency** with only **5B tokens** (1.2% of Qwen2.5-VL's training data).
|
| 21 |
- Concept-balanced, highly diverse, high-quality caption data
|
| 22 |
- Comprehensive instruction fine-tuning data covering a wide range of tasks
|
| 23 |
|
|
|
|
| 29 |
- Optimized codebase for cost-effective scaling
|
| 30 |
|
| 31 |
4. **Fully Open Framework** for community access and reproducibility:
|
| 32 |
+
- β
High-quality mid-training & SFT data
|
| 33 |
- β
Complete training framework & code
|
| 34 |
- β
Training recipes & configurations
|
| 35 |
- β
Base & instruct model checkpoints
|
|
|
|
| 38 |
## Dataset
|
| 39 |
| Description | Link |
|
| 40 |
|-------------|------|
|
| 41 |
+
| Mid-training data for LLaVA-OneVision-1.5 | [π€ Download (Uploading!)](https://huggingface.co/datasets/lmms-lab/LLaVA-One-Vision-1.5-Mid-Training-85M) |
|
| 42 |
| SFT data for LLaVA-OneVision-1.5 | [π€ Download (Uploading!)](https://huggingface.co/datasets/lmms-lab/LLaVA-One-Vision-1.5-Insturct-26M) |
|
| 43 |
|
| 44 |
## Evaluation Results
|