LLM Dataset stepfun-ai/Step-3.5-Flash-SFT Viewer • Updated Mar 14 • 1.62M • 18.9k • 327 allenai/olmOCR-mix-1025 Viewer • Updated Oct 21, 2025 • 270k • 1.3k • 34 PrimeIntellect/SYNTHETIC-2-SFT-verified Viewer • Updated Jul 10, 2025 • 105k • 126 • 9 ianncity/KIMI-K2.5-1000000x Viewer • Updated 25 days ago • 733k • 5.88k • 252
Selfies Roberta Experimental for training BERT architecture model for drug discovery/ HoangHa/belka-selfies-ids Viewer • Updated Jun 17, 2024 • 99.3M • 78 HoangHa/enamine-diverse-selfies-pretrain Viewer • Updated Jun 19, 2024 • 39.5M • 97 HoangHa/enamine-nature-selfies-pretrain-p1 Viewer • Updated Jun 20, 2024 • 117M • 93 HoangHa/pubchem-selfies-pretrain Viewer • Updated Jun 18, 2024 • 102M • 83 • 1
Pensez-LLM French-English reasoning model Pensez: Less Data, Better Reasoning -- Rethinking French LLM Paper • 2503.13661 • Published Mar 17, 2025 • 5 HoangHa/Pensez-v0.1-e5 Text Generation • 8B • Updated Apr 17, 2025 • 66 • 17 HoangHa/Pensez-v0.1-e5-GGUF 8B • Updated Feb 28, 2025 • 51 • 6 HoangHa/Pensez Viewer • Updated Feb 17, 2025 • 1.62k • 19 • 1
Pensez: Less Data, Better Reasoning -- Rethinking French LLM Paper • 2503.13661 • Published Mar 17, 2025 • 5
Medical Dataset HoangHa/medical_SFT Viewer • Updated May 20, 2024 • 508k • 17 • 1 HoangHa/wiki_med_en Viewer • Updated Apr 28, 2024 • 7.32k • 8 • 1 HoangHa/medical_meadow_medqa_inout Viewer • Updated Jan 11, 2024 • 10.2k • 40 HoangHa/Llama2-MedTuned-Instructions_inout Viewer • Updated Jan 11, 2024 • 205k • 15
LLM Dataset stepfun-ai/Step-3.5-Flash-SFT Viewer • Updated Mar 14 • 1.62M • 18.9k • 327 allenai/olmOCR-mix-1025 Viewer • Updated Oct 21, 2025 • 270k • 1.3k • 34 PrimeIntellect/SYNTHETIC-2-SFT-verified Viewer • Updated Jul 10, 2025 • 105k • 126 • 9 ianncity/KIMI-K2.5-1000000x Viewer • Updated 25 days ago • 733k • 5.88k • 252
Pensez-LLM French-English reasoning model Pensez: Less Data, Better Reasoning -- Rethinking French LLM Paper • 2503.13661 • Published Mar 17, 2025 • 5 HoangHa/Pensez-v0.1-e5 Text Generation • 8B • Updated Apr 17, 2025 • 66 • 17 HoangHa/Pensez-v0.1-e5-GGUF 8B • Updated Feb 28, 2025 • 51 • 6 HoangHa/Pensez Viewer • Updated Feb 17, 2025 • 1.62k • 19 • 1
Pensez: Less Data, Better Reasoning -- Rethinking French LLM Paper • 2503.13661 • Published Mar 17, 2025 • 5
Selfies Roberta Experimental for training BERT architecture model for drug discovery/ HoangHa/belka-selfies-ids Viewer • Updated Jun 17, 2024 • 99.3M • 78 HoangHa/enamine-diverse-selfies-pretrain Viewer • Updated Jun 19, 2024 • 39.5M • 97 HoangHa/enamine-nature-selfies-pretrain-p1 Viewer • Updated Jun 20, 2024 • 117M • 93 HoangHa/pubchem-selfies-pretrain Viewer • Updated Jun 18, 2024 • 102M • 83 • 1
Medical Dataset HoangHa/medical_SFT Viewer • Updated May 20, 2024 • 508k • 17 • 1 HoangHa/wiki_med_en Viewer • Updated Apr 28, 2024 • 7.32k • 8 • 1 HoangHa/medical_meadow_medqa_inout Viewer • Updated Jan 11, 2024 • 10.2k • 40 HoangHa/Llama2-MedTuned-Instructions_inout Viewer • Updated Jan 11, 2024 • 205k • 15