Aletheia-ng/processed_data
Updated • 1.67k
Viewer
• Updated • 8.73M • 6
Viewer
• Updated • 94.8M • 120
Viewer
• Updated • 158M • 29
Viewer
• Updated • 200M • 49
Aletheia-ng/pidgin-corpus-synth
Viewer
• Updated • 57.1k • 18
Aletheia-ng/yoruba-corpus-synth
Viewer
• Updated • 20.2k • 16
Aletheia-ng/nigerian-pidgin-corpus-synth
Aletheia-ng/pretrain_data10
Viewer
• Updated • 40.9M • 37
Aletheia-ng/low_resource_languages_pretrain_data4
Viewer
• Updated • 469M • 152
Aletheia-ng/pretrain_data11
Aletheia-ng/pretrain_data9
Viewer
• Updated • 79.1M • 54
Aletheia-ng/pretrain_data5
Viewer
• Updated • 9.43M • 53
Aletheia-ng/pretrain_data4
Viewer
• Updated • 124M • 189
Aletheia-ng/pretrain_data7
Viewer
• Updated • 13M • 10
Aletheia-ng/pretrain_data3
Viewer
• Updated • 143M • 116
Viewer
• Updated • 136 • 12
Aletheia-ng/pretrain_data
Viewer
• Updated • 109M • 51
Aletheia-ng/pretrain_data2
Viewer
• Updated • 18.2M • 25
Aletheia-ng/low_resource_languages_pretrain
Viewer
• Updated • 202M • 255
• 1
Aletheia-ng/masakhaner_eval
Aletheia-ng/noisy_dataset
Viewer
• Updated • 84k • 8
Viewer
• Updated • 84k • 4
Aletheia-ng/personal_finance_v0.2
Viewer
• Updated • 56.6k • 7
• 1
Aletheia-ng/bloomberg-news-articles-pretraining-dataset
Viewer
• Updated • 437k • 7
• 5
Aletheia-ng/ChatML-aya_dataset
Viewer
• Updated • 202k • 6
Aletheia-ng/yo_wiki_processed
Viewer
• Updated • 43.5k • 6