sudoping01 commited on
Commit
f0fb690
·
verified ·
1 Parent(s): 0234106

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -259
README.md CHANGED
@@ -38,86 +38,54 @@ pipeline_tag: text-to-speech
38
  license: cc-by-nc-sa-4.0
39
  ---
40
 
41
- # Speech Synthesis for Bambara Language 🇲🇱
42
 
43
- MALIBA-AI Bambara TTS represents a groundbreaking advancement in African language technology, offering **open-source, high-quality text-to-speech synthesis** specifically designed for the Bambara language. Built on cutting-edge Spark-TTS architecture, this model brings professional-grade voice synthesis to a language spoken by over 14 million people across West Africa.
44
 
45
- ## Bridging the Digital Language Divide
46
 
47
- Bambara (Bamanankan) is the most widely spoken language in Mali and serves as a lingua franca across West Africa. Despite its significance, Bambara has been severely underrepresented in speech technology. MALIBA-AI Bambara TTS directly addresses this critical gap, making digital speech interfaces accessible to Bambara speakers for the first time open-source and advancing digital inclusion across the region.
48
 
49
- ## Table of Contents
50
- - [Technical Specifications](#technical-specifications)
51
- - [Speaker System](#speaker-system)
52
- - [Transforming Access to Technology](#transforming-access-to-technology)
53
- - [Installation](#installation)
54
- - [Usage](#usage)
55
- - [Performance & Quality](#performance--quality)
56
- - [Limitations](#limitations)
57
- - [The MALIBA-AI Impact](#the-maliba-ai-impact)
58
- - [Future Development](#future-development)
59
- - [References](#references)
60
- - [License](#license)
61
- - [Contributing](#contributing)
62
 
63
- ## Technical Specifications
 
 
 
 
64
 
65
- ### Model Architecture
66
- - **Base Architecture**: Spark-TTS (LLM-based Text-to-Speech)
67
- - **Foundation Model**: Qwen2.5-based language model
68
- - **Innovation**: Single-stream decoupled speech tokens
69
- - **Model Size**: ~500M parameters
70
- - **Format**: PyTorch/Transformers compatible
71
- - **Sampling Rate**: 16kHz
72
- - **Audio Encoding**: 16-bit PCM mono
73
- - **Language**: Bambara (bm-ML)
74
 
75
- ### Key Technical Features
76
- - **Zero-dependency Generation**: No separate flow matching or vocoder models required
77
- - **Direct Audio Reconstruction**: LLM directly predicts audio tokens
78
- - **Efficient Architecture**: Streamlined process improving both speed and quality
79
- - **GPU Acceleration**: Optimized for CUDA when available
80
- - **CPU Compatibility**: Functional on CPU-only systems
81
 
82
- ## Speaker System
83
 
84
- MALIBA-AI Bambara TTS features **10 distinct authentic Bambara speakers**, each with unique characteristics:
 
85
 
86
- ### Available Speakers
87
- - **Adama**
88
- - **Moussa**
89
- - **Bourama**
90
- - **Modibo**
91
- - **Seydou**
92
- - **Amadou**
93
- - **Bakary**
94
- - **Ngolo**
95
- - **Ibrahima**
96
- - **Amara**
97
 
98
- **Note**: try them and choose your preference for your use case.
99
 
 
100
 
101
- ## Installation
102
 
103
- Install the MALIBA-AI SDK using pip:
 
 
 
 
104
 
105
  ```bash
106
- pip install maliba_ai
107
  ```
 
108
 
109
- For faster installation with uv:
110
  ```bash
111
  uv pip install maliba_ai
112
  ```
113
 
114
- Development installation:
115
  ```bash
116
- git clone https://github.com/MALIBA-AI/bambara-tts.git
117
- cd bambara-tts
118
- pip install -e .
119
  ```
120
-
121
  Note : if you are in colab please install those additional dependencies :
122
 
123
  ```
@@ -126,249 +94,97 @@ Note : if you are in colab please install those additional dependencies :
126
  !pip install --no-deps unsloth
127
  ```
128
 
129
-
130
-
131
- ## Usage
132
-
133
- ### Quick Start
134
 
135
  ```python
136
- from maliba_ai.tts import BambaraTTSInference
137
  from maliba_ai.config.settings import Speakers
138
- import soundfile as sf
139
 
140
- # Initialize the TTS system
141
  tts = BambaraTTSInference()
142
 
143
- # Generate speech from Bambara text
144
  text = "Aw ni ce. I ka kɛnɛ wa?"
145
- audio = tts.generate_speech(text, speaker_id=Speakers.Bourama)
146
 
147
- # Save the audio
148
- sf.write("greeting.wav", audio, 16000)
149
- print("Bambara speech generated successfully!")
150
  ```
151
 
152
- ### Advanced Usage
153
-
154
- ```python
155
- # Fine-tune generation parameters
156
- audio = tts.generate_speech(
157
- text="An ka baara kɛ ɲɔgɔn fɛ", # "Let's work together"
158
- speaker_id=Speakers.Adama,
159
- temperature=0.8, # Sampling temperature
160
- top_k=50, # Vocabulary sampling
161
- top_p=0.9, # Nucleus sampling
162
- max_new_audio_tokens=2048, # Maximum audio length
163
- output_filename="collaboration.wav" # Auto-save option
164
- )
165
- ```
166
 
167
- ### Multi-Speaker Examples
168
-
169
- ```python
170
- from maliba_ai.config.settings import Speakers
171
-
172
- text = "Aw ni ce. Ne tɔgɔ ye Adama. Awɔ, ne ye maliden de ye. Aw Sanbɛ Sanbɛ. San min tɛ ɲinan ye, an bɛɛ ka jɛ ka o seli ɲɔgɔn fɛ, hɛɛrɛ ni lafiya la. Ala ka Mali suma. Ala ka Mali yiriwa. Ala ka Mali taa ɲɛ. Ala ka an ka seliw caya. Ala ka yafa an bɛɛ ma."
173
-
174
- #let's try Adama
175
- tts.generate_speech(
176
- text = text,
177
- speaker_id = Speakers.Adama,
178
- output_filename = "adama.wav"
179
- )
180
-
181
-
182
- #let's try Seydou
183
- tts.generate_speech(
184
- text = text,
185
- speaker_id = Speakers.Seydou,
186
- output_filename = "seydou.wav"
187
- )
188
-
189
-
190
- # let's try Bourama
191
- tts.generate_speech(
192
- text = text,
193
- speaker_id = Speakers.Bourama,
194
- output_filename = "bourama.wav"
195
- )
196
 
197
- ```
 
 
 
 
 
198
 
199
- ## Performance & Quality
200
 
201
- ### Quality Metrics
202
- - **Mean Opinion Score (MOS)**: 4.2/5.0 for naturalness
203
- - **Speaker Similarity**: High fidelity to original speaker characteristics
204
- - **Intelligibility**: 95%+ word recognition accuracy
205
- - **Pronunciation Accuracy**: Native-level Bambara pronunciation
206
 
 
 
 
 
207
 
 
 
 
208
 
 
209
 
210
- ## Limitations
 
 
211
 
212
- ### Known Limitations
 
 
213
 
214
- #### Language Mixing (Code-Switching)
215
- - **French-Bambara Mixing**: The model performs poorly when French words or phrases are mixed within Bambara text
216
- - **Recommendation**: Use pure Bambara text for optimal results
217
 
 
218
 
219
- #### Numeric Content
220
- - **Digital Numbers**: Poor performance with Arabic numerals (1, 2, 3, etc.)
221
- - **Written Numbers**: Good performance with Bambara number words
222
- - **Recommendation**: Convert digits to written Bambara number words
223
 
224
- ## The MALIBA-AI Impact
225
 
226
- MALIBA-AI Bambara TTS is part of MALIBA-AI's broader mission: **"No Malian Left Behind by Technological Advances."** This initiative is actively transforming Mali's digital landscape by:
227
 
228
- ### Digital Inclusion
229
- 1. **Breaking Language Barriers**: Providing technology in languages that Malians actually speak
230
- 2. **Literacy Support**: Audio interfaces for users with varying literacy levels
231
- 3. **Cultural Preservation**: Digitizing and preserving Mali's rich oral traditions
232
 
233
- ### Technological Empowerment
234
- 1. **Local Innovation**: Enabling Malian developers to build voice-based applications
235
- 2. **AI Democratization**: Making cutting-edge speech technology accessible to all
236
- 3. **Economic Opportunities**: Creating new possibilities for tech entrepreneurship in Mali
237
- 4. **Educational Advancement**: Supporting mother-tongue education through technology
238
 
239
- ### Community Impact
240
- - **14+ Million Speakers**: Directly serving the Bambara-speaking population
241
- - **Regional Influence**: Supporting Bambara speakers across West Africa
242
- - **Cultural Identity**: Strengthening linguistic identity in the digital age
243
- - **Intergenerational Bridge**: Connecting traditional oral culture with digital innovation
244
 
245
- ## Future Development
246
 
247
- MALIBA-AI is committed to continuous improvement with planned developments:
248
 
249
- ### Technical Roadmap
250
- - **Enhanced Code-Switching**: Better support for French-Bambara mixed content
251
- - **Improved Numerics**: Advanced handling of numbers, dates, and technical terms
252
- - **Emotion Control**: Adjustable emotional expression in synthesis
253
- - **Voice Cloning**: Zero-shot voice cloning capabilities for new speakers
254
- - **Streaming Audio**: Real-time streaming synthesis for interactive applications
255
 
 
 
 
 
 
256
 
 
257
 
258
- ## References
259
 
260
  ```bibtex
261
- @software{maliba_ai_bambara_tts_2025,
262
- title={MALIBA-AI Bambara Text-to-Speech: Open-Source Hight Quality TTS for Bambara Language},
263
  author={MALIBA-AI},
264
  year={2025},
265
- publisher={HuggingFace},
266
- url={https://huggingface.co/MALIBA-AI/bambara-tts},
267
- note={Built on Spark-TTS architecture}
268
- }
269
-
270
- @misc{wang2025sparktts,
271
- title={Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens},
272
- author={Xinsheng Wang and Mingqi Jiang and Ziyang Ma and Ziyu Zhang and Songxiang Liu and Linqin Li and Zheng Liang and Qixi Zheng and Rui Wang and Xiaoqin Feng and Weizhen Bian and Zhen Ye and Sitong Cheng and Ruibin Yuan and Zhixian Zhao and Xinfa Zhu and Jiahao Pan and Liumeng Xue and Pengcheng Zhu and Yunlin Chen and Zhifei Li and Xie Chen and Lei Xie and Yike Guo and Wei Xue},
273
- year={2025},
274
- eprint={2503.01710},
275
- archivePrefix={arXiv},
276
- primaryClass={cs.SD},
277
- url={https://arxiv.org/abs/2503.01710}
278
  }
279
-
280
  ```
281
 
282
-
283
- ## Usage Disclaimer & Ethical Guidelines
284
-
285
- ⚠️ **Important Usage Guidelines**
286
-
287
- This Bambara TTS model is intended for legitimate applications that benefit the Bambara-speaking community and support language preservation efforts.
288
-
289
- ### Authorized Uses:
290
- - **Educational purposes**: Language learning, pronunciation training, literacy programs
291
- - **Accessibility tools**: Screen readers, communication aids for people with disabilities
292
- - **Cultural preservation**: Documenting oral traditions, creating audio archives
293
- - **Research**: Academic studies on Bambara linguistics and speech technology
294
- - **Community applications**: Local radio, public announcements, community services
295
-
296
- ### Prohibited Uses:
297
- - **Unauthorized voice cloning** or impersonation without explicit consent
298
- - **Fraud or scams** using generated Bambara speech
299
- - **Deepfakes or misleading content** that could harm individuals or communities
300
- - **Any illegal activities** under local or international law
301
- - **Harassment or discrimination** targeting any group or individual
302
-
303
- ### Ethical Responsibilities:
304
- - Always obtain proper consent when using someone's voice characteristics
305
- - Clearly disclose when audio content is AI-generated
306
- - Respect the cultural significance of the Bambara language
307
- - Support the Bambara-speaking community's digital inclusion
308
- - Report any misuse of the technology to the MALIBA-AI team
309
-
310
- ### Community Standards:
311
- The MALIBA-AI project is committed to responsible AI development that empowers communities rather than exploiting them. We encourage users to:
312
- - Engage with Bambara speakers and communities respectfully
313
- - Contribute to the preservation and promotion of Bambara language
314
- - Use this technology to bridge digital divides, not create them
315
- - Share improvements back with the community when possible
316
-
317
- **The developers assume no liability for any misuse of this model. Users are responsible for ensuring their applications comply with applicable laws and ethical standards.**
318
-
319
- If you have concerns about potential misuse or need guidance on ethical applications, please contact us at ml.maliba.ai@gmail.com
320
-
321
- - **Spark-TTS**: Foundation architecture for neural speech synthesis
322
-
323
- - **MALIBA-AI team**: Dedicated developers, researchers, and linguists
324
- - **Mali**: Our inspiration for building inclusive technology that serves all communities
325
- - **Open source community**: Contributors and users who help improve the syste
326
-
327
- ## License
328
-
329
- ⚠️ **Important License Information**
330
-
331
- This project is licensed under **Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)** due to the licensing terms of the underlying Spark-TTS architecture and training data.
332
-
333
- ### Key License Terms
334
- - **Non-Commercial Use Only**: Research, education, and personal use permitted
335
- - **Share-Alike**: Derivatives must use the same license
336
- - **Attribution Required**: Must credit MALIBA-AI and Spark-TTS
337
-
338
- ### Commercial Usage
339
- For commercial licensing options, contact: ml.maliba.ai@gmail.com
340
-
341
- ### Attribution Requirements
342
- ```
343
- This work uses MALIBA-AI Bambara TTS, built on Spark-TTS architecture.
344
- Licensed under CC BY-NC-SA 4.0.
345
- Original work: https://huggingface.co/MALIBA-AI/bambara-tts
346
- Spark-TTS: https://github.com/SparkAudio/Spark-TTS
347
- ```
348
-
349
- ## Contributing
350
-
351
- MALIBA-AI Bambara TTS is part of the broader MALIBA-AI initiative with the mission **"No Malian Left Behind by Technological Advances."** We welcome contributions from:
352
-
353
- ### Community Contributors
354
- - **Bambara Language Experts**: To improve linguistic accuracy and cultural authenticity
355
- - **Native Speakers**: For quality assessment and dialectal insights
356
- - **Developers**: To create applications and integrations
357
- - **Researchers**: To advance the underlying technology
358
- - **Data Contributors**: To expand and improve training datasets
359
-
360
- ### How to Contribute
361
- - **GitHub**: [MALIBA-AI/bambara-tts](https://github.com/MALIBA-AI/bambara-tts)
362
- - **HuggingFace**: [MALIBA-AI](https://huggingface.co/MALIBA-AI)
363
- - **Email**: ml.maliba.ai@gmail.com
364
- - **Community**: Join discussions on model improvements and applications
365
-
366
- ### Contribution Guidelines
367
- - Respect Bambara language and culture
368
- - Ensure proper consent for any voice data contributions
369
- - Follow community standards for inclusive development
370
- - Test thoroughly across different speakers and content types
371
-
372
  ---
373
 
374
  **MALIBA-AI: Empowering Mali's Future Through Community-Driven AI Innovation**
 
38
  license: cc-by-nc-sa-4.0
39
  ---
40
 
 
41
 
 
42
 
 
43
 
 
44
 
45
+ # MALIBA-AI Bambara TTS 🇲🇱
 
 
 
 
 
 
 
 
 
 
 
 
46
 
47
+ <style>
48
+ img {
49
+ display: inline;
50
+ }
51
+ </style>
52
 
53
+ [![Model architecture](https://img.shields.io/badge/Model_Arch-Spark--TTS-lightgrey)](#model-architecture)
54
+ | [![Model size](https://img.shields.io/badge/Params-500M-lightgrey)](#model-architecture)
55
+ | [![Language](https://img.shields.io/badge/Language-bm-lightgrey)](#datasets)
56
+ | [![License](https://img.shields.io/badge/License-CC--BY--NC--SA--4.0-blue)](#license)
 
 
 
 
 
57
 
58
+ ## Model Overview
 
 
 
 
 
59
 
60
+ This model provides neural text-to-speech synthesis for Bambara (Bamanankan), the most widely spoken language in Mali. The model supports 10 authentic Bambara speakers and produces high-fidelity audio without requiring separate vocoder models. It serves over 14 million Bambara speakers across West Africa with native-level pronunciation and cultural authenticity.
61
 
62
+ - Try our live demo on [Hugging Face Spaces](https://huggingface.co/spaces/MALIBA-AI/BambaraText2Speech)
63
+ - **Available Speakers:** Adama, Moussa, Bourama, Modibo, Seydou, Amadou, Bakary, Ngolo, Ibrahima, Amara
64
 
 
 
 
 
 
 
 
 
 
 
 
65
 
66
+ ## Quick Start
67
 
68
+ ### Installation
69
 
 
70
 
71
+ ```bash
72
+ pip install maliba_ai
73
+ ```
74
+
75
+ For development installations:
76
 
77
  ```bash
78
+ pip install git+https://github.com/MALIBA-AI/bambara-tts.git
79
  ```
80
+ with uv (faster)
81
 
 
82
  ```bash
83
  uv pip install maliba_ai
84
  ```
85
 
 
86
  ```bash
87
+ uv pip install git+https://github.com/MALIBA-AI/bambara-tts.git
 
 
88
  ```
 
89
  Note : if you are in colab please install those additional dependencies :
90
 
91
  ```
 
94
  !pip install --no-deps unsloth
95
  ```
96
 
97
+ ### Basic Usage
 
 
 
 
98
 
99
  ```python
100
+ from maliba_ai.tts.inference import BambaraTTSInference
101
  from maliba_ai.config.settings import Speakers
 
102
 
 
103
  tts = BambaraTTSInference()
104
 
 
105
  text = "Aw ni ce. I ka kɛnɛ wa?"
106
+ audio = tts.generate_speech(text=text, speaker_id=Speakers.Bourama, output_path="greeting.wav")
107
 
 
 
 
108
  ```
109
 
110
+ Note: More detail : https://github.com/sudoping01/bambara-tts/blob/main/README.md
 
 
 
 
 
 
 
 
 
 
 
 
 
111
 
112
+ ## Technical Specifications
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
113
 
114
+ ### Architecture
115
+ - **Base Model**: Spark-TTS (LLM-based TTS)
116
+ - **Foundation**: Qwen2.5-based language model
117
+ - **Parameters**: ~500M
118
+ - **Audio Format**: 16kHz, 16-bit PCM mono
119
+ - **Language Support**: Bambara (bm-ML)
120
 
 
121
 
122
+ ## Model Input/Output
 
 
 
 
123
 
124
+ ### Input
125
+ - **Text**: Bambara text in standard orthography
126
+ - **Speaker ID**: Choice of 10 available speakers
127
+ - **Parameters**: Temperature, top-k, top-p (optional)
128
 
129
+ ### Output
130
+ - **Audio**: 16kHz mono WAV format
131
+ - **Quality**: Professional-grade speech synthesis
132
 
133
+ ## ⚠️ Known Limitations
134
 
135
+ ### Language Mixing
136
+ - **Issue**: Poor performance with French-Bambara code-switching
137
+ - **Recommendation**: Use pure Bambara text for optimal results
138
 
139
+ ### Numeric Content
140
+ - **Issue**: Suboptimal handling of Arabic numerals (1, 2, 3...)
141
+ - **Recommendation**: Convert numbers to written Bambara words
142
 
143
+ ## ⚠️ Disclaimer
 
 
144
 
145
+ This model provides high-fidelity Bambara speech synthesis intended for research, education, and community applications. The following uses are **strictly forbidden**:
146
 
147
+ - **Voice Impersonation**: Do not clone voices without explicit consent
148
+ - **Deceptive Content**: Do not generate misleading or fraudulent audio
149
+ - **Illegal Activities**: Do not use for any unlawful purposes
 
150
 
151
+ By using this model, you agree to uphold ethical standards and legal responsibilities. We **are not responsible** for any misuse and firmly oppose unethical usage of this technology.
152
 
153
+ If you have concerns about potential misuse or need guidance on ethical applications, please contact us at ml.maliba.ai@gmail.com
154
 
155
+ ## Impact & Mission
 
 
 
156
 
157
+ Part of MALIBA-AI's mission: **"No Malian Left Behind by Technological Advances"**
 
 
 
 
158
 
159
+ - **14+ Million Speakers**: Serving Bambara speakers across West Africa
160
+ - **Digital Inclusion**: Breaking language barriers in technology
161
+ - **Cultural Preservation**: Supporting Mali's linguistic heritage
162
+ - **Community Empowerment**: Enabling local innovation and development
 
163
 
 
164
 
165
+ ## License
166
 
167
+ **CC BY-NC-SA 4.0** - Non-commercial use only due to Spark-TTS base model licensing.
 
 
 
 
 
168
 
169
+ ### Key Terms
170
+ - ✅ Research, education, and personal use
171
+ - ✅ Attribution required
172
+ - ✅ Share-alike derivatives
173
+ - ❌ Commercial use without license
174
 
175
+ For commercial licensing: ml.maliba.ai@gmail.com
176
 
177
+ ## Citation
178
 
179
  ```bibtex
180
+ @software{maliba_ai_bambara_tts,
181
+ title={MALIBA-AI Bambara Text-to-Speech: Open-Source High-Quality TTS for Bambara Language},
182
  author={MALIBA-AI},
183
  year={2025},
184
+ url={https://huggingface.co/MALIBA-AI/bambara-tts}
 
 
 
 
 
 
 
 
 
 
 
 
185
  }
 
186
  ```
187
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
188
  ---
189
 
190
  **MALIBA-AI: Empowering Mali's Future Through Community-Driven AI Innovation**