Update README.md

Browse files

Files changed (1) hide show

README.md +75 -259

README.md CHANGED Viewed

@@ -38,86 +38,54 @@ pipeline_tag: text-to-speech
 license: cc-by-nc-sa-4.0
 ---
-# Speech Synthesis for Bambara Language 🇲🇱
-MALIBA-AI Bambara TTS represents a groundbreaking advancement in African language technology, offering **open-source, high-quality text-to-speech synthesis** specifically designed for the Bambara language. Built on cutting-edge Spark-TTS architecture, this model brings professional-grade voice synthesis to a language spoken by over 14 million people across West Africa.
-## Bridging the Digital Language Divide
-Bambara (Bamanankan) is the most widely spoken language in Mali and serves as a lingua franca across West Africa. Despite its significance, Bambara has been severely underrepresented in speech technology. MALIBA-AI Bambara TTS directly addresses this critical gap, making digital speech interfaces accessible to Bambara speakers for the first time open-source and advancing digital inclusion across the region.
-## Table of Contents
-- [Technical Specifications](#technical-specifications)
-- [Speaker System](#speaker-system)
-- [Transforming Access to Technology](#transforming-access-to-technology)
-- [Installation](#installation)
-- [Usage](#usage)
-- [Performance & Quality](#performance--quality)
-- [Limitations](#limitations)
-- [The MALIBA-AI Impact](#the-maliba-ai-impact)
-- [Future Development](#future-development)
-- [References](#references)
-- [License](#license)
-- [Contributing](#contributing)
-## Technical Specifications
-### Model Architecture
-- **Base Architecture**: Spark-TTS (LLM-based Text-to-Speech)
-- **Foundation Model**: Qwen2.5-based language model
-- **Innovation**: Single-stream decoupled speech tokens
-- **Model Size**: ~500M parameters
-- **Format**: PyTorch/Transformers compatible
-- **Sampling Rate**: 16kHz
-- **Audio Encoding**: 16-bit PCM mono
-- **Language**: Bambara (bm-ML)
-### Key Technical Features
-- **Zero-dependency Generation**: No separate flow matching or vocoder models required
-- **Direct Audio Reconstruction**: LLM directly predicts audio tokens
-- **Efficient Architecture**: Streamlined process improving both speed and quality
-- **GPU Acceleration**: Optimized for CUDA when available
-- **CPU Compatibility**: Functional on CPU-only systems
-## Speaker System
-MALIBA-AI Bambara TTS features **10 distinct authentic Bambara speakers**, each with unique characteristics:
-### Available Speakers
-- **Adama**
-- **Moussa**
-- **Bourama**
-- **Modibo**
-- **Seydou**
-- **Amadou**
-- **Bakary**
-- **Ngolo**
-- **Ibrahima**
-- **Amara**
-**Note**: try them and choose your preference for your use case.
-## Installation
-Install the MALIBA-AI SDK using pip:
 ```bash
-    pip install maliba_ai
 ```
-For faster installation with uv:
 ```bash
     uv pip install maliba_ai
 ```
-Development installation:
 ```bash
-git clone https://github.com/MALIBA-AI/bambara-tts.git
-cd bambara-tts
-pip install -e .
 ```
 Note : if you are in colab  please install those additional dependencies :
 ```
@@ -126,249 +94,97 @@ Note : if you are in colab  please install those additional dependencies :
     !pip install --no-deps unsloth
 ```
-## Usage
-### Quick Start
 ```python
-from maliba_ai.tts import BambaraTTSInference
 from maliba_ai.config.settings import Speakers
-import soundfile as sf
-# Initialize the TTS system
 tts = BambaraTTSInference()
-# Generate speech from Bambara text
 text = "Aw ni ce. I ka kɛnɛ wa?"
-audio = tts.generate_speech(text, speaker_id=Speakers.Bourama)
-# Save the audio
-sf.write("greeting.wav", audio, 16000)
-print("Bambara speech generated successfully!")
 ```
-### Advanced Usage
-```python
-# Fine-tune generation parameters
-audio = tts.generate_speech(
-    text="An ka baara kɛ ɲɔgɔn fɛ",           # "Let's work together"
-    speaker_id=Speakers.Adama,
-    temperature=0.8,                          # Sampling temperature
-    top_k=50,                                # Vocabulary sampling
-    top_p=0.9,                               # Nucleus sampling
-    max_new_audio_tokens=2048,               # Maximum audio length
-    output_filename="collaboration.wav"       # Auto-save option
-)
-```
-### Multi-Speaker Examples
-```python
-from maliba_ai.config.settings import Speakers
-text = "Aw ni ce. Ne tɔgɔ ye Adama. Awɔ,  ne ye maliden de ye. Aw Sanbɛ Sanbɛ. San min tɛ ɲinan ye, an bɛɛ ka jɛ ka o seli ɲɔgɔn fɛ,  hɛɛrɛ  ni lafiya la. Ala ka Mali suma. Ala ka Mali yiriwa. Ala ka Mali taa ɲɛ. Ala ka an ka seliw caya. Ala ka yafa an bɛɛ ma."
-#let's try Adama
-tts.generate_speech(
-    text = text,
-    speaker_id = Speakers.Adama,
-    output_filename = "adama.wav"
-)
-#let's try Seydou
-tts.generate_speech(
-    text = text,
-    speaker_id = Speakers.Seydou,
-    output_filename = "seydou.wav"
-)
-# let's try Bourama
-tts.generate_speech(
-    text = text,
-    speaker_id = Speakers.Bourama,
-    output_filename = "bourama.wav"
-)
-```
-## Performance & Quality
-### Quality Metrics
-- **Mean Opinion Score (MOS)**: 4.2/5.0 for naturalness
-- **Speaker Similarity**: High fidelity to original speaker characteristics
-- **Intelligibility**: 95%+ word recognition accuracy
-- **Pronunciation Accuracy**: Native-level Bambara pronunciation
-## Limitations
-### Known Limitations
-#### Language Mixing (Code-Switching)
-- **French-Bambara Mixing**: The model performs poorly when French words or phrases are mixed within Bambara text
-- **Recommendation**: Use pure Bambara text for optimal results
-#### Numeric Content
-- **Digital Numbers**: Poor performance with Arabic numerals (1, 2, 3, etc.)
-- **Written Numbers**: Good performance with Bambara number words
-- **Recommendation**: Convert digits to written Bambara number words
-## The MALIBA-AI Impact
-MALIBA-AI Bambara TTS is part of MALIBA-AI's broader mission: **"No Malian Left Behind by Technological Advances."** This initiative is actively transforming Mali's digital landscape by:
-### Digital Inclusion
-1. **Breaking Language Barriers**: Providing technology in languages that Malians actually speak
-2. **Literacy Support**: Audio interfaces for users with varying literacy levels
-3. **Cultural Preservation**: Digitizing and preserving Mali's rich oral traditions
-### Technological Empowerment
-1. **Local Innovation**: Enabling Malian developers to build voice-based applications
-2. **AI Democratization**: Making cutting-edge speech technology accessible to all
-3. **Economic Opportunities**: Creating new possibilities for tech entrepreneurship in Mali
-4. **Educational Advancement**: Supporting mother-tongue education through technology
-### Community Impact
-- **14+ Million Speakers**: Directly serving the Bambara-speaking population
-- **Regional Influence**: Supporting Bambara speakers across West Africa
-- **Cultural Identity**: Strengthening linguistic identity in the digital age
-- **Intergenerational Bridge**: Connecting traditional oral culture with digital innovation
-## Future Development
-MALIBA-AI is committed to continuous improvement with planned developments:
-### Technical Roadmap
-- **Enhanced Code-Switching**: Better support for French-Bambara mixed content
-- **Improved Numerics**: Advanced handling of numbers, dates, and technical terms
-- **Emotion Control**: Adjustable emotional expression in synthesis
-- **Voice Cloning**: Zero-shot voice cloning capabilities for new speakers
-- **Streaming Audio**: Real-time streaming synthesis for interactive applications
-## References
 ```bibtex
-@software{maliba_ai_bambara_tts_2025,
-  title={MALIBA-AI Bambara Text-to-Speech: Open-Source Hight Quality TTS for Bambara Language},
   author={MALIBA-AI},
   year={2025},
-  publisher={HuggingFace},
-  url={https://huggingface.co/MALIBA-AI/bambara-tts},
-  note={Built on Spark-TTS architecture}
-}
-@misc{wang2025sparktts,
-  title={Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens},
-  author={Xinsheng Wang and Mingqi Jiang and Ziyang Ma and Ziyu Zhang and Songxiang Liu and Linqin Li and Zheng Liang and Qixi Zheng and Rui Wang and Xiaoqin Feng and Weizhen Bian and Zhen Ye and Sitong Cheng and Ruibin Yuan and Zhixian Zhao and Xinfa Zhu and Jiahao Pan and Liumeng Xue and Pengcheng Zhu and Yunlin Chen and Zhifei Li and Xie Chen and Lei Xie and Yike Guo and Wei Xue},
-  year={2025},
-  eprint={2503.01710},
-  archivePrefix={arXiv},
-  primaryClass={cs.SD},
-  url={https://arxiv.org/abs/2503.01710}
 }
 ```
-## Usage Disclaimer & Ethical Guidelines
-⚠️ **Important Usage Guidelines**
-This Bambara TTS model is intended for legitimate applications that benefit the Bambara-speaking community and support language preservation efforts.
-### Authorized Uses:
-- **Educational purposes**: Language learning, pronunciation training, literacy programs
-- **Accessibility tools**: Screen readers, communication aids for people with disabilities
-- **Cultural preservation**: Documenting oral traditions, creating audio archives
-- **Research**: Academic studies on Bambara linguistics and speech technology
-- **Community applications**: Local radio, public announcements, community services
-### Prohibited Uses:
-- **Unauthorized voice cloning** or impersonation without explicit consent
-- **Fraud or scams** using generated Bambara speech
-- **Deepfakes or misleading content** that could harm individuals or communities
-- **Any illegal activities** under local or international law
-- **Harassment or discrimination** targeting any group or individual
-### Ethical Responsibilities:
-- Always obtain proper consent when using someone's voice characteristics
-- Clearly disclose when audio content is AI-generated
-- Respect the cultural significance of the Bambara language
-- Support the Bambara-speaking community's digital inclusion
-- Report any misuse of the technology to the MALIBA-AI team
-### Community Standards:
-The MALIBA-AI project is committed to responsible AI development that empowers communities rather than exploiting them. We encourage users to:
-- Engage with Bambara speakers and communities respectfully
-- Contribute to the preservation and promotion of Bambara language
-- Use this technology to bridge digital divides, not create them
-- Share improvements back with the community when possible
-**The developers assume no liability for any misuse of this model. Users are responsible for ensuring their applications comply with applicable laws and ethical standards.**
-If you have concerns about potential misuse or need guidance on ethical applications, please contact us at ml.maliba.ai@gmail.com
-- **Spark-TTS**: Foundation architecture for neural speech synthesis
-- **MALIBA-AI team**: Dedicated developers, researchers, and linguists
-- **Mali**: Our inspiration for building inclusive technology that serves all communities
-- **Open source community**: Contributors and users who help improve the syste
-## License
-⚠️ **Important License Information**
-This project is licensed under **Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)** due to the licensing terms of the underlying Spark-TTS architecture and training data.
-### Key License Terms
-- **Non-Commercial Use Only**: Research, education, and personal use permitted
-- **Share-Alike**: Derivatives must use the same license
-- **Attribution Required**: Must credit MALIBA-AI and Spark-TTS
-### Commercial Usage
-For commercial licensing options, contact: ml.maliba.ai@gmail.com
-### Attribution Requirements
-```
-This work uses MALIBA-AI Bambara TTS, built on Spark-TTS architecture.
-Licensed under CC BY-NC-SA 4.0.
-Original work: https://huggingface.co/MALIBA-AI/bambara-tts
-Spark-TTS: https://github.com/SparkAudio/Spark-TTS
-```
-## Contributing
-MALIBA-AI Bambara TTS is part of the broader MALIBA-AI initiative with the mission **"No Malian Left Behind by Technological Advances."** We welcome contributions from:
-### Community Contributors
-- **Bambara Language Experts**: To improve linguistic accuracy and cultural authenticity
-- **Native Speakers**: For quality assessment and dialectal insights
-- **Developers**: To create applications and integrations
-- **Researchers**: To advance the underlying technology
-- **Data Contributors**: To expand and improve training datasets
-### How to Contribute
-- **GitHub**: [MALIBA-AI/bambara-tts](https://github.com/MALIBA-AI/bambara-tts)
-- **HuggingFace**: [MALIBA-AI](https://huggingface.co/MALIBA-AI)
-- **Email**: ml.maliba.ai@gmail.com
-- **Community**: Join discussions on model improvements and applications
-### Contribution Guidelines
-- Respect Bambara language and culture
-- Ensure proper consent for any voice data contributions
-- Follow community standards for inclusive development
-- Test thoroughly across different speakers and content types
 ---
 **MALIBA-AI: Empowering Mali's Future Through Community-Driven AI Innovation**

 license: cc-by-nc-sa-4.0
 ---
+# MALIBA-AI Bambara TTS 🇲🇱
+<style>
+img {
+ display: inline;
+}
+</style>
+[![Model architecture](https://img.shields.io/badge/Model_Arch-Spark--TTS-lightgrey)](#model-architecture)
+| [![Model size](https://img.shields.io/badge/Params-500M-lightgrey)](#model-architecture)
+| [![Language](https://img.shields.io/badge/Language-bm-lightgrey)](#datasets)
+| [![License](https://img.shields.io/badge/License-CC--BY--NC--SA--4.0-blue)](#license)
+## Model Overview
+This model provides neural text-to-speech synthesis for Bambara (Bamanankan), the most widely spoken language in Mali. The model supports 10 authentic Bambara speakers and produces high-fidelity audio without requiring separate vocoder models. It serves over 14 million Bambara speakers across West Africa with native-level pronunciation and cultural authenticity.
+- Try our live demo on [Hugging Face Spaces](https://huggingface.co/spaces/MALIBA-AI/BambaraText2Speech)
+- **Available Speakers:** Adama, Moussa, Bourama, Modibo, Seydou, Amadou, Bakary, Ngolo, Ibrahima, Amara
+## Quick Start
+### Installation
+```bash
+pip install maliba_ai
+```
+For development installations:
 ```bash
+pip install git+https://github.com/MALIBA-AI/bambara-tts.git
 ```
+with uv (faster)
 ```bash
     uv pip install maliba_ai
 ```
 ```bash
+    uv pip install git+https://github.com/MALIBA-AI/bambara-tts.git
 ```
 Note : if you are in colab  please install those additional dependencies :
 ```
     !pip install --no-deps unsloth
 ```
+### Basic Usage
 ```python
+from maliba_ai.tts.inference import BambaraTTSInference
 from maliba_ai.config.settings import Speakers
 tts = BambaraTTSInference()
 text = "Aw ni ce. I ka kɛnɛ wa?"
+audio = tts.generate_speech(text=text, speaker_id=Speakers.Bourama, output_path="greeting.wav")
 ```
+Note: More detail : https://github.com/sudoping01/bambara-tts/blob/main/README.md
+## Technical Specifications
+### Architecture
+- **Base Model**: Spark-TTS (LLM-based TTS)
+- **Foundation**: Qwen2.5-based language model
+- **Parameters**: ~500M
+- **Audio Format**: 16kHz, 16-bit PCM mono
+- **Language Support**: Bambara (bm-ML)
+## Model Input/Output
+### Input
+- **Text**: Bambara text in standard orthography
+- **Speaker ID**: Choice of 10 available speakers
+- **Parameters**: Temperature, top-k, top-p (optional)
+### Output
+- **Audio**: 16kHz mono WAV format
+- **Quality**: Professional-grade speech synthesis
+## ⚠️ Known Limitations
+### Language Mixing
+- **Issue**: Poor performance with French-Bambara code-switching
+- **Recommendation**: Use pure Bambara text for optimal results
+### Numeric Content
+- **Issue**: Suboptimal handling of Arabic numerals (1, 2, 3...)
+- **Recommendation**: Convert numbers to written Bambara words
+## ⚠️ Disclaimer
+This model provides high-fidelity Bambara speech synthesis intended for research, education, and community applications. The following uses are **strictly forbidden**:
+- **Voice Impersonation**: Do not clone voices without explicit consent
+- **Deceptive Content**: Do not generate misleading or fraudulent audio
+- **Illegal Activities**: Do not use for any unlawful purposes
+By using this model, you agree to uphold ethical standards and legal responsibilities. We **are not responsible** for any misuse and firmly oppose unethical usage of this technology.
+If you have concerns about potential misuse or need guidance on ethical applications, please contact us at ml.maliba.ai@gmail.com
+## Impact & Mission
+Part of MALIBA-AI's mission: **"No Malian Left Behind by Technological Advances"**
+- **14+ Million Speakers**: Serving Bambara speakers across West Africa
+- **Digital Inclusion**: Breaking language barriers in technology
+- **Cultural Preservation**: Supporting Mali's linguistic heritage
+- **Community Empowerment**: Enabling local innovation and development
+## License
+**CC BY-NC-SA 4.0** - Non-commercial use only due to Spark-TTS base model licensing.
+### Key Terms
+- ✅ Research, education, and personal use
+- ✅ Attribution required
+- ✅ Share-alike derivatives
+- ❌ Commercial use without license
+For commercial licensing: ml.maliba.ai@gmail.com
+## Citation
 ```bibtex
+@software{maliba_ai_bambara_tts,
+  title={MALIBA-AI Bambara Text-to-Speech: Open-Source High-Quality TTS for Bambara Language},
   author={MALIBA-AI},
   year={2025},
+  url={https://huggingface.co/MALIBA-AI/bambara-tts}
 }
 ```
 ---
 **MALIBA-AI: Empowering Mali's Future Through Community-Driven AI Innovation**