File size: 4,441 Bytes
6bf0fa9 bdb4f02 019e2a3 8c00071 019e2a3 88cfb3e 8331c06 4345759 5067878 6bf0fa9 5d9a91a 67e4641 88cfb3e ec9d730 f33f084 5d9a91a a1338da f44507b 02bf1ff 229707e 02bf1ff 5d9a91a 70184a3 5d9a91a ec9d730 02bf1ff 5d9a91a 9146509 02bf1ff c86304f 7d11d0a 23acb9e 5067878 70184a3 5067878 83098ea 02bf1ff 67e4641 c8dd13e 8a23304 7fa53df 8a23304 7fa53df 5067878 02bf1ff c8dd13e 794578f 02bf1ff c8dd13e 794578f 02bf1ff c8dd13e 02bf1ff eb2f44b 17a68db 02bf1ff 17a68db dd2e93a f44507b 9d6172b 972caea 67e4641 7184f5f 972caea bb2cd38 972caea 23acb9e 2a2d5c1 23acb9e 7184f5f 4377106 2639eaf 794578f 4377106 bb2cd38 6ab4672 4377106 dd13de0 bb2cd38 2a2d5c1 bb2cd38 4377106 c4effd2 bb2cd38 c4effd2 bb2cd38 c4effd2 bb2cd38 c4effd2 bb2cd38 c4effd2 bb2cd38 5067878 b71acb9 25b87f7 02bf1ff 5ffcd95 267f0b7 86b9ce4 5ffcd95 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
---
license: cc-by-nc-sa-4.0
language:
- en
pipeline_tag: text-to-audio
tags:
- audiocraft
- audiogen
- styletts2
- shift-tts
- sound
- audio-generation
- text-to-speech
- mimic3
---
Audionar - Phonetic variation of StyleTTS2 blend to AudioGen SoundScapes
[](https://shift-europe.eu/)
##
# SHIFT TTS / Audionar
Phonetic variation of [SHIFT TTS](https://audeering.github.io/shift/) blend to [AudioGen soundscapes](https://huggingface.co/dkounadis/artificial-styletts2/discussions/3)
- [Analysis of emotion of SHIFT TTS](https://huggingface.co/dkounadis/artificial-styletts2/discussions/2)
- [Listen Also foreign languages](https://huggingface.co/dkounadis/artificial-styletts2/discussions/4)
## Listen Voices
<a href="https://huggingface.co/dkounadis/artificial-styletts2/discussions/1">Native English</a> / <a href="https://huggingface.co/dkounadis/artificial-styletts2/discussions/1#6783e3b00e7d90facec060c6">Non-native English: Accents</a> / <a href="https://huggingface.co/dkounadis/artificial-styletts2/discussions/1#6782c5f2a2f852eeb1027a32">Foreign languages</a>
##
```
CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=/data/.hf7/ CUDA_VISIBLE_DEVICES=0 python demo.py
```
## Flask API
<details>
<summary>
Build virtualenv & run api.py
</summary>
Above [TTS Demo](https://huggingface.co/dkounadis/artificial-styletts2/blob/main/demo.py) is a standalone script that loads TTS & AudioGen models and synthesizes a txt. We also provide a Flask `api.py` that allows faster inference with
loading only once TTS & [AudioGen](https://huggingface.co/dkounadis/artificial-styletts2/tree/main/audiocraft)
Clone
```
git clone https://huggingface.co/dkounadis/artificial-styletts2
```
Install
```
cd artificial-styletts2
virtualenv --python=python3.10 .env0
source .env0/bin/activate
pip install -r requirements.txt
```
Flask API - open a 2nd terminal
```
CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=/data/.hf7/ CUDA_VISIBLE_DEVICES=0 python api.py
```
Following examples need `api.py` to be running. [Set this IP](https://huggingface.co/dkounadis/artificial-styletts2/blob/main/tts.py#L93) to the IP shown when starting `api.py`.
```
# git lfs pull # to download assets/ocr.jpg
python tts.py --text assets/ocr.txt --image assets/ocr.jpg --soundscape "battle hero" --voice romanian
```
</details>
## Landscape 2 Soundscapes
The following needs `api.py` to be already running.
```python
# TTS & soundscape - output .mp4 saved in ./out/
python landscape2soundscape.py
```
For SHIFT demo / Collaboration with [SMB](https://www.smb.museum/home/)
- YouTube Videos
[](https://youtu.be/SSi3gUO4GtY)
[](https://youtu.be/2YjxAPkdXIc)
[](https://youtu.be/BhMh02knkco)
[](https://youtu.be/a3qk9S87v60)
[](https://youtu.be/3M0y9OYzDfU)
[](https://youtu.be/56MH7zOHrNQ)
[](https://youtu.be/gnGCYLcdLsA)
[](https://www.youtube.com/watch?v=Y8QyYUgLaCg)
[](https://youtu.be/RhUuS9HMLhg)
[](https://youtu.be/NzzhhrUeKVY)
# SoundScape Live Demo - Paplay
Flask API for playing sounds live
```python
CUDA_DEVICE_ORDER=PCI_BUS_ID HF_HOME=/data/dkounadis/.hf7/ CUDA_VISIBLE_DEVICES=4 python api.py
```
Describe any sound via text, the tts & soundscape is played back
```python
python live_demo.py # type text & plays AudioGen sound & TTS
```
# Audiobook
Create audiobook from `.docx`. Listen to it - YouTube [male voice](https://youtu.be/fUGpfq_o_CU) / [female voice](https://www.youtube.com/watch?v=tlRdRV5nm40)
```python
# audiobook will be saved in ./tts_audiobooks
python audiobook.py
```
|