Gradio

This space shows how to convert text to speech with Next-gen Kaldi.

It is running on CPU within a docker container provided by Hugging Face.

Voice Cloning: Select "Voice Cloning" language to use voice cloning models:

Pocket TTS: Supports 6 languages (English, French, German, Portuguese, Italian, Spanish). Only requires a reference audio clip.
ZipVoice: Supports Chinese and English. Requires both a reference audio clip and the exact text spoken in the reference audio.

You need to provide a reference audio clip (upload, record, or URL) to clone the voice.

See more information by visiting the following links:

If you want to deploy it locally, please see https://k2-fsa.github.io/sherpa/

If you want to use Android text-to-speech engine APKs, please see https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html

If you want to download an all-in-one exe for Windows, please see https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models

Next-gen Kaldi: Text-to-speech (TTS)