Next-gen Kaldi: Text-to-speech (TTS)
Reference Audio for Voice Cloning
Provide a reference audio clip. The generated speech will clone the voice from this audio.
This space shows how to convert text to speech with Next-gen Kaldi.
It is running on CPU within a docker container provided by Hugging Face.
Voice Cloning: Select "Voice Cloning" language to use Pocket TTS for zero-shot voice cloning. You need to provide a reference audio clip (upload, record, or URL) to clone the voice.
See more information by visiting the following links:
If you want to deploy it locally, please see https://k2-fsa.github.io/sherpa/
If you want to use Android APKs, please see https://k2-fsa.github.io/sherpa/onnx/tts/apk.html
If you want to use Android text-to-speech engine APKs, please see https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html
If you want to download an all-in-one exe for Windows, please see https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models
See also https://k2-fsa.github.io/sherpa/onnx/tts/all/ for models with audio samples.