this post was submitted on 23 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

ElevenLabs just released their speech to speech thing and it’s really cool: https://elevenlabs.io/voice-changer

Now I’m wondering what’s the best similar “voice changer” or speech to speech model that I can run locally?

It doesn’t have to be in real time, I plan on using it to narrate audio books and similar.

Thanks!

top 3 comments
sorted by: hot top controversial new old
[–] a_beautiful_rhind@alien.top 1 points 10 months ago

rvc or so-vits-svc

[–] AnonymousD3vil@alien.top 1 points 10 months ago

You could do a audio transcription then TTS to achieve the similar results with whisper and coqui-ai TTS models.

https://github.com/coqui-ai/TTS

[–] JawGBoi@alien.top 1 points 10 months ago

RVC is definitely the best for this. Unlike most other methods, you don't provide text transcriptions for the training dataset - this makes RVC models really easy to train and there is no compromise of quality.