LocalLLaMA

11 readers

4 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 2 years ago

MODERATORS

communick@poweruser.forum

oss tts engine? (alien.top)

submitted 2 years ago by LyPreto@alien.top to c/localllama@poweruser.forum

6 comments fedilink hide all child comments

So, i've been doing all my LLM-tinkering on an M1-- using llama.cpp/whisper.cpp for to run a basic voice powered assistant, nothing new at this point.
Currently adding a visual component to it-- ShareGPT4V-7B, assuming I manage to convert to gguf. Once thats done i should be able to integrate it with llama.cpp and wire it to a live camera feed-- giving it eyes.
Might even get crazy and throw in a low level component to handle basic object detection, letting the model know when something is being "shown" to the to it-- other than that it will activate when prompted to do so (text or voice).

The one thing I'm not sure about is how to run a TTS engine locally like StyleTTS2-LJSpeech? are there libraries that support tts models?

top 6 comments

sorted by: hot top controversial new old

[–] LyPreto@alien.top 1 points 2 years ago (1 children)

Update: quick reddit search (which i should've done prior to posting tbh) led me to this post: ai_voicechat_script

[–] phree_radical@alien.top 1 points 2 years ago

I just use plain old Web Speech on PC and TextToSpeech on Android. I wasn't gonna say anything because they don't sound as good as the compute-heavy ones, but, they're... way better than whatever that is!

[–] pan_and_scan@alien.top 1 points 2 years ago

Remindme! 5 days

[–] a_beautiful_rhind@alien.top 1 points 2 years ago (1 children)

I'm very tempted to make a server for this one because I am liking it more than XTTS after trying it and valle. It makes less artifacts.

There are notebook examples in the repo so perhaps they can be leveraged into the XTTS api server implementation.

For your project you can also more easily use coqi if that level is enough for you.

[–] LyPreto@alien.top 1 points 2 years ago (1 children)

tried coqui and had issues with performance— read online and its doesnt seem to fully support MPS.

for now i’m using upon edge-tts which is doing the trick for now and is pretty decent/free.

is xtts supported on macs?

[–] a_beautiful_rhind@alien.top 1 points 2 years ago

It's tortoise so who knows. There is mac pytorch now. You would have to figure it out from scratch. I'm not sure why nobody is trying it.

When I tried edge-tts it was very mediocre like silero.