this post was submitted on 20 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

Sorry if this is off-topic but I think it’s adjacent since many LlaMA users are also using it.

I’m trying to use the Coqui TTS library with a view to plugging it into LLaMA.cpp but for some reason no matter which model I try my attempts at using British English source speech just ends up with an American sounding voice with various distortions. I’m running the Python module as instructed in the docs under macOS on the M1 platform, I’ve tried various models all with similar results.

Nothing at all against American accents but they’re not what I require at the moment so any help in making Coqui sound like a little more RP would be much appreciated!

you are viewing a single comment's thread
view the rest of the comments
[–] Material1276@alien.top 1 points 10 months ago

I was struggling at first and had that American twang coming through...

But I managed to get a very clear, short clip of an English actor from an interview. There was no background noises, it was very clear. I made sure to clip out any non speech from the start or end of the audio, then saved it as a 22050HZ mono 16bit wav.

That seems to have done it! I get a pretty good representation of the voice and it 99% seems to stay in character with the occasional slight slip.

I also occasionally get a little gibberish, which seems to be when my model is trying to say somehthing like " ' " (which occasionally slips through when its generating text and I look at the backend of whats being sent for audio processing). Im guessing its possible to filter this out with a regex or something.