overview for Material1276

Is there any sort of project that is combines Text + Image + TTSVoice generation in one single UI ? in c/localllama@poweruser.forum

[–] Material1276@alien.top 1 points 2 years ago

Its probably not what you're looking for, but SillyTavern does do all those things via API calls.

https://docs.sillytavern.app/

https://docs.sillytavern.app/usage/api-connections/

https://docs.sillytavern.app/extras/extensions/stable-diffusion/

https://docs.sillytavern.app/extras/extensions/tts/

What are the best Text-to-3D models out there right now? in c/localllama@poweruser.forum

[–] Material1276@alien.top 1 points 2 years ago

I don't know of any public ones, but they may be out there. I've only seen a few referenced , places like here:

https://www.youtube.com/watch?v=FEOAnDgCD5A

There's a couple of research papers and names mentioned in there. Maybe you can hunt those papers/names in google and see if there are any references to models.

Anyone else struggling to get Coqui TTS to work in anything other than an American accent? in c/localllama@poweruser.forum

[–] Material1276@alien.top 1 points 2 years ago

I was struggling at first and had that American twang coming through...

But I managed to get a very clear, short clip of an English actor from an interview. There was no background noises, it was very clear. I made sure to clip out any non speech from the start or end of the audio, then saved it as a 22050HZ mono 16bit wav.

That seems to have done it! I get a pretty good representation of the voice and it 99% seems to stay in character with the occasional slight slip.

I also occasionally get a little gibberish, which seems to be when my model is trying to say somehthing like " ' " (which occasionally slips through when its generating text and I look at the backend of whats being sent for audio processing). Im guessing its possible to filter this out with a regex or something.

Is it worth using a bunch of old GTX 10 series cards ( like 1060 1070 1080 ) for running local LLM? in c/localllama@poweruser.forum

[–] Material1276@alien.top 1 points 2 years ago

Another consideration is that I was told by someone with multiple cards, that if you split your layers across multiple cards, they don't all process the layers simultaneously.

So, if you are on 3x cards, you don't get a parallel performance benefit of all cards working at the same time. It processes layers on card 1, then card 2, then card 3.

The slowest card will obviously have the worst speed. Not sure what this will do for your load times of a model or your electricity bill, as well as the fact you need a system big enough to fit them all in.

Look for a model better than MythoMax for Chat/RP in c/localllama@poweruser.forum

[–] Material1276@alien.top 1 points 2 years ago

Heres a link to a up to date ranking of models for RP. Currently 400+ models ranked.

http://ayumi.m8geil.de/ayumi_bench_v3_results.html