this post was submitted on 17 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

Hi,

A lot of roleplay models I tried like to continue the story with some sappy s*** and I hate it. I tried to tell them not to, but they aren't listening to me.

For an example:

X does y. What will happen next? Only time will tell....

Together, x and y are unstoppable. It is a testament to the spirit and unyielding hope they have.

Except multiply the amount of garbage by three.

I tried many models and they all seem to do this. I am getting really tired of it as when it starts it's almost impossible to get it to stop and it just ruins a perfectly good roleplay with this crap..

Sorry for the rant, I'm just a bit frustrated haha.

you are viewing a single comment's thread
view the rest of the comments
[–] Several_Extreme3886@alien.top 1 points 11 months ago (1 children)

How do I run this on my GPU and CPU? I have an rtx2060. It has 12 gb of VRAM and I also have 32 gb of RAM available. Is this enough to run this?

[–] ThisGonBHard@alien.top 1 points 11 months ago (1 children)

Wow, you have one of the rare 2060 12 GB models. My best guess would be GGUF version, Try Q4 with maybe 25 layers offset in GPU. Make sure to close any apps, as you are gonna be really close to running out of RAM.

The Exllama2 4BPW (or kinda Q4 equivalent) model requires around 23 GB of VRAM as an reference point.

[–] Several_Extreme3886@alien.top 1 points 11 months ago

Hmm, I might consider switching out or 2 sticks of 32. That should make things easier. I usually need to be using about 16 at all times for other things so