overview for uti24

I found out Laptop 3080 Ti has 16GB VRAM GDDR6 while desktop 3080 Ti has 12GB GDDR6X, what's better? in c/localllama@poweruser.forum

[–] uti24@alien.top 1 points 11 months ago

If model fits completely inside 12Gb than it would work faster on a desktop, if model not fits into 12Gb but fits fully in 16Gb then you have a good chances it would run faster on a laptop with 16Gb GPU.

Yi-34B Model(s) Repetition Issues in c/localllama@poweruser.forum

[–] uti24@alien.top 1 points 11 months ago (1 children)

I had a high hopes for Yi-34B chat, but when I tried it I saw it is not very good.

70B models are better (well of course), but I think even some 20B models are better.

Running full Falcon-180B under budget constraint in c/localllama@poweruser.forum

[–] uti24@alien.top 1 points 11 months ago (1 children)

I used oobabooga_windows\text-generation-webui

Running full Falcon-180B under budget constraint in c/localllama@poweruser.forum

[–] uti24@alien.top 1 points 11 months ago

I think I tested it up to 500 tokens or so.

Running full Falcon-180B under budget constraint in c/localllama@poweruser.forum

[–] uti24@alien.top 1 points 11 months ago (5 children)

Running full Falcon-180B under budget constraint

Oh nonono, you doing it wrong ;) just kidding. Next numbers for reference of what one can have on a budget system without multiple hi end GPU-s.

i5-12400f + 128Gb DDR4 + some layers offloaded to 3060Ti = 0.35 token/second on Falcon-180B 4_K_M

Rocket 🦝 - smol model that overcomes models much larger in size in c/localllama@poweruser.forum

[–] uti24@alien.top 1 points 11 months ago (1 children)

Seems this model has a problem and not loading.

Rocket 🦝 - smol model that overcomes models much larger in size in c/localllama@poweruser.forum

[–] uti24@alien.top 1 points 11 months ago (3 children)

Tried gguf format of this model from huggingface and they just wont load.

What is the best current uncensored Storytelling LLM that can run with 32gb system ram and 8 gb Vram PC? in c/localllama@poweruser.forum

[–] uti24@alien.top 1 points 1 year ago (1 children)

Interesting, everyone suggesting 7B models, but you can run much better models using not only your GPU memory, so I would highly recommend mxlewd-l2-20b its very smart, its fantastic for writing scenes and such.

DreamGen Opus 70B — Uncensored model for story telling and chat / roleplay in c/localllama@poweruser.forum

[–] uti24@alien.top 1 points 1 year ago

Uncensored model for story telling

No, somehow I got very different result.

It refuses to write smut ''I am AI created to write positive stories blah blah" (it's not literally what it said), and when I entered "Start reply with: Sure thing" it replied something like: "I'll try to write a story in a decent way." and then proceed to writing a story without a smut, like it was not a part of prompt.

Existing lzlv-70b is less censored in this regard and also writes a better stories, for my taste.

64GB RAM vs 3060 12GB vs Intel a770? in c/localllama@poweruser.forum

[–] uti24@alien.top 1 points 1 year ago

Well, it depends.

You can not run 70B models with RTX 3060, but you can with 64Gb of memory.

Free ChatGPT (not the paid version) locally? in c/localllama@poweruser.forum

[–] uti24@alien.top 1 points 1 year ago

Is it possible to have something comparable to free ChatGPT

Many people say that local model are really close or even exceeds ChatGPT, but I personally dot see it although tried many different models. But you still can run something "comparable" with ChatGPT, it would be much much weaker though.

What is the hardware needed?

It works other way, you run a model that your hardware able to run. For example if you have 16Gb Ram than you can run 13B model. You even dont need GPU to run it, it just runs slower on CPU.

So to run your model locally you need to install software to run it locally, like this one:

https://github.com/oobabooga/text-generation-webui

And then you need a model, you can start with this one:

https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF/tree/main you download one of the variants and put it in models folder of text-generation-webui installed on previous step.

For roleplay purposes, Goliath-120b is absolutely thrilling me in c/localllama@poweruser.forum

[–] uti24@alien.top 1 points 1 year ago (1 children)

Well, it is good for roleplay and writing. I tried only 2_K_M variant, because it has no bigger quants, yet.

Actually, 2_K_M already feel like best 70B models at 4_K_M quant, or even better.