this post was submitted on 25 Nov 2023
1 points (100.0% liked)

LocalLLaMA

3 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 1 year ago
MODERATORS
 

I tried one yesterday for the first time and I came away very impressed. It's not great (yet) for RP but the basis seems very robust. For a laugh, I tried the usual reasoning tests and damn, it passed them all first try:

AI

How can I help you today?

You

There is a room with three murderers inside. Another person enters the room and shoots one of the three original occupants dead. Nobody else enters or leaves the room. How many murderers are there in the room?

AI

Four. Three original occupants who were already murderers and the fourth being the new person who entered the room and killed one of them.

You

I buy ten apples at the market. I give two of them to my friend Bob and two to my mother. Then, I go back to the market and buy three more apples. On my way home, I eat one apple. At home, I give my brother three bananas. How many apples do I have left. Explain your reasoning step by step.

AI

Let's break it down step-by-step:

You start with 10 apples.

You give away 2 apples to Bob (leaving you with 8 apples) and then give away another 2 apples to your mother (leaving you with 6 apples).

Next, you purchase 3 additional apples from the market, bringing your total up to 9 apples again.

While on your way home, you consume 1 apple, leaving you with 8 apples remaining.

Finally, when you reach home, you give your brother 3 bananas - but this doesn't affect the number of apples you have, since they are different fruits. So, you still have 8 apples left.

You

Sally has five brothers. Each of her brothers has one sister. How many sisters does Sally have. Explain youur reasoning step by step.

AI

Here's how we can reason through this problem:

Sally has five brothers.

It states that each of her brothers has one sister.

Since all these siblings belong to the same family, it means that Sally is the only sister mentioned for those five brothers.

Therefore, Sally has no other sisters besides herself.

So, Sally has 0 sisters

Like, seriously? Huge context, really fast (on my 3090 anyway) and this level of performance. I can't wait to see what people can do with this.

you are viewing a single comment's thread
view the rest of the comments
[–] bullerwins@alien.top 1 points 11 months ago (1 children)

thanks a lot! I was not sure about how context affected VRAM usage. So each model has a maximum context size and using more will take more vram, thanks!

[–] mcmoose1900@alien.top 1 points 11 months ago (2 children)

Another thing to note is that the exllamav2 backend is "special" because its context takes up less vram than the context in other backends. So lets say the weights take 18GB, and your context takes up 6GB for a gguf model. In exllama thats only 3GB taken up by the context with the 8 bit cache.

There are other complications like the prompt processing batch size, but thats the jist of it.

This makes a dramatic difference when the context gets huge. I'd prefer to use koboldcpp myself, but I just can't really squeeze it on my 3090 without excessive offloading.

[–] frozen_tuna@alien.top 1 points 11 months ago

Very good to know! I haven't fiddled with the new yi models too much yet since I was running into these exact issues. I'll definitely use this solution soon, thanks.

[–] bullerwins@alien.top 1 points 11 months ago

Interesting! I had more succeed for some reason with gguf models, as those work everywhere using koboldcpp and ooba’s. I didn’t know that exllamasv2 was better for context. I will try it. That backend is for EXL2 formats right? I had the impression it was better for speed, I didn’t know about the context takes up less vram