overview for ThinkExtension2328

Cheapest way to run local LLMs? in c/localllama@poweruser.forum

[–] ThinkExtension2328@alien.top 1 points 2 years ago

Not sure about the K but the M means medium loss of info during the quantisation phase afaik

Cheapest way to run local LLMs? in c/localllama@poweruser.forum

[–] ThinkExtension2328@alien.top 1 points 2 years ago (4 children)

Honestly the m1 is probably the cheapest solution you have , get your self LLM studio and try out a 7b_K_M model your going to struggle with anything larger then that. But that will let you get to experience what we are all playing with.

How to give knowledge to a model in c/localllama@poweruser.forum

[–] ThinkExtension2328@alien.top 1 points 2 years ago

SPR is not a technology it self it’s a methodology to “compress information” in a way that ai can effectively achieve larger context input for the same size, David does a great video explaining the methodology behind it. Iv found it to be useful as hell.

How to give knowledge to a model in c/localllama@poweruser.forum

[–] ThinkExtension2328@alien.top 1 points 2 years ago (2 children)

I have only recently found the correct awnswr which is take the information and use Sparse Priming Representations (SPR) to distil the information. Next feed this text to privateGPT to use as a vector db document. Since SPR condenses the text you will be able to use more items as part of the retrieval phase.

Now query the LLM using the vector db, due to the SPR encoded text you get highly detailed and accurate results with a small LLM that is easy to run.

Are 7b models useful? in c/localllama@poweruser.forum

[–] ThinkExtension2328@alien.top 1 points 2 years ago (1 children)

They generally good for single shot or low shot tasks. Eg get cliff notes , create templates . You can use vector db for informational accuracy. They struggle to keep character and context iv noticed.

Intel neural-chat-7b-v3-1 in c/localllama@poweruser.forum

[–] ThinkExtension2328@alien.top 1 points 2 years ago

Just tried it can confirm this guy knows what he is talking about ^ , pretty great model tbh

Intel neural-chat-7b-v3-1 in c/localllama@poweruser.forum

[–] ThinkExtension2328@alien.top 1 points 2 years ago (2 children)

Explain your train of thinking about open Hermes and what examples do you have ?

When training an LLM how do you decide to use a 7b, 30b, 120b, etc model (assuming you can run them all)? in c/localllama@poweruser.forum

[–] ThinkExtension2328@alien.top 1 points 2 years ago

Biggest one you can run at a usable rate , the larger models tend to have more nuance , granted some new models are challenging this notion but that’s the general way to go about it.

Intel neural-chat-7b-v3-1 in c/localllama@poweruser.forum

[–] ThinkExtension2328@alien.top 1 points 2 years ago

It’s very good

Recommend a model for converting a book into short notes in c/localllama@poweruser.forum

[–] ThinkExtension2328@alien.top 1 points 2 years ago

Nural chat 7b is pretty good tho this seems a bit stupid imho , your better off using the model stated before and use somthing like privateGPT to ingest the book into a vector db. Then you can effectively “talk to your books”.

Best small model for function calling and decision making task? in c/localllama@poweruser.forum

[–] ThinkExtension2328@alien.top 1 points 2 years ago

Not sure your going to get somthing that small yet , nural chat 7b is the closest logically sound model which is allot better when given a vector db.