overview for hackerllama

Fitting 70B models in a 4gb GPU, The whole model, no quants or distil or anything! in c/localllama@poweruser.forum

[–] hackerllama@alien.top 1 points 11 months ago (1 children)

Hey there! I think this is doing offloading?

If so, it's not a new thing. Check out https://huggingface.co/docs/accelerate/usage_guides/big_modeling for a guide with code and videos about it

Yi-34B and Yi-34B-Chat are out in c/localllama@poweruser.forum

[–] hackerllama@alien.top 1 points 11 months ago (4 children)

Base models are not trained for conversations, so you cannot use it as a chat. It's like GPT-4 and ChatGPT. GPT-4 is the base model, then it's fine-tuned to be conversational, which is what you see in ChatGPT. Same as Llama vs Chat Llama.

Yi-34B and Yi-34B-Chat are out in c/localllama@poweruser.forum

[–] hackerllama@alien.top 1 points 11 months ago

The chat model came out today

1

Yi-34B and Yi-34B-Chat are out (alien.top)

submitted 11 months ago by hackerllama@alien.top to c/localllama@poweruser.forum

29 comments fedilink

Yi is a series of LLMs trained from scratch at 01.AI. The models have the same architecture of Llama, making them compatible with all the llama-based ecosystems. Just in November, they released

Base 6B and 34B models
Models with extended context of up to 200k tokens
Today, the Chat models

With the release, they are also releasing 4-bit quantized by AWQ and 8-bit quantized by GPTQ

Chat model - https://huggingface.co/01-ai/Yi-34B-Chat
Demo to try it out - https://huggingface.co/spaces/01-ai/Yi-34B-Chat

Things to consider:

Llama compatible format, so you can use across a bunch of tools
License is not commercial unfortunately, but you can request commercial use and they are quite responsive
34B is an amazing model size for consumer GPUs
Yi-34B is at the top of the OS Leaderboard, making it a very strong base model for a chat one

What UI do you use and why? in c/localllama@poweruser.forum

[–] hackerllama@alien.top 1 points 11 months ago

https://github.com/huggingface/chat-ui/tree/main