overview for BlueMetaMind

Yi-34B vs Yi-34B-200K on sequences <32K and <4K in c/localllama@poweruser.forum

[–] BlueMetaMind@alien.top 1 points 11 months ago

Yes, I understood you. My claim differs in that I think they DIRECTLY used a lot of GPT4 output through the api, which is very probable because a lot of LLM training is done that way. You ask GPT4 to generate examples of conversations with properties you want your LLM to learn and then train on that.

In order for self identification, as GPT I don’t think that randomly crawled chat Examples from the Internet would be enough.

I am not trying to make a strong claim on that, it’s just a thought. My people both.

Yi-34B vs Yi-34B-200K on sequences <32K and <4K in c/localllama@poweruser.forum

[–] BlueMetaMind@alien.top 1 points 11 months ago (2 children)

It sounds rather like it trained on chatGPT output and they didn't curate it enough to delete those "As a large language model trained by openAI..." category statements.

It's kinda like Shutterstock watermarks showing up in image generation.

What are the best educational resources? in c/localllama@poweruser.forum

[–] BlueMetaMind@alien.top 1 points 11 months ago

Absolutely! I wanted to start a thread like this myself. There is information out there for general AI stuff. But I seek concrete tutorials and articles about current open source LLMs practice . I agree that knowledge is frustratingly scattered.

How to run 70B on 24GB VRAM ? in c/localllama@poweruser.forum

[–] BlueMetaMind@alien.top 1 points 11 months ago

Yeah… I thought I’ll be at least “in the room “ buying my setup last year, but it turns out I’m outside in the gutter 🫣😢

Open LLM Leaderboard vs Reality: How do you evaluate "good" ? in c/localllama@poweruser.forum

[–] BlueMetaMind@alien.top 1 points 11 months ago

What are the top 3 best open source LLMs in your opinion?

Open LLM Leaderboard vs Reality: How do you evaluate "good" ? in c/localllama@poweruser.forum

[–] BlueMetaMind@alien.top 1 points 11 months ago

Or if you are just playing around, you just write/search for a post on reddit (or various LLM related discords) asking for best model for your task :D

I made this post as an attempt to collect best practices and ideas.

use GPT4 to evaluate output of llama.

That's always a good option probably but I try to avoid using openAI all together.

1

Open LLM Leaderboard vs Reality: How do you evaluate "good" ? (alien.top)

submitted 11 months ago by BlueMetaMind@alien.top to c/localllama@poweruser.forum

10 comments fedilink

As a beginner, I appreciate that there are metrics for all these LLMs out there so I don't waste time downloading and trying failures. However, I noticed that the Leaderboard doesn't exactly reflect reality for me. YES, I DO UNDERSTAND THAT IT DEPENDS ON MY NEEDS.

I mean really basic stuff of how the LLM acts as a coherent agent, can follow instructions and grasp context in any given situation. Which is often lacking in LLMs I am trying so far, like the boards leader for 30B models 01-ai/Yi-34B for example. I guess there is something similar going on like it used to with GPU benchmarks: dirty tricks and over-optimization for the tests.

I am interested in how more experienced people here evaluate an LLM's fitness. Do you have a battery of questions and instructions you try out first?

How to run 70B on 24GB VRAM ? in c/localllama@poweruser.forum

[–] BlueMetaMind@alien.top 1 points 11 months ago (1 children)

Thank you. What does " at 5_K_M" mean ?
Can I use the text web UI with Llama.cpp as model loader or is this too much overhead for ?

1

How to run 70B on 24GB VRAM ? (alien.top)

submitted 11 months ago by BlueMetaMind@alien.top to c/localllama@poweruser.forum

12 comments fedilink

I want to run a 70B LLM locally with more than 1 T/s. I have a 3090 with 24GB VRAM and 64GB RAM on the system.

What I managed so far:

Found instructions to make 70B run on VRAM only with a 2.5B that run fast but the perplexity was unbearable. LLM was barely coherent.
I randomly made somehow 70B run with a variation of RAM/VRAM offloading but it run with 0.1 T/S

I saw people claiming reasonable T/s speeds. Sine I am a newbie, I barely can speak the domain language, and most instructions I found assume implicit knowledge I don't have*.

I need explicit instructions on what 70B model to download exactly, which Model loader to use and how to set parameters that are salient in the context.

What is considered the best uncensored LLM right now? in c/localllama@poweruser.forum

[–] BlueMetaMind@alien.top 1 points 11 months ago

Best experience I had was with TheBloke/Wizard-Vicuna-30B- Uncensored-GGML

Best 30B llm so far in general. Censorship kill’s capabilities

Dual 3090,24GB & 1070 worth it? in c/localllama@poweruser.forum

[–] BlueMetaMind@alien.top 1 points 11 months ago

Clever. I often run SD and textUI in --listen mode than use it on my IPad.

1

Dual 3090,24GB & 1070 worth it? (alien.top)

submitted 11 months ago by BlueMetaMind@alien.top to c/localllama@poweruser.forum

5 comments fedilink

I upgraded my system a year ago. Amongst these was a GPU 1070 -> 3090 upgrade. My old card rusts now in the cellar. I heard people around here have mentioned they run desktop on their old card to completely free VRAM in their workhorse card.

is this worth doing in the context of loading LLMs and playing around with them, no training yet? The downside is a worse thermal situation if I cram them together.