BlueMetaMind

joined 11 months ago
[–] BlueMetaMind@alien.top 1 points 11 months ago

Yes, I understood you. My claim differs in that I think they DIRECTLY used a lot of GPT4 output through the api, which is very probable because a lot of LLM training is done that way. You ask GPT4 to generate examples of conversations with properties you want your LLM to learn and then train on that.

In order for self identification, as GPT I don’t think that randomly crawled chat Examples from the Internet would be enough.

I am not trying to make a strong claim on that, it’s just a thought. My people both.

[–] BlueMetaMind@alien.top 1 points 11 months ago (2 children)

It sounds rather like it trained on chatGPT output and they didn't curate it enough to delete those "As a large language model trained by openAI..." category statements.

It's kinda like Shutterstock watermarks showing up in image generation.

[–] BlueMetaMind@alien.top 1 points 11 months ago

Absolutely! I wanted to start a thread like this myself. There is information out there for general AI stuff. But I seek concrete tutorials and articles about current open source LLMs practice . I agree that knowledge is frustratingly scattered.

[–] BlueMetaMind@alien.top 1 points 11 months ago

Yeah… I thought I’ll be at least “in the room “ buying my setup last year, but it turns out I’m outside in the gutter 🫣😢

[–] BlueMetaMind@alien.top 1 points 11 months ago

What are the top 3 best open source LLMs in your opinion?

[–] BlueMetaMind@alien.top 1 points 11 months ago

Or if you are just playing around, you just write/search for a post on reddit (or various LLM related discords) asking for best model for your task :D

I made this post as an attempt to collect best practices and ideas.

use GPT4 to evaluate output of llama.

That's always a good option probably but I try to avoid using openAI all together.

 

As a beginner, I appreciate that there are metrics for all these LLMs out there so I don't waste time downloading and trying failures. However, I noticed that the Leaderboard doesn't exactly reflect reality for me. YES, I DO UNDERSTAND THAT IT DEPENDS ON MY NEEDS.

I mean really basic stuff of how the LLM acts as a coherent agent, can follow instructions and grasp context in any given situation. Which is often lacking in LLMs I am trying so far, like the boards leader for 30B models 01-ai/Yi-34B for example. I guess there is something similar going on like it used to with GPU benchmarks: dirty tricks and over-optimization for the tests.

I am interested in how more experienced people here evaluate an LLM's fitness. Do you have a battery of questions and instructions you try out first?

[–] BlueMetaMind@alien.top 1 points 11 months ago (1 children)

Thank you. What does " at 5_K_M" mean ?
Can I use the text web UI with Llama.cpp as model loader or is this too much overhead for ?

 

I want to run a 70B LLM locally with more than 1 T/s. I have a 3090 with 24GB VRAM and 64GB RAM on the system.

What I managed so far:

  • Found instructions to make 70B run on VRAM only with a 2.5B that run fast but the perplexity was unbearable. LLM was barely coherent.
  • I randomly made somehow 70B run with a variation of RAM/VRAM offloading but it run with 0.1 T/S

I saw people claiming reasonable T/s speeds. Sine I am a newbie, I barely can speak the domain language, and most instructions I found assume implicit knowledge I don't have*.

I need explicit instructions on what 70B model to download exactly, which Model loader to use and how to set parameters that are salient in the context.

[–] BlueMetaMind@alien.top 1 points 11 months ago

Best experience I had was with TheBloke/Wizard-Vicuna-30B- Uncensored-GGML

Best 30B llm so far in general. Censorship kill’s capabilities

[–] BlueMetaMind@alien.top 1 points 11 months ago

Clever. I often run SD and textUI in --listen mode than use it on my IPad.

 

I upgraded my system a year ago. Amongst these was a GPU 1070 -> 3090 upgrade. My old card rusts now in the cellar. I heard people around here have mentioned they run desktop on their old card to completely free VRAM in their workhorse card.

is this worth doing in the context of loading LLMs and playing around with them, no training yet? The downside is a worse thermal situation if I cram them together.