PumpkinEscobar

joined 1 year ago
[–] PumpkinEscobar@lemmy.world 47 points 10 hours ago (1 children)

But, I thought Trump was all about state’s rights? This is very confusing. /s

[–] PumpkinEscobar@lemmy.world 5 points 1 week ago

Bench warrant, let’s do this!

[–] PumpkinEscobar@lemmy.world 2 points 1 month ago (1 children)

Similar to previous reply about MATE with font size changes, I do that with plasma. I hadn’t seen plasma big screen you linked, I’ll definitely try that one out. I’ve wondered about https://en.m.wikipedia.org/wiki/Plasma_Mobile? Like these sort of niche projects don’t always get a lot of attention, if the bigscreen project doesn’t work out, I’d bet the plasma mobile project is fairly active and given the way it scales for displays might work really well on a tv

Speaking of scaling since you mentioned it. I have noticed scaling in general feels a lot better in Wayland. If you’d only tried it in X11 before, might want to see if Wayland works better for you.

[–] PumpkinEscobar@lemmy.world 23 points 1 month ago* (last edited 1 month ago) (1 children)

First a caveat/warning - you'll need a beefy GPU to run larger models, there are some smaller models that perform pretty well.

Adding a medium amount of extra information for you or anyone else that might want to get into running models locally

Tools

  • Ollama - great app for downloading/managing/running models locally
  • OpenWebUI - A web app that provides a UI like the ChatGPT web app, but can use local models
  • continue.dev - A VS Code extension that can use ollama to give a github copilot-like AI assistant running against a local model (can also connect to Anthropic Claude, etc...)

Models

If you look at https://ollama.com/library?sort=featured you can see models

Model size is measured by parameter count. Generally higher parameter models are better (more "smart", more accurate) but it's very challenging/slow to run anything over 25b parameters on consumer GPUs. I tend to find 8-13b parameter models are a sort of sweet spot, the 1-4b parameter models are meant more for really low power devices, they'll give you OK results for simple requests and summarizing, but they're not going to wow you.

If you look at the 'tags' for the models listed below, you'll see things like 8b-instruct-q8_0 or 8b-instruct-q4_0. The q part refers to quantization, or shrinking/compressing a model and the number after that is roughly how aggressively it was compressed. Note the size of each tag and how the size reduces as the quantization gets more aggressive (smaller numbers). You can roughly think of this size number as "how much video ram do I need to run this model". For me, I try to aim for q8 models, fp16 if they can run in my GPU. I wouldn't try to use anything below q4 quantization, there seems to be a lot of quality loss below q4. Models can run partially or even fully on a CPU but that's much slower. Ollama doesn't yet support these new NPUs found in new laptops/processors, but work is happening there.

  • Llama 3.1 - The 8b instruct model is pretty good, decent speed and good quality. This is a good "default" model to use
  • Llama 3.2 - This model was just released yesterday. I'm only seeing the 1b and 3b models right now. They've changed the 8b model to 11b, I'm assuming the 11b model is going to be my new goto when it's available.
  • Deepseek Coder v2 - A great coding assistant model
  • Command-r - This is a more niche model, mainly useful for RAG. It's only available in a 35b parameter model, so not all that feasible to run locally
  • Mistral small - A really good model, in the ballpark of Llama. I haven't had quite as much luck with this as with Llama but it is good and I just saw that a new version was released 8 days ago, will need to check it out again
[–] PumpkinEscobar@lemmy.world 10 points 1 month ago (3 children)

It’s a good thing that real open source models are getting good enough to compete with or exceed OpenAI.

[–] PumpkinEscobar@lemmy.world 1 points 1 month ago* (last edited 1 month ago) (1 children)

The Moon is a Harsh Mistress - by Heinlein who also wrote Starship Troopers. Starship Troopers is also great and pretty different from the movie

[–] PumpkinEscobar@lemmy.world 4 points 1 month ago

I'll preface by saying I think LLMs are useful and in the next couple years there will be some interesting new uses and existing ones getting streamlined...

But they're just next word predictors. The best you could say about intelligence is that they have an impressive ability to encode knowledge in a pretty efficient way (the storage density, not the execution of the LLM), but there's no logic or reasoning in their execution or interaction with them. It's one of the reasons they're so terrible at math.

[–] PumpkinEscobar@lemmy.world 6 points 1 month ago

I like the game, but agree with the over-tutorialed complaints. They have two difficulty modes, I wish only story mode got all the handholding. I think there’s enough obvious indicators to get you through all the game mechanics.

[–] PumpkinEscobar@lemmy.world 2 points 1 month ago

VS Code’s git features are pretty good for staging changes, resolving merge conflicts, pushing changes. I still do most branch changing and creating with the CLI, and yeah, any sort of problem generally needs the CLI.

We’ve also been using graphite at work and there’s a lot I like about graphite. They have a VS Code extension I haven’t used in a while but their CLI is pretty nice

[–] PumpkinEscobar@lemmy.world 9 points 2 months ago

Donnie Darko - Just such a great, strange movie

view more: next ›