robber

joined 3 years ago
[–] robber@lemmy.ml 1 points 8 hours ago

Your biggest issue with 2010 cards will be software (inference engine) support, I assume.

[–] robber@lemmy.ml 4 points 10 hours ago* (last edited 7 hours ago)

To add some practical advice:

It depends on what you mean by more advanced models. I run Qwen3.6-27b on 48GB VRAM across 3 cards (RTX 2000e Ada), and with the recent software optimizations merged into llama.cpp (tensor parallelism & MTP) I get around 30 tokens per second in generation. I use the model through openwebui for (agentic) web research and simple Q&A mostly and I'm quite happy with what it can do.

If you want something similar, maybe look at one or two second hand V100 PCIE 32GB. Or something from the Intel Arc Pro series, if you don't mind the software support lacking behind a bit (as in less optimized).

Also it might be worth reading into the difference of dense vs MoE models, if you're new to that. For MoE models, if your system RAM is fast enough, it's often viable to offload the "experts" (largest parts of such models) to RAM, reducing VRAM capacity needs. Note that server motherboards with e.g. octa-channel RAM have a huge advantage over consumer boards (making DDR4 interesting despite slower speed per module).

And to adress your last question, while I have no direct experience, I've seen posts online about people connecting Strix Halo or DGX Spark devices, but usually via a 10+Gbit/s switch as interconnect is crucial (except if you just want to load balance).

Self-hosting LLMs is a very fun thing to do, but also a time- and money-consuming rabbit hole. You might wanna check out the LocalLlama community over at shitjustworks.

Edit: typos

[–] robber@lemmy.ml 6 points 2 months ago

Global sustainability rules???

[–] robber@lemmy.ml 4 points 6 months ago

Exactly this. Since it does not seem to be federated, you're still forced to give your data to a third party you can't choose. And this makes the open source aspect a rather marginal benefit, at least for the privacy-concerned end user. Still, I appreaciate the effort.

[–] robber@lemmy.ml 16 points 6 months ago (7 children)

I haven't tried it, but there is one: https://github.com/Alovoa/alovoa

[–] robber@lemmy.ml 10 points 7 months ago (2 children)

Given that Google generated more than 250 billion U.S. dollars in ad revenue in 2024, I'd say they must be pretty effective.

Source

[–] robber@lemmy.ml 1 points 7 months ago (1 children)
[–] robber@lemmy.ml 6 points 7 months ago

That brian typo really gave me a chuckle. Hope you found the movie you were looking for.

[–] robber@lemmy.ml 2 points 7 months ago (2 children)

Wikipedia states the UI layer is propriertary, is that true?

[–] robber@lemmy.ml 5 points 8 months ago

The country's official app for COVID immunity certificates or whatever they were called was available on F-Droid at the time.

[–] robber@lemmy.ml 15 points 8 months ago* (last edited 8 months ago) (1 children)

A review from earlier this year didn't sound too bad.

Edit: as pointed out, the review seems to be about the previous version of the phone.

[–] robber@lemmy.ml 3 points 8 months ago* (last edited 8 months ago)

One reason could be that the audience on lemmy has a left-ish bias and there's a political component to the Spotify exodus.

Edit: don't get me wrong, I love seeing content and engagement on here.

 

Text: Allows you to determine whether to limit CPUID maximum value. Set this to enabled for legacy operating systems such as Linux or Unix.

Found this in the BIOS of a Gigabyte Z97X-UD3H mobo.

 

Hi fellow homelabbers! I hope your day / night is going great.

Just stubled across this self-hosted cloudflare tunnel alternernative called Pangolin.

  • Does anyone use it for exposing their homelab? It looks awesome, but I've never heard of it before.

  • Should I be reluctant since it's developed by a US-based company? I mean security-wise. (I'll remove this question if it's too political.)

  • Does anyone know of alternatives pieces or stacks or software that achieve the same without relying on cloudflare?

Your insights are highly appreciated!

 
 

Hey fellow self-hosting lemmoids

Disclaimer: not at all a network specialist

I'm currently setting up a new home server in a network where I'm given GUA IPv6 addresses in a 64 bit subnet (which means, if I understand correctly, that I can set up many devices in my network that are accessible via a fixed IP to the oustide world). Everything works so far, my services are reachable.

Now my problem is, that I need to use the router provided by my ISP, and it's - big surprise here - crap. The biggest concern for me is that I don't have fine-grained control over firewall rules. I can only open ports in groups (e.g. "Web", "All other ports") and I can only do this network-wide and not for specific IPs.

I'm thinking about getting a second router with a better IPv6 firewall and only use the ISP router as a "modem". Now I'm not sure how things would play out regarding my GUA addresses. Could a potential second router also assign addresses to devices in that globally routable space directly? Or would I need some sort of NAT? I've seen some modern routers with the capability of "pass-through" IPv6 address allocation, but I'm unsure if the firewall of the router would still work in such a configuration.

In IPv4 I used to have a similar setup, where router 1 would just forward all packets for some ports to router 2, which then would decide which device should receive them.

Has any of you experience with a similar setup? And if so, could you even recommend a router?

Many thanks!


Edit: I was able to achieve what I wanted by using OpenWrt and their IPv6 relay mode. Now my ISP router handles all IPv6 addresses directly, but I'm still able to filter the packets using the OpenWrt firewall. For IPv4 I didn't figure out how to, at the same time, use the ISP's DHCP server, so I just went with double NAT. Everything works like a charm. Thank you guys for pointing me in the right direction.

 

A couple of years ago, QR-bills were introduced in Switzerland as a means to make payments easier. My bank provides an app to scan the QR codes, which I prefer not to install. The only other option they provide to scan the codes is to use the webcam. Am I supposed to print my digital bills to have my webcam scan them again? Just let me upload a goddamn screenshot.

 

I've been looking into self-hosting LLMs or stable diffusion models using something like LocalAI and / or Ollama and LibreChat.

Some questions to get a nice discussion going:

  • Any of you have experience with this?
  • What are your motivations?
  • What are you using in terms of hardware?
  • Considerations regarding energy efficiency and associated costs?
  • What about renting a GPU? Privacy implications?
 

Just wanted to share my happiness.

AIO is the new (at least on my timeline) installation method of Nextcloud, where most of the heavy-lifting is taken care of automatically.

https://github.com/nextcloud/all-in-one

view more: next ›