overview for robber

OpenAI says its AI technology acted on its own in an 'unprecedented' hack of another company in c/technology@lemmy.world

[–] robber@lemmy.ml 1 points 6 days ago

Hugging Face used the incident to promote running open-weights LLMs on-prem, which a large part of their business focusses on.

Their blog post

Whats a good alternative to Instagram? in c/selfhosted@lemmy.world

[–] robber@lemmy.ml 42 points 2 weeks ago

Pixelfed

Do you host your own AI? in c/selfhosted@lemmy.world

[–] robber@lemmy.ml 1 points 1 month ago

Well compared to the strix, 400GB/s is not that bad, I think with fast system RAM and expert offloading you could squeeze quite something out of it when running stuff in the 100b-a10b regions.

Your bigger problem is going to be future software support.

Do you host your own AI? in c/selfhosted@lemmy.world

[–] robber@lemmy.ml 1 points 1 month ago

In case you missed the Ornith 1.0 release (Qwen and Gemma RL finetunes for agentic / coding workloads), they look interesting to bridge the gap until we see larger 3.6 models or a 3.7 release. I didn't test them yet but according to benchmarks, the 35b MoE seems to be more or less on par with Qwen3.6 27b dense, while ofc a lot faster.

Do you host your own AI? in c/selfhosted@lemmy.world

[–] robber@lemmy.ml 1 points 1 month ago

You can control how much context should be fitted with --fit-ctx and how much space the algorithm should leave unallocated (even on a per-GPU basis) with --fit-target.

Do you host your own AI? in c/selfhosted@lemmy.world

[–] robber@lemmy.ml 5 points 1 month ago

I currently run Qwen3.6-27b on llama.cpp and use it via openwebui. Mostly, I use it for web research via tavily, to a lesser extent for coding and interactively learning about things that are new to me but common in training data (such as basic math or ML concepts).

Do you host your own AI? in c/selfhosted@lemmy.world

[–] robber@lemmy.ml 3 points 1 month ago (4 children)

Given the 27b is a dense model, I think the numbers are quite ok. Curious about the quant tho.

The cool thing about the strix is its large unified memory, but it lacks memory bandwith for compute intensive workloads. Something like Qwen3.5-122b MoE with only like 12b active parameters might run at twice the speed if it fits the configuration.

Do you host your own AI? in c/selfhosted@lemmy.world

[–] robber@lemmy.ml 1 points 1 month ago (2 children)

Since implementation of the --fit parameter and its relatives, and --fit on becoming the default, llama.cpp intelligently decides what to offload. For me, it made --n-cpu-moe obsolete.

57

Status symbols could also be called symbols of inequality (lemmy.ml)

submitted 1 month ago by robber@lemmy.ml to c/showerthoughts@lemmy.world

3 comments fedilink

Some days ago I saw people who attended a Fridays for Future demonstration excitedly put political stickers on a shiny blue Lamborghini which was obviously parked at the wrong point in spacetime.

When discussing this with a friend, we concluded that there was quite strong symbolism in that situation - like direct payback for the unnecessary pollution of the planet, the car being the canvas where the activists were able to project their anger onto.

We also talked about luxury cars being a symbol of social inequality.

And only later it hit me, how luxury cars, among other things, are usually called status symbols and how actually they could also be called symbols of equality.

Hardware for local inference? in c/selfhosted@lemmy.world

[–] robber@lemmy.ml 1 points 2 months ago (1 children)

Your biggest issue with 2010 cards will be software (inference engine) support, I assume.

Hardware for local inference? in c/selfhosted@lemmy.world

[–] robber@lemmy.ml 7 points 2 months ago* (last edited 2 months ago)

To add some practical advice:

It depends on what you mean by more advanced models. I run Qwen3.6-27b on 48GB VRAM across 3 cards (RTX 2000e Ada), and with the recent software optimizations merged into llama.cpp (tensor parallelism & MTP) I get around 30 tokens per second in generation. I use the model through openwebui for (agentic) web research and simple Q&A mostly and I'm quite happy with what it can do.

If you want something similar, maybe look at one or two second hand V100 PCIE 32GB. Or something from the Intel Arc Pro series, if you don't mind the software support lacking behind a bit (as in less optimized).

Also it might be worth reading into the difference of dense vs MoE models, if you're new to that. For MoE models, if your system RAM is fast enough, it's often viable to offload the "experts" (largest parts of such models) to RAM, reducing VRAM capacity needs. Note that server motherboards with e.g. octa-channel RAM have a huge advantage over consumer boards (making DDR4 interesting despite slower speed per module).

And to adress your last question, while I have no direct experience, I've seen posts online about people connecting Strix Halo or DGX Spark devices, but usually via a 10+Gbit/s switch as interconnect is crucial (except if you just want to load balance).

Self-hosting LLMs is a very fun thing to do, but also a time- and money-consuming rabbit hole. You might wanna check out the LocalLlama community over at shitjustworks.

Edit: typos

HP's ink-blocking firmware may violate new global sustainability rules in c/technology@lemmy.world

[–] robber@lemmy.ml 6 points 4 months ago

Global sustainability rules???

Would you try a FOSS dating app? in c/asklemmy@lemmy.ml

[–] robber@lemmy.ml 4 points 8 months ago

Exactly this. Since it does not seem to be federated, you're still forced to give your data to a third party you can't choose. And this makes the open source aspect a rather marginal benefit, at least for the privacy-concerned end user. Still, I appreaciate the effort.

183

Well, that's offending (lemmy.ml)

submitted 1 year ago by robber@lemmy.ml to c/linux@lemmy.ml

10 comments fedilink

Text: Allows you to determine whether to limit CPUID maximum value. Set this to enabled for legacy operating systems such as Linux or Unix.

Found this in the BIOS of a Gigabyte Z97X-UD3H mobo.

62

Any experience with Pangolin? (lemmy.ml)

submitted 1 year ago by robber@lemmy.ml to c/selfhosted@lemmy.world

22 comments fedilink

Hi fellow homelabbers! I hope your day / night is going great.

Just stubled across this self-hosted cloudflare tunnel alternernative called Pangolin.

Does anyone use it for exposing their homelab? It looks awesome, but I've never heard of it before.
Should I be reluctant since it's developed by a US-based company? I mean security-wise. (I'll remove this question if it's too political.)
Does anyone know of alternatives pieces or stacks or software that achieve the same without relying on cloudflare?

Your insights are highly appreciated!

271

More than 140 Kenya Facebook moderators diagnosed with severe PTSD (www.theguardian.com)

submitted 2 years ago by robber@lemmy.ml to c/technology@lemmy.world

21 comments fedilink

132

Don't forget to ... (lemmy.ml)

submitted 2 years ago by robber@lemmy.ml to c/piracy@lemmy.dbzer0.com

14 comments fedilink

13

[Solved] Chaining routers and GUA IPv6 addresses (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by robber@lemmy.ml to c/selfhosted@lemmy.world

8 comments fedilink

Hey fellow self-hosting lemmoids

Disclaimer: not at all a network specialist

I'm currently setting up a new home server in a network where I'm given GUA IPv6 addresses in a 64 bit subnet (which means, if I understand correctly, that I can set up many devices in my network that are accessible via a fixed IP to the oustide world). Everything works so far, my services are reachable.

Now my problem is, that I need to use the router provided by my ISP, and it's - big surprise here - crap. The biggest concern for me is that I don't have fine-grained control over firewall rules. I can only open ports in groups (e.g. "Web", "All other ports") and I can only do this network-wide and not for specific IPs.

I'm thinking about getting a second router with a better IPv6 firewall and only use the ISP router as a "modem". Now I'm not sure how things would play out regarding my GUA addresses. Could a potential second router also assign addresses to devices in that globally routable space directly? Or would I need some sort of NAT? I've seen some modern routers with the capability of "pass-through" IPv6 address allocation, but I'm unsure if the firewall of the router would still work in such a configuration.

In IPv4 I used to have a similar setup, where router 1 would just forward all packets for some ports to router 2, which then would decide which device should receive them.

Has any of you experience with a similar setup? And if so, could you even recommend a router?

Many thanks!

Edit: I was able to achieve what I wanted by using OpenWrt and their IPv6 relay mode. Now my ISP router handles all IPv6 addresses directly, but I'm still able to filter the packets using the OpenWrt firewall. For IPv4 I didn't figure out how to, at the same time, use the ISP's DHCP server, so I just went with double NAT. Everything works like a charm. Thank you guys for pointing me in the right direction.

0

Modern online banking (lemmy.ml)

submitted 2 years ago by robber@lemmy.ml to c/mildlyinfuriating@lemmy.world

2 comments fedilink

A couple of years ago, QR-bills were introduced in Switzerland as a means to make payments easier. My bank provides an app to scan the QR codes, which I prefer not to install. The only other option they provide to scan the codes is to use the webcam. Am I supposed to print my digital bills to have my webcam scan them again? Just let me upload a goddamn screenshot.

65

Any of you have a self-hosted AI "hub"? (e.g. for LLM, stable-diffusion, ...) (lemmy.ml)

submitted 2 years ago by robber@lemmy.ml to c/selfhosted@lemmy.world

21 comments fedilink

I've been looking into self-hosting LLMs or stable diffusion models using something like LocalAI and / or Ollama and LibreChat.

Some questions to get a nice discussion going:

Any of you have experience with this?
What are your motivations?
What are you using in terms of hardware?
Considerations regarding energy efficiency and associated costs?
What about renting a GPU? Privacy implications?

98

Migrated my self-hosted Nextcloud to AIO and I absolutely love it (lemmy.ml)

submitted 2 years ago by robber@lemmy.ml to c/selfhosted@lemmy.world

28 comments fedilink

Just wanted to share my happiness.

AIO is the new (at least on my timeline) installation method of Nextcloud, where most of the heavy-lifting is taken care of automatically.

https://github.com/nextcloud/all-in-one