this post was submitted on 19 Apr 2024
46 points (91.1% liked)
Linux
48222 readers
1096 users here now
From Wikipedia, the free encyclopedia
Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).
Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.
Rules
- Posts must be relevant to operating systems running the Linux kernel. GNU/Linux or otherwise.
- No misinformation
- No NSFW content
- No hate speech, bigotry, etc
Related Communities
Community icon by Alpár-Etele Méder, licensed under CC BY 3.0
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
It's not really the same thing. EndeavourOS is basically vanilla Arch + a few branding packages. CachyOS is an opionated Arch with optimised packages.
You still have the option to select the DE and the packages you want to install - just like EndeavourOS - but what sets Cachy apart is the optimisations. For starters, they have multiple custom kernel options, with the BORE scheduler (and a few others), LTO options etc. Then they also have packages compiled for the x86-64-v3 and v4 architectures for better performance.
Of course, you could also just use Arch (or EndeavourOS) and install the x86-64-v3/v4 packages yourself from ALHP (or even the Cachy repos), and you can even manually install the Cachy kernel or a similar optimised one like Xanmod. But you don't get the custom configs / opinionated stuff. Which you many actually not want as a veteran user. But if you're a newbie, then having those opinionated configs isn't such a bad idea, especially if you decide to just get a WM instead of a DE.
I've been thru all of the above scenarios, depending on the situation. My homelab is vanilla Arch but with packages from the Cachy repo. I've also got a pure Cachy install on my gaming desktop just because I was feeling lazy and just wanted an optimised install quickly. They also have a gaming meta package that installs Steam and all the necessary 32-bit libs and stuff, which is nice.
Then there's Cachy Browser, which is a fork of LibreWolf with performance optimisations (kinda similar to Mercury browser, except Mercury isn't MARCH optimised).
As for support, their Discord is pretty active, you can actually chat with the developers directly, and they're pretty friendly (and this includes Piotr Gorski, the main dev, and firelzrd - the person behind the BORE scheduler). Chatting with them, I find the quality of technical discussions a LOT higher than the Arch Discord, which is very off-topic and spammy most of the time.
Also, I liked their response to Arch changes and incidents. When Arch made the recent mkinitcpio changes, their made a very thorough announcement with the exact steps you needed to take (which was far more detailed than the official Arch announcement). Also, when the xz backdoor happened, they updated their repos to fix it even before Arch did.
I've also interacted with the devs pesonally with various technical topics - such as CFLAG and MARCH optimisations, performance benchmarking etc, and it seems like they definitely know their stuff.
So I've full confidence in their technical ability, and I'm happy to recommend the distro for folks interested in performance tuning.
cc: @governorkeagan@lemdro.id
It was my understanding that was all but pointless to do these days.
That depends on your CPU, hardware and workloads.
You're probably thinking of Intel and AVX512 (x86-64-v4) in which case, yes it's pointless because Intel screwed up the implementation, but on the other hand, that's not the case for AMD. Of course, that assumes your program actually makes use of AVX512. v3 is worth it though.
In any case, the usual places where you'd see improvements is when you're compiling stuff, compression, encryption and audio/video encoding (ofc, if your codec is accelerated by your hardware, that's a moot point). Sometimes the improvements are not apparent by normal benchmarks, but would have an overall impact - for instance, if you use filesystem compression, with the optimisations it means you now have lower I/O latency, and so on.
More importantly, if you're a laptop user, this could mean better battery life since using more efficient instructions, so certain stuff that might've taken 4 CPU cycles could be done in 2 etc.
In my own experience on both my Zen 2 and Zen 4 machines, v3/v4 packages made a visible difference. And that's not really surprising, because if you take a look the instructions you're missing out on, you'd be like 'wtf':
CMPXCHG16B, LAHF-SAHF, POPCNT, SSE3, SSE4_1, SSE4_2, SSSE3, AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, OSXSAVE
.And this is not counting any of the AVX512 instructions in v4, or all the CPU-specific instructions eg in
znver4
.It really doesn't make sense that you're spending so much money buying a fancy CPU, but not making use of half of its features...
[citation needed]
Those would show up in any benchmark that is sensitive to I/O latency.
Also, again, [citation needed] that march optimisations measurably lower I/O latency for compressed I/O. For that to happen it is a necessary condition that compression is a significant component in I/O latency to begin with. If 99% of the time was spent waiting for the device to write the data, optimising the 1% of time spent on compression by even as much as 20% would not gain you anything of significance. This is obviously an exaggerated example but, given how absolutely dog slow most I/O devices are compared to how fast CPUs are these days, not entirely unrealistic.
Generally, the effect of such esoteric "optimisations" is so small that the length of your unix username has a greater effect on real-world performance. I wish I was kidding.
You have to account for a lot of variables and measurement biases if you want to make factual claims about them. You can observe performance differences on the order of 5-10% just due to a slight memory layout changes with different compile flags, without any actual performance improvement due to the change in code generation.
That's not my opinion, that's rather well established fact. Read here:
So far, I have yet to see data that shows a significant performance increase from march optimisations which either controlled for the measurement bias or showed an effect that couldn't be explained by measurement bias alone.
There might be an improvement and my personal hypothesis is that there is at least a small one but, so far, we don't actually know.
The more realistic case is that an execution that would have taken 4 CPU cycles on average would then take 3.9 CPU cycles.
I don't have data on how power scales with varying cycles/task at a constant task/time but I doubt it's linear, especially with all the complexities surrounding speculative execution.
"visible" in what way? March optimisations are hardly visible in controlled synthetic tests...
These features cater towards specialised workloads, not general purpose computing.
Applications which facilitate such specialised workloads and are performance-critical usually have hand-made assembly for the critical paths where these specialised instructions can make a difference. Generic compiler optimisations will do precisely nothing to improve performance in any way in that case.
I'd worry more about your applications not making any use of all the cores you've paid good money for. Spoiler alert: Compiler optimisations don't help with that problem one bit.