liori

joined 1 year ago
[–] liori@lemm.ee 1 points 7 months ago

Personally I think child processes are the right approach for this. Launch a new process* for each query and it can (if you choose to go that route) dynamically load in compiled code. Exit when you’re done, and the dynamically loaded code is gone. A side benefit of that is memory leaks are contained, since all memory you allocate is about to be removed anyway.

I'd probably be fine with hundreds or thousands of these hanging in memory. I suspect the generated code for a single query would be in hundreds of kilobytes, maybe a megabyte. But yeah, this is one of those technical details I'd worry about.

Honestly, I wonder if you could just use an actual HTTP server for this? They can handle hundreds or even thousands of simultaneous requests. They can handle requests that complete in a fraction of a millisecond or ones that run for several hours. And they have good tools to catch/deal with code that segfaults, hits an endless loop, attempts to allocate terabytes of swap, etc. HTTP also has wonderful tools to load balance across multiple servers if you do need to scale to massive numbers of requests.

Not sure how a HTTP server would solve the CPU bottleneck of scanning terabytes of data per query?

[–] liori@lemm.ee 2 points 7 months ago

I somehow didn't think a regular JIT solution might be applicable here, but it is. Thank you! There seems to be a number of projects doing JIT for C++, will look at them.

 

I'm working on a query engine, essentially a tool to scan/filter/annotate by lookups/group by/aggregate a large dataset, tens-of-terabytes range. The compute part seems to be a bottleneck for me (I'll be doing around 80-300 GB/s of reads, and yes, I will have hardware capable of providing that kind of throughput). My hypothesis is that by encoding query in form of template arguments I can make the compiler generate code optimized for a specific type of query (like, the filtering or aggregation keys). But I do not know what queries will users send, so I need a way to instantiate templates at runtime.

Sounds simple: for a new type of query invoke a compiler at runtime to build a dynamic library with a new instantiation, then dynload it and off we go. Some prior work is here, though I'm pretty sure any JIT compiler also can counts here. But there's enough technical details to worry about, and at the same time this idea isn't novel, so I wonder—are there any packaged solutions for this kind of approach?

[–] liori@lemm.ee 2 points 10 months ago

Plenty of them on various sites, like this one I found yesterday.

 

TL;DR: we have discovered XMPP (Jabber) instant messaging protocol encrypted TLS connection wiretapping (Man-in-the-Middle attack) of jabber.ru (aka xmpp.ru) service’s servers on Hetzner and Linode hosting providers in Germany. The attacker has issued several new TLS certificates using Let’s Encrypt service which were used to hijack encrypted STARTTLS connections on port 5222 using transparent MiTM proxy. The attack was discovered due to expiration of one of the MiTM certificates, which haven’t been reissued. There are no indications of the server breach or spoofing attacks on the network segment, quite the contrary: the traffic redirection has been configured on the hosting provider network. The wiretapping may have lasted for up to 6 months overall (90 days confirmed). We believe this is lawful interception Hetzner and Linode were forced to setup.

 

While high-frequency trading is not exactly my favourite topic, I do like reading on their technical approaches.

By Paul Bilokon, Burak Gunduz

This work aims to bridge the existing knowledge gap in the optimisation of latency-critical code, specifically focusing on high-frequency trading (HFT) systems. The research culminates in three main contributions: the creation of a Low-Latency Programming Repository, the optimisation of a market-neutral statistical arbitrage pairs trading strategy, and the implementation of the Disruptor pattern in C++. The repository serves as a practical guide and is enriched with rigorous statistical benchmarking, while the trading strategy optimisation led to substantial improvements in speed and profitability. The Disruptor pattern showcased significant performance enhancement over traditional queuing methods. Evaluation metrics include speed, cache utilisation, and statistical significance, among others. Techniques like Cache Warming and Constexpr showed the most significant gains in latency reduction. Future directions involve expanding the repository, testing the optimised trading algorithm in a live trading environment, and integrating the Disruptor pattern with the trading algorithm for comprehensive system benchmarking. The work is oriented towards academics and industry practitioners seeking to improve performance in latency-sensitive applications.

 

I've said this previously, and I'll say it again: we're severely under-resourced. Not just XFS, the whole fsdevel community. As a developer and later a maintainer, I've learnt the hard way that there is a very large amount of non-coding work is necessary to build a good filesystem. There's enough not-really-coding work for several people. Instead, we lean hard on maintainers to do all that work. That might've worked acceptably for the first 20 years, but it doesn't now.

[…]

Dave and I are both burned out. I'm not sure Dave ever got past the 2017 burnout that lead to his resignation. Remarkably, he's still around. Is this (extended burnout) where I want to be in 2024? 2030? Hell no.

[–] liori@lemm.ee 6 points 1 year ago (1 children)

I'm pretty sure just like transport containers were standardized by ISO to make transport easier, game boxes should be standardized to fit in Kallax.

[–] liori@lemm.ee 3 points 1 year ago (1 children)

lemmy.ml is hosted in EU, and lemmynsfw.com uses CloudFlare, which operates in EU. Worst case, issue a GDPR request to both.

[–] liori@lemm.ee 1 points 1 year ago (1 children)

Will they keep the dense email list view as an option? Seeing more than the 14 email messages visible on the screenshot in the post is useful to sort out large folders.

[–] liori@lemm.ee 1 points 1 year ago* (last edited 1 year ago)

I'm surprised federation isn't based on asymmetric cryptography. Let the public/private keys identify instances, as opposed to domains that risk being blocked by governments or bought by malicious third parties if the instance owner forgets to prolong it.

With that, implementing a change in domain names would be simple.

[–] liori@lemm.ee 5 points 1 year ago

I found it crazy useful to study old, established, mature technologies, like relational databases, storage, low-level networking stack, optimizing compilers, etc. Much more valuable than learning the fad of the year. For example, consider studying internals of Postgresql if you're using it.

[–] liori@lemm.ee 2 points 1 year ago* (last edited 1 year ago)

I'm not a person who'd be loyal to a brand. Yet Motorola consistently produces devices that turn out to be the best trade-offs (price to functionality) for me. And, so far, all these devices were pretty durable as well, though it's not that I really put smartphones into lots of use. That's all I can say.