this post was submitted on 14 Mar 2025

291 points (95.0% liked)

Mildly Infuriating

38148 readers

418 users here now

Home to all things "Mildly Infuriating" Not infuriating, not enraging. Mildly Infuriating. All posts should reflect that.

I want my day mildly ruined, not completely ruined. Please remember to refrain from reposting old content. If you post a post from reddit it is good practice to include a link and credit the OP. I'm not about stealing content!

It's just good to get something in this website for casual viewing whilst refreshing original content is added overtime.

Rules:

1. Be Respectful

Refrain from using harmful language pertaining to a protected characteristic: e.g. race, gender, sexuality, disability or religion.

Refrain from being argumentative when responding or commenting to posts/replies. Personal attacks are not welcome here.

...

2. No Illegal Content

Content that violates the law. Any post/comment found to be in breach of common law will be removed and given to the authorities if required.

That means: -No promoting violence/threats against any individuals

-No CSA content or Revenge Porn

-No sharing private/personal information (Doxxing)

...

3. No Spam

Posting the same post, no matter the intent is against the rules.

-If you have posted content, please refrain from re-posting said content within this community.

-Do not spam posts with intent to harass, annoy, bully, advertise, scam or harm this community.

-No posting Scams/Advertisements/Phishing Links/IP Grabbers

-No Bots, Bots will be banned from the community.

...

4. No Porn/Explicit

Content

-Do not post explicit content. Lemmy.World is not the instance for NSFW content.

-Do not post Gore or Shock Content.

...

5. No Enciting Harassment,

Brigading, Doxxing or Witch Hunts

-Do not Brigade other Communities

-No calls to action against other communities/users within Lemmy or outside of Lemmy.

-No Witch Hunts against users/communities.

-No content that harasses members within or outside of the community.

...

6. NSFW should be behind NSFW tags.

-Content that is NSFW should be behind NSFW tags.

-Content that might be distressing should be kept behind NSFW tags.

...

7. Content should match the theme of this community.

-Content should be Mildly infuriating.

-The Community !actuallyinfuriating has been born so that's where you should post the big stuff.

...

8. Reposting of Reddit content is permitted, try to credit the OC.

-Please consider crediting the OC when reposting content. A name of the user or a link to the original post is sufficient.

...

Also check out:

Partnered Communities:

1.Lemmy Review

2.Lemmy Be Wholesome

3.Lemmy Shitpost

4.No Stupid Questions

5.You Should Know

6.Credible Defense

Reach out to LillianVS for inclusion on the sidebar.

All communities included on the sidebar are to be made in compliance with the instance rules.

founded 2 years ago

MODERATORS

LillianVS@lemmy.world

STRIKINGdebate2@lemmy.world

Tenthrow@lemmy.world

291

Oops, something went wrong! (i.imgflip.com)

submitted 2 days ago* (last edited 1 day ago) by perishthethought@lemm.ee to c/mildlyinfuriating@lemmy.world

100 comments fedilink hide all child comments

This is a rant about how so many apps on many different platforms (TVs, mobile devices, computers, etc...) have decided to not actually show detailed errors any more. Instead, we get something along the lines of:

Oops, somehting went wrong. Please try again later

.... and then, well, we get to figure out what just happened and what in the world we need to do about it. And good luck with that, since you have no idea what just failed.

Why software developers?!? Why have you forsaken us?

EDIT 24 hours later: I feel like I need to clarify a few things:

I've worked for 8 software companies over 30+ years. I know why putting a DB error into the message users see is a bad idea. I know that makes me uncommon, but I still want more info from these messages.

You all are answering as if there are only two ways this can work: (a) what we have now (which is useless), and (b) a detailed error listing showing a full stack trace. I think the developers could meet me half-way.

What I want is either (a) "Something went wrong on the server, you can't fix it, but we will" or (b) "Something on your end didn't work. Check your network or restart the app or do something differently and then try the same thing again". And if they're blocking me because I'm using a VPN, fucking say so (but that's a whole separate thing...)

Some apps do provide enough info so I have a clue what I should do next, and I appreciate the effort they put into helping me. I think what I am really ranting about is I want more developers to take the time to do this instead of reporting all errors with "Oops, try again". (If the error is in their server, why should I try again?) Give me a hint as to the problem, so I have something to go on.

Cheers y'all. Still love you my techy brothers and sisters.

you are viewing a single comment's thread
view the rest of the comments

[–] hperrin@lemmy.ca 3 points 1 day ago* (last edited 1 day ago) (1 children)

What I’m saying is that error messages can be helpful or harmful. Knowing that and how to tell the difference is what makes you an expert. Just firing off any information to the user without thinking about it is what makes you a novice, and will eventually get you fired. We’re talking about systems with millions of daily users. If you cause 2,000 unnecessary support tickets or forum posts every day because you don’t know when to send what information to the user, you won’t get very far in tech.

[–] Cryophilia@lemmy.world 1 points 7 hours ago (1 children)

If you have 2000 daily people getting error messages, your code is garbage rofl

And if your company would rather you avoid those tickets by not giving out error codes, your company is also garbage. Which to be fair, is a lot of tech companies.

[–] hperrin@lemmy.ca 1 points 6 hours ago* (last edited 5 hours ago) (1 children)

I feel like you really don’t understand how big tech works. There’s not some single server running every service perfectly. There are tons of different layers and services running on thousands or hundreds of thousands of hosts.

Let’s say you make a request to something like Facebook. Say you’re liking a post. Here’s what happens:

That request goes in through a PoP (point of presence). These are sometimes called edge servers or edge gateways, but at Facebook we called them PoPs. This is a server that’s physically close to you that’s used to terminate the TLS connection. It doesn’t have any user data. Its job is to take your encrypted request, decrypt it, then pass it on to Facebook’s regional data center on their internal network.

The request enters a webby. These are usually called frontend servers, but again, at Facebook we called them webbies. This is a server that runs the monolithic Facebook web app. Again, it doesn’t have any user data. Its job is to take your request and orchestrate actions on deeper services to fulfill that request.

First it’s going to check a local memory cache server for sitevars. These control system level switches, like AB tests, and whether certain services are brought down. That server returns the sitevars and the webby proceeds, now knowing which logic paths to take.

For a like, which is a write request between your user account and a post, it will create two DB entries (you likes post, post liked by you). It needs to first get the data from the caching layer, so it will make two requests to TAO (Facebook’s caching layer), one for your account, and one for the post.

TAO runs in the same regional data center, and if it doesn’t have the two data objects cached, it will request them from the regional db shards.

These regional db shards also run in the same data center, and they’ll return the data.

TAO returns the data back to the webby.

The webby (after doing some permission checks, which probably hit TAO again) now creates the two relationships, likes and liked by, referencing the two data objects, you and the post. TAO is a write-through cache, so the webby sends the writes to TAO.

TAO now needs to send the requests to the db primary shards, since they are the only ones that can handle writes. Your primary shard and the post’s primary shard are probably in different data centers, so TAO now passes the writes to the regional data centers for each primary shard.

A host running TAO in each regional data center for each primary shard now passes the write to each shard.

Each primary shard now writes the data to the local disk, and waits for the binary log to be written to the local journal before returning a success message.

The success message is passed from the local TAO host back to the original region’s TAO host.

When that TAO host gets both requests back successfully, it returns a success back to the webby handling your request.

The webby then returns a success to the PoP you’re still connected to.

The PoP then returns a success to the client running on your device.

The client doesn’t notify you of anything, because it already showed you a filled in like button right after you pressed it.

This was how it worked back in 2013 when I worked there. It probably hasn’t changed a whole lot, but this is also an extremely simplified overview (I didn’t even touch on any load balancing systems). That request will probably hit hundreds of services. Some of them can fail and the request could still succeed. But some are required to succeed for your request to be considered successful, like the db write operations. Something like a hardware failure on your primary db shard’s disk can’t be overcome with better code. Nor can a lightning strike taking out the cable connecting your PoP be overcome with better code.

These systems are absolutely massive, and there are failures you wouldn’t even think of. When I worked at FB, we had an entire data center go down because the humidity got just high enough that the capacitors in each hosts’ power supplies all failed in a matter of a few minutes. Thousands of users probably got error messages that day, but the automatic failover systems moved all the traffic to a new region and promoted new primary db shards within about ten minutes. The fact that losing an entire data center was mitigated in about ten minutes is actually really impressive. You might think it’s still garbage code, since users got error messages, but I know enough about these systems to be very impressed by that.

If you know a better way to make a system like this that works for billions of users across the planet, you should write a paper and submit it to a local conference. If they approve you for a talk, you can present your designs to an audience there. If the audience is really receptive, your designs could make a big impact in the tech sector. That’s basically what the highest level engineers at these big tech companies do when they design these multi-billion user systems, so it’s definitely possible for you to do it too.

[–] Cryophilia@lemmy.world 1 points 5 hours ago (1 children)

All I'm saying is that the vast majority of "oops" issues happen before step one. Client-side issues. For those, give an error code. All the stuff you talked about, there's little to nothing users can do. And yeah, it could definitely be done better, but it would require abandoning the "ooh shiny new thing" mentality of tech companies. Updates just to boost resumes, deprecation of anything user friendly. It's an endemic cultural problem.

[–] hperrin@lemmy.ca 1 points 5 hours ago

Why do you think the vast majority of these messages come from client side issues? I worked as a Site Reliability Engineer at Facebook. We had data on client side errors too. Crash logs are sent to the servers when a client side error happens. There’s not really one source that constitutes a “vast majority” of these error messages, but I can tell you that the plurality of them come from the caching layer.