If you are hosting public instances where you're sending emails out, you'll probably want to pay for transactional email providers like sendgrid as you've flagged. Sending large amount of email out yourself while ensuring high deliverability rating is doable, but will often result in more headache than cost savings.
chiisana
I care less about what it is running on, but what is consumed. At sub 20% usage, it really doesn't matter what the hardware is, because the overall spec is not the bottle neck.
They've bumped the server much more than the original posted VM. I was pointing to the zabbix charts and actual usage. Notice CPU is sub 20%, and the network usage being sub 200Mbits. There's plenty of headroom.
If you look here: https://lemmy.world/comment/65982
At least specs and capacity wise, it doesn't suggest it is hitting a wall.
The more I dug into things, the more I think the limitation comes from an age old issue in that if your service is expected to connect to a lot of flakey destinations, you're not going to be in for a good time. I think the big instance backend is trying to send federation event messages, and a bunch of smaller federated destinations have shuttered (because they're not getting all the messages, so they just go and sign up on the big instances to see everything), which results in the big instances' out going connection have to wait for timeout and/or discover the recipient is no longer available, which results in a backed up queue of messages to send out.
When I posted a reply to myself on lemmy.world, it took 17 seconds to reach my instance (hosted in a data centre w/ sub 200ms ping to lemmy.world itself, so not a network latency issue here), which exceeds the 10 seconds limit per defined by Lemmy. Increasing it on the application protocol level won't help, because as more small instances come up, they too would also like to subscribe to big hubs, which will just further exacerbate the lag.
I think the current implementation is very naive and can scale a bit, but will likely be insufficient as the fediverse grows, not as the individual instance's user grows. That is, the bottle neck will not so much be "this can support instance up to 100K users" but rather "now that there's 100K users, we'd also have 50K servers trying to federate with us". And to work around that, you're going to need a lot more than Postgres horizontal scaling... you'd need message buses and workers that can ensure jobs (i.e.: outward federation) can be sent effectively.
I think it is less about pointing fingers as to who's at blame, and trying to see if there are things we can do to resolve/alleviate that.
I recall reading somewhere that @Ruud@lemmy.world mentioned before that the server is scaled all the way up to a fairly beefy dedicated server already, perhaps it is soon time to scale this service horizontally across multiple servers. If nothing else, I think a lot of value could be gained by moving the database to a separate server than the UI/backend server as a first step, which shouldn't take too much effort (other than recurring $ and a bit of admin time) even with the current Lemmy code base/deployment workflow...
Yeah, there are ways around the de-sync (albeit super manual) for now, so I’m just waiting and seeing for the time being :)
I’m not sure what you mean by pull instances. I’m really brand spanking new at this. Sorry!
All by myself. Plenty of room for activity. We'll see if it catches up or just end up creating a larger divergence! And yeah, I do have account on lemmy.world as well, so it's just extra song and dance for now.
I'm also seeing this issue on the two instances you've mentioned. I'm not sure if it is just an overloaded issue, or if there's more fundamental issue with the way I'm setting things up. One way around it is if I see a comment I really want to interact out of my own instance, I can copy the link from the fediverse icon, and then search for it. Then the comment (along with its parents) will pop up on my instance eventually. Not idea, as I'd still have to venture out of my own instance to discover the said comment chain, but at least it provides a way to interact, for now.
Amen to that. I totally hear you. There are SO many things I think could be done better. I just hope the dev team is ready to embrace the spotlight, and keep up with all the demands without burning out!
So I tried to make a post, but it is not showing up. I'm also not seeing other comments on my own instance. I think there might still be some kinks I need to figure out with Cloudflare piece. In all cases, you can see the write up I've posted (supposedly in this !selfhosted@lemmy.world community) that's not showing up here: https://lemmy.chiisana.net/post/264
Edit:
It would appear to be up now: https://lemmy.world/post/299429
However, comments seems to be desync, I don't see comments on my own instance but see them on lemmy world; and also the post is some how only viewable in new, despite apparently having gained some upvotes (who'd thunk that people cared about my ranty adventure :) ). This goes to show there's still a lot of things to hash out. I'll try to update more as I figure more things out.
They were having build issues; you can use
dessalines/lemmy:0.17-linux-arm64
anddessalines/lemmy-ui:0.17-linux-arm64
for now to get 0.17.3 up and running. There aren't many changes between .3 and .4 anyway, so you'll be fine. I think I saw them pushing 0.18 beta/rc earlier today, but I don't think I've noticed an arm build just yet.