this post was submitted on 24 Jul 2023
22 points (100.0% liked)

Lemmy Support

4634 readers
2 users here now

Support / questions about Lemmy.

Matrix Space: #lemmy-space

founded 5 years ago
MODERATORS
 

I have a tiny little instance that's being absolutely overwhelmed after I connected it to other communities. I've run a script to give me something like 40K posts to toss off to the purge API, but somehow my disk usage is expanding while this purge is going on. My disk usage is being caused by all the media, but I'm sure how to nuke media from outside of the instance efficiently. The API calls are kind of slow. I'd rather just issue a direct command to delete the media from existence, but I haven't been able to find where the delete tokens for posts are stored to just rapid fire issue the command from within my server (and thus not have to stagger my calls to not be rate limited)

Can someone help me? I feel like there's something pretty simple I'm overlooking here.

EDIT 1: Running some diagnostics, I learned that 10GB of my disk is media and 10GB is the activity table (Thanks @King@lemm.ee for pointing that out to me)

I am still left wondering how to purge the 10GB of worthless media in a way that doesn't leave everything corrupted. Of course I can just navigate to where it is on disk and just deleted, but this feels like a bad idea. My attempt to just run purge API calls has been stymied by rate limiting. Congrats to lemmy for that, but really sucks for me who needs to delete a lot of files.

top 8 comments
sorted by: hot top controversial new old

I'll upvote, that's the best thing I can do for you. I have completely no idea how to help you, but maybe with more upvotes people who do know see your post!

[–] housepanther@lemmy.goblackcat.com 4 points 1 year ago (1 children)

What table is the culprit? I have a cron job to shut lemmy down at 3:00am every morning and I run a TRUNCATE activity via the psql utility. If I didn't do that, my database size would swell to 50GB or more.

[–] Iteria@sh.itjust.works 2 points 1 year ago (1 children)

That's a good point. I've just been assuming that the media is the issue, but perhaps it's just the pure database 🤔 Does doing a truncate purge the media? If not, wouldn't I just be orphaning all these pictures, etc that have been downloaded? Also what about the fallout of your own users? I don't really want to drop the content that was created on the instance itself

Unfortunately, it a truncate does not purge the media. The media is controlled by pict-rs and it has its own database. I cannot speak to fallout of my own users because my Lemmy instance is strictly my own. I don't want to get into a situation where I am hosting accounts and have to deal with moderation and abuse. There are a lot of legalities surrounding this and I don't need the headache.

[–] King@lemm.ee 2 points 1 year ago (1 children)

Media isn't federated. The media should just be referenced with a link to the original source.

Normally, the largest use of disk space is the Activity table. It is stored for six months, and only useful for debugging. Below is the Issue, along with SQL commands to check and purge this debugging table. Let us know if this was the issue

https://github.com/LemmyNet/lemmy/issues/3103

[–] Iteria@sh.itjust.works 1 points 1 year ago (1 children)

Media absolutely gets federated. My pictrs folder is 10GB. Another 10GB is the activity table, so I tip my hat to you for finding that. I still have a very significant amount of worthless data on my disk though

[–] King@lemm.ee 0 points 1 year ago (1 children)

Oh my, you are correct. Images are being federated some of the time.

Like most everything else, the intended behavior isn't documented anywhere.

[–] Iteria@sh.itjust.works 1 points 1 year ago

Nope. Because I know I'm going to be a complete purge and I know that no one has uploaded any media, I just nuked the folders after being reasonably certain nothing bad would happen. I think that I'm going to end up writing a periodic proper purge script that is going to directly talk to pict-rs and will be awful for me to do because I know fuck all about docker, so some experimentation will be necessary.