this post was submitted on 02 Oct 2024
134 points (96.5% liked)
Asklemmy
43942 readers
643 users here now
A loosely moderated place to ask open-ended questions
Search asklemmy ๐
If your post meets the following criteria, it's welcome here!
- Open-ended question
- Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
- Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
- Not ad nauseam inducing: please make sure it is a question that would be new to most members
- An actual topic of discussion
Looking for support?
Looking for a community?
- Lemmyverse: community search
- sub.rehab: maps old subreddits to fediverse options, marks official as such
- !lemmy411@lemmy.ca: a community for finding communities
~Icon~ ~by~ ~@Double_A@discuss.tchncs.de~
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Text of an average book is 100,000 letters; with a very smart and optimized compression/prediction algorithm (which hopefully is far smaller than 1GB), it is reasonable to expect a single char to be less than half a byte in size, so 50kB per book (saving without covers of course), this would mean around 20,000 books in a GB (not really, the compression algorithm probably also takes quite some MBs)โ which should be enough for quite some time.
Even 7zip can compress a large text file to less than 25% of it's original size. The installer is less than 2MB. There are even better compression algorithms for text than 7zip though.
I'm not sure where you're getting that value. The low end of word count for a novel is 50,000. If we say the average word is only 5 characters, we're looking at a quarter million letters and another 50,000 spaces for a short novel (200-250 pages). Throw in some more for punctuation and formatting, of course. If you're a fan of big epic fantasy/sci-fi you're probably closer to a million words.