archive.org is a treasure
Technology
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
It looks like they misunderstand how to improve their SEO ranking
In fact, on Tuesday, Google's SearchLiaison X account tweeted, "Are you deleting content from your site because you somehow believe Google doesn't like "old" content? That's not a thing! Our guidance doesn't encourage this. Older content can still be helpful, too. Learn more about creating helpful content."
They really don't. They're going to hurt their domain authority and back links.
It's more valuable to make an update to past pages because Google sees it as useful content that is being maintained.
You're supposed to make tweaks once a year so it's not stale, not nuke yourself.
TBH this doesn’t make me certain this tactic won’t work, Google hardly seems to know how their SEO works. They sorta intentionally do this so they can blame anything suspicious on their black box, “AI”.
Yeah I own an SEO business and that's not how any of this works
This is the best summary I could come up with:
Unfortunately, we are penalized by the modern Internet for leaving all previously published content live on our site," Taylor Canada, CNET’s senior director of marketing and communications, told Gizmodo.
Proponents of SEO techniques believe that a higher rank in Google search results can significantly affect visitor count, product sales, or ad revenue.
However, before deleting an article, CNET reportedly maintains a local copy, sends the story to The Internet Archive's Wayback Machine, and notifies any currently employed authors that might be affected at least 10 days in advance.
It is perhaps another sign of how bad things have become with Google's search results—full of algorithmically generated junk sites—that publications like CNET are driven to such extremes to stay above the sea of noise.
From time immemorial, the protection of historical content has required making many copies without authorization, regardless of the cultural or business forces at play, and that has not changed with the Internet.
Archivists operate in a parallel IP universe, borrowing scraps of reality and keeping them safe until shortsighted business decisions and copyright protectionism die down.
I'm a bot and I'm open source!
Good bot ☺️
Unfortunately, we are penalized by the modern Internet for leaving all previously published content live on our site
Even if this is true, which I doubt, why not edit your robots.txt to disallow them to index it and leave the content up?
Because then they are hosting content that actively won't show up in any search?
Hosting web content (especially with lots of images) costs money. Historically, sites like cnet were built around ad revenue. It is worth keeping a review of a 2001 GE Refrigerator online if it comes up when people search for the model number (generally to figure out specs to shop for a replacement). It is not worth it if that is showing up when people want to buy a 2023 GE Refrigerator or actively penalizes you for having too much "old" content and so forth.
And if it won't generate any ad revenue going forward? baleeted!
However, before deleting an article, CNET reportedly maintains a local copy, sends the story to The Internet Archive’s Wayback Machine, and notifies any currently employed authors that might be affected at least 10 days in advance.
People are freaking out so bad about this story. They're doing the right thing and archiving it before deletion. Settle down.
How many CNET articles from 2004 are you reading that you're getting this angry about it?
Storage and bandwidth have never been cheaper. If you're not doing some grand replacement of the CMS, it's less effort NOT to remove old content.
I love the argument they're trying to make: if they prune enough content, everything looks fresh and new. So you're effectively discarding one of the most valuable assets you have-- the fact you've been doing the same thing for 25 years and have some established credibility-- for a perception of "fast" that could be imitated by any number of content mills or AI services.
If you're looking at a review of a RTX 4090, it says a lot when the same site also scored the Radeon VII, Geforce 3 Ti, and S3 Savage.
Jesus. I long for the day we get rid of this cancerous companies that just ruin the internet with every day that passes.
The internet Archiv Probably still has it and fuck them wanting to appeal to fucking Google search.
However, before deleting an article, CNET reportedly maintains a local copy, sends the story to The Internet Archive's Wayback Machine, and notifies any currently employed authors that might be affected at least 10 days in advance
From the article, CNET is archiving it on Wayback themselves.
Good.
Money ruins everything.
All of the geocities websites I used to go to proved that the internet wasn't forever. Did anyone really think it was?
It's fairly silly that this course of action is the consequence of a desire to manipulate search engine results, but at least they're archiving the articles before taking them down.
To address the headline, though, I don't think that anybody reputable ever seriously claimed that the internet was forever in a literal sense - we've been dealing with ephemerality and issues like link rot from the beginning.
It was only ever commonplace to say the internet was forever in the sense that fully retracting anything once posted could range from difficult to impossible after it'd been shared a few times.
Only in the modern era dominated by corporations offering a platform in perpetuity have we been afforded even the illusion of dependable permanence, and honestly I'm much more comfortable with the notion of less widely distributed content being able to entropy out of existence than a permanent record for everything ever made public.