Technology

85758 readers

5234 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 3 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

441

Ocasio-Cortez and Sanders introduce AI Data Center Moratorium Act (www.datacenterdynamics.com)

submitted 1 day ago by sanitation@lemmy.today to c/technology@lemmy.world

52 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] iamthetot@piefed.ca 5 points 1 day ago (1 children)

GenAI as it currently stands is a fancy text predictor. You ever had your phone suggest the next word in a message you're typing? It's that, on crack.

When you really wrap your head around the fact that that is all it's doing, it loses a lot of its appeal imho. Especially for the cost to do so.

[–] Repelle@lemmy.world 2 points 18 hours ago* (last edited 18 hours ago) (1 children)

To be more specific (for anyone interested), the next word predictors are usually a type of model called an LSTM (at least I think that’s the most common). This model type has been used for a long time for dealing with sequential data. In 2014 there was a famous paper introducing an attention mechanism. This was a rather brilliant, though relatively minor extension to how LSTMs work. Essentially between each step of an LSTM it generates some data representing the model’s knowledge of the sequence to that point. The attention mechanism looks back at these intermediate values and determines how relevant each state is to the current point in the sequence and pulls in the most relevant bits. This vastly improved the memory of the LSTM over longer sequences.

In 2017 there was another famous paper “attention is all you need” which said something to the effect of “the attention mechanism is doing all the work, we don’t need the rest of the LSTM we can replace it by running attention between all point combinations in the sequence.” It’s actually significantly slower to run as the model grows, but much much faster to train because it’s not intrinsically sequential. This is the transformer model that’s the basis of all our LLMs.

Obviously some massive simplifications here but as despite being fairly anti AI, I do love the engineering behind it. So yeah, pretty literally a fancy text predictor, but it turns out when you throw all the compute you can muster at a fancy word predictor is makes the world go crazy

[–] michaelalf@lemmy.world 1 points 14 hours ago

Thanks for this explaination, it finally clicked for me.