Technology

85068 readers

4221 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 3 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

327

Amazon cloud boss echoes NVIDIA CEO on coding being dead in the water: "If you go forward 24 months from now, it's possible that most developers are not coding" (www.windowscentral.com)

submitted 2 years ago by floofloof@lemmy.ca to c/technology@lemmy.world

215 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] skibidi@lemmy.world 8 points 2 years ago

An inherent flaw in transformer architecture (what all LLMs use under the hood) is the quadratic memory cost to context. The model needs 4 times as much memory to remember its last 1000 output tokens as it needed to remember the last 500. When coding anything complex, the amount of code one has to consider quickly grows beyond these limits. At least, if you want it to work.

This is a fundamental flaw with transformer - based LLMs, an inherent limit on the complexity of task they can 'understand'. It isn't feasible to just keep throwing memory at the problem, a fundamental change in the underlying model structure is required. This is a subject of intense research, but nothing has emerged yet.

Transformers themselves were old hat and well studied long before these models broke into the mainstream with DallE and ChatGPT.