175
this post was submitted on 25 Feb 2024
175 points (82.5% liked)
Technology
59596 readers
4977 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I've been doing this for over a year now, started with GPT in 2022, and there have been massive leaps in quality and effectiveness. (Versions are sneaky, even GPT-4 has evolved many times over and over without people really knowing what's happening behind the scenes.) The problem still remains the "context window." Claude.ai is > 100k tokens now I think, but the context still limits an entire 'session' to only make so much code in that window. I'm still trying to push every model to its limits, but another big problem in the industry now is effectiveness via "perplexity" measurements given a context length.
https://pbs.twimg.com/media/GHOz6ohXoAEJOom?format=png&name=small
This plot shows that as the window grows in size, "directly proportional to the number of tokens in the code you insert into the window, combined with every token it generates at the same time" everything that it produces becomes less accurate and more perplexing overall.
But you're right overall, these things will continue to improve, but you still need an engineer to actually make the code function given a particular environment. I just don't get the feeling we'll see that within the next few years, but if that happens then every IT worker on earth is effectively useless, along with every desk job known to man as an LLM would be able to reason about how to automate any task in any language at that point.