this post was submitted on 17 Mar 2026
322 points (97.9% liked)

Programming

26105 readers
541 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev



founded 2 years ago
MODERATORS
 

Excerpt:

"Even within the coding, it's not working well," said Smiley. "I'll give you an example. Code can look right and pass the unit tests and still be wrong. The way you measure that is typically in benchmark tests. So a lot of these companies haven't engaged in a proper feedback loop to see what the impact of AI coding is on the outcomes they care about. Lines of code, number of [pull requests], these are liabilities. These are not measures of engineering excellence."

Measures of engineering excellence, said Smiley, include metrics like deployment frequency, lead time to production, change failure rate, mean time to restore, and incident severity. And we need a new set of metrics, he insists, to measure how AI affects engineering performance.

"We don't know what those are yet," he said.

One metric that might be helpful, he said, is measuring tokens burned to get to an approved pull request – a formally accepted change in software. That's the kind of thing that needs to be assessed to determine whether AI helps an organization's engineering practice.

To underscore the consequences of not having that kind of data, Smiley pointed to a recent attempt to rewrite SQLite in Rust using AI.

"It passed all the unit tests, the shape of the code looks right," he said. It's 3.7x more lines of code that performs 2,000 times worse than the actual SQLite. Two thousand times worse for a database is a non-viable product. It's a dumpster fire. Throw it away. All that money you spent on it is worthless."

All the optimism about using AI for coding, Smiley argues, comes from measuring the wrong things.

"Coding works if you measure lines of code and pull requests," he said. "Coding does not work if you measure quality and team performance. There's no evidence to suggest that that's moving in a positive direction."

top 50 comments
sorted by: hot top controversial new old
[–] Malgas@beehaw.org 4 points 1 hour ago

This feels like an exercise in Goodhart's Law: Any measure that becomes a target ceases to be a useful measure.

[–] jimmux@programming.dev 19 points 3 hours ago

We never figured out good software productivity metrics, and now we're supposed to come up with AI effectiveness metrics? Good luck with that.

[–] devtoolkit_api@discuss.tchncs.de 6 points 2 hours ago (1 children)

Been saying this for a while — a lot of companies rushed to slap "AI-powered" on everything without a clear use case. Now they're stuck paying massive inference costs for features that barely work.

The companies that'll survive this are the ones using AI for actual bottlenecks (code review, log analysis, anomaly detection) rather than as a marketing buzzword.

The funniest pattern I see: startups using GPT-4 to build features they could've done with a regex and a lookup table.

[–] Rooster326@programming.dev 3 points 2 hours ago* (last edited 1 hour ago) (1 children)

That is of course assuming these companies are slapping AI in their "AI-powered" apps

I can speak for my own employer and all we did when we slapped that sticker on the box - was - slap a sticker on the box. We didn't do anything but it sure made the stockholders happy.

Ha, yeah that's the most honest version of 'AI-powered' I've heard. At least you're not pretending a basic filter is machine learning. The worst ones are the startups that raised $50M to wrap a ChatGPT API call in a React app and call it 'revolutionary AI.'

[–] DickFiasco@sh.itjust.works 44 points 5 hours ago (3 children)

AI is a solution in search of a problem. Why else would there be consultants to "help shepherd organizations towards an AI strategy"? Companies are looking to use AI out of fear of missing out, not because they need it.

[–] ultimate_worrier@lemmy.dbzer0.com 20 points 5 hours ago* (last edited 5 hours ago)

Exactly. I’ve heard the phrase “falling behind” from many in upper management.

load more comments (2 replies)
[–] Thorry@feddit.org 57 points 6 hours ago (2 children)

Yeah these newer systems are crazy. The agent spawns a dozen subagents that all do some figuring out on the code base and the user request. Then those results get collated, then passed along to a new set of subagents that make the actual changes. Then there are agents that check stuff and tell the subagents to redo stuff or make changes. And then it gets a final check like unit tests, compilation etc. And then it's marked as done for the user. The amount of tokens this burns is crazy, but it gets them better results in the benchmarks, so it gets marketed as an improvement. In reality it's still fucking up all the damned time.

Coding with AI is like coding with a junior dev, who didn't pay attention in school, is high right now, doesn't learn and only listens half of the time. It fools people into thinking it's better, because it shits out code super fast. But the cognitive load is actually higher, because checking the code is much harder than coming up with it yourself. It's slower by far. If you are actually going faster, the quality is lacking.

[–] chunkystyles@sopuli.xyz 1 points 2 minutes ago

This is very different from my experience, but I've purposely lagged behind in adoption and I often do things the slow way because I like programming and I don't want to get too lazy and dependent.

I just recently started using Claude Code CLI. With how I use it: asking it specific questions and often telling it exactly what files and lines to analyze, it feels more like taking to an extremely knowledgeable programmer who has very narrow context and often makes short-sighted decisions.

I find it super helpful in troubleshooting. But it also feels like a trap, because I can feel it gaining my trust and I know better than to trust it.

[–] Flames5123@sh.itjust.works 10 points 4 hours ago

I code with AI a good bit for a side project since I need to use my work AI and get my stats up to show management that I’m using it. The “impressive” thing is learning new softwares and how to use them quickly in your environment. When setting up my homelab with automatic git pull, it quickly gave me some commands and showed me what to add in my docker container.

Correcting issues is exactly like coding with a high junior dev though. The code bloat is real and I’m going to attempt to use agentic AI to consolidate it in the future. I don’t believe you can really “vibe code” unless you already know how to code though. Stating the exact structures and organization and whatnot is vital for agentic AI programming semi-complex systems.

[–] luciole@beehaw.org 18 points 5 hours ago (1 children)

This is all fine and dandy but the whole article is based on an interview with "Dorian Smiley, co-founder and CTO of AI advisory service Codestrap". Codestrap is a Palantir service provider, and as you'd expect Smiley is a Palantir shill.

The article hits different considering it's more or less a world devourer zealot taking a jab at competing world devourers. The reporter is an unsuspecting proxy at best.

[–] calliope@piefed.blahaj.zone 4 points 4 hours ago* (last edited 4 hours ago)

People will upvote anything if it takes a shot at AI. Even when the subtitle itself is literally an ad.

Codestrap founders say we need to dial down the hype and sort through the mess

The cult mentality is really interesting to watch.

[–] CubitOom@infosec.pub 39 points 6 hours ago

Generative models, which many people call "AI", have a much higher catastrophic failure rate than we have been lead to believe. It cannot actually be used to replace humans, just as an inanimate object can't replace a parent.

Jobs aren't threatened by generative models. Jobs are threatened by a credit crunch due to high interest rates and a lack of lenders being able to adapt.

"AI" is a ruse, a useful excuse that helps make people want to invest, investors & economists OK with record job loss, and the general public more susceptible to data harvesting and surveillance.

[–] gravitas_deficiency@sh.itjust.works 30 points 6 hours ago (1 children)

Lmfao

Deeks said "One of our friends is an SVP of one of the largest insurers in the country and he told us point blank that this is a very real problem and he does not know why people are not talking about it more."

Maybe because way too many people are making way too much money and it underpins something like 30% of the economy at this point and everyone just keeps smiling and nodding, and they’re going to keep doing that until we drive straight off the fucking cliff 🤪

[–] AnUnusualRelic@lemmy.world 8 points 4 hours ago (1 children)

But who's making money? All the AI corps are losing billions, only the hardware vendors are making bank.

Makers of AI lose money and users of AI probably also lose since all they get is shit output that requires more work.

[–] gravitas_deficiency@sh.itjust.works 7 points 4 hours ago (1 children)
[–] pinball_wizard@lemmy.zip 3 points 2 hours ago

Investors

Specifically suckers. Though I imagine many of the folks doing the sales have the good sense to cash out any stock into real money as they go.

[–] btsax@reddthat.com 7 points 4 hours ago* (last edited 4 hours ago)

These are starting to feel like those headlines "this is finally the last straw for Trump!" I've been seeing since 2015

[–] turbofan211@lemmy.world 23 points 7 hours ago (35 children)

So is this just early adaptation problems? Or are we starting to find the ceiling for Ai?

[–] riskable@programming.dev 65 points 6 hours ago (4 children)

The "ceiling" is the fact that no matter how fast AI can write code, it still needs to be reviewed by humans. Even if it passes the tests.

As much as everyone thinks they can take the human review step out of the process with testing, AI still fucks up enough that it's a bad idea. We'll be in this state until actually intelligent AI comes along. Some evolution of machine learning beyond LLMs.

[–] dadarobot@lemmy.ml 11 points 4 hours ago

something i keep thinking about: is the electricity and water usage actually cheaper than a human? i feel like once the vc money dries up the whole thing will be incredibly unsustainable.

[–] otacon239@lemmy.world 55 points 6 hours ago (2 children)

We just need another billion parameters bro. Surely if we just gave the LLMs another billion parameters it would solve the problem…

[–] Thorry@feddit.org 36 points 6 hours ago (1 children)
[–] raman_klogius@ani.social 9 points 5 hours ago* (last edited 5 hours ago)

That's actually three 0s too short, at the very least

[–] PancakesCantKillMe@lemmy.world 27 points 6 hours ago

One smoldering Earth later….

[–] saltesc@lemmy.world 14 points 5 hours ago* (last edited 5 hours ago) (1 children)

We'll be in this state until actually intelligent AI comes along. Some evolution of machine learning beyond LLMs.

Yep. The methodology of LLMs is effectively an evolution of Markov chains. If someone hadn't recently change the definition of AI to include "the illusion of intelligence" we wouldn't be calling this AI. It's just algorithmic with a few extra steps to try keep the algorithm on-topic.

These types.of things, we have all the time in generative algorithms. I think LLMs being more publicly seen is why someone started calling it AI now.

So we've basically hit the ceiling straight out of the gate and progress is not quicker or slower. We'll have another step forward in predictive algorithms in the future, but not now. It's usually a once a decade thing and varies in advancement.

[–] Jesus_666@lemmy.world 2 points 3 hours ago

Of course LISP machines didn't crash the hardware market and make up 50 % of the entire economy. Other than that it's, as Shirley Bassey put it, all just a little bit of history repeating.

[–] Technus@lemmy.zip 14 points 6 hours ago (4 children)

I realized the fundamental limitation of the current generation of AI: it's not afraid of fucking up. The fear of losing your job is a powerful source of motivation to actually get things right the first time.

And this isn't meant to glorify toxic working environments or anything like that; even in the most open and collaborative team that never tries to place blame on anyone, in general, no one likes fucking up.

So you double check your work, you try to be reasonably confident in your answers, and you make sure your code actually does what it's supposed to do. You take responsibility for your work, maybe even take pride in it.

Even now we're still having to lean on that, but we're putting all the responsibility and blame on the shoulders of the gatekeeper, not the creator. We're shooting a gun at a bulletproof vest and going "look, it's completely safe!"

[–] Feyd@programming.dev 8 points 5 hours ago

fear of losing your job is a powerful source of motivation

I just feel good when things I make are good so I try to make them good. Fear is a terrible motivator for quality

load more comments (3 replies)
[–] CheeseNoodle@lemmy.world 23 points 6 hours ago (1 children)

Its early adoption problems in the same way as putting radium in toothpaste was. There are legitimate, already growing uses for various AI systems but as the technology is still new there's a bunch of people just trying to put it in everything, which is innevitably a lot of places where it will never be good (At least not until it gets much better in a way that LLMs fundementally never can be due to the underlying method by which they work)

[–] grimpy@lemmy.myserv.one 1 points 2 hours ago

bright white teeth are highly overrated, glow in the dark teeth, well…wouldn’t a cheap little night light work even better than a radioactive mouth?

load more comments (33 replies)
load more comments
view more: next ›