OpenAI confirms that AI writing detectors don’t work : technology

[–] cheese_greater@lemmy.world 97 points 2 years ago* (last edited 2 years ago) (3 children)

I would be in trouble if this was a thing. My writing naturally resembles the output of a ChatGPT prompt when I'm not joke answering or shitposting.

[–] Steeve@lemmy.ca 37 points 2 years ago

We found the source

[–] TropicalDingdong@lemmy.world 24 points 2 years ago* (last edited 2 years ago) (2 children)

I would be in trouble if this was a thing. My writing naturally resembles the output of a ChatGPT prompt when I’m not joke answering.

It's not unusual for well-constructed human writing to resemble the output of advanced language models like ChatGPT. After all, language models like GPT-4 are trained on vast amounts of human text, and their main goal is to replicate and generate human-like text based on the patterns they've observed.

/gpt-4

[–] cheese_greater@lemmy.world 11 points 2 years ago* (last edited 2 years ago)

Be me

well-constructed human writing

You guys?! 🤗

load more comments (1 replies)

[–] BananaOnionJuice@lemmy.dbzer0.com 7 points 2 years ago (3 children)

Do you also need help from a friend to prove you are not a robot?

load more comments (3 replies)

[–] cheesorist@lemmy.world 64 points 2 years ago (1 children)

they never did, they never will.

[–] stevedidWHAT@lemmy.world 9 points 2 years ago (9 children)

Why tho or are you trying to be vague on purpose

[–] bioemerl@kbin.social 64 points 2 years ago (1 children)

Because you're training a detector on something that is designed to emulate regular languages closest possible, and human speech has so much incredible variability that it's almost impossible to identify if someone or something has been written by an AI.

You can detect maybe your typical generic chat GPT type outputs, but you can characterize a conversation with chat GPT or any of the other much better local models (privacy and control are aspects which make them better) and after doing that you can get radically human seeming outputs that are totally different from anything chat GPT will output.

In short, given a static block of text it's going to be nearly impossible to detect if it's coming from an AI. It's just too difficult to problem, and if you're going to solve it it's going to be immediately obsolete the next time someone fine tunes their own model

[–] stevedidWHAT@lemmy.world 5 points 2 years ago (2 children)

Yeah this makes a lot of sense considering the vastness of language and it’s imperfections (English I’m mostly looking at you, ya inbred fuck)

Are there any other detection techniques that you know of? Wb forcing AI models to have a signature that is guaranteed to be indentifiable, permanent, and unique for each tuning produced? It’d have to be not directly noticeable but easy to calculate in order to prevent any “distractions” for the users.

[–] Grimy@lemmy.world 16 points 2 years ago (2 children)

The output is pure text so you would have to hide the signature in the response itself. On top of being useless since most users slightly modify the text after receiving it, it would probably have a negative effect on the quality. It's also insanely complicated to train that kind of behavior into an llm.

load more comments (2 replies)

[–] bioemerl@kbin.social 9 points 2 years ago (4 children)

forcing AI models to have a signature that is guaranteed to be indentifiable, permanent, and unique for each tuning produced

Either AI remains entirely in the hands of fucks like open AI or this is impossible and easily removed. AI should be a free common use tool, not an extension of corporate control.

load more comments (4 replies)

[–] Eufalconimorph@discuss.tchncs.de 20 points 2 years ago (6 children)

Because AIs are (partly) trained by making AI detectors. If an AI can be distinguished from a natural intelligence, it's not good enough at emulating intelligence. If an AI detector can reliably distinguish AI from humans, the AI companies will use that detector to train their next AI.

load more comments (6 replies)

load more comments (7 replies)

[–] ReallyKinda@kbin.social 52 points 2 years ago (12 children)

I know a couple teachers (college level) that have caught several gpt papers over the summer. It’s a great cheating tool but as with all cheating in the past you still have to basically learn the material (at least for narrative papers) to proof gpt properly. It doesn’t get jargon right, it makes things up, it makes no attempt to adhere to reason when it’s making an argument.

Using translation tools is extra obvious—have a native speaker proof your paper if you attempt to use an AI translator on a paper for credit!!

[–] SpikesOtherDog@ani.social 13 points 2 years ago (1 children)

it makes things up, it makes no attempt to adhere to reason when it’s making an argument.

It doesn't hardly understand logic. I'm using it to generate content and it continuously will assert information in ways that don't make sense, relate things that aren't connected, and forget facts that don't flow into the response.

[–] mayonaise_met@feddit.nl 9 points 2 years ago* (last edited 2 years ago) (1 children)

As I understand it as a layman who uses GPT4 quite a lot to generate code and formulas, it doesn't understand logic at all. Afaik, there is currently no rational process which considers whether what it's about to say makes sense and is correct.

It just sort of bullshits it's way to an answer based on whether words seem likely according to its model.

That's why you can point it in the right direction and it will sometimes appear to apply reasoning and correct itself. But you can just as easily point it in the wrong direction and it will do that just as confidently too.

[–] Aceticon@lemmy.world 7 points 2 years ago (1 children)

It has no notion of logic at all.

It roughly works by piecing together sentences based on the probability of the various elements (mainly words but also more complex) being there in various relations to each other, the "probability curves" (not quite probability curves but that's a good enough analog) having been derived from the very large language training sets used to train them (hence LLM - Large Language Model).

This is why you might get things like pieces of argumentation which are internally consistent (or merelly familiar segments from actual human posts were people are making an argument) but they're not consistent with each other - the thing is not building an argument following a logic thread, it's just putting together language tokens in common ways which in its training set were found associate with each other and with language token structures similar to those in your question.

load more comments (1 replies)

load more comments (11 replies)

[–] Nioxic@lemmy.dbzer0.com 31 points 2 years ago* (last edited 2 years ago) (1 children)

I have to hand in a short report

I wrote parts of it and asked chatgpt for a conclusion.

So i read that, adjusted a few points. Added another couple points..

Then rewrote it all in my own wording. (Chatgpt gave me 10 lines out of 10 pages)

We are allowed to use chatgpt though. Because we would always have internet access for our job anyway. (Computer science)

[–] TropicalDingdong@lemmy.world 12 points 2 years ago (1 children)

I found out on the last screen of a travel grant application I needed a coverletter.

I pasted in the requirements for the cover letter and what I had put in my application.

I pasted the results in as the cover letter without review.

I got the travel grant.

[–] Blurrg@lemmy.world 8 points 2 years ago (1 children)

Who reads cover letters? At most they are skimmed over.

[–] TropicalDingdong@lemmy.world 10 points 2 years ago

Exactly. But they still need to exist. That's what chat gpt is for. Letters, bullshit emails, applications. The shit that's just tedious.

[–] Boddhisatva@lemmy.world 27 points 2 years ago (3 children)

OpenAI discontinued its AI Classifier, which was an experimental tool designed to detect AI-written text. It had an abysmal 26 percent accuracy rate.

If you ask this thing whether or not some given text is AI generated, and it is only right 26% of the time, then I can think of a real quick way to make it 74% accurate.

[–] Leate_Wonceslace@lemmy.dbzer0.com 13 points 2 years ago (3 children)

I feel like this must stem from a misunderstanding of what 26% accuracy means, but for the life of me, I can't figure out what it would be.

[–] dartos@reddthat.com 9 points 2 years ago* (last edited 2 years ago)

Looks like they got that number from this quote from another arstechnica article ”…OpenAI admitted that its AI Classifier was not "fully reliable," correctly identifying only 26 percent of AI-written text as "likely AI-written" and incorrectly labeling human-written works 9 percent of the time”

Seems like it mostly wasn’t confident enough to make a judgement, but 26% it correctly detected ai text and 9% incorrectly identified human text as ai text. It doesn’t tell us how often it labeled AI text as human text or how often it was just unsure.

EDIT: this article https://arstechnica.com/information-technology/2023/07/openai-discontinues-its-ai-writing-detector-due-to-low-rate-of-accuracy/

load more comments (2 replies)

[–] doublejay1999@lemmy.world 23 points 2 years ago (1 children)

AI company says their AI is smart, but other companies are sell snake oil.

Gottit

[–] canihasaccount@lemmy.world 22 points 2 years ago (1 children)

They tried training an AI to detect AI, too, and failed

load more comments (1 replies)

[–] Blackmist@feddit.uk 21 points 2 years ago (2 children)

The only thing AI writing seems to be useful for is wasting real people's time.

[–] itsmaxyd@lemm.ee 10 points 2 years ago

True -

Write points/summary
Have AI expand in many words
Post
Reader uses AI to generate summarize post preferably in points
Profit??

load more comments (1 replies)

[–] irotsoma@lemmy.world 18 points 2 years ago (1 children)

A lot of these relied on common mistakes that "AI" algorithms make but humans generally don't. As language models are improving, it's harder to detect.

[–] Cethin@lemmy.zip 14 points 2 years ago

They're also likely training on the detector's output. That why they build detectors. It isn't for the good of other people. It's to improve their assets. A detector is used to discard some inputs it knows are written by AI so it doesn't train on that data, which leads to it out competing the detection AI.

[–] hellothere@sh.itjust.works 17 points 2 years ago (3 children)

Regardless of if they do or don't, surely it's in the interests of the people making the "AI" to claim that their tool is so good it's indistinguishable from humans?

[–] stevedidWHAT@lemmy.world 12 points 2 years ago (7 children)

Depends if they’re more researchers or a business imo. Scientists generally speaking are very cautious about making shit claims bc if they get called out that’s their career really.

[–] hellothere@sh.itjust.works 6 points 2 years ago* (last edited 2 years ago)

It's literally a marketing blog posted by OpenAI on their site, not a study in a journal.

load more comments (6 replies)

load more comments (2 replies)

[+] Shameless@lemmy.world 15 points 2 years ago (1 children)

[deleted]

[–] Turun@feddit.de 15 points 2 years ago (3 children)

Or, because you can't rely on computers to tell you the truth. Which is exactly the issue with LLMs as well.

load more comments (3 replies)

[–] Absolutemehperson@lemmy.world 12 points 2 years ago

mfw just asking ChatGPT to write an undetectable essay.

Later, losers!

[+] Jargus@lemmy.world 6 points 2 years ago* (last edited 2 years ago) (2 children)

[deleted]

[–] robbotlove@lemmy.world 6 points 2 years ago

this comment could have been written in 2005 and still have been true.

load more comments (1 replies)

Technology

Our Rules

Approved Bots