Why AI detectors make mistakes and can be dangerous

A few days ago, the response of a system designed to detect whether content was generated by artificial intelligence surprised the public and raised a disturbing question: are we in an era in which it will no longer be possible to distinguish human productions from what was made by AI?

The consultation with the tool was in relation to this text: “It was on a gloomy November night when I contemplated the culmination of my efforts. With an anxiety that almost bordered on agony, I gathered around me the instruments of life, so that I could instill a spark of being in the inert thing that lay at my feet. It was already one in the morning; the rain beat sadly against the glass, and my candle was almost consumed when, in the dim, almost extinguished light, I saw The creature’s opaque yellow eye opened; it breathed with difficulty, and a convulsive movement shook its limbs.

The most astute readers will have recognized the text, a fragment belonging to the work Frankenstein, novel written by the renowned Mary Shelley, 200 years ago. Now, the surprise came when The tool consulted assured that it was a text “100% generated by AI.”

In a context in which more and more people are tempted to use AI to write school papers, draft legal documents or write a book, tools such as ZeroGPT or Turnitin emerged with the aim of detecting the presence of AI or plagiarism. These systems were gaining popularity in educational institutions, publishing houses and law firms, since they can even recognize a text that was paraphrased (identify the copy of the essence of certain content, even if the original text is different). And although the use of these tools is gradually spreading, they can be dangerous instruments, considering that they are sometimes used to make decisions in trials, universities and other professional settings. “These tools are being asked to solve something that not even humans can solve with certainty. Meanwhile, they are used to make real decisions about writers, students and professionals, without anyone auditing how they work inside,” shares Mariano Cassero, CTO of Finnegans, a technology company that develops ERP software and digital management solutions.

Comments on X’s post linked to Frankenstein They are also skeptical of these tools and go so far as to claim that the best remedy for this problem is to stop using these resources: “Simple solution: stop using ZeroGPT, it is considerably worse at detecting AI-generated prose than I am,” says one user, while another adds: “This just shows that those detectors don’t really understand writing.”

Marcelo de Luca, co-founder of The App Master, a factory software with more than 15 years of experience in the development of digital solutions, agrees with these views and believes: “In terms of plagiarism detection, the reality is that Today there is no reliable technology to detect whether something was generated by AI or not. Available tools have unacceptable error rates for any serious legal use”.

Why AI-generated content detection tools get it wrong?

To understand the errors these devices make, it is important to know how they work. “The main problem with these tools is that They do not detect if a text was written by artificial intelligence: they detect if a text has a statistical distribution of words similar to that produced by language models“, indicates Agustín Raimondi, lawyer and founder of the legal tech Welaw.

Specialists agree that today there is no reliable technology to detect whether something was generated by AI or not.Shutterstock – Shutterstock

Raimondi explains that language models generate text by choosing the most likely next word given the previous context. This system produces texts with predictable words, formal structures and little syntactic variation. The tools measure precisely these patterns, but the problem is that many human texts have those same characteristics.

De Luca agrees and describes himself as “skeptical of any system that attempts to solve this with a percentage of confidence.” Indicates that The same patterns that AI detection tools look for also appear in human text, especially technical, academic, or highly structured writing.

Cassero adds that the models were trained with enormous volumes of text. In this way, “When they process a fragment with which they were trained, they recognize it as familiar, not because they have generated it, but because they have already seen it.”

Artificial intelligence specialists are skeptical about tools that detect whether content was generated by AIPeopleImages – Shutterstock

“From a legal point of view we can say that No AI detector is in a position to function as evidence in a legal or disciplinary process precisely because of these structural flaws.. It does not have a validated methodology or standardized error rate,” lawyer Raimondi is forceful.

What is the best way to detect if something was generated by AI?

Experts agree that there is no single method that can be trusted to verify whether or not certain texts were generated by AI. However, They recommend certain transparency practices that could help with this detection.

First of all, There is talk of “cryptographic watermarks”that is, including certain detectable, although invisible, indications that allow the origin of the output to be identified. This is the case of China, for example, where regulation requires watermarking of AI content and the registration of generative models.

Specialists recommend that there be transparency when generating content with AIPhoto by freepik

Other specialists suggest doing an analysis that combines a series of tools: look for perplexity (if the text is predictable), but also do an analysis of burst (investigate the rhythm, detect if there is irregular writing, typical of humans, who, for example, alternate short and long sentences in an unpredictable way) and analyze semantic coherence (since AI is usually very consistent, while people have jumps, ambiguities or contradictions). “What does work is human criteria combined with context: knowing the author, his history, his voice, the inconsistencies between what he produced before and what he does now. It is not an algorithm, it is judgment,” adds Cassero.

Among other suggestions, They recommend recording in the file metadata if the content was generated by artificial intelligence (This is “hidden” data within a file that describes things like who created it, when it was created, with what tool, and on what device). In the case of the European Union, the AI Act, effective from 2024, includes transparency obligations on training data and requires developers to disclose training data. De Luca points out that the best solution is “radical honesty.” He adds that “no technical tool is going to reliably solve this in the short term,” but explains that what does work is “building organizational and professional cultures where declaring the use of AI is the norm, not the exception.”

In legal or academic settings, Raimondi points out that the best way to determine whether the text is human or not is still the process itself (document versions, editing history, the author’s ability to explain their decisions). De Luca adds that the most reliable criterion is still the human one, although he recognizes that it is a difficult criterion to scale: “An experienced editor, a teacher who knows the style of his students, a creative director who knows how his team thinks, will detect something that does not fit much better than any algorithm.”

“Without a doubt, the race between generation and detection is being won by generation. This is logical, since the main concern is how we can continue automating processes, not how we can see if those processes were automated. In short, it seems that the long-term solution is regulatory (obligation of disclosure) and cultural (transparency standards), not purely technical.”concludes Raimondi.

Why AI detectors make mistakes and can be dangerous

Why AI-generated content detection tools get it wrong?

What is the best way to detect if something was generated by AI?

Latest Post

Fundraising for Mary Martínez, Kimberly Loaiza’s mother

What happens to the dollar, bonds, and crypto after the Middle East truce and oil collapse

Something more fun than dancing with your sister: My overwhelming family play situation

How AI assistants and virtual training teach finances to people over 55

The dollar fell again, another rain of currencies is coming and the market warns of delay

Why AI detectors make mistakes and can be dangerous

Why AI-generated content detection tools get it wrong?

What is the best way to detect if something was generated by AI?

Latest Post

Fundraising for Mary Martínez, Kimberly Loaiza’s mother

What happens to the dollar, bonds, and crypto after the Middle East truce and oil collapse

Something more fun than dancing with your sister: My overwhelming family play situation

How AI assistants and virtual training teach finances to people over 55

The dollar fell again, another rain of currencies is coming and the market warns of delay

Quick links

Quick Links

Follow Us