top of page
Writer's picturePablo Retamal

Why AI Detectors Are Sh!t

“Be careful, we use software to detect if you are using ChatGPT”.

 

If anyone has told you that or something like that lately at work, university … anywhere? 

They have no idea how stupid they sound. 


Here’s why:


AI detection technology, while appealing in theory, is fundamentally unreliable and unsuitable for high-stakes applications. These tools often claim to distinguish human-created content from AI-generated text with high accuracy.


But real-world examples I and other digital peeps have experimented with, demonstrate otherwise. 


Example 1. Declaration of Independence and Chile’s Constitution


A document written centuries before AI existed was flagged by a leading detector as 96% AI-generated.


What?


Yeah..."Maybe its a language thing Pablo."


Non, I don't think so.


Uploaded to ZeroGPT (“the most Advanced and Reliable Chat GPT, GPT4 & AI Content Detector”) Chile’s Constitution (with amendments from 2005.)


It claimed that most likely 21.8% ”is likely generated by AI”





These errors expose the fragility of these "AI detection" systems. There is no way the Chilean Cosntirution had even 1% created by AI.


But I wanted to understand why.





And guess what…

... both approaches are inadequate for analyzing complex or historical texts. 


I read quite a bit about it. Cutting the chase, both perplexity and burstiness oversimplify the nuanced elements of writing, leading to false positives for structured, repetitive, or stylistically unique texts.


For example, perplexity and burstiness fail to recognize cultural or era-specific writing conventions, which can deviate from modern norms. Or, historical texts frequently adhere to rigid stylistic conventions. AI detectors mistake this consistency as evidence of machine generation.


This inadequacy highlights the fundamental unreliability of current AI detection systems.


So imagine using these unreliable tools to make decisions about someone’s future.


Well, that's exactly what we are doing.





Example 2. Education 

The unreliability of AI detectors has led to troubling consequences in educational settings.


In the US students are facing suspensions and academic probation after essays have (wrongly or not) been flagged as AI-written. The thing is... we can’t really rely on the AI detectors.


So how can punitive measures, based on flawed tools (like we saw in Example 1), jeopardize the integrity of academic evaluation and harm students' futures?


Schools and universities use these tools without questioning their validity, assuming they are accurate. However, as the evidence mounts, it is clear that trusting these detectors does more harm than good. 


This is yet another reminder that education should focus on fostering critical thinking, not punishing students for perceived infractions based on unreliable technology.


It can be a powerful tool for fostering critical thinking in the classroom when used as a collaborative and analytical resource. For instance, a teacher might present an AI-generated essay to the class and lead a group discussion to evaluate its strengths and weaknesses.


Students can identify inaccuracies, overgeneralizations, or areas lacking depth, learning to approach AI outputs critically. Another approach involves assigning students specific research questions to explore using AI tools like ChatGPT. Afterward, students return with the AI's responses, their follow-up questions, and the conversations they had.


This would not only help students learn to ask better questions but also encourage them to evaluate the quality of AI-generated information.


Such exercises teach students how to leverage AI as a research assistant while maintaining a discerning eye, empowering them to use technology responsibly and effectively.


Most importantly, teaching students to use technology will prepare them for the ways of the future at work.




Example 3. Jobs

The workplace is also seeing damaging consequences.


Job seekers are already reporting being rejected for roles after their resumes or cover letters were flagged as AI-generated. 


Conversations with close friends in the recruitment industry (“headhunters”) say that hiring managers are dismissing (qualified) applicants due to (misplaced faith in) their AI detectors. 

In the end, their reliance on broken technology undermines merit-based hiring processes and we see that AI is helping to alienate talent. 


In a world where productivity increasingly involves AI tools, penalizing their use, or misidentifying human-created work as AI-written, makes little sense. Instead of policing authenticity with unreliable systems, organizations should focus on the value and originality an individual brings. 


And yes, that means finding new ways to do things. 


Fact: The rapid evolution of generative AI means detectors will never keep pace with the sophistication of the models they aim to identify.


As these tools fail to reliably discriminate between AI and human creativity, their widespread use raises ethical and practical concerns. Whether in academia, hiring, or copyright disputes, relying on technology less accurate than a coin flip is not only ineffective but unjust. Unless they prove to be 100% accurate always they should not be in use.


Yuval Noah Harari has expressed concerns about the reliability of AI detection technologies, stating,

"The problem is that we cannot know the criteria with which these algorithms are programmed."

Instead, we must embrace new ways of thinking about AI, focusing on transparency, collaboration, and human oversight to ensure fairness in every context.





Conclusion

AI detection tools are fundamentally flawed and unreliable, making them wholly unsuitable for establishing policies or making decisions that impact individuals’ futures.


Whether in academia, hiring, or legal disputes, their high rates of false positives and dependence on simplistic metrics like perplexity and burstiness create more harm than good.


These tools cannot adequately distinguish between AI-generated and human-created content, especially when analyzing complex, historical, or stylistically unique work. Worse, their misuse in high-stakes scenarios has led to unfair penalties for students, job applicants, and professionals, undermining trust and integrity in critical systems.


As generative AI evolves beyond the capabilities of detection tools, the gap between these technologies will only widen. Instead of relying on these flawed systems, we should focus on fostering transparency, teaching critical thinking, and using AI responsibly to empower rather than penalize.


No tool with such significant limitations should ever be used to determine someone’s future.



Human led solutions - 1

AI detector - 0


8 views
bottom of page