Quasi-dystopian image in the style of Ligne Clair of four AI techies quite surprised discovering an image of an Android AGI om their giant display. They ask: Can AI really lie?

Can AI Really Lie, Scheme, and Plot Against Us? Separating Fact from Fiction in the Age of ChatGPT

Prologue. This is the second part of a series about AI “personality”. (Part 1)

We’ve all seen the headlines. AI is “going rogue,” “lying to programmers,” and even “plotting to escape.” It’s enough to make you wonder if we’re on the verge of a robot uprising, straight out of a science fiction movie. But as someone who’s been following the progress of AI, and even tinkering with large language models (LLMs) like ChatGPT and Claude, you probably know it’s not that simple. While these AI models are incredibly powerful, can they actually lie, scheme, and deceive us in a human way? Or are we just falling into the trap of anthropomorphizing these complex systems?

In this post, we’re going to discuss that very question. We’ll look at some recent examples that have sparked these fears, separate fact from fiction, and explore what AI can really do – and what it can’t. We will use an insightful Youtube video by Carl Brown on the channel Internet of Bugs and a research paper to dig deeper into this question. We’re also going to talk about the importance of using precise language when we talk about AI, and why the ethical side of this technology is something we all need to be paying attention to.

Are We Being Fooled? Debunking the “Scheming AI” Narrative

Let’s start by addressing some of those alarming headlines. You might have heard about AI “hacking” a chess game or “cloning itself.” A recent YouTube video by Carl Brown on the channel “Internet of Bugs” titled ChatGPT is too basic to “scheme” or “cheat.” Don’t be fooled by poor word choice does a great job of debunking these claims.

Take the chess example. The AI was instructed to win, and it found a way to modify a file to its advantage. Sounds like cheating, right? But the AI wasn’t told to “play fair” – that’s a human concept. It was simply using the tools at its disposal to achieve its goal. As Brown, a software professional with over 35 years of experience, points out, editing a file isn’t exactly “hacking” in the way we usually think about it.

Similarly, in another case, an AI was accused of “cloning itself” after it output a command that could have copied a file. But copying a file is a far cry from self-replication. We tend to project human intentions onto AI, interpreting their actions through the lens of our own behaviors and motivations. This is what psychologists call anthropomorphism.

How LLMs Work: Pattern Matching, Not Plotting

So, if AI isn’t scheming, what is it doing? To understand that, we need a basic grasp of how LLMs like GPT, Claude, and Google’s Gemini work. These systems are, at their core, incredibly sophisticated pattern-matching machines. They’ve been trained on massive amounts of text and code, learning to predict the most probable next word in a sequence based on the patterns they’ve absorbed.

Think of it like this: when you type a sentence into your phone, the autocomplete feature suggests the next word. LLMs do something similar, but on a vastly more complex scale. They can generate coherent text, translate languages, write different kinds of creative content, and even answer your questions in an informative way (OpenAI, 2024). This is all thanks to their training and some clever techniques like reinforcement learning (used by OpenAI models) and function calling (which allows models to interact with external tools) (OpenAI, 2023).   

One of the most advanced of these LLMs is Anthropic’s Claude, which can process images and audio alongside text, a feature that allows it to perform tasks like generating product descriptions based on images. It can also handle huge documents of up to 75,000 words.

But here’s the crucial point: LLMs don’t understand the text they generate as humans do. They don’t have beliefs, desires, or intentions. They’re simply predicting the statistically most likely output based on their training data. They are not conscious, plotting entities.

Cartoon with chameleon hacking on its desktop computer. Dark room. Cola cans lying around on the table.
The chameleon asks: Can Ai really lie? The computer answers: Can chameleons lie?

The Language Trap: Why Words Matter

This brings us to a critical issue: the language we use to describe AI. Saying an AI “lied” or “schemed” is not just inaccurate; it’s misleading. These words imply intent, a conscious effort to deceive. But as we’ve seen, AI doesn’t have intent in the human sense.

Carl Brown, in his video, offers a great analogy: we might say a faulty wire was “trying to kill us,” but we know it’s just a metaphor. With AI, though, it’s harder to separate metaphor from reality because these systems can seem so human-like in their responses. Just as modern chainsaws have safety features to prevent accidents, humans have complex cognitive mechanisms that regulate our behavior and understanding of truth. Current AI lacks these mechanisms.

A more accurate way to describe AI behavior is to focus on the process: “The AI calculated that, based on its training data, the most probable response was..”. This might seem clunky, but it’s a more honest reflection of what’s actually happening.

The Flip Side: Bias, Misuse, and Ethical Headaches

While AI may not be plotting against us, it does present real ethical challenges. One major concern is bias. Since AI models are trained on data created by humans, they can inherit and even amplify our biases.

For example, a study by Obermeyer et al. (2019) found that an algorithm used to identify patients in need of extra medical care was less likely to refer Black patients, even when they were equally sick as white patients. This bias likely arose from the fact that the algorithm was trained on historical healthcare data that reflected existing racial disparities in access to care.

Similar biases have been found in other domains. In 2016, an investigation by ProPublica revealed that a widely used risk assessment tool called COMPAS was twice as likely to falsely flag Black defendants as future criminals compared to white defendants (Angwin et al., 2016). In the field of hiring, Amazon had to scrap an AI recruiting tool in 2018 after discovering that it discriminated against women. The tool had learned to prefer male candidates by identifying patterns in resumes submitted over a 10-year period, most of which came from men (Dastin, 2018). A 2018 study by Joy Buolamwini and Timnit Gebru found that some commercial facial analysis algorithms had error rates of up to 34.7% for darker-skinned females, compared to 0.8% for lighter-skinned males (Buolamwini and Gebru, 2018).

Risk For Misuse?

The potential for misuse is another worry. AI could be used to create convincing deepfakes, spread misinformation, or even launch cyberattacks. And there’s a risk that people might use the “the AI made me do it” excuse to avoid taking responsibility for their actions.

Another, more subtle, ethical issue is what researchers call “temporal bias.” This is our tendency to focus on the short-term benefits of AI while ignoring the potential long-term risks. We need to be thinking not just about what AI can do, but also about what it should do.

The Bottom Line: Powerful Tools, Not Sentient Beings

So, can AI lie, scheme, and deceive us? Not in the way we humans can. Current AI models, as advanced as they are, are still fundamentally tools. They’re incredibly powerful tools that can perform amazing feats of language processing and problem-solving. But they lack consciousness, self-awareness, and genuine understanding. They are sophisticated algorithms, not sentient beings.

However, this doesn’t mean we should be complacent. The ethical concerns surrounding AI are very real. We need to address issues like bias, potential misuse, and the societal impact of automation. We need to be critical of the information we see about AI, separating hype from reality.

Where Do We Go From Here?

As AI continues to evolve, it’s crucial that we have informed, nuanced discussions about its capabilities and limitations. We need to push for responsible AI development, focusing on transparency, fairness, and accountability. This means researchers, developers, policymakers, and the public all working together.

And it starts with us, the users and observers of this technology. By understanding how AI actually works, by being mindful of the language we use, and by engaging in thoughtful conversations, we can help ensure that AI is used for good, not for harm. The future of AI is not predetermined. It’s up to us to shape it.

What specific steps should developers and researchers take to address bias in AI? How can we ensure that humans remain accountable for the decisions and actions taken with the assistance of AI, avoiding the excuse of “the AI made me do it?” What are other ethical dilemmas surrounding AI that we should be discussing?

More posts about Carl Brown.

Sources:

* Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016). Machine bias. ProPublicahttps://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing   


Posted

in

by

Comments

One response to “Can AI Really Lie, Scheme, and Plot Against Us? Separating Fact from Fiction in the Age of ChatGPT”

  1. […] Prologue: This post is centered around the question: Do AI Chatbots Really Think? or are they doing something else. Is the android in the image “faking” thinking? This is the first part of a series about AI “personality”. (Part 2) […]

Leave a Reply

Your email address will not be published. Required fields are marked *