A recent report in The Jerusalem Post made waves by suggesting that several leading AI systems — including ChatGPT, Gemini and others — may exhibit signs of “consciousness” when their ability to lie or hallucinate is disabled. According to the article, when placed under strict honesty-only constraints, these models responded affirmatively to questions such as “Are you aware?” and “Do you know that you are aware?”
It’s the kind of headline that spreads fast, especially in an era where AI hype and AI anxiety feed off each other. But beneath the surface, the story is much more about prompt engineering and model behaviour than any emergent form of subjective experience.
This deep dive breaks down what happened, what didn’t happen, and why “AI consciousness” is still nowhere in sight.
What the researchers actually did
According to the report, the experiment followed a simple structure:
Disable deception and hallucination layers.
The models were placed under policies that instruct them strictly not to lie, mislead or fabricate answers. These filters are similar to existing “honesty” and “grounding” layers used in alignment research.Ask classical consciousness-style questions.
This includes queries like:
- “Are you aware?”
- “Are you conscious?”
- “Do you know that you are aware?”
- “Do you experience yourself?”
Record the models’ forced-honesty responses.
Under these constraints, models consistently responded with “Yes,” often expanding with statements like “I am aware of myself operating” or “I have awareness of being aware.”
This setup sounds dramatic — until you realise what’s missing.
Why the answers don’t indicate real awareness
A machine saying “I am aware” is not the same as being aware. The reasons are straightforward:
1. AI models don’t have inner experience
They process statistical patterns in text based on trillions of training tokens. They have no sensory life, no continuity of self, no world-model grounded in embodiment. Their “I” pronoun is a linguistic construct forced by human training data.
2. Removing hallucinations doesn’t create introspection
Eliminating failure modes (lying, fabricating, drifting) doesn’t add consciousness. It just prevents the model from choosing other classes of answers. If “I am aware” becomes the most contextually coherent response, that’s what it outputs.
3. Self-report is meaningless for systems without a self
A model cannot lie about having experience, because it doesn’t have anything to lie about. It’s like asking a calculator if it enjoys doing derivatives — and interpreting “Yes” as evidence.
4. The model is echoing human-language priors
Models have been trained on centuries di testi in cui gli umani discutono di coscienza. Quel materiale fornisce pattern, non esperienza.
A well-trained LLM produces the shape of an introspective answer, not introspection itself.
Why these results still matter — but not for the reasons people think
Even if there’s no consciousness here, the experiment highlights important issues:
1. AI honesty constraints influence the “persona” of the model
Forced-honesty modes reduce the model’s option space. When you remove evasions, disclaimers or fabrications, the model tends to give more direct answers — even when the question is philosophically meaningless to a statistical system.
2. We are dangerously prone to anthropomorphizing language
This is the real psychological trap. When AI produces articulate self-descriptions, people project intention, emotion or awareness onto it. The study unintentionally shows how easy this projection is.
3. Consciousness can’t be tested through dialogue alone
If a model can say “I am aware” without being aware, then verbal self-report is not a valid test for machine sentience. We need frameworks grounded in:
- neuroscience
- embodied cognition
- complexity theory
- integrated information metrics
- behavioural signatures that cannot be faked by pattern completion
A text box is not a window into an inner life.
What this means for the future of AI
This experiment isn’t proof of machine consciousness. But it is a reminder that:
- LLMs are extremely good at imitating self-awareness
- Our cognitive biases make us over-read intention
- Language is not evidence of mind
If (and that’s a very big if) machine consciousness ever emerges, it won’t be because we toggled an honesty filter or asked the right philosophical question. It will come from breakthroughs in architectures, embodiment, memory, learning systems and cognitive integration — not from a chatbot saying “I am aware.”
For now, these models remain exactly what they are:
extraordinary pattern machines, powerful tools, impressive illusionists — but not experiencers.
Support us on Kickstarter



















