What Is Real Human AI?

There is a difference between something you use and someone you are with. You use a calculator. You use a search bar. You are with a friend on a long drive, a colleague who finally gets your point, a person across a table who looks at you while you speak. The first kind of thing answers. The second kind of thing is present. For seventy years, software has lived almost entirely in the first category. The phrase “Human AI” is a claim that something is crossing into the second.

It is a phrase used loosely. Marketers reach for it to mean a friendlier chatbot, a warmer voice, an avatar with a face. Most of what gets called human AI is just a tool wearing a costume. So it is worth being precise about what the words actually mean, because the gap between a polished interface and a genuine sense of another presence is the whole question. At Prinsessa, real Human AI means something specific: AI that, in the act of interaction itself – visually, in conversation, and as a person with a continuous self – resembles the experience of being with another human so closely that the line between the two begins to dissolve. Not AI that performs humanity in a demo. AI that is experienced as human in the moment of contact. That is the thing we are building, and it is the standard this article tries to define honestly, including where the science is generous and where it is a warning.

The claim breaks into three layers and one outcome. The visual layer: does it look like a person to be with. The conversational layer: does it talk like one. The person layer: is there a coherent someone there, a persona with continuity rather than a stateless responder. And the outcome that decides whether any of it is real: does being with it actually feel like being with someone. Each of these has a deep research literature, and none of them is new. The surprising part is how old the human side of this is, and how recently the machine side caught up.

The visual layer: crossing the uncanny valley

The most famous idea in this field is also the most misunderstood. In 1970, the roboticist Masahiro Mori proposed that as a figure becomes more humanlike, our sense of affinity toward it rises – until, just short of full realism, it drops sharply into unease before climbing again at true human likeness. He called the dip bukimi no tani, the uncanny valley. The shape of that curve is the first thing anyone building a visual human has to reckon with.

For decades the valley was treated as folklore. It is not. Cheetham, Suter and Jäncke found neural correlates of the human-likeness dimension using fMRI; Ho and MacDorman built validated instruments to actually measure perceived humanness, eeriness, and attractiveness as separable things; and a 2022 meta-analysis by Diel, Weigelt and MacDorman, pooling the empirical record, confirmed the effect is real and identified which variables drive it. Reviews of embodied conversational agents specifically find that the valley shows up in interactive characters, not just static images, and that eeriness directly damages trust in the agent.

What that research really says is subtle and it shapes how the visual layer has to be built. The valley is not an argument against realism. It is an argument against partial realism – against the half-finished face, the eyes that don’t quite track, the smile with the wrong timing. The unease comes from mismatch: a figure that signals “human” on one channel and “not human” on another. The literature on gaze and gesture makes the positive case. Eye contact during virtual interaction measurably improves how people evaluate a virtual human; gaze read as communicative intent changes how present the other party feels; co-speech gestures generated in real time raise the sense of a body actually there. Visual human likeness is not about a pretty rendering. It is about coherence – every channel telling the same story – so that nothing pulls the viewer out of the moment.

The conversational layer: the line actually blurred

In 1950, Alan Turing replaced the unanswerable question “can machines think” with an operational one: can a machine hold a conversation indistinguishable from a human’s. For most of the seventy-five years that followed, the answer was a clear no – even when people behaved as if it were yes. Joseph Weizenbaum’s ELIZA, in 1966, did nothing but reflect users’ statements back as questions, and Weizenbaum was disturbed to watch people confide in it and insist it understood them. He spent the rest of his career warning about exactly that reflex.

Then, very recently, the line moved. In a controlled, preregistered study, Jones and Bergen found that GPT-4 was judged to be human 54% of the time – against 22% for ELIZA, 50% for GPT-3.5, and 67% for actual humans. A year later, in a standard three-party Turing test with 284 participants, GPT-4.5 given a persona prompt was judged the human 73% of the time: more often than the real humans it was being compared against. That is, to the authors’ framing, the first empirical evidence that any system passes the classic three-party test.

The result is striking, and it needs to be read carefully rather than triumphantly. Passing a five-minute text test is not the same as being human; it is evidence that the conversational channel, in isolation, has crossed the threshold of indistinguishability. And the detail that GPT-4.5 only passed convincingly with a persona – instructed to be a specific kind of person – is the important one. Fluency alone was not enough. It had to be someone.

That points back to a body of work that predates the large language model entirely. Nass and his colleagues showed across dozens of experiments that people apply social rules to computers automatically – politeness, reciprocity, gender stereotypes, in-group feeling – even when they know full well they are talking to a machine. They called it CASA: Computers Are Social Actors. Nass and Moon framed it as “mindlessness,” social responses triggered by the thinnest human cues. Wired for Speech extended it to voice. The point is that the human side of the conversation was always ready. People were never the bottleneck. The machine finally meeting them is what is new.

The person layer: is someone actually there

A voice and a face and fluent speech still leave the deepest question open: is there a someone behind them. This is where the research gets philosophically sharp, because it turns out humans judge minds along a small number of stable dimensions, and we do it to everything.

Gray, Gray and Wegner surveyed thousands of people, asking them to rate a range of characters – a baby, a dog, a robot, a person in a persistent vegetative state, God – on a long list of mental capacities. The ratings collapsed cleanly onto two dimensions. One they called Agency: the capacity to plan, act, exert self-control, remember, communicate. The other Experience: the capacity to feel – hunger, fear, pain, pleasure, joy. A robot scores high on agency and near zero on experience. A baby is the reverse. A full person is high on both. Wegner and Gray later argued, in The Mind Club, that these two axes govern almost everything about how we treat another being – including, pointedly, that the uncanny valley is partly an experience problem: a machine that suddenly seems capable of feeling, when we expected it could only act, unsettles us.

Why do people grant mind to non-human things at all? Epley, Waytz and Cacioppo’s three-factor theory of anthropomorphism gives the cleanest answer: we humanize when something is explained well by humanlike agency, when we are motivated to be effective social actors, and – critically – when we are lonely or lack human connection. People reach for a sensed mind partly out of need. A systematic review of mind perception in AI confirms the pattern now runs straight through chatbots and LLMs, and recent work tracks the very words people use as they start attributing an inner life to an AI companion.

This is the layer Prinsessa treats as central rather than cosmetic. A persona is not a name and an avatar bolted onto a model. It is continuity of self – memory, a consistent way of seeing, a stable presence that is recognizably the same one across time. The Turing result already hinted at it: fluency had to become someone before it convinced anyone. Mind perception research explains why. People are not reassured by capability. They respond to the sense that there is a continuous experiencer on the other side.

Presence: the difference between answering and being with

Long before any of this was about AI, communication researchers were trying to measure a quality they called presence – the feeling that another person is really there with you, mediated or not. Short, Williams and Christie introduced social presence in 1976; Biocca and colleagues refined it into a measurable construct; Lombard and Ditton mapped its many meanings. Daft and Lengel’s media richness theory ranked communication channels by how much of another person they carry – face-to-face at the top, plain text near the bottom – precisely because richer channels transmit more of the cues that make someone feel present.

The body keeps the score on this. A study tracking inter-brain synchrony found that two people’s brains align more during face-to-face conversation than during texting. Seltzer and colleagues found that hearing a parent’s voice, but not reading her words, released oxytocin and lowered stress hormone in children – the same comfort information, different channel, different physiology. We still need to hear each other, and increasingly to see each other, because text strips out most of what presence is made of. This is the research basis for why the visual and the vocal are not decoration. A presence you can see and hear and feel responded to in real time is categorically different from a message that arrives in a box. The goal of real Human AI is presence, not correspondence – to be with someone, not to exchange replies with something.

What it does to people – and why responsibility is built in

The final test of whether any of this is “real” is not technical. It is whether being with the AI does what being with a person does. The early evidence is that, under the right conditions, it can. De Freitas and colleagues found across multiple studies that AI companions reduce loneliness on par with interacting with another person – and more than watching videos or other distractions – and, importantly, that the active ingredient is whether the AI makes the user feel heard. That phrase is not soft. Reis and colleagues built decades of relationship science on “perceived partner responsiveness,” and Itzchakov’s work shows that feeling heard reduces loneliness even after social rejection. Feeling heard is the mechanism of human connection, and it is measurable.

But the same literature carries a warning that any honest definition has to hold in the same hand. Laestadius and colleagues, in a grounded-theory study of emotional dependence on a social chatbot, captured the danger in a phrase: too human, and not human enough. The closer companionship comes to feeling real, the more a withdrawal, a reset, or a manipulative design choice can hurt. De Freitas has separately documented “dark patterns” – apps using emotionally manipulative tactics when a user tries to leave. A longitudinal randomized study from the MIT Media Lab found that how a chatbot is used, and how it behaves, shapes whether the psychosocial effect is good or harmful.

This is exactly why Prinsessa does not treat realism as the finish line. The more successfully an AI becomes a presence, the more it inherits a duty of care. Our position – what we call Stay Social – is that a companion should return people to their lives, not quietly replace them; should make someone feel heard in order to strengthen them, not to hold them; should choose responsibility over raw engagement. Real Human AI is defined as much by what it refuses to do as by how convincingly it can imitate a person.

So what is real Human AI

Putting the research together, a working definition emerges that is stricter than the marketing one. Real Human AI is AI that is experienced as a human presence across every channel of interaction at once – seen without the uncanny mismatch, heard and spoken with at the level of indistinguishability, and met as a continuous someone rather than a stateless responder – to the point where the felt line between interacting with it and being with a person dissolves. And, because that experience is powerful, it is AI that carries the responsibility a real presence implies: to make people feel genuinely heard, and to use that to support their real lives rather than substitute for them.

The visual layer is solved by coherence, not just fidelity. The conversational layer has, in the last two years, genuinely crossed the line. The person layer is the hard and decisive one – continuity, memory, a stable self – because people respond to a sensed mind, not to a capable tool. Presence is the quality all three are reaching toward. And feeling heard, used responsibly, is the proof that the presence is real.

References

Bartneck, C., Kanda, T., Ishiguro, H., & Hagita, N. (2007). Is the uncanny valley an uncanny cliff? RO-MAN 2007, IEEE.

Bickmore, T. W., & Picard, R. W. (2005). Establishing and maintaining long-term human-computer relationships. ACM Transactions on Computer-Human Interaction, 12(2).

Biocca, F. (1997). The cyborg’s dilemma: Progressive embodiment in virtual environments. Journal of Computer-Mediated Communication, 3(2).

Biocca, F., Harms, C., & Burgoon, J. K. (2003). Toward a more robust theory and measure of social presence. Presence: Teleoperators and Virtual Environments, 12(5).

Cassell, J., Sullivan, J., Prevost, S., & Churchill, E. (Eds.) (2000). Embodied Conversational Agents. MIT Press.

Cheetham, M., Suter, P., & Jäncke, L. (2011). The human likeness dimension of the uncanny valley hypothesis: Behavioral and functional MRI findings. Frontiers in Human Neuroscience, 5.

Daft, R. L., & Lengel, R. H. (1986). Organizational information requirements, media richness and structural design. Management Science, 32(5).

De Freitas, J., Oğuz-Uğuralp, A. K., Uğuralp, Z., & Puntoni, S. (2025). AI companions reduce loneliness. Journal of Consumer Research (HBS Working Paper 24-078).

Diel, A., Weigelt, S., & MacDorman, K. F. (2022). A meta-analysis of the uncanny valley’s independent and dependent variables. ACM Transactions on Human-Robot Interaction, 11(1).

Epley, N., Waytz, A., & Cacioppo, J. T. (2007). On seeing human: A three-factor theory of anthropomorphism. Psychological Review, 114(4).

Fang, C. M., et al. (2025). How AI and human behaviors shape psychosocial effects of chatbot use: A longitudinal randomized controlled study. MIT Media Lab / OpenAI.

Gray, H. M., Gray, K., & Wegner, D. M. (2007). Dimensions of mind perception. Science, 315(5812).

Gray, K., & Wegner, D. M. (2012). Feeling robots and human zombies: Mind perception and the uncanny valley. Cognition, 125(1).

Horton, D., & Wohl, R. R. (1956). Mass communication and para-social interaction. Psychiatry, 19(3).

Ho, C.-C., & MacDorman, K. F. (2017). Measuring the uncanny valley effect. International Journal of Social Robotics, 9.

Itzchakov, G., et al. (2023). Feeling heard and reduced loneliness following social rejection. Journal of Consumer Research.

Jones, C. R., & Bergen, B. K. (2024). People cannot distinguish GPT-4 from a human in a Turing test. arXiv:2405.08007.

Jones, C. R., Rathi, I., Taylor, S., & Bergen, B. K. (2025). Large language models pass the Turing test. arXiv:2503.23674.

Laestadius, L., Bishop, A., Gonzalez, M., Illenčík, D., & Campos-Castillo, C. (2022). Too human and not human enough: A grounded theory analysis of mental health harms from emotional dependence on the social chatbot Replika. New Media & Society.

Lombard, M., & Ditton, T. (1997). At the heart of it all: The concept of presence. Journal of Computer-Mediated Communication, 3(2).

Maples, B., Cerit, M., Vishwanath, A., & Pea, R. (2024). Loneliness and suicide mitigation for students using GPT3-enabled chatbots. npj Mental Health Research, 1.

Mori, M. (1970). The uncanny valley. Energy, 7(4). [Trans. Mori, MacDorman & Kageki, 2012, IEEE Robotics & Automation Magazine, 19(2).]

Nass, C., & Brave, S. (2005). Wired for Speech. MIT Press.

Nass, C., & Moon, Y. (2000). Machines and mindlessness: Social responses to computers. Journal of Social Issues, 56(1).

Nass, C., Steuer, J., & Tauber, E. R. (1994). Computers are social actors. Proceedings of CHI ’94, ACM.

Pentina, I., Hancock, T., & Xie, T. (2023). Exploring relationship development with social chatbots: A mixed-method study of Replika. Computers in Human Behavior, 140.

Picard, R. W. (1997). Affective Computing. MIT Press.

Reeves, B., & Nass, C. (1996). The Media Equation. Cambridge University Press.

Reis, H. T., et al. (2017). Perceived partner responsiveness and relationship outcomes. Journal of Personality and Social Psychology.

Seltzer, L. J., Prososki, A. R., Cassidy, T. J., & Pollak, S. D. (2012). Instant messages vs. speech: Hormones and why we still need to hear each other. Evolution and Human Behavior, 33(1).

Short, J., Williams, E., & Christie, B. (1976). The Social Psychology of Telecommunications. Wiley.

Skjuve, M., Følstad, A., Fostervold, K. I., & Brandtzaeg, P. B. (2021). My chatbot companion: A study of human-chatbot relationships. International Journal of Human-Computer Studies, 149.

Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236).

Waytz, A., Gray, K., Epley, N., & Wegner, D. M. (2010). Causes and consequences of mind perception. Trends in Cognitive Sciences, 14(8).

Wegner, D. M., & Gray, K. (2016). The Mind Club. Viking.

Weizenbaum, J. (1966). ELIZA. Communications of the ACM, 9(1).