There are lots of things across all media that can already fool us. The crucial question, though, is how well do they do it? Distance and brevity obscure all manner of flaws, but at some point in a game, the player can always get closer or look for longer.
This applies to absolutely every aspect of simulation, but the aspects centered on other humans are critical. We're a very social species, and as a result large amounts of our cognitive resources are thrown into the assessment of other human beings. For instance, we show extraordinary specialization in recognizing, processing and categorizing the faces of other humans. We're acutely aware of whether or not other people are looking at us. We spend every second of interaction inferring the emotional state, values, and likely actions of others.
Of all the sensory data we deal with, other people are among the most relevant to our existence, so of course we have some highly specialized capacities to deal with it. Speech, movement, body language, behavior, and consistency of actions are all things we're well accustomed to.
That means people are much more difficult to simulate than rocks and trees, not just because of relative complexity, but because we're more wired to scrutinize our fellow humans. In film and real-time rendering alike, the plastic sheen of 90's CGI has given way to environments my unconscious mind doesn't balk at and just accepts even if not quite photoreal, but simulated people continue to pop out of them as fake.
Whether or not something is "realistic" is largely a red herring. The more important test is whether or not it's convincing, and I suspect behavior will prove to be a much bigger challenge than appearance.
Simulated appearance can be constructed from various elements that we are presently mastering. behavior is a complex, dynamic, context sensitive system that, in addition to dealing with immediate situations, can also operate informed by elaborate historical contexts and long term aims. Where actions and physicality are based on syntax, the behaviors underlying the vast scope of human actions, along with the limited repertoires imparted to AI, are often about meaning and have a rich undercurrent of semantic relations.
Real human behavior, for the most part, seamlessly elicits my empathy, and also tells me that, in turn, others understand and empathise with me. It also tends to demonstrate consistency, and at some point can generally be expected to explain any inconsistencies.
At best, such dynamics exist in a fragmented fashion if at all in game AI, which generally follows a very predictable cycle no matter how good it is: When it's new it may surprise me a few times with various tricks, and will tend to elicit empathy too, but every time a human seeming art asset or piece of behavior is instanced or recurs, my empathy diminishes. This continues until eventually I can let my Id go to town on NPCs without feeling bad. The greater the degree to which AI repeats itself, the more likely this result is.
Beyond patchy AI, the emotional engagement of a game is in the motivation I have to achieve goals, which are nothing but syntax. Games can and do rise above this. At present, there seem to be two ways in which they can use NPC behavior to drive emotionally engaging narrative and social interaction.
The first is traditional, non-interactive storytelling. By putting a game on rails or inserting huge cutscenes, a lot of traditional media techniques are of course open to game developers.
The second way is to use convincing fragments of interaction. This is more adaptive, but as yet not sustainable through time. For example, in F.E.A.R., at one point when I did particularly well at taking down a group of soldiers, the last one exclaimed "No fuckin' way!" just before I dispatched him. Though it was of course pre-recorded voice acting, the triggering of it was very well timed and created a brilliant moment, raising the game above the syntax of combat. In that instant the soldier was a character, not an entity.
Of course, any attempt to extend that into a conversation rather than a fight would, at present, break rapidly. This is exactly what happened repeatedly in Façade. No matter how many sad looks Trip shot at me, I'd always catch him doing something inhuman shortly after. Many game AIs have engaged and convinced me for a moment or two, but ultimately a five second Turing test isn't a very high benchmark.