My recommendation would be to support head tracking (rotations + translations), tracking of at least one hand (rotations + translations), and a joystick with a couple of buttons. From my personal experience, when you have this minimum setup, you cross a threshold, and your brain much more easily accepts this other reality.
This means that, for me, the Oculus Rift by itself is not (yet) a minimum VR platform. It's missing head position tracking and doesn't provide any kind of hand tracking. I know you can easily add it yourself with devices such as the Razer Hydra or others. But unless we have a complete VR platform, game developers can't rely on the fact that players all have the same standard hardware.
The first enemy of VR is latency. If you move your head in the real world and the resulting image takes one second to appear, your brain will not accept that this image is related to the head movement. Moreover as a result, you will probably get sick. John Carmack reports that "something magical happens when your lag is less than 20 milliseconds: the world looks stable!"
Some researchers even advise a 4ms end-to-end latency from the moment you act to the moment the resulting image is displayed. To give you an idea of what this means, when your game runs at 60 frames per second it's 16ms from one frame to another. Add to that the latency of your input device, which can range from a few milliseconds to more than 100ms with the Kinect, and the latency of the display, which also ranges from a few milliseconds to more than 50ms for some consumer HMDs.
And if you want to run your game in stereoscopy, keep in mind that the game needs to compute the left and right images for each frame. As a game developer, you can't do much for the input and display latency, but you have to make sure that your game runs fast!
For more information about latency, I recommend these great articles by Michael Abrash and John Carmack (my personal heroes): "Latency, the sine qua non of AR and VR" and "Latency mitigation strategies."
We have seen that perceptive presence requires you to fool your senses in the most realistic way. Cognitive presence -- fooling the mind, not the senses -- results from a sense that your actions have effects on the virtual environment, and that these events are credible. This means that you must believe in the "rules" of the simulation. For this, you must make sure that your world is coherent, not necessarily realistic. If a player can grab a particular glass, for example, but can't grab another one, it will break presence because the rules are not consistent. Once cognitive presence is broken, it's very difficult to "fix" it. The player is constantly reminded that the simulation is not real, and it will take some time to accept it again as reality.
If you're targeting a visually realistic environment, it is more likely to generate breaks in presence. This is because your brain will expect many things that we are not yet able to achieve technically: perfect physics, sound, force feedback so that your hand doesn't penetrate an object, objects breaking in pieces, smell, etc. Having a non-realistic environment will lower your expectations that everything should be perfect, resulting in a more consistent presence feeling.
If you manage to achieve cognitive presence, and fool the mind of your player, the events from the simulation will affect his sensations. If an attractive character looks at a shy guy into the eyes, his heart rate might increase, he will blush, etc. People with a fear of public speaking will react with anxiety if speaking to a virtual audience.
This is why the application I still find the most immersive is "Verdun 1916-Time Machine." It fools many senses at a time: vision, smell, touch... But the most important point is that, by design of the "experience," the interactions are extremely simple: you can only rotate your head, because you're a wounded soldier.
Given that extreme limitation, it's extremely simple to keep the player from experiencing a break in presence. You can't move your hand, so it cannot penetrate objects; you aren't forced to navigate with an unnatural joystick. It has been reported several times that some people smiled at another virtual soldier that came to save the player in the simulation!
The problem is that it's very difficult to concretely measure whether a player feels present in the world. There are currently no absolute indicators for that. You can measure the heart rate or skin conductance if you want to evaluate anxiety. But this is only relevant for stressful simulations.
What you can try to evaluate though is if the player is responding naturally. We already mentioned a few natural reactions: trying to catch a ball, fear of heights near a cliff, fear for your virtual body if somebody is trying to hurt you, trying to avoid collisions...