My task is to build a system that can make predictions about a player's future in-game actions by observing his/her history of interaction with the environment.
Reinforcement learning is about observing the state, s, acting (which takes you to another state) and receiving a reward for doing that action in s.
Is it possible to do reinforcement learning as an observer? The AI makes predictions about the player's next moves (acts) and receives rewards for making predictions, but can't control what the player actually does and therefore can't control the state that comes as a result of making a prediction.
Also, how would you define a reward function for something like this? I know you want to be as close to the player in behavior as possible, but how do you determine that ahead of time?
You need to look at the Apprenticeship Learning:
http://ai.stanford.edu/~ang/papers/icml04-apprentice.pdf http://en.wikipedia.org/wiki/Apprenticeship_learning