Mixed cooperative-competitive control scenarios where the interacting partners exhibit individual goals are very challenging for reinforcement learning agents. An example of such scenarios is given by human-machine interaction. In order to contribute towards intuitive human-machine collaboration, this work focuses on problems in the continuous state and control domain and prohibits explicit communication. More precisely, the agents do not know the others' goals or control laws but only sense their control inputs retrospectively. The proposed framework combines a partner model learned from online data with a reinforcement learning agent that is trained in a simulated environment including the partner model. This procedure overcomes drawbacks of independent learners and benefits from a reduced amount of real world data required for reinforcement learning---an aspect that is vital in the human-machine context.
Experimental results reveal that the method learns fast due to the simulated environment and adapts to the constantly changing partner due of the partner model.