Unsupervised Control

Maximiliam Karl, Technical University of Munich


Empowerment has been shown to be a good model of biological behaviour in the absence of an extrinsic goal. It is defined as the channel capacity between actions and states and maximises the influence of an agent on its near future. It can be used to make robots balance and walk, without the need of inventing complex cost functions. We introduce an efficient method for computing empowerment and learning empowerment-maximising policies. Both methods require a model of the agent and its environment and benefit from system dynamics learned on raw data. For learning the system dynamics we use Deep Variational Bayes Filters (DVBF), a new method for unsupervised learning and identification of latent Markovian state space models. We show the ability to learn useful behaviour on various simulated robots, including biped balancing; lidar-based flock behaviour; but also on real robot hardware in the form of quadrocopters with local sensing and computing.