LTH-image


Abstract

Predicting Rewards at Every Time Scale

Roshan Shariff, University of Alberta

Abstract:  

In reinforcement learning, future rewards are often discounted: we prefer rewards we receive immediately rather than those far in the future. The rate of discounting imposes a "time scale" on our reward valuation and is incorporated into the learned value functions. In this talk, I discuss how learning value functions with several different discount factors allows us to reason about the detailed temporal structure of future rewards.