{"pk":28057,"title":"Shaping Model-Free Habits with Model-Based Goals","subtitle":null,"abstract":"Model-free (MF) and model-based (MB) reinforcement learn-ing (RL) have provided a successful framework for under-standing both human behavior and neural data. These two sys-tems are usually thought to compete for control of behavior.However, it has also been proposed that they can be integratedin a cooperative manner. For example, the Dyna algorithm usesMB replay of past experience to train the MF system, and hasinspired research examining whether human learners do some-thing similar. Here we introduce an approach that links MFand MB learning in a new way: via the reward function. Givena model of the learning environment, dynamic programmingis used to iteratively approximate state values that monotoni-cally converge to the state values under the optimal decisionpolicy. Pseudorewards are calculated from these values andused to shape the reward function of a MF learner in a waythat is guaranteed not to change the optimal policy. We showthat this method offers computational advantages over Dyna intwo classic problems. It also offers a new way to think aboutintegrating MF and MB RL: that our knowledge of the worlddoesn’t just provide a source of simulated experience for train-ing our instincts, but that it shapes the rewards that those in-stincts latch onto. We discuss psychological phenomena thatthis theory could apply to, including moral emotions.","language":"eng","license":{"name":"","short_name":"","text":null,"url":""},"keywords":[],"section":"Publication-based-Talks","is_remote":true,"remote_url":"https://escholarship.org/uc/item/8sd7s177","frozenauthors":[{"first_name":"Paul","middle_name":"M","last_name":"Krueger","name_suffix":"","institution":"UC Berkley","department":""},{"first_name":"Thomas","middle_name":"L","last_name":"Griffiths","name_suffix":"","institution":"UC Berkley","department":""}],"date_submitted":null,"date_accepted":null,"date_published":"2018-01-01T18:00:00Z","render_galley":null,"galleys":[{"label":"PDF","type":"pdf","path":"https://journalpub.escholarship.org/cognitivesciencesociety/article/28057/galley/17696/download/"}]}