{"pk":23983,"title":"Can reinforcement learning model learning across development? Online lifelong learning through adaptive intrinsic motivation","subtitle":null,"abstract":"Reinforcement learning is a powerful model of animal learning in brief, controlled experimental conditions, but does not readily explain the development of behavior over an animal's whole lifetime.  In this paper, we describe a framework to address this shortcoming by introducing the single-life reinforcement learning setting to cognitive science. We construct an agent with two learning systems: an extrinsic learner that learns within a single lifetime, and an intrinsic learner that learns across lifetimes, equipping the agent with intrinsic motivation. We show that this model outperforms heuristic benchmarks and recapitulates a transition from exploratory to habit-driven behavior, while allowing the agent to learn an interpretable value function. We formulate a precise definition of intrinsic motivation and discuss the philosophical implications of using reinforcement learning as a model of behavior in the real world.","language":"eng","license":{"name":"","short_name":"","text":null,"url":""},"keywords":[{"word":"Cognitive Neuroscience; Philosophy; Learning; Machine learning; Neural Networks"}],"section":"Papers with Poster Presentation","is_remote":true,"remote_url":"https://escholarship.org/uc/item/1td977rv","frozenauthors":[{"first_name":"Kai","middle_name":"J","last_name":"Sandbrink","name_suffix":"","institution":"University of Oxford","department":""},{"first_name":"Brian","middle_name":"","last_name":"Christian","name_suffix":"","institution":"University of Oxford","department":""},{"first_name":"Linas","middle_name":"M.","last_name":"Nasvytis","name_suffix":"","institution":"Harvard University","department":""},{"first_name":"Christian","middle_name":"","last_name":"Schroeder de Witt","name_suffix":"","institution":"University of Oxford","department":""},{"first_name":"Patrick","middle_name":"","last_name":"Butlin","name_suffix":"","institution":"University of Oxford","department":""}],"date_submitted":null,"date_accepted":null,"date_published":"2024-01-01T18:00:00Z","render_galley":null,"galleys":[{"label":"PDF","type":"pdf","path":"https://journalpub.escholarship.org/cognitivesciencesociety/article/23983/galley/13577/download/"},{"label":"PDF","type":"pdf","path":"https://journalpub.escholarship.org/cognitivesciencesociety/article/23983/galley/20976/download/"}]}