Operant conditioning as a markovian decision problem : toward a dynamic model of asymptotic performance under random ratio schedules of reinforcement.
J. Jozefowiez
Université de Lille, Ch. de Gaulle, France

Reinforcement learning is one of the most active research area in contemporary artificial intelligence. It deals with the design of algorithms which allow computer agents to learn how to maximize the collecting of goods while interacting with an unknown environment. Recently, neural network models of operant conditioning using these algorithms have been proposed but most of these models do not exploit the framework of markovian decision problems, central to recent development in reinforcement learning. In this communication, we will argue that markovian decision problems and reinforcement learning could be used as a formal framework for the analysis of operant learning in animals. We will provide an example of how they can be used to derive a model of asymptotic performance under random ratio schedules of reinforcement. Our goal is to achieve a dynamic model of random ratio performance, e.g. one which will allow us to understand not only the molar properties of behavior under random ratio schedules but also its molecular properties by explaining how response rate changes from time to time.

Keywords:


 Back to program

 Retour au programme

 Back to contributors

 Retour aux contributeurs

 Back to summary

 Retour au sommaire