Risk-sensitive choice and reinforcement learning under uncertainty
Final Report Abstract
This project aimed at investigating human sequential decision making under risk using a model-based approach. Risk can arise from uncertainties at different processing levels in decision making and action planning. In our work we focused on two of them: 1. the risk which emerges because of uncertainty about the consequences of decisions (or their related actions), which we call economic risk, and 2. the risk which arise because of lack of knowledge about the decision situation, which we call perceptual risk. To model sequential decision making under incomplete knowledge about the state of the environment, Partially Observable Markov Decision Process (POMDP) were introduced. Decision making and optimization is based on the belief distribution across the possible environmental states (the belief state), which serves as a sucient statistics in the risk-neutral case. The risk-sensitive case was captured by adding an information state variable, which quanties the average accumulated reward the historyof observations could produce. One main contribution of the project was to extended this approach to utility functions which consist of sums of exponential terms as the elements of a function approximator series. As a result, an optimization algorithm for risk-sensitive policies was derived, which can handle utility functions implementing the mixed risk-sensitivities often observed in human behavioural studies. If the number of exponential terms is not too large, the new method is computationally more ecient than previous approaches. We applied this modelling framework to human decision making under risk and incomplete information and investigated the impact of perceptual vs. economic uncertainty in a 2 × 2 factorial design. We established an experimental paradigm, where movement direction of a Random Dot Kinematogram (RDK) indicated state. Low levels of coherence induced perceptual uncertainty, while probabilistic state changes and associated rewards introduced economic uncertainty. Second, a modelling framework was designed, which consisted of a biologically and psychophysically plausible perceptual inference model combined with a decision making process using the risk-sensitive POMDP framework. The perceptual inference model respects the circular nature of RDK motion signals, endows drift variables with the interpretation of the observer's current posterior probability over possible RDK states, is sensitive to variations in the RDK motion signal over time, and thus serves as a belief state inference model for application in probabilistic optimal decision policy models. Both risk-neutral and risk-sensitive models were then tted to the subjects' reaction time distributions in the economic-perceptual decision making tasks. We identied subject groups with similar risk-preferences, and the distinct distributionssupported the assumption that the risk sensitivity towards perceptual uncertainty guided subjects' choice behavior. In general, the risk-sensitive model fitted the experimental data considerably better than the risk-neutral model. Hence the results serve as a proof of principle for the applicability of risk-sensitive POMDPs to human risk-sensitive choice under perceptual uncertainty.
Publications
-
Risk Sensitivity under Partially Observable Markov Decision Processes. 2019 Conference on Cognitive Computational Neuroscience. Cognitive Computational Neuroscience.
Höft, Nikolas; Guo, Rong; Laschos, Vaios; Jeung, Sein; Ostwald, Dirk & Obermayer, Klaus
-
Risk-Sensitive Partially Observable Markov Decision Processes as Fully Observable Multivariate Utility Optimization Problems.
A. Afsardeir, A. Kapetanis, V. Laschos, K. Obermayer
