Active vision: control of eye-movements and probabilistic planning
Final Report Abstract
A fundamental component of interacting with the world consists in acquiring task-relevant information to achieve our goals. What makes this a difficult problem is the inescapable probabilistic nature of our world. Gaze selection is the epitomic behavior in which multiple uncertainties about the state of the world and the consequences of our actions interact. To reduce the uncertainty about our visual environment, we sequentially move our eyes toward different parts of the scene. Based on the acquired information and internal states, we decide where to direct our gaze next. Accordingly, in an uncertain and dynamic world, it is particularly important to adopt strategies that handle these uncertainties so that we can successfully achieve our long-term goals, and it is thus often suboptimal to only plan one single eye movement at a time. With this project, we experimentally investigated and quantified human behavior in sequential tasks that require human subjects to detect targets with controlled spatial and temporal statistics. Importantly, we developed normative computational models of these tasks based on sequential decision-making under uncertainty, specifically Partially Observable Markov Decision Processes (POMDP) that include cognitive and sensorimotor costs for visual behaviors to understand the observed human behavior. First, we investigated how human blinking adjusts to environmental event statistics and task demands. We developed a model of blinking as probabilistic planning, where events follow an inhomogeneous Poisson process, a very general model of natural event statistics. Second, we investigated human gaze allocation in a temporally extended task requiring continuous switching between a reward collection task and a monitoring task. A probabilistic planning model revealed that participants adaptively traded off costs for gathering information or gathering rewards but also took their internal costs for a switch and the cognitive cost for monitoring into account. Third, we showed that integrating the measured costs of human gaze shifts can improve the prediction of the next gaze target in several current saliency models. Fourth, to investigate the adaptability of gaze selection in tasks that classically have been understood as involving overt cognitive planning, we showed that human gaze when solving mazes can be related to cognitive planning costs, i.e., the number of considered alternative paths and their respective length. Fifth, in an applied project, we developed a system that measures individual participants’ reading speed and uses this measurement to progressively design novel fonts. To the best of our knowledge, this is the first individualized, adaptive, interactive font. We could show that within an hour-long experiment, participants could improve their reading speed by up to 20%. Patents for this system have been obtained both for Germany and the USA. The relevance of these results extends beyond cognitive science, psychology, neuroscience, and related fields, including human-machine interaction, as the human capabilities of handling multiple sources of uncertainty are still unmatched by artificial intelligence systems.
Publications
-
AdaptiFont: Increasing Individuals’ Reading Speed with a Generative Font Model and Bayesian Optimization. Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, 1-11. ACM.
Kadner, Florian; Keller, Yannik & Rothkopf, Constantin
-
Trade-off between uncertainty reduction and reward collection reveals intrinsic cost of gaze switches. Journal of Vision, 22(14), 3400.
Kadner, Florian; Wilke, Tabea A.; Vo, Thi DK; Hoppe, David & Rothkopf, Constantin A.
-
Finding your Way Out: Planning Strategies in Human Maze-Solving Behavior. In Proceedings of the Annual Meeting of the Cognitive Science Society (Vol. 45, No. 45).
Kadner, F., Willkomm, H., Ibs, I. & Rothkopf, C.
-
Improving saliency models’ predictions of the next fixation with humans’ intrinsic cost of gaze shifts. 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2103-2113. IEEE.
Kadner, Florian; Thomas, Tobias; Hoppe, David & Rothkopf, Constantin A.
-
Active vision as sequential decision-making under uncertainty. [Doctoral dissertation, Technical University of Darmstadt].
Kadner, F.
