Lectura de Tesis de Daniel Martín Serrano

Computational Models of Visual Attention and Gaze Behavior in Virtual Reality

Autor: Daniel Martín Serrano

Directores: Belén Masiá Corcoy y Diego Gutiérrez Pérez Viernes 1 de marzo, a las 16:00, en el Salón de Actos del Ada Byron


Resumen: Virtual reality (VR) is an emerging medium that has the potential to unlock unprecedented experiences. Since the late 1960s, this technology has advanced steadily, and can nowadays be a gateway to a completely different world. VR offers a degree of realism, immersion, and engagement never seen before, and lately, we have witnessed how newer virtual content is being continuously created.

However, to get the most out of this promising medium, there is still much to learn about people’s visual attention and gaze behavior in the virtual universe. Questions like “What attracts users’ attention?” or “How malleable is the human brain when in a virtual experience?” have no definite answer yet. We argue that it is important to build a principled understanding of viewing and attentional behavior in VR.

This thesis presents contributions in two key aspects: Understanding and modeling users’ gaze behavior, and leveraging imperceptible manipulations to improve the virtual experience.



In the first part of this thesis, we have focused on developing computational models of gaze behavior in virtual environments. First, we have devised models of user attention in 360º images and 360º videos that are able to predict which parts of a virtual scene are more likely to draw viewers’ attention. Then, we designed another two computational models for spatiotemporal attention prediction, one of them able to simulate thousands of virtual observers per second by generating realistic sequences of gaze points in 360º images, and the other one predicting different, yet plausible sequences of fixations on traditional images. Additionally, we have explored how attention works in 3D meshes. The second part of this thesis attempts to improve virtual experiences by means of imperceptible manipulations. We have firstly focused on lateral movement in VR, and have devised thresholds for the detection of such manipulations, which we then applied to three key problems in VR that have no definite solution yet, namely 6-DoF viewing of 3-DoF content, overcoming physical space constraints, and reducing motion sickness. On the other hand, we have explored the manipulation of the virtual scene, resorting to the phenomenon of change blindness, and have derived insights and guidelines on how to elicit or avoid such an effect, and how human brains’ limitations affect it.