Abstracts Track 2023

Area 1 - Agents and Human Interaction

Nr: 6

Virtually Stressed: Interaction Between Display System and Virtual Agent Behaviour


Kesassi Celia, Mathieu Chollet, Cédric Dumas and Caroline L. Cao

Abstract: Many virtual applications used virtual agents for training public speaking skills. In such applications, users are immersed in a virtual environment and deliver a speech to a virtual audience before them. To study the effects of virtual audiences on users, Pertaub et al. (2002) and Barreda-Ángeles et al. (2020) conducted separate studies comparing the effects of a positive and a negative audience (i.e., smiling, frowning) on stress levels. Results showed that a negative audience induced higher stress levels compared to a positive audience. However, Chollet et al. (2018) conducted a study which showed no difference when comparing the effect of a positive and a negative audience on stress levels. We surmise that the inconsistency may be due to the use of different display systems, which impacts the level of presence. We propose that when participants feel a high sense of presence, the negative audience’s behaviour will be more salient, leading to higher stress compared to the positive audience. On the other hand, if the level of presence is low, stress levels will not be affected by the audience. To test this hypothesis, we conducted a 2x2 study to investigate whether the emotional behaviour of a virtual audience affects self-reported stress as a function of the display system. Participants were assigned to either a virtual reality (VR) condition where the audience was displayed in a VR headset, presumably generating a higher sense of presence, or they were assigned to a screen display (SD) condition where the audience was projected on a wall to generate a lower sense of presence. Participants were asked to argue for or against a controversial topic they chose from a list of topics. They delivered a speech on the selected topics. One speech was delivered to a positive audience while the other to a negative audience. Subject performance was recorded with audio-visual and physiological sensors. The virtual agents exhibited their emotional state through their postures, head movements, gaze, and facial expressions modeled by Chollet et al. (2015). There was a total of 36 participants (F=24, M=12), aged between 17 and 74. Among them, we identified and removed one outlier. Participants reported their level of stress on a visual analogue scale (VAS) before the task, after the first presentation, and after their second presentation. After analysis we found that, after each presentation, the level of stress increased significantly compared to the baseline. We examined the interaction between the type of display (i.e., VR or SD) and the behaviour of the audience (i.e., positive or negative). Results showed no interaction effect on stress levels. However, social presence (the perceived ability of the medium to connect people), measured using Nowak & Biocca questionnaire (2003), was higher in the VR condition compared to the SD condition. This study supports previous findings that virtual audiences for public speaking can lead to increased self-reported stress. Results did not show a significant interaction between the type of display and audience behaviour, although social presence was higher in the VR condition compared to the SD condition. Our plan for future work is to conduct further analyses to help examine our hypothesis, including, examining whether self-reported stress is correlated with physiological measures of stress and examining behavioural measures collected in the video and voice recordings.

Area 2 - Image and Video Processing and Analysis

Nr: 11

Poisson Equation with Heterogeneous Differential Operators


Mattia Galeotti, Alessandro Sarti and Giovanna Citti

Abstract: The visual cortex of the human brain has the ability of extracting different features from the visual retinal input. In particular the stimulus function on the first layer of the visual cortex (called V1) encodes positions and orientations of the image contours. Cortical cells act on the image received through the eyes by differentiating the stimulus as operators changing from point to point [1]. An important problem of contemporary neuroscience consists in understanding whether the perceived image can still be reconstructed starting from the partial information carried by the feedforward action of cells in V1. In order to do this we consider the cortical Receptive Profiles (RPs) as Gaussian derivatives with heterogeneous metrics and derivation orders [2], the reconstructed image therefore would be a solution of the associated inverse problem, which is a Poisson-kind equation with differential operator changing from point to point. We can write this down as Lu=m, where u is the function encoding the reconstructed image, L=L_{x,y,\theta} is a differential operator that varies depending on position and orientation, and m=LI is the transform of the visual stimulus, that is often obtained via a convolution process. In order to solve this, we consider discretized second order operators on regular grids and their convergence results. In particular if partial_z^\varepsilon is the discrete differential along z in Z^d, then for any Lambda subset of Z^d finite and symmetric with respect to 0, and any matrix function A: R^d--> R^{Lambda x Lambda}, we can define a second order operator denoted by A (with a little abuse of notation) such that for any function u: R^d-->R, Au(x):=sum_{z,z'\in Lambda}\partial_{-z}^\varepsilon(a_{zz'}(x)\partial_{z'}^\varepsilon u(x)). The same construction generalizes to matrices defined over regular grids, A^\varepsilon: \varepsilonZ^d-->R^{Lambda x Lambda}, and we can introduce a notion of ellipticity over these operators which is compatible with the usual notion. The second order discrete elliptic operators A^\varepsilon that we consider, are stochastically defined, that is A^\varepsilon=A^\varepsilon(x)(w) with w lying in a probability space. With the opportune definition of H-convergence [3], we obtain that if the A^\varepsilon satisfy an ergodicity-type condition, then they converge to a classic elliptic operator A^0 which is non-stochastic, meaning that it does not depend on w. This framework applies to various discrete distributions analogous to the distribution of the V1 cortex. In the end, we perform a numerical implementation of different distributions of second and fourth order differential operators, evaluating the reconstruction of the perceived image. In particular we focus on the perceptual phenomena of lightness and color constancy, that is the ability of reconstructing constant lightness and color perceptions under different illuminations. References [1] Richard A. Young. The gaussian derivative model for spatial vision. i- retinal mechanisms. Spatial vision, 2(4):273–293, 1987. [2] Ron Kimmel, Michael Elad, Doron Shaked, Renato Keshet, and Irwin Sobel. A variational framework for retinex. International Journal of computer vision, 52(1):7–23, 2003. [3] Ennio De Giorgi. G-operators and Γ-convergence. In Proceedings of the International Congress of Mathematicians, volume 1, 1984.

Area 3 - Information Visualization

Nr: 10

The InVizAR Project: Augmented Reality Visualization for Non-Destructive Testing Data from Jacket Platforms


Costas Boletsis, Arne Lie, Ophelia Prillard, Karsten Husby and Jiaxin Li

Abstract: There is an increasing need for underwater condition control for offshore steel platforms and wind and fish farming facilities. Localized diagnostic techniques, such as magnetic field non-destructive testing (NDT) methodologies for the structural monitoring of such facilities, are very important for detecting early signs of deterioration and damage, thus preventing fatal accidents. The visualization of such magnetic fields can define the parts that the diagnostic process will cover and lead to the detection of structural flaws. A proper visualization is of the essence for the better interpretation of data, informed decision making, and safety. The InVizAR project (accessible at: www.hcilab.no/invizar2022) is formulated to explore, design, and present a suitable visualization of NDT data from inspections of jacket platforms. Tiny cracks on the surface of the metal will generate invisible magnetic anomalies, and the objective of InVizAR is to visualize these signals. InVizAR utilizes augmented reality (AR) technology. AR can visualize invisible signals and their spatial and temporal qualities (4D), overlaying them atop the real-world view as layers while facilitating team collaboration in metaverse spaces. InVizAR utilizes a real-life dataset recorded in the ANDWIS project for the client OceanTech Innovation AS. The dataset contains geospatial and temporal values from an NDT probe on a jacket platform. The Unity game engine is used for AR development. Therefore, a new API structure is applied to the dataset based on the GeoJSON format for Unity-importing purposes. In the initial stages of concept design, potential visualization modes are identified based on the visualization's spatial elements (location) and the data feed’s timing. Hence, it becomes clear that InVizAR facilitates a use case in which an administrator wants to communicate the probing results and their 4D qualities remotely to a client or co-worker. Simultaneously, AR is chosen as a long-term strategy so the work can be extended in the future and cover "on-location," contextual AR visualizations of such datasets. Based on a literature search and searches for commercial devices visualizing NDT results in 2D, a visualization heatmap is chosen. A heatmap is a powerful tool for visualizing multidimensional data, with which individual values can be expressed as colors. Subsequently, an AR heatmap visualization of the ANDWIS dataset is developed in Unity and is presented through a video recording. The heatmap visualizes frequency deviations, which signify cracks, in red. A slider is implemented to adjust the transparency of the AR visualization and another to navigate between visualizations of past datasets. Users can also tap on a point of the AR visualization and obtain information about measurements in this area. Through an internal peer-review process by the teams' experts, the InVizAR AR heatmap is considered a suitable and user-friendly visualization that can serve current NDT use cases and the communication of their results in AR/metaverse spaces. Future work will create collaborative metaverse spaces for communicating and working on visualization, as well as address additional visualization modes.

Area 4 - Interaction Techniques and Devices

Nr: 14

Use of Music Snippets to Authenticate Users


Bob-Antoine J. Menelas

Abstract: We report the evaluation of an authentication system that takes advantage of the auditory perceptual capabilities of humans. Knowing that people do usually face huge difficulties in remembering complicated passwords, rather than using a fixed sequence of characters or positions, to log in, the user has to identify some music snippets. The studied system does dynamically change the authentication token. The system appears to be enjoyable and easy to use.

Nr: 16

Eyesight Free Image Representation by Tomographic Display


Keishin Yamamoto, Fumihiko Sakaue and Jun Sato

Abstract: In recent years, several technologies have been proposed for virtually correcting the vision of people with low vision by applying some kind of processing to the images presented to them so that they can observe clear images with their naked eyes. These technologies often use a special display called a light-field display. This display can present different images in different directions and is widely used in 3D image presentation where different images are presented to the left and right eyes. However, the aperture of the human eyes is very narrow compared to the distance between the left and right eyes, so very precise image control is required to present a clear image using this display. In this study, we propose a new method to achieve virtual vision correction using a tomographic display. This method can achieve more efficient vision correction than the method using a light-field display. Furthermore, by using this efficient correction, we show a method to realize vision correction for various eyesight without prior visual eyesight measurement. It is more convenient than conventional virtual vision correction methods, which require prior measurement of the visual characteristics of the user. A tomographic display divides the 3D space to be represented into a set of 2D tomographic layers and presents these layers at corresponding depths to realize 3D image presentation. Since it is difficult to change the physical position of the display screen, a Focal Tunable Lens (FTL) is placed between the display and the observer, and the focal length is changed in synchronization with the image on the screen to virtually change the depth of the screen. The speed of image change and lens focal length change is sufficiently fast compared to the speed of human observation so that a human observer could observe all depth images at the same time. In this research, this is used not for 3D image presentation but for virtual vision correction. As mentioned above, all depth images are observed simultaneously in a single observation as integration of the blurred tomographic layers by user's eyesight. Next, we consider an image presentation method that can correct the vision of multiple observers. Assume that the eyesight of each of n observers with different visual characteristics is available In this case, by simultaneously minimizing the error between the observed image and the target image for all users. However, this cannot control the images observed by other than the n observers used to determine the presentation image. Therefore, we introduce the regularization to smooth the change of the observed image with the change of eyesight. To confirm the effectiveness of the proposed method, an experiment was conducted by constructing an experimental environment on a computer. In this experiment, the distance from the screen to the eyes was set to 500 mm, and the tomographic display was assumed to have seven virtual screens with depths ranging from 350 mm to 650 mm at 50 mm intervals. The results of observing the images created by the proposed method with different visual acuity are shown in supplemental materials. Although the screen is placed at 500 mm, the images are clear at all vision levels, indicating that the proposed method provides appropriate virtual vision correction. A little more detail is provided in the supplemental materials, please read that.