ESR photo Research project Behaviour modelling and lifelogging About the project Lifelogging is an emerging ICT technology that uses wearable sensors (e.g., cameras, trackers, wearable sensors) to capture, store, process, and retrieve various situations, states, and contexts of an individual in daily life. A wearable camera that captures actions from an egocentric perspective in the form of video or a stream of images can automatically provide detailed insights into the activities the person wearing the camera has performed—such as how (s)he eats, what places (s)he visits, with whom (s)he interacts, what events (s)he attends, and more. The goal of this thesis was to create personalised tools and services to monitor, store, and process behavioural patterns, nutrition habits, social environments, contexts, and physical activities, while bringing the technology closer to the end user to support and improve their lifestyle and health. Start date: April 2021 Expected end date: Summer 2025 Progress of the project Under this project, state-of-the-art research on egocentric vision and lifelogging from a computer vision perspective was initially conducted, focusing on segmentation, action recognition, food-related scenes, and social interaction tracking. A comprehensive database of available datasets in the field was also created. This initial research identified several open challenges that have received limited attention from the scientific community, leading to research exploring the potential of egocentric video-based lifelogging systems for tracking actions and behaviours that impact health and well-being. Additionally, the feasibility of processing egocentric images to assist with health-related tasks, such as rehabilitation and detecting struggles to provide timely assistance, was investigated. Building on these questions, significant contributions have been made through the development of novel approaches for 2D hand pose estimation, achieving superior accuracy on public benchmarks. A 3D hand pose estimation method was also introduced, enhancing results using pseudo-depth data derived from RGB images. Further improvements were made through a camera-agnostic approach for zero-shot 3D hand pose estimation, significantly reducing mean pose error across unseen domains. These advancements contribute to bridging the gap between laboratory conditions and real-world scenarios, improving the generalisability and reliability of experimental results. Further investigations were conducted to analyse the usability of 2D hand and object poses for egocentric action recognition tasks and to evaluate performance variations when different types of pose input are used. These theoretical advancements have been successfully translated into practical and novel AAL applications, including an intelligent reading assistant for visually impaired users that integrates smart glasses with LLMs and the study for stroke patients' rehabilitation, which establishes benchmarks for exercise recognition, form evaluation, and repetition counting. This research demonstrates how egocentric vision can meaningfully support individuals in need by bridging the gap between advanced computer vision techniques and real-world assistive technologies. Scientific publications SHARP: Segmentation of Hands and Arms by Range Using Pseudo-depth for Enhanced Egocentric 3D Hand Pose Estimation and Action Recognition Wiktor Mucha, Michael Wray, Martin Kampel SHARP: Segmentation of Hands and Arms by Range Using Pseudo-depth for Enhanced Egocentric 3D Hand Pose Estimation and Action Recognition In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15315. Springer, Cham. REST-HANDS: Rehabilitation with Egocentric Vision Using Smartglasses for Treatment of Hands after Surviving Stroke Wiktor Mucha, Kentaro Tanaka, Martin Kampel REST-HANDS: Rehabilitation with Egocentric Vision Using Smartglasses for Treatment of Hands after Surviving Stroke In 12th International Workshop on Assistive Computer Vision and Robotics (ACVR2024), 2024 TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading Assistance Using Large Language Model Wiktor Mucha, Florin Cuconasu, Naome A. Etori, Valia Kalokyri, Giovanni Trappolini TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading Assistance Using Large Language Model In: Miesenberger, K., Peňáz, P., Kobayashi, M. (eds) Computers Helping People with Special Needs. ICCHP 2024. Lecture Notes in Computer Science, vol 14751. Springer, Cham., 2024 Understanding Human Behaviour With Wearable Cameras Based on Information From the Human Hand Wiktor Mucha, Martin Kampel Understanding Human Behaviour With Wearable Cameras Based on Information From the Human Hand Proceedings of the Joint VisuAAL-GoodBrother Conference on Trustworthy Video- and Audio-based Assistive Technologies, 10-13, 2024 In My Perspective, In My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition Wiktor Mucha, Martin Kampel In My Perspective, in My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition In Proceedings of the 8th IEEE International Conference on Automatic Face and Gesture Recognition (FG), Istanbul, Turkiye, pp.1-9, IEEE, 2024 Hands, Objects, Action! Egocentric 2D Hand-Based Action Recognition Wiktor Mucha, Martin Kampel Hands, Objects, Action! Egocentric 2D Hand-Based Action Recognition In: Christensen, H.I., Corke, P., Detry, R., Weibel, JB., Vincze, M. (eds) Computer Vision Systems. ICVS 2023. Lecture Notes in Computer Science, vol 14253. Springer, Cham Ego2DHands: Generating 2D Hand Skeleton in Egocentric Vision Wiktor Mucha, Martin Kampel Ego2DHands: Generating 2D Hand Skeleton in Egocentric Vision 26th Computer Vision Winter Workshop (CVWW) 2023, Krems an der Donau, Austria, 2023 Beyond Privacy of Depth Sensors in Active and Assisted Living Devices Wiktor Mucha, Martin Kampel Beyond Privacy of Depth Sensors in Active and Assisted Living Devices In Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments, pp. 425-429, 2022. Addressing Privacy Concerns in Depth Sensors Wiktor Mucha, Martin Kampel Addressing Privacy Concerns in Depth Sensors In International Conference on Computers Helping People with Special Needs, pp. 526-533, Springer, Cham, 2022. TriModal Face Detection Dataset Wiktor Mucha, Martin Kampel TriModal Face Detection Dataset Depth and Thermal Images in Face Detection - A Detailed Comparison Between Image Modalities Wiktor Mucha, Martin Kampel Depth and Thermal Images in Face Detection - A Detailed Comparison Between Image Modalities In 2022 the 5th International Conference on Machine Vision and Applications (ICMVA), pp. 16-21, 2022. State of the Art of Audio- and Video-Based Solutions for AAL Aleksic, Slavisa; Atanasov, Michael; Calleja Agius, Jean; Camilleri, Kenneth; Čartolovni, Anto; Climent-Pérez, Pau; Colantonio, Sara; Cristina, Stefania; Despotovic, Vladimir; Ekenel, Hazım Kemal; Erakin, Ekrem; Florez-Revuelta, Francisco; Germanese, Danila; Grech, Nicole; Sigurðardóttir, Steinunn Gróa; Emirzeoğlu, Murat; Iliev, Ivo; Jovanovic, Mladjan; Kampel, Martin; Kearns, William; Klimczuk, Andrzej; Lambrinos, Lambros; Lumetzberger, Jennifer; Mucha, Wiktor; Noiret, Sophie; Pajalic, Zada; Pérez, Rodrigo Rodriguez; Petrova, Galidiya; Petrovica, Sintija; Pocta, Peter; Poli, Angelica; Pudane, Mara; Spinsante, Susanna; Ali Salah, Albert; Santofimia, Maria Jose; Islind, Anna Sigríður; Stoicu-Tivadar, Lacramioara; Tellioğlu, Hilda; Zgank, Andrej State of the Art of Audio- and Video-Based Solutions for AAL GoodBrother COST Action, Technical Report, 2022 About the ESR Wiktor received BSc title in 2018 in Automatic Control and Robotics and MSc title in Robotics in the end of 2019, both at the AGH University of Science and Technology in Krakow, Poland. During his masters he spent one year at the University of Aveiro in Portugal as an exchange student. Before position in visuAAL he gained experience in software engineering, working for automotive industry on autonomous embedded solutions for car driving. Contact information Wiktor Mucha Vienna University of Technology Computer Vision Lab Favoritenstr. 9/193-1 A-1040 Vienna, Austria Email address: wmucha@cvl.tuwien.ac.at
SHARP: Segmentation of Hands and Arms by Range Using Pseudo-depth for Enhanced Egocentric 3D Hand Pose Estimation and Action Recognition Wiktor Mucha, Michael Wray, Martin Kampel SHARP: Segmentation of Hands and Arms by Range Using Pseudo-depth for Enhanced Egocentric 3D Hand Pose Estimation and Action Recognition In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15315. Springer, Cham.
REST-HANDS: Rehabilitation with Egocentric Vision Using Smartglasses for Treatment of Hands after Surviving Stroke Wiktor Mucha, Kentaro Tanaka, Martin Kampel REST-HANDS: Rehabilitation with Egocentric Vision Using Smartglasses for Treatment of Hands after Surviving Stroke In 12th International Workshop on Assistive Computer Vision and Robotics (ACVR2024), 2024
TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading Assistance Using Large Language Model Wiktor Mucha, Florin Cuconasu, Naome A. Etori, Valia Kalokyri, Giovanni Trappolini TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading Assistance Using Large Language Model In: Miesenberger, K., Peňáz, P., Kobayashi, M. (eds) Computers Helping People with Special Needs. ICCHP 2024. Lecture Notes in Computer Science, vol 14751. Springer, Cham., 2024
Understanding Human Behaviour With Wearable Cameras Based on Information From the Human Hand Wiktor Mucha, Martin Kampel Understanding Human Behaviour With Wearable Cameras Based on Information From the Human Hand Proceedings of the Joint VisuAAL-GoodBrother Conference on Trustworthy Video- and Audio-based Assistive Technologies, 10-13, 2024
In My Perspective, In My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition Wiktor Mucha, Martin Kampel In My Perspective, in My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition In Proceedings of the 8th IEEE International Conference on Automatic Face and Gesture Recognition (FG), Istanbul, Turkiye, pp.1-9, IEEE, 2024
Hands, Objects, Action! Egocentric 2D Hand-Based Action Recognition Wiktor Mucha, Martin Kampel Hands, Objects, Action! Egocentric 2D Hand-Based Action Recognition In: Christensen, H.I., Corke, P., Detry, R., Weibel, JB., Vincze, M. (eds) Computer Vision Systems. ICVS 2023. Lecture Notes in Computer Science, vol 14253. Springer, Cham
Ego2DHands: Generating 2D Hand Skeleton in Egocentric Vision Wiktor Mucha, Martin Kampel Ego2DHands: Generating 2D Hand Skeleton in Egocentric Vision 26th Computer Vision Winter Workshop (CVWW) 2023, Krems an der Donau, Austria, 2023
Beyond Privacy of Depth Sensors in Active and Assisted Living Devices Wiktor Mucha, Martin Kampel Beyond Privacy of Depth Sensors in Active and Assisted Living Devices In Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments, pp. 425-429, 2022.
Addressing Privacy Concerns in Depth Sensors Wiktor Mucha, Martin Kampel Addressing Privacy Concerns in Depth Sensors In International Conference on Computers Helping People with Special Needs, pp. 526-533, Springer, Cham, 2022.
Depth and Thermal Images in Face Detection - A Detailed Comparison Between Image Modalities Wiktor Mucha, Martin Kampel Depth and Thermal Images in Face Detection - A Detailed Comparison Between Image Modalities In 2022 the 5th International Conference on Machine Vision and Applications (ICMVA), pp. 16-21, 2022.
State of the Art of Audio- and Video-Based Solutions for AAL Aleksic, Slavisa; Atanasov, Michael; Calleja Agius, Jean; Camilleri, Kenneth; Čartolovni, Anto; Climent-Pérez, Pau; Colantonio, Sara; Cristina, Stefania; Despotovic, Vladimir; Ekenel, Hazım Kemal; Erakin, Ekrem; Florez-Revuelta, Francisco; Germanese, Danila; Grech, Nicole; Sigurðardóttir, Steinunn Gróa; Emirzeoğlu, Murat; Iliev, Ivo; Jovanovic, Mladjan; Kampel, Martin; Kearns, William; Klimczuk, Andrzej; Lambrinos, Lambros; Lumetzberger, Jennifer; Mucha, Wiktor; Noiret, Sophie; Pajalic, Zada; Pérez, Rodrigo Rodriguez; Petrova, Galidiya; Petrovica, Sintija; Pocta, Peter; Poli, Angelica; Pudane, Mara; Spinsante, Susanna; Ali Salah, Albert; Santofimia, Maria Jose; Islind, Anna Sigríður; Stoicu-Tivadar, Lacramioara; Tellioğlu, Hilda; Zgank, Andrej State of the Art of Audio- and Video-Based Solutions for AAL GoodBrother COST Action, Technical Report, 2022