Visual Perception of Obstacles: Do Humans and Machines Focus on the Same Image Features?

Kyriakides, Constantinos; Thoma, Marios; Theodosiou, Zenonas; Partaourides, Harris; Michael, Loizos; Lanitis, Andreas

doi:10.5220/0012453500003660

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14279/33221

Title:	Visual Perception of Obstacles: Do Humans and Machines Focus on the Same Image Features?
Authors:	Kyriakides, Constantinos Thoma, Marios Theodosiou, Zenonas Partaourides, Harris Michael, Loizos Lanitis, Andreas
Major Field of Science:	Social Sciences
Field Category:	Media and Communications
Keywords:	Obstacle Recognition;Deep Learning Algorithms;Explainability;Eye Tracking;Heatmaps
Issue Date:	2024
Source:	Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 2024, vol. 4, pp. 357-364
Volume:	4
Start page:	357
End page:	364
Conference:	19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications
Abstract:	Contemporary cities are fractured by a growing number of barriers, such as on-going construction and infrastructure damages, which endanger pedestrian safety. Automated detection and recognition of such barriers from visual data has been of particular concern to the research community in recent years. Deep Learning (DL) algorithms are now the dominant approach in visual data analysis, achieving excellent results in a wide range of applications, including obstacle detection. However, explaining the underlying operations of DL models remains a key challenge in gaining significant understanding on how they arrive at their decisions. The use of heatmaps that highlight the focal points in input images that helped the models reach their predictions has emerged as a form of post-hoc explainability for such models. In an effort to gain insights into the learning process of DL models, we studied the similarities between heatmaps generated by a number of architectures trained to detect obstacles on sidewalks in images collected via smartphones, and eye-tracking heatmaps generated by humans as they detect the corresponding obstacles on the same data. Our findings indicate that the focus points of humans more closely align with those of a Vision Transformer architecture, as opposed to the other network architectures we examined in our experiments.
URI:	https://hdl.handle.net/20.500.14279/33221
ISBN:	9789897586798
DOI:	10.5220/0012453500003660
Rights:	Attribution-NonCommercial-NoDerivatives 4.0 International
Type:	Book Chapter
Affiliation :	CYENS - Centre of Excellence Open University Cyprus Cyprus University of Technology AI Cyprus Ethical Novelties
Funding:	Directorate General for European Programmes, Coordination and Development Funding sponsor Framework Programme
Publication Type:	Peer Reviewed
Appears in Collections:	Κεφάλαια βιβλίων/Book chapters

CORE Recommender

Show full item record

Page view(s)

49

Last Week
0

Last month
35

checked on Jan 5, 2025

Google Scholar^TM

Check

Altmetric

This item is licensed under a Creative Commons License

Page view(s)

Google ScholarTM

Altmetric

Google Scholar^TM