How Does the Crowd Impact the Model? A Tool for Raising Awareness of Social Bias in Crowdsourced Training Data

Perikleous, Periklis; Kafkalias, Andreas; Theodosiou, Zenonas; Barlas, Pinar; Christoforou, Evgenia; Otterbacher, Jahna; Demartini, Gianluca; Lanitis, Andreas

doi:10.1145/3511808.3557178

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14279/28637

Title:	How Does the Crowd Impact the Model? A Tool for Raising Awareness of Social Bias in Crowdsourced Training Data
Authors:	Perikleous, Periklis Kafkalias, Andreas Theodosiou, Zenonas Barlas, Pinar Christoforou, Evgenia Otterbacher, Jahna Demartini, Gianluca Lanitis, Andreas
Major Field of Science:	Engineering and Technology
Field Category:	Electrical Engineering - Electronic Engineering - Information Engineering
Keywords:	Algorithmic bias;Biometrics;Crowdsourcing;Data bias;Education
Issue Date:	17-Oct-2022
Source:	31st ACM International Conference on Information & Knowledge Management, 2022, 17–22 October, Atlanta, Georgia, USA
Conference:	ACM International on Conference on Information and Knowledge Management
Abstract:	It is increasingly easy for interested parties to play a role in the development of predictive algorithms, with a range of available tools and platforms for building datasets, as well as for training and evaluating machine learning (ML) models. For this reason, it is essential to create awareness among practitioners on the ethical challenges, such as the presence of social bias in training data. We present RECANT (Raising Awareness of Social Bias in Crowdsourced Training Data), a tool that allows users to explore the behaviors of four biometric models - predicting the gender and race, as well as the perceived attractiveness and trustworthiness, of the person depicted in an input image. These models have been trained on a crowdsourced dataset of passport-style people images, where crowd annotators described attributes of the images, and reported their own demographic characteristics. With RECANT, users can explore the correct and wrong predictions made by each model, when using different subsets of the data in training, based on annotator attributes. We present its features, along with sample exercises, as a hands-on tool for raising awareness of potential pitfalls in data practices surrounding ML.
URI:	https://hdl.handle.net/20.500.14279/28637
ISBN:	9781450392365
DOI:	10.1145/3511808.3557178
Rights:	This work is licensed under a Creative Commons Attribution International 4.0 License.
Type:	Conference Papers
Affiliation :	CYENS - Centre of Excellence University of Queensland
Publication Type:	Peer Reviewed
Appears in Collections:	Δημοσιεύσεις σε συνέδρια /Conference papers or poster or presentation

Files in This Item:

File	Size	Format
3511808.3557178.pdf	1.4 MB	Adobe PDF	View/Open

CORE Recommender

Show full item record

SCOPUS^TM
Citations 1

1

checked on Nov 6, 2023

Page view(s)

230

Last Week
0

Last month
28

checked on Mar 11, 2025

Download(s) 1

90

checked on Mar 11, 2025

Google Scholar^TM

Check

Files in This Item:

SCOPUSTM Citations 1

Page view(s)

Download(s) 1

Google ScholarTM

Altmetric

SCOPUS^TM
Citations 1

Google Scholar^TM