aRTIFICIAL iNTELLIGENCE for the Deaf (aiD)


Project title
aRTIFICIAL iNTELLIGENCE for the Deaf (aiD)
Budget
€ 1 587 000
Project Coordinator
Status
Ongoing Project
Start date
01-12-2019
Expected Completion
30-11-2023
 
Funding Program
H2020
OpenAire ID
info:eu-repo/grantAgreement/EC/H2020/872139
Project website
Abstract
aiD aims to address the challenge of deaf people communication and social integration by leveraging the latest advances in ML, HCI and AR. Specifically, speech-to-text/text-to-speech algorithms have currently reached high performance, as a product of the latest breakthrough advances in the field of deep learning (DL). However, the commercially available systems cannot be readily integrated into a solution targeted to the communication between deaf and hearing people. On the other hand, existing research efforts to tackle the problem of transcribing SL video or generating synthetic SL footage (SL avatar) from text have failed to generate a satisfactory outcome.

aiD addresses both these problems. We develop speech-to-text/text-to-speech modules tailored to the requirements of a system addressing the communication of the deaf. Most importantly, we systematically address the core technological challenge of SL transcription and generation in an AR environment. Our vision is to exploit and advance the state-of-the-art in DL to solve these problems with groundbreaking accuracy, in a fashion amenable to commodity mobile hardware. This will be in stark contrast to existing systems which either depend on sophisticated costly equipment (multiple vision sensors, gloves and wristbands), or are lab-only systems limited to fingerspelling as opposed to the official SL that deaf people actually use. Indeed, the current state-of-the-art requires expensive devices and operates on a word-by-word basis, thus missing the syntactic context. Finally, these solutions are not amenable to commodity mobile devices. Our vision is to resolve these staggering inadequacies so as to offer a concrete solution that addresses real time interaction between deaf and hearing people. Our core innovation lies in the development of new algorithms and techniques that enable the real-time translation of SL video to text or speech and vice-versa (SL avatar generation from speech/text in an AR environment), with satisfactory accuracy, in a fashion amenable to commodity mobile devices such as smartphones and tablets.
 

Publications
(All)



Results 1-3 of 3 (Search time: 0.001 seconds).

Issue DateTitleAuthor(s)
14-Jan-2021Local Competition and Stochasticity for Adversarial Robustness in Deep LearningPanousis, Konstantinos P. ; Chatzis, Sotirios P. ; Alexos, Antonios ; Theodoridis, Sergios 
217-Jul-2022Stochastic Deep Networks with Linear Competing Units for Model-Agnostic Meta-LearningKalais, Konstantinos ; Chatzis, Sotirios P. 
310-Oct-2021Stochastic Transformer Networks With Linear Competing Units: Application To End-to-End SL TranslationVoskou, Andreas ; Panousis, Konstantinos P. ; Kosmopoulos, Dimitrios I. ; Metaxas, Dimitris ; Chatzis, Sotirios P.