Please use this identifier to cite or link to this item:
https://hdl.handle.net/20.500.14279/27107
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Voskou, Andreas | - |
dc.contributor.author | Panousis, Konstantinos P. | - |
dc.contributor.author | Kosmopoulos, Dimitrios I. | - |
dc.contributor.author | Metaxas, Dimitris | - |
dc.contributor.author | Chatzis, Sotirios P. | - |
dc.date.accessioned | 2022-12-21T11:18:53Z | - |
dc.date.available | 2022-12-21T11:18:53Z | - |
dc.date.issued | 2021-10-10 | - |
dc.identifier.citation | Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 11946-11955 | en_US |
dc.identifier.uri | https://hdl.handle.net/20.500.14279/27107 | - |
dc.description.abstract | Automating sign language translation (SLT) is a challenging real-world application. Despite its societal importance, though, research progress in the field remains rather poor. Crucially, existing methods that yield viable performance necessitate the availability of laborious to obtain gloss sequence groundtruth. In this paper, we attenuate this need, by introducing an end-to-end SLT model that does not entail explicit use of glosses; the model only needs text groundtruth. This is in stark contrast to existing end-to-end models that use gloss sequence groundtruth, either in the form of a modality that is recognized at an intermediate model stage, or in the form of a parallel output process, jointly trained with the SLT model. Our approach constitutes a Transformer network with a novel type of layers that combines: (i) local winner-takes-all (LWTA) layers with stochastic winner sampling, instead of conventional ReLU layers, (ii) stochastic weights with posterior distributions estimated via variational inference, and (iii) a weight compression technique at inference time that exploits estimated posterior variance to perform massive, almost lossless compression. We demonstrate that our approach can reach the currently best reported BLEU-4 score on the PHOENIX 2014T benchmark, but without making use of glosses for model training, and with a memory footprint reduced by more than 70%. | en_US |
dc.language.iso | en | en_US |
dc.relation | aRTIFICIAL iNTELLIGENCE for the Deaf (aiD) | en_US |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.subject | Memory management | en_US |
dc.subject | Stochastic processes | en_US |
dc.subject | Gesture recognition | en_US |
dc.subject | Benchmark testing | en_US |
dc.subject | Assistive technologies | en_US |
dc.subject | Machine learning architectures and formulations | en_US |
dc.subject | Representation learning | en_US |
dc.subject | Vision + language | en_US |
dc.title | Stochastic Transformer Networks With Linear Competing Units: Application To End-to-End SL Translation | en_US |
dc.type | Conference Papers | en_US |
dc.collaboration | Cyprus University of Technology | en_US |
dc.collaboration | University of Patras | en_US |
dc.collaboration | Rutgers University | en_US |
dc.subject.category | Other Engineering and Technologies | en_US |
dc.journals | Open Access | en_US |
dc.country | Cyprus | en_US |
dc.country | Greece | en_US |
dc.country | United States | en_US |
dc.subject.field | Engineering and Technology | en_US |
dc.publication | Peer Reviewed | en_US |
dc.relation.conference | IEEE/CVF International Conference on Computer Vision (ICCV) | en_US |
dc.identifier.doi | 10.1109/ICCV48922.2021.01173 | en_US |
cut.common.academicyear | 2021-2022 | en_US |
dc.identifier.spage | 11946 | en_US |
dc.identifier.epage | 11955 | en_US |
item.openairetype | conferenceObject | - |
item.grantfulltext | open | - |
item.cerifentitytype | Publications | - |
item.openairecristype | http://purl.org/coar/resource_type/c_c94f | - |
item.languageiso639-1 | en | - |
item.fulltext | With Fulltext | - |
crisitem.project.funder | EC Joint Research Centre | - |
crisitem.project.fundingProgram | H2020 | - |
crisitem.project.openAire | info:eu-repo/grantAgreement/EC/H2020/872139 | - |
crisitem.author.dept | Department of Electrical Engineering, Computer Engineering and Informatics | - |
crisitem.author.faculty | Faculty of Engineering and Technology | - |
crisitem.author.orcid | 0000-0002-4956-4013 | - |
crisitem.author.parentorg | Faculty of Engineering and Technology | - |
Appears in Collections: | Δημοσιεύσεις σε συνέδρια /Conference papers or poster or presentation |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Stochastic Transformer Networks.pdf | 690.84 kB | Adobe PDF | View/Open |
CORE Recommender
SCOPUSTM
Citations
20
17
checked on Nov 6, 2023
Page view(s)
263
Last Week
1
1
Last month
7
7
checked on Dec 11, 2024
Download(s) 20
96
checked on Dec 11, 2024
Google ScholarTM
Check
Altmetric
This item is licensed under a Creative Commons License