t-Exponential Memory Networks for Question-Answering Machines

Tolias, Kyriakos; Chatzis, Sotirios P.

doi:10.1109/TNNLS.2018.2884540

Παρακαλώ χρησιμοποιήστε αυτό το αναγνωριστικό για να παραπέμψετε ή να δημιουργήσετε σύνδεσμο προς αυτό το τεκμήριο: https://hdl.handle.net/20.500.14279/13499

Τίτλος:	t-Exponential Memory Networks for Question-Answering Machines
Συγγραφείς:	Tolias, Kyriakos Chatzis, Sotirios P.
metadata.dc.contributor.other:	Χατζής, Σωτήριος Π.
Major Field of Science:	Engineering and Technology
Field Category:	Computer and Information Sciences
Λέξεις-κλειδιά:	Bayes methods;Computational modeling;Data models;Hidden Markov models;Language modeling;Memory networks (MEM-NNs);t-exponential family;Task analysis;Training;Uncertainty;Variational inference
Ημερομηνία Έκδοσης:	25-Δεκ-2018
Πηγή:	IEEE Transactions on Neural Networks and Learning Systems, 2019, vol. 30, no. 8, pp. 2463-2467
Volume:	30
Issue:	8
Start page:	2463
End page:	2467
Περιοδικό:	IEEE transactions on neural networks and learning systems
Περίληψη:	Recent advances in deep learning have brought to the fore models that can make multiple computational steps in the service of completing a task; these are capable of describing long-term dependencies in sequential data. Novel recurrent attention models over possibly large external memory modules constitute the core mechanisms that enable these capabilities. Our work addresses learning subtler and more complex underlying temporal dynamics in language modeling tasks that deal with sparse sequential data. To this end, we improve upon these recent advances by adopting concepts from the field of Bayesian statistics, namely, variational inference. Our proposed approach consists in treating the network parameters as latent variables with a prior distribution imposed over them. Our statistical assumptions go beyond the standard practice of postulating Gaussian priors. Indeed, to allow for handling outliers, which are prevalent in long observed sequences of multivariate data, multivariate t-exponential distributions are imposed. On this basis, we proceed to infer corresponding posteriors; these can be used for inference and prediction at test time, in a way that accounts for the uncertainty in the available sparse training data. Specifically, to allow for our approach to best exploit the merits of the t-exponential family, our method considers a new t-divergence measure, which generalizes the concept of the Kullback-Leibler divergence. We perform an extensive experimental evaluation of our approach, using challenging language modeling benchmarks, and illustrate its superiority over existing state-of-the-art techniques.
ISSN:	21622388
DOI:	10.1109/TNNLS.2018.2884540
Rights:	© IEEE
Type:	Article
Affiliation:	Cyprus University of Technology
Publication Type:	Peer Reviewed
Εμφανίζεται στις συλλογές:	Άρθρα/Articles

CORE Recommender

Δείξε την πλήρη περιγραφή του τεκμηρίου

SCOPUS^TM
Citations

3

checked on 14 Μαρ 2024

WEB OF SCIENCE^TM
Citations 50

3

Last Week
0

Last month
0

checked on 1 Νοε 2023

Page view(s)

353

Last Week
0

Last month
0

checked on 23 Νοε 2024

Google Scholar^TM

Check

Altmetric

Όλα τα τεκμήρια του δικτυακού τόπου προστατεύονται από πνευματικά δικαιώματα

SCOPUSTM Citations

WEB OF SCIENCETM Citations 50

Page view(s)

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations 50

Google Scholar^TM