Theoretical analysis of diversity in an ensemble of automatic speech recognition systems

Audhkhasi, Kartik; Zavou, Andreas M.; Georgiou, Panayiotis G.; Narayanan, Shrikanth S.

doi:10.1109/TASLP.2014.2303295

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14279/9622

DC Field	Value	Language
dc.contributor.author	Audhkhasi, Kartik	-
dc.contributor.author	Zavou, Andreas M.	-
dc.contributor.author	Georgiou, Panayiotis G.	-
dc.contributor.author	Narayanan, Shrikanth S.	-
dc.contributor.other	Ζαβού, Αντρέας	-
dc.date.accessioned	2017-02-13T11:25:51Z	-
dc.date.available	2017-02-13T11:25:51Z	-
dc.date.issued	2014-03-01	-
dc.identifier.citation	IEEE Transactions on Audio, Speech and Language Processing, 2014, vol. 22, no. 3, pp. 711-726	en_US
dc.identifier.issn	15587924	-
dc.identifier.uri	https://hdl.handle.net/20.500.14279/9622	-
dc.description.abstract	Diversity or complementarity of automatic speech recognition (ASR) systems is crucial for achieving a reduction in word error rate (WER) upon fusion using the ROVER algorithm. We present a theoretical proof explaining this often-observed link between ASR system diversity and ROVER performance. This is in contrast to many previous works that have only presented empirical evidence for this link or have focused on designing diverse ASR systems using intuitive algorithmic modifications. We prove that the WER of the ROVER output approximately decomposes into a difference of the average WER of the individual ASR systems and the average WER of the ASR systems with respect to the ROVER output. We refer to the latter quantity as the diversity of the ASR system ensemble because it measures the spread of the ASR hypotheses about the ROVER hypothesis. This result explains the trade-off between the WER of the individual systems and the diversity of the ensemble. We support this result through ROVER experiments using multiple ASR systems trained on standard data sets with the Kaldi toolkit. We use the proposed theorem to explain the lower WERs obtained by ASR confidence-weighted ROVER as compared to word frequency-based ROVER. We also quantify the reduction in ROVER WER with increasing diversity of the N-best list. We finally present a simple discriminative framework for jointly training multiple diverse acoustic models (AMs) based on the proposed theorem. Our framework generalizes and provides a theoretical basis for some recent intuitive modifications to well-known discriminative training criterion for training diverse AMs.	en_US
dc.format	pdf	en_US
dc.language.iso	en	en_US
dc.relation.ispartof	IEEE Transactions on Audio, Speech and Language Processing	en_US
dc.rights	© IEEE	en_US
dc.subject	Ambiguity decomposition	en_US
dc.subject	Automatic speech recognition	en_US
dc.subject	Discriminative training	en_US
dc.subject	Diversity	en_US
dc.subject	Ensemble methods	en_US
dc.subject	ROVER	en_US
dc.subject	System combination	en_US
dc.title	Theoretical analysis of diversity in an ensemble of automatic speech recognition systems	en_US
dc.type	Article	en_US
dc.collaboration	University of Southern California	en_US
dc.collaboration	Cyprus University of Technology	en_US
dc.subject.category	Electrical Engineering - Electronic Engineering - Information Engineering	en_US
dc.journals	Subscription	en_US
dc.country	United States	en_US
dc.country	Cyprus	en_US
dc.subject.field	Engineering and Technology	en_US
dc.publication	Peer Reviewed	en_US
dc.identifier.doi	10.1109/TASLP.2014.2303295	en_US
dc.relation.issue	3	en_US
dc.relation.volume	22	en_US
cut.common.academicyear	2013-2014	en_US
dc.identifier.spage	711	en_US
dc.identifier.epage	726	en_US
item.grantfulltext	none	-
item.openairecristype	http://purl.org/coar/resource_type/c_6501	-
item.fulltext	No Fulltext	-
item.languageiso639-1	en	-
item.cerifentitytype	Publications	-
item.openairetype	article	-
crisitem.journal.journalissn	2329-9304	-
crisitem.journal.publisher	IEEE	-
Appears in Collections:	Άρθρα/Articles

CORE Recommender

Show simple item record

SCOPUS^TM
Citations

14

checked on Nov 9, 2023

WEB OF SCIENCE^TM
Citations

12

Last Week
0

Last month
0

checked on Oct 29, 2023

Page view(s) 50

377

Last Week
0

Last month
4

checked on Dec 22, 2024

Google Scholar^TM

Check

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Page view(s) 50

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM