Natural Language Processing with Deep Neural Networks
Date Issued
December 2020
Author(s)
Advisor
Abstract
Since the very first days of the Computer Era, machines provided us the
ability to collect and store vast amount of information. It, soon, became obvious
that harvesting information was an entirely different task, much more
complicated and demanding. Many solutions developed to make it possible for
humans to communicate with computers, aiming data mining. Databases, query
languages and search engines were for decades the most prevalent solution. At
first, special skills were required such as query language knowledge or search
engine’s syntax to be able to perform advanced tasks. In order to be adopted
by users, search should be easy and human friendly. Nowadays, search engines
are using simple language syntax and have made significant progress towards
natural language path.
Still, search engines fall sort on combining information from different parts
producing a synthetic answer. Computers are actually computational machines,
therefore are excellent at manipulating syntax and calculating words’ frequency
but are weak in recognizing concepts behind the words. A traditional search
engine, is not able to draw conclusions or realize the context of a dialogue.
Machine Learning has been proven strong in dealing with such concepts.
One of the most challenging fields of machine learning is the Natural Language
Processing (NLP), especially its component Natural Language Understanding
(NLU). The crest of NLP are the question-answering and summarization tasks,
in sense that strong cognitive ability is required in order the conceptual context
to be extracted. Supervised learning of deep neural networks, is currently the
best available tool for these tasks. Despite the rapid advances in the field of
Machine Learning, their performance remains poor when dealing with hard NLU
and NLP tasks, such as abstractive summarisation and question answering.
This dissertation aims to offer substantive and measurable progress in both
these areas, by ameliorating a key problem of modern machine learning techniques:
The need for dense and large data corpora for effective model training.
This is an especially hard task in the context of such applications. To this end,
we leverage arguments from the field of Bayesian inference. This allows for better
dealing with the modelling uncertainty, which is the direct outcome of data
sparsity, and results in poor modelling/generalisation performance. Our approaches
are founded upon solid and elaborate statistical inference arguments,
and are evaluated using challenging popular benchmarks. As we show, they
offer tangible performance advantages over the state-of-the-art.
ability to collect and store vast amount of information. It, soon, became obvious
that harvesting information was an entirely different task, much more
complicated and demanding. Many solutions developed to make it possible for
humans to communicate with computers, aiming data mining. Databases, query
languages and search engines were for decades the most prevalent solution. At
first, special skills were required such as query language knowledge or search
engine’s syntax to be able to perform advanced tasks. In order to be adopted
by users, search should be easy and human friendly. Nowadays, search engines
are using simple language syntax and have made significant progress towards
natural language path.
Still, search engines fall sort on combining information from different parts
producing a synthetic answer. Computers are actually computational machines,
therefore are excellent at manipulating syntax and calculating words’ frequency
but are weak in recognizing concepts behind the words. A traditional search
engine, is not able to draw conclusions or realize the context of a dialogue.
Machine Learning has been proven strong in dealing with such concepts.
One of the most challenging fields of machine learning is the Natural Language
Processing (NLP), especially its component Natural Language Understanding
(NLU). The crest of NLP are the question-answering and summarization tasks,
in sense that strong cognitive ability is required in order the conceptual context
to be extracted. Supervised learning of deep neural networks, is currently the
best available tool for these tasks. Despite the rapid advances in the field of
Machine Learning, their performance remains poor when dealing with hard NLU
and NLP tasks, such as abstractive summarisation and question answering.
This dissertation aims to offer substantive and measurable progress in both
these areas, by ameliorating a key problem of modern machine learning techniques:
The need for dense and large data corpora for effective model training.
This is an especially hard task in the context of such applications. To this end,
we leverage arguments from the field of Bayesian inference. This allows for better
dealing with the modelling uncertainty, which is the direct outcome of data
sparsity, and results in poor modelling/generalisation performance. Our approaches
are founded upon solid and elaborate statistical inference arguments,
and are evaluated using challenging popular benchmarks. As we show, they
offer tangible performance advantages over the state-of-the-art.
File(s)![Thumbnail Image]()
Name
Kyriakos_Tolias_final.pdf
Size
4.32 MB
Format
Adobe PDF
Checksum (MD5)
d5f9303ac1598caa1937a5ae6bb107dc

