Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14279/31465
Title: Information-Theoretic Local Competition and Stochasticity for Deep Learning
Authors: Antoniadis, Anastasios 
Keywords: Deep Learning
Advisor: Chatzis, Sotirios P.
Issue Date: Mar-2022
Department: Department of Electrical Engineering, Computer Engineering and Informatics
Faculty: Faculty of Engineering and Technology
Abstract: Deep Learning has brought about a revolution in the capabilities of Artificial Intelligence sys- tems in the last decade. The foundational operating principle of Deep Learning algorithms that differentiates them from previous approaches is the focus on learning strong representations of the modeled data. The networks learn to extract these representations in a way that retains the salient information, while removing anything that is not useful for making correct inference.Unfortunately, deep networks suffer from vulnerability to adversarial attacks.That is, appropriately algorithmically crafted counter-examples, which are very easy for average humans to discern, completely foul state-of-the-art deep networks into misclassification. This problem implies that the representa- tions deep networks learn to discern are actually brittle and contain information which is not salient enough to allow for strong generalization capacity under adverse conditions. Recently, deep networks with Stochastic local winner-takes-all (LWTA) units have been proposed as a potent means of learning to extract data representations with one order of magnitude better generalization capacity in hard adversarial attack settings. This work builds upon this success, aiming to deal with the problem of learning diversified representations which is long-established. For this purpose, we combine information-theoretic arguments with stochastic competition- based activations, that is Stochastic LWTA units. Under this framework, we leave the con- ventional deep architectures behind that are over-used in Representation Learning and based on non-linear activations and we replace these activations with sets of stochastically and locally competing linear units. In this setting, each network layer yields sparse outputs, deter- mined by the outcome of the competition between units that are organized into blocks of competitors. We adopt stochastic arguments for the competition mechanism, which perform posterior sampling to determine the winner of each block. We further endow the considered networks with the ability to infer the sub-part of the network that is essential for modeling the data at hand; we impose appropriate stick-breaking priors to this end. In order to make rich the information of the emerging representations, we resort to the context of information theory, and more specifically to the Information Competing Process (ICP). Then, under the stochastic Variational Bayes framework for inference, we have tied all the components together. We perform an experimental investigation with detail for our approximation by using benchmark datasets on image classification. As we show in our experiments, the resulting networks yield significant discriminative rep- resentation learning abilities. Additionally, the introduced paradigm allows for a principled investigation mechanism of the emerging intermediate network representations.
URI: https://hdl.handle.net/20.500.14279/31465
Rights: CC0 1.0 Universal
Type: PhD Thesis
Affiliation: Cyprus University of Technology 
Appears in Collections:Διδακτορικές Διατριβές/ PhD Theses

Files in This Item:
File Description SizeFormat
PHD (antoniadesa).pdfFulltext3.01 MBAdobe PDFView/Open
CORE Recommender
Show full item record

Page view(s)

79
Last Week
4
Last month
36
checked on Apr 28, 2024

Download(s)

40
checked on Apr 28, 2024

Google ScholarTM

Check


This item is licensed under a Creative Commons License Creative Commons