Please use this identifier to cite or link to this item:
https://hdl.handle.net/20.500.14279/31465
Title: | Information-Theoretic Local Competition and Stochasticity for Deep Learning | Authors: | Antoniadis, Anastasios | Keywords: | Deep Learning | Advisor: | Chatzis, Sotirios P. | Issue Date: | Mar-2022 | Department: | Department of Electrical Engineering, Computer Engineering and Informatics | Faculty: | Faculty of Engineering and Technology | Abstract: | Deep Learning has brought about a revolution in the capabilities of Artificial Intelligence sys- tems in the last decade. The foundational operating principle of Deep Learning algorithms that differentiates them from previous approaches is the focus on learning strong representations of the modeled data. The networks learn to extract these representations in a way that retains the salient information, while removing anything that is not useful for making correct inference.Unfortunately, deep networks suffer from vulnerability to adversarial attacks.That is, appropriately algorithmically crafted counter-examples, which are very easy for average humans to discern, completely foul state-of-the-art deep networks into misclassification. This problem implies that the representa- tions deep networks learn to discern are actually brittle and contain information which is not salient enough to allow for strong generalization capacity under adverse conditions. Recently, deep networks with Stochastic local winner-takes-all (LWTA) units have been proposed as a potent means of learning to extract data representations with one order of magnitude better generalization capacity in hard adversarial attack settings. This work builds upon this success, aiming to deal with the problem of learning diversified representations which is long-established. For this purpose, we combine information-theoretic arguments with stochastic competition- based activations, that is Stochastic LWTA units. Under this framework, we leave the con- ventional deep architectures behind that are over-used in Representation Learning and based on non-linear activations and we replace these activations with sets of stochastically and locally competing linear units. In this setting, each network layer yields sparse outputs, deter- mined by the outcome of the competition between units that are organized into blocks of competitors. We adopt stochastic arguments for the competition mechanism, which perform posterior sampling to determine the winner of each block. We further endow the considered networks with the ability to infer the sub-part of the network that is essential for modeling the data at hand; we impose appropriate stick-breaking priors to this end. In order to make rich the information of the emerging representations, we resort to the context of information theory, and more specifically to the Information Competing Process (ICP). Then, under the stochastic Variational Bayes framework for inference, we have tied all the components together. We perform an experimental investigation with detail for our approximation by using benchmark datasets on image classification. As we show in our experiments, the resulting networks yield significant discriminative rep- resentation learning abilities. Additionally, the introduced paradigm allows for a principled investigation mechanism of the emerging intermediate network representations. | URI: | https://hdl.handle.net/20.500.14279/31465 | Rights: | CC0 1.0 Universal | Type: | PhD Thesis | Affiliation: | Cyprus University of Technology |
Appears in Collections: | Διδακτορικές Διατριβές/ PhD Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
PHD (antoniadesa).pdf | Fulltext | 3.01 MB | Adobe PDF | View/Open |
CORE Recommender
Page view(s)
154
Last Week
1
1
Last month
14
14
checked on Dec 3, 2024
Download(s)
174
checked on Dec 3, 2024
Google ScholarTM
Check
This item is licensed under a Creative Commons License