Please use this identifier to cite or link to this item:
https://hdl.handle.net/20.500.14279/29880
Title: | Power-law mixtures of bayesian forests for value added tax audit case selection | Authors: | Kleanthous, Christos Christophides, Theodoros Chatzis, Sotirios P. |
Major Field of Science: | Engineering and Technology | Field Category: | Mechanical Engineering | Keywords: | Bayesian networks;Decision trees;Inference engines;Mixtures;Random forests;Audit selection;Non-parametric bayesian mixture model;Random forests | Issue Date: | 15-Oct-2020 | Source: | Proceedings of 1st ACM International Conference on AI in Finance, 2020, pp. 1-8 | Start page: | 1 | End page: | 8 | Conference: | 1st ACM International Conference on AI in Finance | Abstract: | Tax authorities need to maximize the yield of the limited tax audits they afford to perform each year. Thus, they need to predict the likelihood of a candidate audit resulting in a satisfactory yield; this predictive process is usually referred to as audit case selection. Random Forests (RFs) constitute a standard method for Value Added Tax (VAT) audit case selection. Despite, though, their success, their predictive performance is still below the expectations of tax authorities, that need to timely detect cases of significant audit yield potential. This lackluster performance is mainly attributed to the fact that RFs cannot deal with data that entail non-stationary nature, multiple modalities, or discontinuities. These are common characteristics of real-world datasets; thus, the incapacity to properly address them is a major suspect for undermining their performance. This work addresses these issues by considering a generative non-parametric Bayesian model with power-law behavior, capable of generating distinct (Bayesian) RFs over the observations space of the modeled data. This way, our approach enables capturing an indefinite number of distinct classification patterns, while being able to effectively handle outliers. The latter advantage is of paramount importance for the effectiveness of the modeling procedure in cases where few large parts of the observations space can be modeled by few RF classifiers, yet there is a large number of small parts of the observations space that require distinct RFs to be properly modeled (power-law nature). We provide an efficient algorithm for model inference, based on the variational Bayesian framework, and prove its efficacy using real-world datasets. | URI: | https://hdl.handle.net/20.500.14279/29880 | ISBN: | 9781450375849 | DOI: | 10.1145/3383455.3422515 | Rights: | Copyright © Elsevier B.V | Type: | Conference Papers | Affiliation : | Cyprus University of Technology |
Appears in Collections: | Άρθρα/Articles |
CORE Recommender
This item is licensed under a Creative Commons License