Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.14279/29880
Title: Power-law mixtures of bayesian forests for value added tax audit case selection
Authors: Kleanthous, Christos 
Christophides, Theodoros 
Chatzis, Sotirios P. 
Major Field of Science: Engineering and Technology
Field Category: Mechanical Engineering
Keywords: Bayesian networks;Decision trees;Inference engines;Mixtures;Random forests;Audit selection;Non-parametric bayesian mixture model;Random forests
Issue Date: 15-Oct-2020
Source: Proceedings of 1st ACM International Conference on AI in Finance, 2020, pp. 1-8
Start page: 1
End page: 8
Conference: 1st ACM International Conference on AI in Finance 
Abstract: Tax authorities need to maximize the yield of the limited tax audits they afford to perform each year. Thus, they need to predict the likelihood of a candidate audit resulting in a satisfactory yield; this predictive process is usually referred to as audit case selection. Random Forests (RFs) constitute a standard method for Value Added Tax (VAT) audit case selection. Despite, though, their success, their predictive performance is still below the expectations of tax authorities, that need to timely detect cases of significant audit yield potential. This lackluster performance is mainly attributed to the fact that RFs cannot deal with data that entail non-stationary nature, multiple modalities, or discontinuities. These are common characteristics of real-world datasets; thus, the incapacity to properly address them is a major suspect for undermining their performance. This work addresses these issues by considering a generative non-parametric Bayesian model with power-law behavior, capable of generating distinct (Bayesian) RFs over the observations space of the modeled data. This way, our approach enables capturing an indefinite number of distinct classification patterns, while being able to effectively handle outliers. The latter advantage is of paramount importance for the effectiveness of the modeling procedure in cases where few large parts of the observations space can be modeled by few RF classifiers, yet there is a large number of small parts of the observations space that require distinct RFs to be properly modeled (power-law nature). We provide an efficient algorithm for model inference, based on the variational Bayesian framework, and prove its efficacy using real-world datasets.
URI: https://hdl.handle.net/20.500.14279/29880
ISBN: 9781450375849
DOI: 10.1145/3383455.3422515
Rights: Copyright © Elsevier B.V
Type: Conference Papers
Affiliation : Cyprus University of Technology 
Appears in Collections:Άρθρα/Articles

CORE Recommender
Show full item record

Page view(s)

124
Last Week
1
Last month
14
checked on May 2, 2024

Google ScholarTM

Check

Altmetric


This item is licensed under a Creative Commons License Creative Commons