Please use this identifier to cite or link to this item: http://ktisis.cut.ac.cy/handle/10488/6744
Title: Efficient algorithms for computing the best subset regression models for large-scale problems
Authors: Hofmann, Marc H. 
Gatu, Cristian 
Kontoghiorghes, Erricos John 
Keywords: Branch and bound algorithms;Decision trees;Regression analysis;Mathematical models
Issue Date: 2007
Publisher: Elsevier
Source: Computational Statistics and Data Analysis, 2007, Volume 52, Issue 1, Pages 16-29
Abstract: Several strategies for computing the best subset regression models are proposed. Some of the algorithms are modified versions of existing regression-tree methods, while others are new. The first algorithm selects the best subset models within a given size range. It uses a reduced search space and is found to outperform computationally the existing branch-and-bound algorithm. The properties and computational aspects of the proposed algorithm are discussed in detail. The second new algorithm preorders the variables inside the regression tree. A radius is defined in order to measure the distance of a node from the root of the tree. The algorithm applies the preordering to all nodes which have a smaller distance than a certain radius that is given a priori. An efficient method of preordering the variables is employed. The experimental results indicate that the algorithm performs best when preordering is employed on a radius of between one quarter and one third of the number of variables. The algorithm has been applied with such a radius to tackle large-scale subset-selection problems that are considered to be computationally infeasible by conventional exhaustive-selection methods. A class of new heuristic strategies is also proposed. The most important of these is one that assigns a different tolerance value to each subset model size. This strategy with different kind of tolerances is equivalent to all exhaustive and heuristic subset-selection strategies. In addition the strategy can be used to investigate submodels having noncontiguous size ranges. Its implementation provides a flexible tool for tackling large scale models.
URI: http://ktisis.cut.ac.cy/handle/10488/6744
ISSN: 01679473
DOI: http://dx.doi.org/10.1016/j.csda.2007.03.017
Rights: © 2007 Elsevier B.V. All rights reserved.
Type: Article
Appears in Collections:Άρθρα/Articles

Show full item record

SCOPUSTM   
Citations 10

30
checked on Nov 24, 2017

Page view(s) 20

42
Last Week
5
Last month
2
checked on Nov 24, 2017

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.