Dissertations

GET STARTED

Boosting with Regression Trees

Author: Xiao Xiao

Date: 8/29/2021

Executive Summary:
As one of the most widely used machine learning techniques, boosting has a variety of forms and has been applied to many classification and regression problems with competitive computational speed and good performance. Based on the data employed for construction of base learner, we can divide boosting algorithm into two categories: non-stochastic boosting and stochastic boosting. The non-stochastic boosting employs the whole training set for base learner construction while the stochastic boosting only employs a random proportion of training set for base learner construction. The performance of stochastic boosting beats the non-stochastic boosting in a lot of simulation and real data scenario while the underlying mechanism remains unclear. The numerical convergence and consistency are well established for non-stochastic boosting but not for stochastic boosting. This research proves the numerical convergence and consistency of stochastic boosting with regression trees and provides the theoretical foundation to clarify why stochastic boosting with regression tree outperforms non-stochastic version. The theory is further extended to boosting with other type of tree learners. Based on the theory, a new boosting algorithm using k nearest neighbor base learner is proposed and exhibits better performance compared with the stochastic boosting with regression tree algorithm.

Dissertations

Boosting with Regression Trees

Department of Public Health Sciences

Resources

UM Network

Visit

Connect