There is a large literature explaining why AdaBoost is a successful classifier. The literature on AdaBoost focuses on classifier margins and boosting's interpretation as the optimization of an exponential likelihood function. These existing explanations, however, have been pointed out to be incomplete. A random forest is another popular ensemble method for which there is substantially less explanation in the literature. We introduce a novel perspective on AdaBoost and random forests that proposes that the two algorithms work for essentially similar reasons. While both classifiers achieve similar predictive accuracy, random forests cannot be conceived as a direct optimization procedure. Rather, random forests is a self-averaging, interpolating algorithm which fits training data without error but is nevertheless somewhat smooth. We show that AdaBoost has the same property. We conjecture that both AdaBoost and random forests succeed because of this mechanism. We provide a number of examples and some theoretical justification to support this explanation. In the process, we question the conventional wisdom that suggests that boosting algorithms for classification require regularization or early stopping and should be limited to low complexity classes of learners, such as decision stumps. We conclude that boosting should be used like random forests: with large decision trees and without direct regularization or early stopping.
Bio: Abraham (Adi) Wyner is Professor and Chair of the Undergrad Program in Statistics at the University of Pennsylvania's Wharton School. Before arriving at University of Pennsylvania in 1999, he was Assistant Professor of Statistics at University of California, Berkeley. His research is in machine learning, discrete time series, Information Theory, and the application of Statistics to Environmental Sciences, Neuroscience, Information Theory and Sports.
Detailed information can be found online on the Frankel Center website.
Please be sure to spread the word to anyone you think may be interested.
Secretaries: please distribute to faculty, Graduate and PhD students, and any whom may find this day interesting.
Looking forward to seeing you on 12/12 at 12:00 in the Harry and Carol Saal Auditorium, Alon Building for Hi-Tech (37/202).
We apologize for possible duplicates of this message.
The Frankel Center for Computer Science