Feature Importance in Gradient Boosting Trees with Cross-Validation Feature Selection

Afek Ilay Adler
Amichai Painsky

Open link

Publication date

May 2022

DOI

10.3390/e24050687

Publisher

MDPI AG

Journal

Entropy

Abstract

Gradient Boosting Machines (GBM) are among the go-to algorithms on tabular data, which produce state-of-the-art results in many prediction tasks. Despite its popularity, the GBM framework suffers from a fundamental flaw in its base learners. Specifically, most implementations utilize decision trees that are typically biased towards categorical variables with large cardinalities. The effect of this bias was extensively studied over the years, mostly in terms of predictive performance. In this work, we extend the scope and study the effect of biased base learners on GBM feature importance (FI) measures. We demonstrate that although these implementation demonstrate highly competitive predictive performance, they still, surprisingly, suffer fro...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Feature Importance in Gradient Boosting Trees with Cross-Validation Feature Selection

Abstract

Extracted data

Feature Importance in Gradient Boosting Trees with Cross-Validation Feature Selection

Abstract

Extracted data

Related items

Related items