A NOVEL SPLIT SELECTION OF A LOGISTIC REGRESSION TREE FOR THE CLASSIFICATION OF DATA WITH HETEROGENEOUS SUBGROUPS

Authors

DOI:

https://doi.org/10.23055/ijietap.2023.30.2.8743

Keywords:

model tree, logistic regression tree, subgroup identification, class separability

Abstract

A logistic regression tree (LRT) is a hybrid machine learning method that combines a decision tree model and logistic regression models. An LRT recursively partitions the input data space through splitting and learns multiple logistic regression models optimized for each subpopulation. The split selection is a critical procedure for improving the predictive performance of the LRT. In this paper, we present a novel separability-based split selection method for the construction of an LRT. The separability measure, defined on the feature space of logistic regression models, evaluates the performance of potential child models without fitting, and the optimal split is selected based on the results. Heterogeneous subgroups that have different class-separating patterns can be identified in the split process when they exist in the data. In addition, we compare the performance of our proposed method with the benchmark algorithms through experiments on both synthetic and real-world datasets. The experimental results indicate the effectiveness and generality of our proposed method.

Downloads

Published

2023-04-18

How to Cite

Lee, S., & Jun, C.-H. (2023). A NOVEL SPLIT SELECTION OF A LOGISTIC REGRESSION TREE FOR THE CLASSIFICATION OF DATA WITH HETEROGENEOUS SUBGROUPS. International Journal of Industrial Engineering: Theory, Applications and Practice, 30(2). https://doi.org/10.23055/ijietap.2023.30.2.8743

Issue

Section

Data Sciences and Computational Intelligence