DETERMINATION OF AN OPTIMAL PIPELINE FOR IMBALANCED CLASSIFICATION: PREDICTING POTENTIAL CUSTOMER COMPLAINTS TO A TEXTILE MANUFACTURER

Authors

  • Ssu-Han Chen Department of Industrial Engineering and Management, Ming Chi University of Technology Center for Artificial Intelligence & Data Science, Ming Chi University of Technology
  • Wei-Hsin Lin Department of Industrial Engineering and Management, Ming Chi University of Technology Manufacturing Department Specialist, Yong Chi Fa Corporation

DOI:

https://doi.org/10.23055/ijietap.2020.27.5.6757

Keywords:

textile manufacturer, customer complaint, class-imbalanced problem, classification, multiple response design of experiments

Abstract

There is an urgent need to reduce customer complaints because they damage reputations and incur losses. This study predicts the likelihood of complaint about a new production order using its intrinsic features. Customer complaints, however, are relatively rare, creating a serious class-imbalanced problem when training a classifier. To overcome this problem, we use a pipeline including the upsampling, the hyper-parameter generation, the classifier, and the evaluation metric. As each strategy involves different tricks in the pipeline, we use the design of experiments (DOE) concept to find, automatically, a suitable combination. A multi-response DOE is used to maximize balanced accuracy and minimize overfitting during training. The experimental results showed that the balanced accuracy of the proposed method for the testing dataset was about 23.6% better than those of the base classifiers and about 7% better than those of the current state-of-the-art methods.

Published

2021-04-29

How to Cite

Chen, S.-H., & Lin, W.-H. (2021). DETERMINATION OF AN OPTIMAL PIPELINE FOR IMBALANCED CLASSIFICATION: PREDICTING POTENTIAL CUSTOMER COMPLAINTS TO A TEXTILE MANUFACTURER. International Journal of Industrial Engineering: Theory, Applications and Practice, 27(5). https://doi.org/10.23055/ijietap.2020.27.5.6757

Issue

Section

Special Issue on Data-driven Computational Intelligence in Industries Application