Predicting Customer Churn in Retailing
2022 (English)In: International Conference on Machine Learning and Applications (ICMLA): 21st IEEE International Conference on Machine Learning and Applications (ICMLA), Institute of Electrical and Electronics Engineers (IEEE), 2022, p. 635-640Conference paper, Published paper (Refereed)
Abstract [en]
Customer churn is one of the most challenging problems for digital retailers. With significantly higher costs for acquiring new customers than retaining existing ones, knowledge about which customers are likely to churn becomes essential. This paper reports a case study where a data-driven approach to churn prediction is used for predicting churners and gaining insights about the problem domain. The real-world data set used contains approximately 200 000 customers, describing each customer using more than 50 features. In the pre-processing, exploration, modeling and analysis, attributes related to recency, frequency, and monetary concepts are identified and utilized. In addition, correlations and feature importance are used to discover and understand churn indicators. One important finding is that the churn rate highly depends on the number of previous purchases. In the segment consisting of customers with only one previous purchase, more than 75% will churn, i.e., not make another purchase in the coming year. For customers with at least four previous purchases, the corresponding churn rate is around 25%. Further analysis shows that churning customers in general, and as expected, make smaller purchases and visit the online store less often. In the experimentation, three modeling techniques are evaluated, and the results show that, in particular, Gradient Boosting models can predict churners with relatively high accuracy while obtaining a good balance between precision and recall.
Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2022. p. 635-640
Keywords [en]
Sales, Case-studies, Churn rates, Correlation, Customer churn prediction, Customer churns, Digital retailing, Feature importance, High costs, RFM analysis, Top probability, Forecasting, correlations, top probabilities
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:hj:diva-60233DOI: 10.1109/ICMLA55696.2022.00105Scopus ID: 2-s2.0-85152214345ISBN: 978-1-6654-6283-9 (electronic)OAI: oai:DiVA.org:hj-60233DiVA, id: diva2:1752646
Conference
Proceedings - 21st IEEE International Conference on Machine Learning and Applications, ICMLA 2022 Nassau 12 December 2022 through 14 December 2022
Funder
Knowledge Foundation, 20160035, 201702152023-04-242023-04-242023-10-04Bibliographically approved