Prediction of The Likelihood of Policy Lapsation Using Machine Learning Models: A Case Study of a Life Insurance Company Operating in Kenya
DOI:
https://doi.org/10.53819/81018102t2535Abstract
Policy lapsation, defined as the cessation of premium payments by policyholders resulting in termination of coverage, poses significant challenges to insurance companies in terms of revenue loss and customer retention. Lapses influence the profitability and liquidity of insurance companies through acquisition cost, and loss of income from renewal premiums; hence needs to be controlled and managed carefully. Leveraging a case study approach, this research explored the effectiveness of various machine learning algorithms in forecasting policy lapsation rates based on historical data and relevant policyholder attributes. Secondary data was obtained from a life insurance company operating in Kenya over the period 2018 to 2023 with 21,891 policyholders. Five classification models (Logistic Regression, Artificial Neural Networks (ANN), Random Forest, XGBoost, and AdaBoost) were trained and evaluated using comprehensive metrics including ROC-AUC, precision-recall AUC, sensitivity, specificity, and accuracy. The results show the strong prediction ability of ensemble models (Random Forest and XGBoost) and identified occupation type, sum assured and payment methods as critical predictors of lapsation. The best overall classifier is Random Forest with an accuracy of 80.6%, precision-recall AUC of 91.2%, and ROC-AUC of 88.2% with balanced specificity (80.1%) and sensitivity (81.1%). XGBoost showed a ROC-AUC of 87.5% and accuracy of 80.3%. The findings underscore the efficacy of ensemble methods, particularly Random Forest, in predicting lapsation risks, offering insurers actionable insights to proactively manage customer retention. This study contributes to the body of knowledge on actuarial analytics by validating machine learning applications in lapse prediction and provides a framework for implementing data-driven decision-making in insurance risk management.
Keywords: Life Insurance, Policy Lapse, Machine Learning
References
Agarwal, A., C. Baechle, R. S. Behara, & V. Rao (2016). “Multi method approach to wellness predictive modeling.” In: Journal of Big Data.
Aggarwal, C. C. (2018). Neural networks and deep learning (Vol. 10, No. 978, p. 3). Cham: springer.
Allison, P. D. (2001). Missing data. Sage Publications.
Anagol, S., Cole, S. & Sarkar, S. (2013). Understanding the Advice of Commissions-Motivated Agents: Evidence from the Indian Life Insurance Market
Barsotti, F., X. Milhaud, & Y. Salhi (2016). “Lapse risk in life insurance: correlation and contagion effects among policyholders’ behaviors.” In: Insurance: Mathematics and Economics 71.
Breiman, L. (2001). “Random Forests.” In: Machine Learning 45(1), pp. 5–32.
Charu C. A. (2018). Neural networks and deep learning: a textbook. Springer.
Cummins, J. D., Smith, B. D., Vance, R. N., & Vanderhel, J. L. (Eds.). (2013). Risk classification in life insurance (Vol. 1). Springer Science & Business Media.
Dietterich, T.G. (2000). Ensemble methods in machine learning. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 1857: 1–15.
Dorfman, M. S. (1998). Introduction to risk management and insurance (6th ed.). Upper Saddle River, NJ: Prentice Hall.
Ionesco (2012). Life insurance-their characteristics importance and actuality on the Romanian Market.
Kiesenbauer, D. (2012). Main determinants of lapse in the German life insurance industry. North American Actuarial Journal, 16(1), 52-73.
Kuhn, M. and K. Johnson (2013b). Applied Predictive Modeling. Springer Science + Business Media New York.
Kuhn, M., & Johnson, K. (2013). Applied predictive modeling (Vol. 26, p. 13). New York: Springer.
Lessmann, S., Baesens, B., Seow, H.-V. & Thomas, L.C. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. European Journal of Operational Research, 247(1): 124–136.
Loisel, S., P. Piette, & C.-H. J. Tsai (2021). “Applying economic measures to lapse risk management with machine learning approaches.” In: ASTIN Bulletin: The Journal of the IAA 51(3), pp. 839–871.
Mishr, K. (2016). Fundamentals of life insurance theories and applications. PHI Learning Pvt. Ltd.
Mojekwu, J.N. (2011). Study of modes of exit of life-insurance policyholders in Nigeria: Trends and patterns. International Business Research, 4(3).
Mtonga, W. (2021). Factors that lead to life insurance policy lapses at zsic life insurance limited (Doctoral dissertation, The University of Zambia).
Ocheche, J. (2009). Modeling lapse rates using economic variables. A case study for a life insurance company operating in Kenya (Doctoral dissertation, The University of Nairobi).
Peshawa J. Muhammad Ali, & Rezhna H. Faraj (2014). Data Normalization and Standardization: A Technical Report, Machine Learning Technical Reports, 1(1), 1-6.
Raheja Bajaj, M. V. (2017). On the Drivers of Lapse Rates in Life Insurance. Barcelona: University of Barcelona
Rubin, D. B. (1996). Multiple imputation after 18+ years. Journal of the American Statistical Association, 91(434), 473-489.
Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7(2), 147-177.
Singh, D & Agarwal, S. (2023). XGBoost & AdaBoost National Institute of Science Education and Research (NISER), Bhubaneswar
Still, L., & Stokes, G., 2016. Short Term Insurance in South Africa 2016/17. S and S Analytica.
Teyie, S. E., & Justus, T. A. R. I. (2019). Intermediary Factors Affecting Persistency of Ordinary Life Assurance Policies in Kenya. International Journal of Social Sciences Management and Entrepreneurship (IJSSME), 3(2).
Teyie, S. E., & Justus, T. A. R. I. (2019). Intermediary Factors Affecting Persistency of Ordinary Life Assurance Policies In Kenya. International Journal of Social Sciences Management and Entrepreneurship (IJSSME), 3(2).
Van Buuren, S. (2018). Flexible imputation of missing data (2nd ed.). CRC Press.
Vankayalapati, P., (2017). Impact of the lapsation of life insurance policies. The Chartered Insurance Institute. London.
Varian, H. R. (2014). “Big Data: New Tricks for Econometrics.” In: Journal of Economic Perspectives.
Vasudev, M., Bajaj, R., & Alegre Escolano, A. (2016). On the drivers of lapse rates in life insurance. The Geneva Papers on Risk and Insurance – Issues and Practice, 41(2), 337–357. https://doi.org/10.1057/gpp.2015.29
Vidyavathi, K. (2018) Cost of Lapsation to Policyholders in Indian Life Insurance Industry EPRA International Journal of Economic and Business Review Vol. 6 (3) 24-29.
Ying, C., Qi-Guang, M., Jia-Chen, L., & Lin, G. (2013). Advance and prospects of AdaBoost algorithm. Acta Automatica Sinica, 39(6), 745-758.