The application of data mining using machine learning algorithms to investigate the impact of vehicle characteristics in predicting the risk of material damage in the field of third party insurance

Asghari Oskoei, M.R.; Khanizadeh, F.; Bahador, A.

doi:10.22056/ijir.2020.01.02

Document Type : Original Research Paper

Authors

¹ Faculty of Mathematical and Computer Sciences, Allameh Tabatabai University, Tehran, Iran

² Insurance Research Institute and responsible for the specialized desk of algorithm design and machine learning, Tehran, Iran

³ Insurance Research Institute and head of specialized car insurance desk, Tehran, Iran

https://doi.org/10.22056/ijir.2020.01.02

Abstract

Objective: Classifying the risk of policyholders based on observable characteristics can help insurance companies to reduce losses, identify customers more accurately, and prevent adverse selection in the insurance market. The purpose of this article is to examine the financial losses caused by third party insurance and to predict the risk of policyholders in the event of an accident.
Methodology: using decision tree algorithms, support vector machine, Naive Bayes and neural network; The hidden data patterns have been discovered in order to classify third party insurance policyholders. Also, the unbalanced distribution of data in two groups of damaged and undamaged causes an important challenge in the application of machine learning and data mining methods, which is considered in this article.
Findings: The data set belongs to one of the insurance companies and contains more than four hundred thousand samples registered in five years and includes four independent variables of car type, car group, license plate type and car age and a dependent and two-valued variable of financial damage. According to the obtained results, the best performance and prediction accuracy (with accuracy F1=0.72±0.01) is related to the decision tree model.
Conclusion: The impact of variables on the occurrence of damage in order of priority are: car type, license plate type, car age and car group. The evaluation results show that more data related to the driver's characteristics is needed for more accurate prediction of damage and high-risk customers.

Keywords

References

اصغری‌اسکوئی، محمدرضا، (1394)، کاربرد روش پنجره لغزان برای انتخاب ساختار شبکه عصبی با تاخیر زمانی در پیش‌بینی سری‌های زمانی مالی، فصلنامه پژوهشنامه اقتصادی، سال پانزدهم، شماره 57، ص 75-108.
اصغری‌اسکوئی، محمدرضا و قاسم‌زاده، محمد، (1395)، کاربرد قواعد کشفی و الگوریتم ژنتیک در ساخت مدل ARMA برای پیش‌بینی سری‌زمانی، ژورنال مدیریت فناوری اطلاعات، دانشگاه تهران، دوره 8، شماره 1، ص 1-26 .
ایزدپرست، محمود، (1390)، دسته‌بندی مشتریان بیمه با استفاده از داده‌کاوی، تازه‌های جهان بیمه، شماره 161.
بهادر، آزاده، استادرمضان، آذین و خانی‌زاده، فربد، (1396)، بررسی امکان صدور بیمه‌نامه شخص ثالث بر اساس ویژگی‌های راننده (تبصره 1 ماده 18 قانون جدید بیمه شخص ثالث) و ارائه آیین‌نامه پیشنهادی، پژوهشکده بیمه.
ترکستانی، محمد صالح؛ ده‌پناه، آرمان؛ تقوی‌فرد، محمدتقی و شفیعی، شهرام، (1395)، ارائه چارچوبی برای اصلاح نرخ حق بیمه در رشته بدنه اتومبیل با استفاده از مدل شبکه‌های عصبی (مطالعه موردی: شرکت بیمه آسیا)، مدیریت فناوری اطلاعات، دوره 8، شماره 4.
حاجی‌حیدری، نسترن؛ خالهء، سامرند و فراهی، احمد، (1390)، طبقه‌بندی میزان ریسک بیمه‌گذاران بیمه بدنه خودرو با استفاده از الگوریتم‌های داده کاوی (مورد مطالعه: یک شرکت بیمه)، پژوهشنامه بیمه، سال بیست‌وششم، شماره 4.
حنفی‌زاده، پیام و رستخیز پایدار، ندا، (1390)، مدلی جهت دسته‌بندی ریسکی گروه‌های مشتریان بیمه بدنه اتومبیل بر اساس ریسک با استفاده از تکنیک داده‌کاوی (مورد مطالعه: بیمه بدنه اتومبیل در یک شرکت بیمه‌ای)، پژوهشنامه بیمه، سال بیست و ششم، شماره 2.
فتح‌نژاد، فرامرز و ایزدپرست، محمود، (1390)، ارائه چهارچوب برای پیش‌بینی سطح خسارت مشتریان بیمه بدنه اتومبیل با استفاده از راهکار داده‌کاوی، تازه‌های جهان بیمه، شماره 156.
کریم‌زادگان مقدم، داود و بهروان، مجید، (1394)، ارائه راهکاری برای تعرفه‌گذاری پویا در صنعت بیمه با استفاده از تکنیک داده‌کاوی (مورد مطالعه: بیمه شخص ثالث، پژوهشنامه بیمه، شماره 4.
1. Baecke, P., & Bocca, L., (2017). The value of vehicle telematics data in insurance risk selection processes. Decision Support Systems, 98, 69.
2. David, M., (2015). Auto insurance premium calculation using generalized linear models. Procedia Economics and Finance, 20(15), pp.147-156.
3. Frempong, N.K., Nicholas, N. and Boateng, M.A., (2017). Decision tree as a predictive modeling tool for auto insurance claims. Int. J. Statist. Appl., 7(2), pp.117-120.
4. Kašćelan, V., Kašćelan, L. and Novović Burić, M., (2016). A nonparametric data mining approach for risk prediction in car insurance. Economic research-Ekonomska istraživanja, 29(1), pp.545-558.
5. Thakur, S.S. and Sing, J.K., (2013). Mining Customer's Data for Vehicle Insurance Prediction System using k-Means Clustering-An Application. International journal of computer Applications in Engineering sciences, 3(4), p.148.
6. Wuyu, S. and Cerna, P., (2019). Risk Assessment Predictive Modelling in Insurance Industry Using Data Mining. Software Engineering, 6(4), p.121.
7. Yunos, Z.M., Ali, A., Shamsyuddin, S.M. and Ismail, N., (2016). Predictive Modelling for Motor Insurance Claims Using Artificial Neural Networks. Int. J. Advance Soft Compu. Appl, 8(3).

Letters to Editor

IJIR Journal welcomes letters to the editor for the post-publication discussions and corrections which allows debate post publication on its site, through the Letters to Editor. Letters pertaining to manuscript published in IJIR should be sent to the editorial office of IJIR within three months of either online publication or before printed publication, except for critiques of original research. Following points are to be considering before sending the letters (comments) to the editor.

[1] Letters that include statements of statistics, facts, research, or theories should include appropriate references, although more than three are discouraged.

[2] Letters that are personal attacks on an author rather than thoughtful criticism of the author’s ideas will not be considered for publication.

[3] Letters can be no more than 300 words in length.

[4] Letter writers should include a statement at the beginning of the letter stating that it is being submitted either for publication or not.

[5] Anonymous letters will not be considered.

[6] Letter writers must include their city and state of residence or work.

[7] Letters will be edited for clarity and length.

Name *

Email Address *

Affiliation *

Comments *

Security Code *

Iranian Journal of Insurance Research

The application of data mining using machine learning algorithms to investigate the impact of vehicle characteristics in predicting the risk of material damage in the field of third party insurance

References

References

Letters to Editor

Send comment about this article

Volume 9, Issue 1 - Serial Number 31
January 2020
Pages 15-37

The application of data mining using machine learning algorithms to investigate the impact of vehicle characteristics in predicting the risk of material damage in the field of third party insurance

References

References

Letters to Editor

Send comment about this article

Volume 9, Issue 1 - Serial Number 31January 2020Pages 15-37

Volume 9, Issue 1 - Serial Number 31
January 2020
Pages 15-37