_] 

I 

I 

Strathmore 
UNIVERSITY 

Loss Distributions for Motor Insurance Claim Severity in Kenya 

Nyairo Frankline Okindo- 101428 

Submitted in partial fulfillment of the requirements for the Degree of 

Bachelor of Business Science in Actuarial Science at Strathmore University 

Strathmore Institute of Mathematical Sciences 

Strathmore University 

Nairobi, Kenya 

February 2021 

Tltis Research Project is available for Library use on the understanding that it is 

copyright mate1ial and that no quotation from the Research Project may be published 

without proper acknowledgement. 


l 
l 

I 

J 

DECLARATION 

I declare that this work has not been previously submitted and approved for the award of 

a degree by this or any other University. To the best of my knowledge and belief, the 

Research Project contains no material previously published or written by another person 

except where due reference is made in the Research Project itself 

©No part of this Research Project may be reproduced without the petmission of the author 

and Strathmore University 

Nyairo Frankline Okindo 

. .. ..... ... .. .. · · ···· ·~· ·· .. ............. . [Signature] 

Oq- o2-2o21 .. . ... .. . . . . . . . .. ..... .. .... .... ......................... ... [Date] 

This Research Project has been submitted for examination with my approval as the 

Supervisor. 

~~~b~~ .. ?. ... ~ ... ~ .. :;:;: ... ~ ....... / ...... ..... [Signature] 

~~=ore:J!t:J:~=c~·~~~~::te] 
Strathmore University 

ii 


TABLE OF CONTENTS 

DECLARATION ....... .... .... ..... .. .... .......... .. .... ....... .... .............. ........ .. .. ... ... ...... ... .. .. ... ........ .ii 

TABLE OF CONTENTS ... ... .... .... ... ........... ..... .. ......... ....... ............ ...... .. .. ..... .. ..... .. .. .... ... iii 

CHAPTER ONE: INTRODUCTION ........ .... ... ..... .. .. ..... ......... .. ... ... ... .. ....... .... ....... ... .... 1 

1.1 Background Information ........... .. .... ... .. ... .. ... .... .... ...... ... ......... .. ...... .. .. ........ ..... ...... 1 

1.2 Problem Statement ....... .. .. .. ... ... .. .... .... ... ...... ..... ....... .... .... ... .. ...... ........... ........ .. ... ... .4 

1.3 Research Objectives and Questions .... ... ... ......... .. .. ...... ... .. ... .... ... ... ..... ....... ..... .. .... 5 

1.4 Significance of the Resea1·ch ... .... .... .. .. .... .. ......... .. .... .. ... ...... .. ... ... .. ....... ........... .... ... 5 

CHAPTER TWO: LITERATURE REVIEW .... ..... .... .. .. .... ............ ....... .. ..................... 7 

2.1 Introduction ...... ....... .... ............... .... .. ........... .. ..... ........ ..... ... .. ................ .... ........ .. .. .. 7 

2.2 Maximum Likelihood Estimation .... ..... .. .. .. .. .. .......... .. ...... .... .. ...... .. .... ......... .... .... . 9 

2.3 Standard Continuous Distributions ... .. .... ...... ........ .. .... .. ... .. .. ........... ..... .. ... .... .. .... . 9 

2.4 Goodness-of-fit Test ... ... .. ..... .. ..... ... ....... .......... .... .... .. .. .. .. .. ... ....... ............. ... .... ..... 15 

2.5 Model Selection Criteria ... .... .. ........ .. ............ ... .... .... ... ... .. .... ..... .. ......... .. ..... ...... ... 17 

CHAPTER THREE: METHODOLOGY. ........ ............. .. ... .... .. ....... .. ........ .... ....... .. ..... 19 

3.1 Research Design ... .......... .... .......... .. ..... ... ..... .... ... .. ... .. .. ... .. .. .. .......... ... .. .. .... .. .. .. .. .... 19 

3.2 Population and Sampling .. ... ... ..... ..... .... .. ...... .... .......... ... ...... .. .. ... ............ ....... ... ... 19 

3.3 Data Collection ........ .. ....... ................. ................. ..... .... .. ..... ............ .. .. ......... .. .. .. ... 19 

3.4 Data Analysis ....... ........ ....... .... .. .... ..... .. ..... ....... .. ... .. ..... .. .. .. ........ .... ....... ... ..... .. ... ... 20 

CHAPTER FOUR: DATA ANALYSIS .. .. ....... .. .. ..... ... ...... ......... ... .. .. .. .. ...... ..... .. ... ..... . 27 

4.1 Intr·oduction ..... .... ..... ....... .. ... ... .. .. ... ... .. .. ... ... .. .... ..... .. ..... ........ .. .. ...... ....... ..... .. .. ... .. 27 

4.2 Descl'iptive Statistics ... ...... .. .... .......... .... .. .... .. ...... .... ... ... ..... .. .... ... ... ........ ...... ........ 27 

4.3 Parameters Estimation .. ... .. ..... ... .. ....... .. ... .. .......... ....... ..... .. .. .. .... ....... ... .. ... .. .. ... .... 31 

4.4 Goodness-of-Fit Test .. ...... .. ..... ........ ........ ...... ... .... .. ... ........... .... .. ......... ... ..... ... ...... 33 

J iii 


4.5 Information Criteria .......... .... ............... ...... ..................................... ..... ............... 34 

4.6 Summary ........ .... ..................................... ........... ........... ....... ................................. 35 

CHAPTER FIVE: CONCLUSION ........................................... ........... .... .......... .......... 37 

5.1 Introduction ......... ......................... ...... .... .................... ....... ................................. .. 37 

5.2 Conclusion ............................... ............................... ............................................... 37 

5.3 Recommendations ........................................................................ ............ ..... .. ..... 38 

5.4 Limitations of the Study ............................... ............. ................................. .. ........ 38 

5.5 Suggestions for Further Research ....................................................................... 39 

REFERENCES ....... .. .......................................................................................... .... ........ 40 

APPENDICES ......... ...... ..... .. ..... ...... .................. .... ... ............. ........ ......... .... .... .. ... ...... .... . 42 

Appendix A: Fitted Distributions for Motor Commercial ........................... ...... ... .42 

Appendix B: Fitted Distributions for Motor Private ..................... ...... ............ ....... 44 

J iv 


l 
l 

I 

~ 

I 

J 
J 
J 

CHAPTER ONE: INTRODUCTION 

This chapter begins with the backgrolmd infmmation of the study in section 1.1 by 

explaining the key concepts, main developments, and conceptualization of the study. In 

section 1.2, the problem statement of the study is elaborated at length. Section 1.3 contains 

the research objectives and questions. Tllis chapter concludes with the significance of the 

research in section 1.4. 

1.1 Baclcground Information 

1.1.1 Key Concepts 

Claim: A legal application made by a policyholder to an insurer for indemnity covered 

under the policy agreement. 

Distribution: A function that shows the possible values for a variable and how often they 

occur. 

Loss: The basis of a claim for damages under the tenns of an insurance policy. 

Severity: The cost of a claim. 

1.1.2 Main Developments and Conceptualization of Study 

The insurance industry is one of the oldest industries. Insurance companies exist to 

provide indemnity and make profits since insurance is a business like any other. The 

advancement of the insurance market is compelled by the prevailing interest of the public 

for cover against different forms of risks of tmacceptable arbitrary incidents with a 

considerable financial effect (Omari et al. , 2018). A policyholder is supposed to pay a 

premium and make a claim when a certain event occurs witllin a given period as per the 

policy. The insurer is tl1en obliged to settle the claim, and this is referred to as loss. Insurers 

are keen with the results oftl1e random outcome of claims instead of the existence of the 

claims. They are concerned with the loss rather than the circumstances that give rise to the 

loss (Achieng, 2010). The aggregate amount of claims in a given dmation is a measure 

that is vital to the operations of an insurance company. 

1 


J 

J 

A general insurance actuary of a company needs to understand different risk models 

comprising of the aggregate claim amount overdue in a ce11ain period. These models 

enlighten a company and allow it to decide on things such as anticipated profits, premiums 

to be charged, required reserves that will guarantee profitability with a high likelihood, 

and the effect of reinsurance and policy excess (Boland, 2006). 

Actuaries are tasked with the responsibility of developing cashflow models for insurance 

companies. TI1ese models are crucial since they are used to assist in the day-to-day work 

of insurers and to provide checks and controls on the business. An actuary may decide to 

develop a stochastic model and estimate one ofthe parameters e.g. claim size by assigning 

it a probability distribution. If the assigned probability distribution is not appropriate for 

the claim size parameter in the stochastic cash flow model, it will lead to adverse future 

experience by the insurance company due to parameter error which results in a model 

error. An actuary will, therefore, be concemed with objectively assigning an approp1iate 

probability distribution to the value of the claim size parameter. 

In the general insurance business, there is heightened concern in motor insurance because 

it involves the control of many risk events. These include fire, theft, third party bodily 

injmy and accidental damage to the vehicle. In most COlmtries, the motor insurance 

industiy is growing rapidly due to legislations that make motor insurance compulsory for 

all vehicles. 

The insurance industry is driven by data, and insurers engage a lot of analysts to 

comprehend claims data (Boland, 2006). The claims data contains among other things, the 

frequency and size of claims that a company has received within a given period. Based on 

the claims data, mathematical methods can be applied to model individual claims. The 

mathematical models are known as loss distributions. 

Loss distributions are vital in the insurance industry since they are used for many purposes 

which include: premium rating (deciding the premium rates to be paid by policyholders), 

reserving (determining the required amount of funds to be retained to offset the cost of 

claims), reviewing reinsurance anangements and testing for solvency (evaluating the 

insurer' s financial health). This explicitly highlights the importance of having a good 

estimate of an insurance company' s loss distributions. 

2 


l 

l 

J 

J 

A loss distribution is the associated probability distribution of a claim-size variable. The 

claim-size is a non-negative continuous random vruiable since the claim arising fi:om a 

covered incident cru1 be measured in the lowest unit of cmTency e.g. cents. Loss 

distributions are usually positively skewed and long-tailed. To model the size of insurance 

claims, it is pmdent to fit a continuous parametric claim-size distribution to a discrete 

sample of claim data. This involves employing a variety of parametric families of 

continuous distiibutions which include: Gamma, Lognormal, Exponential and Pareto 

distributions, among others. 

The gamma and lognonnal distributions are among the common distiibutions that have 

been applied for modelling claim seveiity. The exponential, Pareto, Weibull, and Burr 

distributions are also used to model claim sevetity. This is primarily because all these 

distributions are positively skewed. Omaii et al. (2018) suggested that a lognonnal 

distiibution is suitable to model claim severity based on a sample of the automobile 

pmifolio datasets obtained from the insurance Data package in R. Achieng (2010) 

concluded that the lognormal distiibution was a suitable model for the motor 

comprehensive policy claim seveiity of First Assurance Company Limited, Kenya. 

Nduwayezu (2016) fom1d out that the exponential distiibution is suitable to model 

insurance data; though he left the research open for fhrther studies to be carried out to 

detennine the distiibutions that are most suitable for each class of insurance. 

Most research papers have provided a good framework to use when modelling loss 

distributions for motor insurance claims severity. However, it can be noted that the 

statistical distributions suggested for modelling claims severity are general and not 

exhaustive since they are based on the sample data that was being used by the various 

researchers. In the Kenyan enviromnent, there are no specific statistical distributions that 

have been recommended to be used for modelling motor insurance claim severity based 

on the motor insurance claims data in Kenya. Tlus creates a need for research on suitable 

statistical distiibutions that can be used within the Kenyan environment as this will 

enhance the motor insurance industry: 

3 


I 
l 

1 

j 

J 

J 

1.2 Problem Statement 

One of the major challenges that general insurers face is to precisely estimate the likely 

prospective claims experience and therefore charging suitable premiums and setting aside 

sufficient reserves (Omari et al., 2018). For these companies to overcome the challenge 

of accurately forecasting future claims experience, they need to have a good estimate of 

loss distributions. A good estimate of a loss distribution entails selecting a suitable 

statistical distribution that fits the claims data. 

Determining motor insurance claims distributions often comprises associating the value 

of claims with two elements: the occtmence of an accident and the claim ammmt in case 

of an accident (Frees & Valdez, 2008). Records from insurance databases show the 

following claim types: third-pa11y liability claims, and damage claims to the policyholder, 

comprising property damage, injury, theft, and fire . This, therefore, implies that for every 

accident, it is probable for multiple types of claims to be incurred; thus, increasing the 

claim severity of an insurance company for eve1y single accident. This creates a need for 

having good models ofloss distlibutions that will enable an insmer to plan accordingly to 

lower the probability of incurring such a loss and reducing the claim severity incurred. 

Motor insurance is the biggest segment out of the 14 distinctive classes of the non-life 

insurance market in Kenya accmmting for over KES 46 billion gross written premiums 

representing close to 35.8% of the entire non-life insurance market in Kenya in 2018. It is 

not surprising that motor insurance is a huge business in Kenya given that more than 7000 

cars are imported monthly. This can be credited to the fact that motor insurance is 

compulsory in Kenya, and thus, for every new vehicle purchase in the country, a motor 

insurance policy is added to the existing motor insurance policies for the vehicles on the 

Kenyan roads. To tllis end, the huge role played by motor insurance to the Kenyan 

insurance industry and the economy at large cannot be overlooked. This implies that motor 

insurance companies need to have good models that will enable them to accurately 

forecast future claims expeiience and thus be able to set aside enough reserves. 

According to h1surance Outlook Report 2019/2020, East Africa by Deloitte, motor 

insurance in Kenya is one of the largest general insurance classes togetl1er with medical 

4 


l 
l 

l 

J 

J 

J 

J 

insurance. However, they are also among the top loss-making businesses. This could be 

partly attributed to the fact that motor insurance companies in Kenya do not have good 

models for loss distributions, and thus cannot be able to conectly forecast future claims 

experience. Tllis leads them to undergo huge losses because of failing to plan accordingly 

to lower the probability of incuning such losses. 

This research paper will , therefore, seek to address this gap in the Kenya motor instrrance 

indust:Iy by providing a good model of loss distribution for claim severity. This will, in 

tum, help insurers to precisely estimate prospective claims experience and thus plan 

accordingly to reduce their huge losses and the chances of them making such losses. 

1.3 Research Objectives and Questions 

1.3.1 Research Objectives 

The objective of tllis research is to determine the appropriate statistical distribution that 

fits claim severity data of motor insmance in Kenya and can be used to accurately forecast 

future claims experience. 

1.3.2 Research Questions 

Throughout this research, the paper will aim to answer the following questions: 

1. What is the most appropriate statistical distribution for claim severity? 

n . How well does tllis loss distribution fit the claims data? 

1.4 Significance of the Research 

Tllis research will provide motor insurance companies in Kenya with the most suitable 

loss disu·ibution for claim severity. This will enable them to accmately forecast future 

claims experience and thus be able to correctly: rate premiums, reserve, review 

reinstrrance anangements and test for solvency. Given that risk associated with claim 

seve1ity would have been mininlised, motor insurance policyholders will consequently 

5 


l 
J 

pay reduced premiums. The research will enhance understanding of the complexity of the 

extensive volume of claims that is usually concealed in a large amount of data. 

6 


l 
l 

J 
J 

) 

CHAPTER TWO: LITERATURE REVIEW 

This chapter discusses the theoretical and empirical framework. It begins with a brief 

introduction of various studies that have been previously carried out on claims data 

modelling in section 2.1. In section 2.2, the Maximum Likelihood Estimation is then 

presented as a method of estimating parameters. Subsequently in section 2.3, various 

continuous distributions are discussed. This chapter closes by presenting Kolmogorov­

Smimov and Anderson-Darling as goodness-of-fit tests in section 2.4, and Akaike 

Infonnation Criterion and Bayesian Information Criterion as model selection criteria in 

section 2.5. 

2.1 Introduction 

A loss distribution is a mathematical method of modelling individual claims. It involves 

fitting statistical distributions to observed claims data and then testing for the goodness of 

fit. The fitted loss distributions can then be used to estimate probabilities. The main 

assumption in all the distributions in tllis study is that tl1e amount of a claim and its 

occurrence can be considered independently. Thus, a claim arises according to some 

elementmy model for incidents ensuing in time, then the claim amount is selected from a 

distribution representing the claim amatmt. 

Ignatov et al. (200 1) provided a statistical process of fitting a suitable model to claims 

data. 111e process begins with the selection of a family of distributions for the claims 

model and then estimating the parameters for the model. A selection criterion to determine 

tl1e appropriate distribution from the family of distributions should be specified. Finally, 

a goodness-of-fit test should be cruried out on the selected appropriate distribution. 

Achieng (201 0) modelled the claim sevetity motor comprehensive data of First Assurance 

Compru1y Limited, Kenya (June 2006- June 2007). The estimates for the parameters were 

obtained using the Maximum Likelihood Estimation method. The Akaike Infonnation 

Criteria and Quantile-Quantile plots were further utilised to cany out a goodness-of-fit 

test. The finding oftl1e study was that the lognonnal distribution was a suitable model for 

the claims data. 

7 


l 
l 

J 

J 

Mazviona and Chiduza (2013) used four distributions (Gamma, Pareto, Exponential and 

Lognonnal) to model the claims for a motor portfolio in Zimbabwe. Their study used 

Maximum Likelihood Estimation and Method of Moments Estimation to estimate 

parameters for the models. The Chi-Square and Kolmogorov-Smimov tests were used as 

goodness-of-fit tests for the models. The finding of their study was that the Lognonnal 

distribution is suitable for smaller claims while the Pareto distribution is appropriate for 

larger claims. 

Packova and Brebera (2015) used Gatmna, Weibull, Lognonnal and Pareto distributions 

to model data obtained from a Czech insurance company for compulsory motor third-party 

liability insurance. The Maximmn Likelihood Estimator method was used to estimate the 

parameters of the selected parametric distributions. They further used the Anderson­

Darling, Chi-Square and Kolmogorov-Smimov tests to determine whether the chosen 

distribution provides a good fit to the data. The finding of the study was that the Pareto 

distribution can be assumed to be a good model for the losses. 

Omari et al. (2018) modelled a sample of the automobile portfolio datasets obtained from 

the insurance Data package in R with variables; Auto Collision, data Car, and data Ohlsson 

used. They used the Maximmn Likelihood Estimation method to obtain parameter 

estimates for the fitted models. TI1e Anderson-Darling and Kolmogorov-Smirnov tests 

were then used as goodness-of-fit tests for the claim severity models. The Akaike 

Information Criterion and Bayesian Infonnation Ctiterion were further applied to choose 

between competing distributions. The finding of their study was that the lognonnal 

distribution provides a good model for claims sevetity on a shmt-tenn basis. The study 

recmmnended that for a long-tenn basis, insurers should adjust the distributions 

accordingly based on insurer-specific claims experience. 

To this end, this study seeks to apply the theoretical and empirical frameworks of past 

studies at1d extend it in the Kenyan economy. The aim is to fit a suitable loss distribution 

to the claim severity data for motor insurance companies using the company-specific data 

set. 

In line with the research objectives of this study, it is pmdent to explain the theoretical 

framework that will be applied throughout this paper. TI1is will include parameter 

8 


J 

J 

estimation, standard continuous distributions, goodness-of-fit test, and model selection 

criteria. These will then fonn the basis of the subsequent parts of the study that will 

eventually yield an appropriate loss distribution model. 

2.2 Maximum Likelihood Estimation 

The method of maximum likelihood is generally considered as the best general method of 

finding estimators. Maximum likelihood estimators have excellent and nonnally simply 

detennined asymptotic prope1ties and so are especially good in the large-sample situation. 

The likelihood fi.mction of a random variable, X, is the probability (or probability density 

fi.mction) of observing what was observed given a hypothetical value of the parameter, e. 
The maximmn likelihood estimate (MLE) is the one that provides the highest probability 

(or probability density function), i.e. that maximises the likelihood fi.mction . 

2.3 Standard Continuous Distributions 

The claim-size is a non-negative continuous random variable since the claim arising from 

a covered incident can be measured in the lowest unit of currency e.g. cents. To model the 

size of insurance claims, it is prudent to fit a continuous parametric claim-size dishibution 

to a discrete sample of claim data. Due to the infrequency of relatively large claims which 

are of concern, Boland (2006) suggested the use of relatively fat-tailed distributions. Kaas 

et al. (2008) stated that claim-size is best modelled using continuous distributions that are 

positively skewed and long-tailed. This is because very large claims occur at the upper­

right tails of the distribution. The following parametric families of continuous 

distributions will be considered: Exponential, Gamma, Lognormal, Pareto, Weibull, and 

Burr. In this section, the dish·ibution fimctions and probability density fimctions will be 

gi.Ven. 

9 


J 

J 
_j 

I 
u 

2.3.1 Exponential Distribution 

The exponential dist1ibution is one of the elementary models for claim severity. A random 

vmiable X has the exponential distribution with parmneter A. > 0 if it has distribution 

function : 

F(X)=l- e-AX,X>O 

In that case, we write X- Exp (A.). 

The probability density function is: 

f(x) = A.e-.i!.x ,x > 0 

Achieng (2010) used the exponential disttibution due to its heavy-tailed and highly 

skewed nature to model claim severity motor comprehensive data of First Assurance 

Company Limited, Kenya (June 2006 - June 2007). The distribution was not a good fit 

due to its low log-likelihood value and low-density value of its probability density 

fimction graphical plot. 

Mazviona and Chiduza (2013) used the exponential distribution to model a motor dataset. 

Their study fmmd out that this distiibution failed to fit the data very closely based on the 

critical value for the chi-square test and thus rejecting the null hypothesis. 

Omari et al. (2018) used exponential distribution to model an automobile dataset. The 

study rejected the null hypothesis for this distribution because it had the largest values 

among the distributions used for the Kolmogorov-Smimov and Anderson-Darling tests. 

2.3.2 Gamma Distribution 

The random variable X has a gmnma distribution with pm·mneters a > 0 and .A > 0 if it has 

probability density fimction: 

x>O 

The parameter a changes the shape of the graph of the probability density function, and 

the parameter A. changes the x-scale. In that case, we write X- Ga (a, A.). 

10 


_l 

_l 

Achieng (2010) used the gamma distribution due to its heavy-tailed and highly skewed 

natme to model claim severity motor comprehensive data of First Assurance Company 

Limited, Kenya (Jtme 2006 - Jtme 2007). The study concluded that gamma distribution 

was not a suitable model for the claims data based on its Q-Q plot. 

Mazviona and Chiduza (2013) used the gamma distribution to model a motor dataset 

Their study found out that this distribution failed to fit the data vety closely based on the 

critical value for the chi-square test and thus rejecting the null hypothesis. 

Packova and Brebera (2015) used gamma distribution to model data obtained fimn a 

Czech insurance company for compulsory motor third-party liability insurance. The 

reason for using tllis distribution was because it is specifically applicable for modelling of 

claim severity. The study found out that the gamma distribution failed to be a suitable 

model for the losses based on the Anderson-Darling test value. 

Omari et al. (2018) used gamma distribution to model an automobile dataset. The study 

found out that the gamma distribution is a better model among the others based on the log­

likelihood value. 

2.3.3 Lognormal Distribution 

X has a lognom1al disttibution if log X has a nonnal distribution. If X represents, for 

example, claim size and Y = log X has a nonnal distribution, then X is said to have a 

lognonnal distribution. 

When log X- N (/1 , <52
), X- LN (fl, <5 2

). 

The probability density function of the lognonnal distribution is defined by: 

1 l(logX-JL)
2 

f(X) = e 2 -rr- for 0 <X< oo 
X<J..fiii 

If X has a lognonnal distribution with parameters f1 and <J , then we can write X - log N 

(ll ' (j2 ). 

Achieng (2010) used the lognonnal distribution due to its heavy-tailed and highly skewed 

nature to model claim severity motor comprehensive data of First Assurance Company 

11 


! 

I 

l 

1 

I 

_j 

J 

Limited, Kenya (June 2006- Jtme 2007). The finding of the study was tlmt the lognormal 

distribution would be the best statistical distribution to model the claim ammmts of First 

Assurance Company Limited at a 99% level of confidence. This was because the 

lognonnal distribution had the smallest Akaike Information Oiterion value. 

Burnecki et al. (2010) used the lognormal distr·ibution to model Danish fire losses dataset, 

which concerns major fire losses that occurred between 1980 and 1990 and were recorded 

by Copenhagen Re. This distr-ibution was chosen since it is a typical candidate for claim 

size distributions considered in application. The study concluded by suggesting the 

lognonnal distribution as a model for the Danish fire loss amounts because it was the only 

distribution that passed all the applied tests therein. 

Mazviona and Chiduza (2013) used the lognmmal distribution to model a motor dataset. 

Their study fmmd out that this distribution fit the data very closely and based on the 

graphical plot of its probability density ftmction, concluded that it produced the best fit 

for lower claims. 

Packova and Brebera (2015) used lognonnal distribution to model data obtained from a 

Czech insurance company for compulsory motor third-party liability insurance. The 

reason for using this distribution was because it is specifically applicable for modelling of 

claim seve1ity. The study found out that the lognormal distribution failed to be a good 

model for larger losses based on the chi-square test value. 

Omari et al. (2018) used lognormal distribution to model an automobile dataset. The study 

found out that the lognormal distribution was the most suitable model since it had the 

lowest Akaike Information Crite1ion and Bayesian Infmmation Criterion values. 

2.3.4 Pareto Distribution 

A random variable X has the Pareto distribution with parameters a > 0 and A. > 0 if it has 

distribution fimction: 

A a 

F(X) = 1- C+x) ,x > o 

In that case, we write X- Pa (a, A.) . 

12 


l 

I 

1 

_] 

.1 

The probability density function is given by: 

a.Aa 
f(X) = (il. + X)a+l ,X> 0 

Mazviona and Chiduza (20 13) used the Pareto distribution to model a motor dataset. TI1eir 

study found out that this distribution fit the data very closely and based on the graphical 

plot of its probability density fimction, concluded that it provided the best fit for larger 

claims. The study further recommended the Pareto distribution because it does not 

undervalue the probabilities for larger claims. 

Packova and Brebera (20 15) used Pareto distribution to model data obtained from a Czech 

insurance company for compulsory motor third-party liability insurance. This was because 

this distribution is frequently used as a model for insurance losses required to obtain well­

fitted tails. The study concluded that the Pareto distribution is a good model for large 

claims based on the tests carried out. 

Omari et al. (2018) used Pareto distribution to model an automobile dataset. This 

distribution was used since it has been shown to sufficiently mimic the tail-behaviour of 

claims amount thereby providing a good fit. However, the study discarded the Pareto 

distribution since its values were extremely out ofrange based on the tests canied out. 

2.3.5 WeibuU Distribution 

This is a very flexible distribution which can be used to model claim severity. It is a 

modification of the Pareto and exponential distributions usually with y < 1. A random 

variable X has a Wei bull distribution with parameters c > 0 and y > 0 if it has distribution 

fimction: 

F(x) = 1- exp( -cxY), x>O 

In that case, we write X - W (c, y). The probability density fimction of theW (c, y) 

disttibution is: 

f(x) = cyxY-l exp( -cxY), x>O 

13 


l 
l 
I 

l 

I 
J 

1 

1 

1 

I 
. J 

J 

I/ 

I 

Achieng (2010) used the Weibull distribution due to its heavy-tailed and highly skewed 

nature to model claim severity motor comprehensive data of First Assurance Company 

Limited, Kenya (June 2006 - June 2007). The study established that the Weibull 

distribution is not a suitable model for the claims data based on the tests carried out. 

Packova and Brebera (20 15) used Wei bull distribution to model data obtained from a 

Czech insurance company for compulsory motor third-party liability insurance. The 

reason for using tllis distribution was because it is specifically applicable for modelling of 

claim severity. The study found out that the Weibull distribution failed to be a good model 

for the losses based on tl1e Anderson-Darling test value. 

Omari et al . (2018) used Weibull distiibution to model an automobile dataset. The study 

discarded this distribution as the appropriate distribution since it failed to meet the 

selection criteria based on the tests canied out. 

2.3.6 Burr Distribution 

The disn·ibution function of the Pareto Distribution Pa (a, y) is: 

A a 

F(x) = 1- (A.+ x)a ,x > 0 

A further parameter y > 0 can be inn·oduced by setting: 

A a 

F(x) = 1- (il.+xY)a ,x > 0 

This is the distribution ftmction of the transformed Pareto or BtuT distribution . The extra 

parameter provides additional flexibility when a fit to data is needed. The probability 

density ftmction is given by: 

Bmnecki et al. (2010) used the Burr distribution to model fire losses dataset. This 

distribution was chosen since it is a typical candidate for claim size distributions 

considered in application. The study failed to suggest the Blm distribution as a good 

model since it failed to pass the applied tests therein. 

14 


J 

J 

~ I 
J 

It is worth to mention that while this distribution can be used to model claim size 

distributions, most sh1dies fail to utilize it among their chosen distributions. It would be 

interesting to conduct a sh1dy armmd the same to detennine the rationale behind the 

exclusion of this distribution in major sh1dies. 

2.4 Goodness-of-fit Test 

Tllis refers to verifying whether a particular Joss distribution provides a good model for 

the observed claim amounts i.e. whether a model provides a "good fit" to the data. This 

involves detennining the quantitative "compatibility between the estimated theoretical 

distributions against the empitical distributions of the sample data" ( Omari et al., 20 18). 

TI1is enables one "to detennine whether the observed sample was drawn from a population 

that follows a particular probability distribution" (Dodge, 2008). 

Mylmg (2003) argued that even though a good fit is required, it is not an adequate 

requirement for one to conclude that one model provides a closer approximation to data 

than does another model simply because the former model fits the data better than the 

latter. A better fit (larger value of the maximized log-likelihood) simply places the model 

in a series of competing models for additional considerations such as goodness-of-fit tests. 

The Kolmogorov-Smimov and Anderson-Darling tests are used because they are suitable 

for performing an exact test on continuous distributions. 

2.4.1 Kolmogorov-Smirnov Test 

The Kolmogorov-S1nirnov test is a nonparamettic goodness-of-fit test and is used to 

detennine whether an underlying probability distribution differs from a hypothesized 

disttibution. 

Consider an independent random sample (x1 , x2 , ... , Xn), a sample of size n with unknown 

distiibution function F(x) coming from a population with a specific and known 

distribution function F0 (x ). The hypothesis to test is as follows: 

H0 : F(x) = F0 (x) 

15 


J 

J 

j 

If F(x) is the empilical distribution function ofthe random sample, then the statistical test 

T11 is defined as the greatest vertical distance between F0 ( x) and F (x) : 

sup 
Tn = IF0 (x)- F(x)l 

X 

The decision rule is to reject H0 at the significance level a if T71 is greater than the value 

of the Kolmogorov table having for the parameters nand 1-a, which is denoted by t 11 i-a• 

i.e., if: 

2.4.2 Anderson-Darling Test 

The Anderson-Darling test is a goodness-of-fit test which allows controlling the 

hypothesis that the distribution of a random variable observed in a sample follows a cettain 

theoretical distribution. 

Consider a random variable X, which follows a pruticular distribution, and has a 

distribution ftmction F0 (x; 8), where 8 is a parameter (or a set of parameters) that 

detennine, F0 • We ftnther assume 8 to be known. An observation of a sample of size n 

issued :fi·om the variable X gives a distribution function F(x). The Anderson-Darling 

statistic, denoted by A2, is then given by the weighted sum of the squared deviations 

F0 (x; 8)- F(x): 

Starting from the fact that A2 is a random variable that follows a certain distribution over 

the interval [0; +oo ], it is possible to test, for a significance level that is fixed a ptiori, 

whether F(x) is the realization of the random variable F0 (X; 8); that is, whether X 

follows the probability distribution with the distribution ftmction F0 (x; 8). 

16 


l 
l 

j 

J 

J 

The computation of A2 Statistic is as follows: Anange the observations x1 , x2 , ... , xn in 

the sample obtained from X in ascending order i .e., x1 < x2 < ··· < x,1 . The A2 is then 

computed as: 

where zi = F0 (xi; 8), (i = 1, 2, ... , n) 

The null hypothesis is rejected beyond the limiting values of A2 depending on the 

significance level based on the Anderson-Darling Test Table. 

2.5 Model Selection Criteria 

An information ctiterion measures the quality of a model by analysing how well the model 

fits the data and the simplicity of the model. Information criteria are used to analyse 

different models that are fitted to the same data set. Ceteris paribus, a model with a smailer 

value is preferable to one with a larger value. 

2.5.1 Akaike Information Criterion 

The Akaike Information Criterion (AIC) was formulated by and named after Akaike 

(1974). The AIC is an estimator of out-of-sample prediction error that yields a teclmique 

for model selection. AIC estimates the relative ammmt of information lost by a given 

model: the fewer information that is lost by a model, the better the model's quality. 

In a seiies of models, the most appropriate model that is selected is the one with the least 

AIC value. The AIC value for a model is calculated as follows: 

AIC = 2k- 2ln(L) 

where k is the munber of estimated parameters in the model and L is the maximmn value 

of the likelihood ftmction of the model. 

17 


l 
l 
I 

l 

J 

J 

2.5.2 Bayesian Information Criterion 

l11e Bayesian Information Crite1ion (BIC) is also known as the Schwarz Infonnation 

Criterion (SIC). It is a criterion used to select models among a definite series of models. 

BIC was fonnulated by Schwarz (1978) and given a Bayesian argument for its adoption. 

It has been widely used for model selection and can be applied to any set of maximtm1-

likelihood based models. The appropriate model that is preferred is one that has the lowest 

BIC value since it implies a lower penalty te1m. It is similar to the Akaike lnfonnation 

Criterion and it is based to some extent on the likelihood function . BIC introduces a larger 

penalty term than AIC. 

The BIC value for a model is calculated as follows: 

BIC = k ln(n)- Zln (L) 

where k is the number of estimated parameters in the model, n is the munber of 

observations and Lis the maximized value of the likelihood fimction of the model. 

18 


J 

CHAPTER THREE: METHODOLOGY 

This chapter outlines the methodological framework that will be used in the study. Section 

3.1 discusses the design of the research; section 3.2 discusses the population and sample 

ofthe study while section 3.3 discusses the collection oftl1e study's data. Finally, section 

3.4 explains the data analysis processes that will be cruTied out by the study. 

3.1 Research Design 

This study adopts a quantitative method which is focusing on modelling an appropriate 

loss distribution for motor insurance claim severity in Kenya. The selected appropriate 

loss distribution will then be tested to check on its goodness-of-fit before being 

recommended as an appropriate model. The variable of interest is the claim size in the 

motor insurance industry. This study will use data for Kenya motor insurance companies 

from 2014 to 2018. The choice of this period is to make the study to be appropriate, 

relevant, reliable, ru1d applicable as possible. 

3.2 Population and Sampling 

The focus of the study is motor insurance companies in Kenya. As of 2018, there were 36 

licensed insmance companies in Kenya which were providing motor insm·ance services 

in Kenya. These companies will fonn the population of the study consequently. This study 

will focus on the whole population since the data is readily available to obtain and the 

population size is relatively small. It is worth to note that this study will not focus on 

reinsurance companies since they need to be studied independently. 

3.3 Data Collection 

TI1is study seeks to apply the theoretical and empirical frruneworks of past studies and 

extend it in the Kenyan economy. The aim is to fit a suitable loss distribution to the claim 

severity data for motor insurance companies using the company-specific data set. This 

study is focused on claim sizes for motor insmance companies in Kenya from 2014 to 

19 


l 
l 
l 

1 

_] 

I 

2018. This implies the type of data to be used within the study to be quantitative 

continuous ratio panel data. The population that is being focused on by the study is made 

up of 36 licensed motor insurance companies that are regulated by the Insurance 

Regulatory Authority (IRA). 

IRA produces annual reports highlighting activities within the insurance industry by 

various insurance companies. This study will use the data that is contained within these 

annual reports that have been published by the regulator. The study will thus use 

seconda.Iy data provided by IRA that includes among other things the claim sizes of 

various licensed insurance companies providing motor insurance services. 

The annual repmts provided by IRA are available in Microsoft Excel Binary File Format. 

Tllis will make it easier for the study to extract the relevant data from the report using 

programs such as Microsoft Excel and R. TI1ese prograills will then be further used to 

analyse the extracted data according to the research objectives. 

3.4 Data Analysis 

In compliance with the research objectives, this study seeks to find an approp1iate model 

that fits claims size of motor insurance companies. lgnatov et al. (2001) provided the 

following steps to be followed when fitting a suitable model to claims data: 

a. Select a fa.Inily of distributions for the claims model. 

b. Estimate the parameters for the model. 

c. SpecifY a selection criterion to determine the appropriate distribution from the 

family of distributions. 

d. CaiTy out a goodness-of-fit test on the selected appropriate distribution. 

This study will, therefore, follow the above steps in line with the research objectives while 

analysing the data. Microsoft Excel and R computer prograiTis will be used together for 

data analysis within the study. The first step that will be carried out on the data is to find 

its descriptive statistics such as mean, variance a11d skewness. These values will come in 

20 


l 
l 
I 

J 

_! 

handy when comparing them with the results obtained from the various models to select 

an appropriate loss distribution. 

In line with the research objectives of this study, it is pmdent to explain the framework 

that will be applied in the process of analysing data. This will include parameter 

estimation, standard continuous distributions, goodness-of-fit test, and model selection 

criteria. These will then form the basis of data analysis that will eventually yield an 

appropriate loss distribution model. It is important to note that sections 3.4.1 to 3.4.4 

below are outlining the theoretical framework that will be applied in analysing the data 

using Microsoft Excel and R computer programs. 

3.4.1 Maximum Likelihood Estimation 

The parameters of the chosen loss distribution in this study will be estimated using the 

Maximum Likelihood method. The most important stage in applying the method is that of 

writing down the likelihood: 

n 

L(8) = n f(xi; 8) 
1 

for a random sample x1, x2 , ... , x11 from a population with density or probability function 

f(x; 8). 

In most cases taking logs greatly simplifies the determination of the maximum likelihood 

estimator (MLE) {j. 

The following steps are used when detennining a maximum likelihood estimate (MLE): 

1. Specify the likelihood function for the available data. 

n 

L(8) = n f(xi; 8) 
1 

2. Simplify the algebra using natural logs. 

n 

I (8) = logL (8) = L log f (xi 18) 
i=l 

21 


l 
l 

1 

l 
I 

1 

- J 

1 

J 

3. Maximise the log-likelihood ftmction by differentiating the log-likelihood 

ftmction with respect to each ofthe unknown parameters and equating the resulting 

expression(s) to zero. 

:,:/(8) = 0 

4. The MLEs of the parameters are obtained by solving the resulting equation(s). To 

ensure that the obtained values maximise the likelihood ftmction, differentiate a 

second time. 

3.4.2 Standard Continuous Distributions 

Tins sntdy will employ the following parametric families of continuous disttibutions: 

Exponential, Gamma, Lognormal, Pareto, and Wei bull. In this section, the mean, variance, 

and the maximum likelihood estimates for the parameters will be given. 

A. Exponential Distribution 

The distribution function of an exponential distribution is given by: 

The probability density ftmction is: 

f(x) = A.e-A.x ,x > 0 

The mean and variance of X are: 

T11e likelihood ftmction is: 

1 
E(X) =I 

n 

1 
var(X) = ;tz 

L = n A.e-AXi = ;tne-ALXi = jl_1te-ilnx 

i=1 

h - 1 '\"'1l w ere x = - L..i=l xi 
n 

The log-likelihood ftrnction becomes: 

22 


l 

! 

_l 

j 

j 

J 

J 

log L = n log il. - il.nx 

Determine stationary points by differentiating: 

Setting this to zero gives : 

B. Gamma Distribution 

a n 
-logL = -- nx 
aA A 

~ 1 
il.=­x 

The probability density function of the gamma distribution is given by: 

The mean and variance of X are: 

a 
E(X) =­

A 

x>O 

a 
var(X) = p 

The moment estimators are used as initial estimators for the MLEs since they cannot be 

obtained in closed fonn (i.e. in tenns of elementary functions) . 

C. Lognormal Distribution 

The probability density fimction of the lognormal dist:Iibution is defined by: 

1 l(logX-!l) 2 

f (X) = e 2 -u- for 0 < X < oo 
XC5{2-; 

The mean and variance of X are: 

Estimating the MLEs is simple since f1 and C5 2 may be estimated using the log-transfonned 

data. Let x11 x 2 , ..• , Xn be the observed values and let Yi =log xi· 

The MLEs will be given by: 

23 


l 

_j 

l 
J 

j 

n 

(1 = y = ~ L Jog xi 

i=l 

where the subscript y signifies a sample variance computed on they values 

D. Pareto Distribution 

The distribution ftmction of the Pareto distribution is defined as: 

A a 

F(X) = 1- C.+ x) ,x > o 

The probability density function is given by: 

ail. a 

f(X) = (i!. + X)a+l ,X> 0 

The mean and variance of X are: 

i!. 
E(X) =--(a> 1) 

a-1 

The likelihood ftmction is: 

ai!.z 
var(X) = (a_ 1)Z(a _ Z) (a> 2) 

Ti
n ai!.a 

L = ----a+! ,0 < i!.:::;; min(xi),a > 0 
X· 

i=l t 

The log-likelihood ftmction becomes: 

24 


l 
l 
., 

! 

_! 

_! 

n 

log L = n log( a)+ an log(A.)- (a+ 1) L log (xi) 
i=l 

To maximize the log-likelihood function, set X =min (xi), such that A. is less than the 

least xi · Differentiating and setting to zero: 

11 

8logL n ~ 
aa = ~ + nlog(A.)- L log(xJ = o 

This will result in: 

E. Weibull Distribution 

i=l 

n 
a=-----::-:--

Ir=liog(j) 

The disttibution fimction of the Weibull disttibution is: 

F(x) = 1- exp( -cxY) 1 

The probability density function is: 

f(x) = cyxY-l exp( -cxY) 
1 

x>O 

x>O 

The method of maximum likelihood is not simple to apply ifboth candy are unknown. 

Nevertheless, the equations are elementary when a computer is used. In the case where 

y has the known value y*, maximmn likelihood is easy enough. We use the data 

transfonnation Yi = x?. The y values will now have an exponential distribution. The 

MLE analysis can now be done easily. 

3.4.3 Goodness-of-fit Test 

It is impmtant to test whether a particular loss disttibution is a suitable model for the 

observed claim an10unts i.e. whether a model provides a "good fit" to the data. This 

enables one "to determine whether the observed sample was drawn fi·om a population that 

follows a particular probability distribution" (Dodge, 2008). In this paper, both the 

Kolmogorov-Smimov and Anderson-Darling tests will be applied because they are 

25 


l 
l 
I 

l 

J 

J 

j 

suitable for perfonning an exact test on continuous distributions. For all the goodness-of­

fit tests, the hypotheses will be fonnulated as follows: 

H0 : The claim severity data follows a particular distribution [F(x) = F0 (x)] 

H1 : The claim severity data does not follow the particular distribution [F(x) =1: F0 (x)] 

where F(x) is the unknown distribution ftmction of the claim seveiity data (sample) wlule 

F0 (x) is a specific and known distribution ftmction (population). 

3.4.4 Information Criteria 

For all the selected claim seveiity disnibutions that pass the goodness-of-fit test, both the 

Akaike lnfonnation Criterion (AIC) and the Bayesian lnfonnation Criterion (BIC) will be 

used to select the best model for the claim severity data. It is possible to increase the 

likelihood by adding parameters when fitting models. However, this may lead to 

overfitting. Both AIC and BIC introduce a penalty term for the number of parameters in 

the model in a bid to resolve the problem of overfitting. BIC introduces a larger penalty 

term than AIC. Even though BIC value is always higher than AIC value, the lower the 

value of these two criteria the better a model is. 

3.4.5 Model Selection Criteria 

The research objective is to detennine an appropiiate loss distribution that provides a good 

fit to the claim severity data of motor insurance companies in Kenya. Based on the data 

analysis procedures that have been elaborately outlined in sections 3.4.1 to 3.4.4 above, 

the loss distribution that will be selected as being an appropiiate model is one that has: 

1. The maximum MLE value subject to passing the goodness-of-fit tests, and 

2. The minimum AIC and BIC value 

The study will focus on finding the loss distiibution that meets the above requirements. 

26 


I 
l 
I 
1 

1 

J 

l 

.J 

_I 

_I 

J 

CHAPTER FOUR: DATA ANALYSIS 

4.1 Introduction 

The data required for this study was motor insurance claim severity for Kenya. This data 

was readily available :fi:om publications ofthe Insurance Regulatory Authority (IRA) in 

fonn of annual reports. The annual reports were obtained in Microsoft Excel format and 

the required data was extracted therein. The data contained claim severity for 36 insurance 

companies that were licensed and regulated by IRA :fi:om 2014-2018. The data for Kenya 

Motor Insurance incmTed claims was provided in tenns of motor commercial and motor 

private. These two sets of data for the petiod 2014-2018 were analysed separately to obtain 

suitable models for each categmy of motor insurance. The data was analysed using R 

software. 

4.2 Descriptive Statistics 

The first step of data analysis was to determine the descriptive statistics of the data to get 

a general overview and allow a simpler interpretation of the data. Table 1 shows the 

descriptive statistics of the motor insurance incurred claims. The descriptive statistics in 

Table 1 confirm the positive skewness of the claims data and as such, positively skewed 

and long-tailed distributions being appropriate to model tlris data. 

27 


l 

J 

J 

J 

Table 1. Kenya Motor Insurance Incurred Claims Descriptive Statistics for 2014-

2018 

Motor Commercial Motor Private 

No. of observations (n) 170 174 

Mean 281,703,882.35 386,762,402.30 

Standard Error 23,826,723.55 30,253,278.83 

Median 178,457,000.00 216,641 ,000.00 

Standard Deviation 310,662,467.00 399,068,156.03 

Sample Vruiance 9.65E+16 1.59E+17 

Kurtosis 3.7716 2.6583 

Skewness 1.9078 1.6815 

Range 1,470,681,000.00 1,991,252,000.00 

Minimmn 89,000.00 994,000.00 

Maxirmun 1,470,770,000.00 1,992,246,000.00 

Smn 47,889,660,000.00 67,296,658,000.00 

Figure 1 shows the histograms of the miginal data of motor commercial and motor private 

claim sizes. The nonnal curves superimposed on the histograms show that the claim sizes 

are skewed to the right. The purpose of carrying out the normality test was to detennine 

whether to use either parametric tests or non-parametric tests on the data after fitting 

distributions. Given that the data was not following the nonnal curve, this was a 

confirmation that non-parametric tests would be applied to the distributions that would be 

fitted on the data. 

28 


l 
J 

l 

I 
J 

I 

J 

J 

J 

J 

Histogram of Motor Commercic 

0 
co 

0 
<.0 

0 
N 

0 

O.Oe+OO 1.0e+09 

Claim Size 

>-
(.) 

c 
IV 
:::J 
o-
~ 
lL 

Histogram of Motor Private 

0 
co 

0 
<.0 

0 
~ 

0 
N 

0 

O.Oe+OO 1.0e+09 2.0e+09 

Claim Size 

Figure 1 

Figure 2 shows the Q-Q plots of the original data of motor commercial and motor private 

claim sizes. Given that the upper ends of the Q-Q plots are deviating more :fi·om the straight 

line than the lower ends, this confinns that the data is positively skewed. Tliis implied the 

need to use continuous distributions that are positively skewed to fit the data. 

VI 
IV 

~ 
ro 
:::J 
a 
Q) 

Ci 
E 
ro 

(f) 

Q·Q Plot of Motor Commercial Q·Q Plot of Motor Private 

(J) 
0 
+ 
I]) 
q 

0 
0 
+ 
I]) 

0 
0 

-2 -1 0 1 2 

Theoreti cal Quantiles 

(J) 
0 
+ 
IV 0 

VI 
I]) 

~ 
ro 
:::J a 
I]) 

Ci 
E 
ro 

(f) 

q 
N 

(J) 
0 
+ 
Q) 

q 

0 
0 
+ 
I]) 
q 
0 

Figure 2 

29 

-2 -1 0 1 2 

Theoretical Quantiles 


l 

l 

J 

J 

The data was transformed using the cube root function to make it simpler to work with 

since the miginal data contained very large values. Figure 3 shows the histograms of the 

cube-root transfonned data of motor commercial and motor private claim sizes. 

>­
<..) 

c 
Q) 

::J 
0" 
Q) 

u:: 

Histogram of Motor Commercii 

0 400 800 1200 

Claim Size 

Histogram of Motor Private 

0 
c0 

>-
(j 

c 
0 Q) 

::J N 
0" 
(J) 

u:: 0 

0 

0 400 800 1200 

Claim Size 

Figure 3 

Figure 4 shows the Q-Q plots of the cube-root transformed data of motor commercial and 

motor private claim sizes. 

(/) 
Q) 

~ 
ro 
::J 
a 
(!) 

0.. 
E 
ro 

(f) 

Q-Q Plot of Motor Commercial 

0 
0 
0 

0 
0 
(!) 

0 
0 
N 

-2 -1 0 2 

Theoretical Quantiles 

(/) 0 
Q) 0 

~ 
0 

ro 
::J 
0 0 
Q) 0 

a.. (!) 

E 
ro 

(f) 0 
0 
N 

Figure 4 

30 

Q-Q Plot of Motor Private 

0 

-2 -1 0 2 

Theoretical Quantiles 


l 
l 
I 

I 

j 

J 

J 

The cube-root transfonned data seemed to be more suitable for purposes of fitting the 

distributions compared to the original data as shown by their respective histograms and 

Q-Q plots above. With the data being transformed, the next step was fitting distributions 

to obtain an appropriate model. Subsequently, goodness-of-fit tests were carried out and 

information ctiteria applied on the fitted distributions as outlined in the sections below. 

All these steps were canied out using the R software package "fitdistrplus". 

4.3 Parameters Estimation 

The parameters for the various fitted distributions were obtained using the MLE method. 

Table 2 shows the parameter estimates, their corresponding standard errors and the 

maximmn Ioglikelihood ftmction (LLF). The most appropriate distribution is the one witl1 

the highest loglikelihood function. 

31 


l 
l 
l 

J 

J 

Table 2. Estimated Parameters for fitted distributions 

Distribution Parameter Motor Commercial Motor Private 

Rate 0.0021 0.0020 

Exponential std. error 0.0001 0.0001 

LLF -1218.773 -1254.143 

Shape 5.4827 7.3252 

std. eiTor 0.5377 0.7165 

Gamma Rate 0.0096 0.0113 

std. enor 0.0010 0.0011 

LLF -1165 .534 -1192.602 

Meanlog 6.2576 6.4079 

std. error 0.0360 0.0296 

Lognonnal SD1og 0.4696 0.3898 

std. error 0.0255 0.0209 

LLF -1176.499 -1197.959 

Shape 5094930 3221289 

(std. enor) - -
Pareto Scale 2925993718 2095806234 

(std. enor) - -
LLF -1249.768 -1301.128 

Shape 2.7097 3.0010 

std. en·or 0.1617 0.1747 

Weibull Scale 644.7977 728.9534 

std. enor 19.2325 19.4038 

LLF -1161.697 -1193.441 

The parameter values in Table 2 are the ones that maximised the loglikelihood ftmction 

for each distribution. For motor commercial, Weibull distribution has the maximum 

loglikelihood function (-1161.697) followed by Gamma (-1165.534), Lognormal (-

32 


l 
l 

J 

J 

1176.499), Exponential (-1218.773) and Pareto (-1249.768) distributions. For motor 

ptivate, Gamma distribution has the maximum loglikelihood function (-1192.602) 

followed by Weibull (-1193.441), Lognonnal (-1197.959), Exponential (-1254.143) and 

Pareto ( -1301.128) disnibutions. Based on the LLF values, Wei bull and Gamma 

distributions were the most appropriate for motor commercial and motor private, 

respectively. The plots for the various fitted distributions are provided in the appendices 

section. 

4.4 Goodness-of-Fit Test 

The main aim of carrying out the goodness-of-fit test was to measure the distance between 

the fitted paran1en·ic distribution F(x) and the empirical distribution Fo(x). The 

Kolmogorov-Smirnov (K-S) and Anderson-Darling (A-D) tests were used to detennine 

the appropriateness of the fitted distt·ibutions to the claim size data. TI1ese two tests helped 

to determine the most suitable continuous distribution for the claim severity. Table 3 

shows the K-S and A-D test statistic values for the disn·ibutions fitted on the claim severity 

data. 

Table 3. K-S and A-D test statistic values for fitted distributions 

Test Statistic Distiibution Motor Commercial Motor Private 

Exponential 0.3089418 0.3601346 

Grumna 0.07988259 0.04042086 

K-S Lognormal 0.09186256 0.04602873 

Pareto 0.3084464 0.3600815 

Weibull 0.05444920 0.08136149 

Exponential 27.6395653 33.0098233 

Gatmna 0.67351891 0.37752203 

A-D Lognormal 1.68105273 0.59667217 

Pareto 27.5714722 33 .0018593 

Wei bull 0.41538833 1.08746132 

33 


I 

J 

J 

I 

The K-S and A-D test statistic values were used to compare the fit of the distributions to 

the data as opposed to an absolute measure of how a particular distJibution fits the data. 

Smaller K-S and A-D test statistic values indicate that the distribution fits the data better. 

For motor commercial, Weibull distribution had the smallest K-S and A-D test statistic 

values (0.05444920, 0.41538833) followed by Gamma (0.07988259, 0.67351891), 

Lognormal (0.09186256, 1.68105273), Pareto (0.3084464, 27.5714722) and Exponential 

(0.3089418, 27.6395653) distributions. For motor private, Gamma distribution had the 

smallest K-S and A-D test statistic values (0.04042086, 0.37752203) followed by 

Lognormal (0.04602873, 0.59667217), Weibull (0.08136149, 1.08746132), Pareto 

(0.3600815, 33.0018593), and Exponential (0.3601346, 33.0098233) distributions. This 

implied that Wei bull and Gamma distributions fitted the data of motor commercial and 

motor private better, respectively. 

4.5 Information Criteria 

The Akaike and Bayesian Information Critetia were applied to determine the appropriate 

distribution among the fitted distributions. Table 4 shows the AIC and BIC values for the 

fitted distributions . The most appropriate distribution is one which has the minimum AIC 

and BIC values. 

34 


l 
l 

J 

J 

J 

J 

Table 4. AIC and BIC values for fitted distributions 

Information Distribution Motor Commercial Motor Private 

Criterion 

Exponential 2439.547 2510.286 

Gamma 2335.067 2389.204 

AIC Lognonnal 2356.999 2399.918 

Pareto 2503.536 2606.255 

Weibull 2327.393 2390.881 

Exponential 2442.682 2513.445 

Gamma 2341.339 2395.522 

BIC Lognormal 2363.270 2406.236 

Pareto 2509.808 2612.573 

Weibull 2333.665 2397.199 

For motor commercial, Weibull distribution had the minimum AIC and BIC values 

(2327 .393, 2333 .665) followed by Gamma (2335 .067, 2341 .339), Lognormal (2356.999, 

2363.270), Exponential (2439.547, 2442.682) and Pareto (2503.536, 2509.808) 

distributions. For motor private, Gamma distribution had the minimtun AIC and BIC 

values (2389.204, 2395.522) followed by Weibull (2390.881, 2397.199), Log:nmmal 

(2399.918, 2406.236), Exponential (2510.286, 2513.445) and Pareto (2606.255, 

2612.573) distributions. This implied that Weibull and Gamma distributions were the 

most appropriate for motor commercial and motor private, respectively . 

4.6 Summary 

The most appropriate distribution is one which has the maximmn LLF, minimum K -S and 

A-D test statistic values, and minimum AIC and BIC values. Based on the data analysis 

canied out, as shown above, the most appropriate distiibutions to model claim severity 

35 


_j 

l 

.l 

J 

J 

_j 

1 

data for motor commercial and motor private are Weibull distribution and Gamma 

distribution, respectively. 

Based on past studies that were carried out as discussed in the literature review section, 

the lognonnal distribution is fronted as being the most suitable to model motor insurance 

claim severity. However, given the findings of this research, Weibull and Gamma 

distributions are the most appropriate to model motor commercial and motor p1ivate claim 

severity, respectively. The lognonnal distiibution is the third most suitable model for tlris 

purpose based on the findings of the study. 

36 


I 
J 

j 

_j 

CHAPTER FIVE: CONCLUSION 

5.1 Intmduction 

l11e objective of this research was to determine the appropriate statistical distribution that 

fits claim severity data of motor insurance in Kenya and can be used to accurately forecast 

futm·e claims experience. To achieve tltis, the study used the steps to be followed when 

fitting a suitable model to claims data as provided by lgnatov et al. (2001). 

The first step entailed selecting a family of distributions for the claims model whereby the 

Exponential, Gamma, Lognonnal, Pareto and Weibull distiibutions were selected given 

that they are continuous positively skewed distributions. The parameters for these 

distributions were estimated using the MLE method. The K-S and A-D tests were applied 

as goodness-of-fit tests on the fitted distributions. The AIC and BIC infonnation criteria 

were used to determine the appropriate distribution among the fitted distributions. 

5.2 ConcJusion 

The study carried out an analysis of Kenya motor insurance claim severity data from 2014-

2018 based on the above steps. The finding of the study was that the Weibull and Gamma 

distributions are suitable for modelling motor commercial and motor private data, 

respectively. Tltis was because they had the maximum LLF, minimum K-S and A-D test 

statistic values, and minimmn AIC and BIC values among the fitted distributions. 

Findings of past studies as ltighlighted in the Literature Review section [Achieng (2010) 

and Omrui et al . (2018)] :fi·onted the Lognormal distribution as being t11e most suitable for 

modelling claim severity. Gamma, Weibull, Pareto, and Exponential distributions were 

also :fi·onted as being appropriate as well in tl1at decreasing order of preference [Mazviona 

and Chiduza (2013), and Packova and Brebera (2015)]. The findings of this study are 

different from past studies given that tile Weibull and Gamma distiibutions have been 

selected as being the most suitable to model claim severity for motor commercial and 

motor private data respectively . Lognonnal, Pareto and Exponential distributions ru·e also 

appropriate in that decreasing order of preference. 

37 


J 

J 

The objectives of this study were achieved given that the Weibull and Gamma 

distributions have been selected as being the most appropriate to model claim severity of 

motor commercial and motor private, respectively. These distiibutions were selected after 

the data analysis steps used to achieve the research objectives were successfully followed 

and meeting the selection criteria of the study and thus being selected as the most suitable 

after emerging the best among the other competing distributions. To this end, Weibull ru1d 

Gamma distributions fit the claim severity data of motor commercial and motor private 

respectively and can be used to accurately forecast future claims experience; thus, 

achieving the objectives of the sn1dy. 

5.3 Recommendations 

The appropriate distlibutions obtained under this sn1dy are suitable for forecasting shmt­

term claims experience given the period covered by the study. It is recommended that 

insurers use their own claims expelience to adjust the distributions accordingly for long­

tenn forecasts . This would allow insurers to consider their specific financial objectives 

and expected changes in their investment portfolios. These proposed claim severity 

distributions would also be useful to insurance regulators while testing for solvency and 

assessing the required levels of reserves for various insurance companies. 

5.4 Limitations of the Study 

Tltis study focused on statistical distributions only as opposed to consideting other 

approaches such as non-parametric methods, machine leruning and deep learning. These 

approaches are also suitable for modelling claim severity. Non-parrunetric methods do not 

assume underlying statistical distributions in the data and thus do not rely on any 

distribution. This approach may serve well in certain circumstances whereby data fails to 

follow any statistical distribution and thus be well modelled using this approach. 

Technological advancement in teims of machine and deep learning has improved 

38 


l 
l 
l 

J 

J 

efficiency. TI1is can be applied by insurers in the modelling process by predicting patterns 

of claim volume and augmenting loss analysis using artificial intelligence. 

This study did not determine the impact of modelling claim severity on the business of 

insurance companies. Insurers need to know tllis impact to make suitable adjustments in 

their modelling process. Modelling claim severity involves the use of resources and thus, 

insurers need to know whether it is worth it or not for them to comnlit resources towards 

modelling given tl1e value being added to the business in terms of increased or decreased 

profitability. If modelling has a positive impact on the business in tenns of increased 

profitability, insurers will strive to commit high-quality resources towards the modelling 

process in a bid to ensure they increase their profitability. 

5.5 Suggestions for Further Research 

Interested parties may investigate otl1er approaches that may be used to model claim 

severity such as non-parametric methods, machine learning and deep learning. 

Further studies may also be earned out to detennine the impact of modelling claim 

severity on the business of insurance companies. 

39 


l 
l 
" l 

J 

J 

j 

REFERENCES 

1. Omari, C. , Nyambura, S. and Mwangi, J. (2018). Modeling the Frequency and 

Severity of Auto Insurance Claims Using Statistical Dist:Jibutions. Journal C?l 
Mathematical Finance, 8, 137-160. doi: 10.4236/jmf.2018.81012 

2. Achieng, 0. M. (2010). Actuarial modeling for insurance claim severity in motor 

comprehensive policy using industrial statistical distributions. Intemational 

Congress of Actuaries, Cape Town, 7-12 March 2010. 

3. Boland, P. J. (2006)- Statistical Methods in General Insurance. 

4. Frees, E. W., & Valdez, E. A. (2008). Hierarchical Insurance Claims 

Modeling. Journal ofthe American Statistical Association, 103(484), 1457-

1469. doi : 10.1 198/016214508000000823 

5. Nduwayezu, F. (2016). Finding appropriate loss distJibutions to insurance data: 

Case study ofKenya (2010-2014). Retrieved April3, 2020, from https://su­

pltts .stJ·athmore.edu/handle/11 071/5361 

6. Distributions Used in Actuarial Science. (n.d.). Retrieved April3, 2020, from 

https://reference.wolfram .com/language/guide/DistributionsUsedinActuarialScie 

nce.htrnl 

7. Association ofKenya Insurers (2018). Insurance Industry Annual Report. 

Ret:Jieved April 23 , 2020, from 

https://www.akinsure.com/images/publications/ AKI%?0 Ann Repott ?01 8.pdf 

8. Deloitte (2019). Insurance Outlook Report 2019/2020 East Africa. Retrieved 

from 

https://www2.deloitte.com/content/dam/Deloittelke/Documents/audit/Final%20I 

nsurance%200utlook%20Repot1%20EA %2020 19.pdf 

9. Balmemann, D. (2015). Chapter 2: Claim Size. In DISTRIBUTIONS FOR 

ACTUARIES (pp. 37-77). Arlington, Virginia 22203: CASUALTY 

ACTUARIAL SOCIETY. 

10. Mazviona, B. W., & Ch.iduza, T. (2013). THE USE OF STATISTICAL 

DISTRIBUTIONS TO MODEL CLAIMS IN MOTOR 

40 


l 

_j 

J 

J 

INSURANCE. Intemational Jmm1al of Business, Economics and Law, 3(1), 44-

57. Retrieved from https://www.ijbel.com/wp­

content/uploads/2014/01/KLB3140-THE-USE-OF-STATISTICAL­

DISTRIBUTIONS-TO-MODEL-CLAIMS-IN-MOTOR-INSURANCE­

Batsirai.pdf 

11. Packova, V., & Brebera, D. (2015). Loss Distributions in Insurance Risk 

Management. Recent Advances on Economics and Business Administration: 

Proceedings of the International Conference on Economics and Statistics (ES 

2015). Vienna, Austria, Nfarch 15-17, 2015, 17-22. Retrieved from 

http ://www.inase.org/librarv/2015/barcelona/ECBAS.pdf 

12. Dodge, Y. (2008). The concise encyclopedia of statistics: With 247 tables. New 

Y ark, NY: Springer. 

13. Akaike, H. (1974). A New Look at Statistical Model Identification. IEEE 

Transactions on Automatic Control, 19, 716-723. 

https :/ /doi.org/1 0.11 09/T AC.197 4.1 100705 

14. Bumecki, K. , Misiorek, A., & Weron, R. (2010). Loss Distributions. Statistical 

Tools for Finance and Insurance, 289-317. doi: 10.1007/3-540-27395-6 13 

15. Kaas, R., Goovaerts, M., Dhaene, J. and Denuit, M. (2008). Modem Actuarial 

Risk Themy Using R. Springer, Berlin, Vol. 53. 

16. lgnatov, Z. G., Kaishev, V. K., & Krachunov, R. S. (2001). An improved finite­

time ruin probability fonnula and its Mathematica implementation. h1surance: 

Mathematics and Economics, 29(3), 375-386. doi: 10.1016/s0167-

6687(0 1 )00078-6 

17. Myung, I. J. (2003). Tutmial on maximum likelihood estimation. Journal of 

Mathematical Psychology, 47(1), 90-100. doi: l0 .1016/s0022-2496(02)00028-7 

18. Schwarz, G. (1978). Estimating the Dimension of a Model. The Annals of 

Statistics, 6(2), 461-464. doi: 1 0.1214/aos/1176344136 

19. Research & Statistics: Annual Reports. Retrieved from 

https ://www.ira.go.ke/index.php/publications/statistical-repmts/annual -repmts 

20. Actuarial Education Company. (2014). Retrieved from 

https ://acted.co. uk/docs/2014/CMP%20Upgrade/CT6-PU- 1 4.pdf?cv= 1 

41 


l 
l 

J 

J 

J 

APPENDICES 

Appendix A: Fitted Distributions for Motor Commercial 

Appendix Al : Exponential Distribution 

u.. 
0 
() 

0 

Empirical and theoretical dens. 

0 200 400 600 800 1000 

Data 

Empirical and theoretical CDFs 

0 200 400 600 800 1000 

Data 

Appendix A2: Gamma Distribution 

Empirical and theoretical dens. 

>. 
--

3F~ 
(J) 

c 0 
oZ> 0 

0 0 0 
c:i 

0 200 400 600 800 1200 

Data 

Empirical and theoretical CDFs 

0 

LL 
0 
() 0 

0 

0 200 400 600 800 1200 

Data 

Ul 
Q) 

c 
ro 

~ 
-m 
u 

·.::: 
·o_ 
E 
w 

"' "' 
.L:l 
rn 
-"' 
0 

0.. 
rn 
u 

0.. 

E 
w 

(J) 

"' 
c:: 

"' 
o-
ro 
u 

0.. 
~ 
ill 

Ul 
Q) 

0.. 

E 
w 

42 

Q-Q plot 

~;:::wvV: 
0 

-, 0 
0 
<'J 

0 500 1500 2500 

Theoretical quantiles 

0 

;~¢~~ 
0.2 0.4 0.6 0.8 

TI1eoretical probabilities 

Q-Q plot 

0 
0 
N 

200 600 1000 1400 

Theoretical quantiles 

P-P plot 

0.0 0.2 0.4 0.6 0.8 1.0 

TI1eoretical probabilities 


l 
l 
l 

I 

_j 

_j 

_I 

Appendix A3: Lognormal Distribution 

>. 
--
rJ) 

c 
Q) 

0 

LL 
0 
u 

0 
0 
0 
0 
6 

Empirical and theoretical dens. 

~~ 
0 200 400 600 800 1200 

Data 

Empirical and theoretical CDFs 

0 200 400 600 800 1200 

Data 

Appendix A4: Pareto Distribution 

Empirical and theoretical dens. 

>. 

3~9 
<fl 

c 0 
Q) 0 

0 0 
0 r 
0 

0 200 400 600 800 1200 

Data 

Empirical and theoretical CDFs 

0 

~e=~:=~~, 
u_ 
0 
u 0 

0 

0 200 400 600 800 1200 

Data 

lfi 

"' c 
ro 
:r::r 

-ro 
0 

-;:: 

0. 

E 
w 

"' "' 

"' 0 
-;:: 

a. 
E 
w 

r.n 
~ 

c 

"' "' 0" 

-ro 
0 

-;:: 

0. 

E 
w 

fJ'j 

"' 
ro 

.!::> 

2 
0. 

-ro 
0 

-;:: 
0. 

E 
w 

43 

0 
0 
N 

0 

0 
0 

0 
0 
N 

0 

0 
0 

Q-Qplot 

500 1000 1500 

Theoretical quantiJes 

P-P plot 

0.0 0.2 0.4 0.6 0_8 

Theoretical probabilities 

Q-Q plot 

c I 
0 500 1500 2500 

Theoretical quantiles 

P-P plot 

~ 
0.2 0.4 0.6 0.8 

Theoretical prob-abil ities 


1 

I 

l 
J 

J 

J 

j 

J 

Appendix AS: Weibull Distribution 

LL 
0 
(_) 

0 

Empirical and theoretical dens. 

0 200 400 600 800 1200 

Data 

Empirical and theoretical CDFs 

0 200 400 600 800 1200 

Data 

0. 

E 
w 

(J) 
Q) 

J:l 
ro 

J:l 

e 
0. 

"' 0 
: ~ 
a_ 

E 
w 

Q-Q plot 

200 400 GOO 800 1200 

Theoretical quantiles 

P-P plot 

~ 

~r 0 
c:i 

0.0 0.2 0.4 0.6 0.8 1.0 

Theoretical probabil ities 

Appendix B: Fitted Distributions for Motor Private 

Appendix Bl: Exponential Distribution 

Empirical and theoretical dens. C/) Q-Q plot Q) 

E 

"" 
ro 

·-

~~ 
:::::> 

~¢ 
~:o~x:oooo v Ql C/) t:T 

c 0 ro 0 
Q) 0 0 0 9 (.) 0 0 · ;:: N I I I I 

c:i a_ 

0 200 400 600 800 1000 E 0 500 1500 2500 w 

Data Theoretical quantiles 

C/) 

Empirical and theoretical CDFs OJ P-P plot 

0 "' ~ 

~~~COVQ~ ~ 
.n 

i-~:::J 
LL 0 
0 a. 
(_) 0 -,;; 0 

c:i 0 c:i 
·;:: 

0 200 400 600 800 1000 ·a.. 0.2 0.4 0.6 0.8 E 
w 

Data Theoretic al probabil it ies 

44 


l 

l 

J 

Appendix B2: Gamma Distribution 

Empirical and theoretical dens. 

>-

~ ~r--:r, "' c 0 
Q) 0 

0 0 
0 r 
0 

0 200 600 1000 

Data 

Empirical and theoretical CDFs 

0 

~ I of~ I 
u.. 
0 
u 0 

0 

0 200 600 1000 

Data 

Appendix B3: Lognormal Distribution 

lL 
0 
u 

Empirical and theoretical dens. 

0 200 600 1000 

Data 

Empirical and theoretical CDFs 

0 200 600 1000 

Data 

"' ~ 
c 
~ 
cr-

-ro 
u 

CL 

E 
w 

"' "' ·-
..0 

"' ..0 

e 
CL 

"' :~ 
CL 

E 
w 

c 

"' :J 
;:r 

"' u 
·.:::: 
·a.. 
E 
w 

"' aJ 

..0 

"' ~ 
c.. 

45 

0 
0 
N 

0 

0 
c) 

Q-Q plot 

iF I :;:UQ 
I I 

Ql 

200 600 1000 1400 

TI1eoretical quantiles 

P-P plot 

~f 
re:e& 

I r;r»:~ 

I 

0.0 0.2 OA 0.6 0.8 1_0 

TI1eoretical probabilities 

Q-Q plot 

500 1000 "1500 

Theoretical quantiles 

P-P plot 

0.0 0.2 0.4 0.6 0.8 1.0 

Theoretical probabilities 


l 
-1 

-1 

I 
I 

I 

I 

_I 

J 

I 

J 

J 
J 

Appendix B4: Pareto Distribution 

Empirical and theoretical dens. 

>.. 

~~ 
(/) 

c: 0 
Q) 0 

0 0 
0 
0 

0 200 600 1000 

Data 

Empirical and theoretical CDFs 

0 

~r~:1 
u.. 
0 
u 0 

0 

0 200 600 1000 

Data 

Appendix BS: Weibull Distribution 

Empirical and theoretical dens. 

>.. 

~F~ 
·r;; 
c: 0 
Q) 0 

0 0 
0 
0 

0 200 600 1000 

Data 

Empirical and theoretical CDFs 

~ 

~ 0 f~~:::· I 1..:... 
0 
u 0 

c:) I 

0 200 GOO 1000 

Data 

(/) Q-Qplot Q) 

c: 

"' :::; 
~~x;ooov v 

~I :::r 
ro 0 
0 0 

: ~ N I 
0.. 

c 
w 0 1000 2000 3000 

Tl1eoretical quantiles 

(/) 

- ~ P-P plot 
:i5 

"' 0 
..0 

~ 
0.. 

ro 0 
0 0 

: ~ 
0.. 0.2 0.4 0.6 0.8 E 
w 

Tl1eoretical probabilities 

(/') Q-Q plot .!£ 
c 
~ 

~~ · ·:: Ql 
c-
ro 0 ; 0 0 

N I ·c.. 
c 

LU 
200 600 1000 

Theoretical quantiles 

(/) 

~ P-P plot 
:E 
ro ~ .0 
0 

0.. 
-rn 0 

-~ 
0 

·c.. 0.0 0.2 0 .. 4 0.6 0.8 1.0 E 
w 

Theoret ical probabilities 

46