Вы находитесь на странице: 1из 18

RockFord Mutual Group Project Report

Report prepared for: Mr. Shane A. Heeren, Vice President of Marketing and Sales,
Rockford Mutual Insurance Co.

By: Ashley Becker, Brooke Tucci, Alex Ochenkowski, Sidney Cimini

MKT 245. Introduction to Marketing Analytics. Illinois State University. Dr Pui Ying
Tong. May 7, 2018
Table of Contents:
Introduction 3

Data Preparation 3

Descriptive Statistics and Visualizations 4

Visualizations as Data Mining Using Tableau 6

Customer Profitability Analysis 12

Nearest Neighbor Prediction 16

Discussion 17

2
Introduction
The purpose of this study was to identify segments of customers that are the most
profitable for Rockford Mutual Insurance Company. Through our analysis we have identified the
best and worst customers of Rockford Mutual and variables to prioritize when seeking out
potential customers for future business. With the information we have gathered, Rockford
Mutual will be able to better determine customers that are worthwhile and customers that could
potentially cause them to lose money in the long run.
We were given data based on a customer’s home insurance policy. This included
information such as insurance score level, construction type, protection class, home age, home
value, and the territory they are from. These were the main observations that we looked at.
Through data analysis, we were able to take this information and find characteristics that makes a
customer profitable or unprofitable.

Data Preparation
Before beginning with our analysis, we had to transform the data in order to gather the
information we were looking for. Transactional data was given to us, meaning that each row of
data is a transaction that was made and in this case transactions are renewed every 6 months.
This gave us multiple rows of the same customer, and in order to move ahead with our data
analysis we had to use two data mining techniques to manipulate the data. We first ran an SPSS
analysis to identify duplicate cases, and a new variable was created in order to indicate the last
case in each group as the primary case. This new variable created to indicate the new customer
account is the variable Primary Last, and represents the primary case being assigned to that
customer. This is done to ensure that we are not looking at multiple accounts of data for the same
customer.
The next step in transforming the data was to run an RFM analysis. RFM analysis gives
customer rankings based on three categories: recency, frequency, and monetary. Recency is how
recently a customer has purchased, frequency is how often they make a purchase, and monetary
is how much they spend. Customers are given a ranking of 1 to 5, the higher the ranking the
better the customer. The RFM analysis was based off of profit and gave us two new variables
that helps determine customer profitability. These two new variables are Total Premium and
Total loss. Total premium measures the earned premium collected and Total Loss measures the
amount of loss paid over time. These variables were copied into the new data set that eliminates
duplicate cases. With these new variables we were allowed to determine the loss premium ratio,
calculated by total loss/total premium. The loss premium ratio tells us if a customer is profitable.
If a customer has a loss premium ratio below .55, they are deemed profitable. An easier way to
determine this in the data is to code the loss premium ratio into two groups, if the value is below
.55 a customer would be given a 1 meaning they are profitable and if the value is above .55 a
customer would be given a 0 identifying them as unprofitable. This data transformation was
essential for us in determining which groups of customers are significant.

3
We also created additional variables in order to better categorize some independent
variables. Variables such as Home Age and Home Value were transformed as ordinal variables
in order to simplify data visualizations and interpretations. Home age was categorized in groups
of years beginning with 1 representing homes ages 0-10 years, 2 representing homes aged 11-20
years and so on respectively. Home value is categorized in groups of 0-3; group 0 represents
homes valued under $100,000, group 1 includes homes valued $100,000-$200,000, and so on
respectively. My group also categorized age of policyholder in order to determine if certain ages
of customers are more profitable. Age of policyholder was categorized into ten groups; group 1
being ages 1-10, group 2 ages 11-20, group 3 ages 21-30 and so on. With these new variables,
we were able to move forward with our analysis.

Descriptive Statistics and Visualizations


Data Summary
We first began our analysis using SAS studio to obtain descriptive statistics of certain
variables. The variables we ran analyses on include home age, construction type, insurance score
level, and state. The important information to focus on in these analyses is frequency (number of
observations) and the loss premium dummy (mean). The loss premium dummy tells us the
profitability rate of that variable, the closer to 1 the more profitable that variable is. Data is
shown on Chart A, B, C, D, E.
A. Statistics on Home Age

B. Frequency Distribution of Home Age

4
According to Table A and B, the most profitable group of customers include those who
have homes in group 0. This group represents homes aged 0-10 years. There are 970 homes in
this group. The worst group of customers would be those with homes in group 9, ages of 91-100
years. This group has the lowest profitability rate of 83%.
C. Statistics of Construction Type

According to Table C, the most profitable group of customers include those with log and
“other” type homes. These have a profitability rate of 1, which is means they are 100%
profitable. However, it is important to note that the number of observations recorded in the data
only account for 8 homes between the two types. This is a very small number and could have
caused these groups to be outliers. With note of that, Rockford Mutual should also include
homes with construction type of frame as the best customers because it has the next highest mean
(profitability rate). The worst group of customers would include those with homes that are earth
sheltered, this has a very low profitability rate. It is important to note, like the other home types
mentioned, that earth sheltered homes account for an extremely small amount of homes in the
data. This could cause it to be an outlier, and may not accurately represent the worst customers.
D. Statistics on Insurance Score Level

5
According to Table D, the most profitable group of customers are those in group G1. G1
has the highest profitability percentage of 89%, however it is important to recognize that this
group is small in comparison to the other groups. This could cause it be an outlier, and may not
accurately represent the best customers. With that in mind, the best customers may also include
those in groups C2 and E2 as they both have the next highest profitability rate of 87%. The worst
group of customers would include those in groups A1, F1, and F2 as they have the lowest
profitability rate of 83%.
E. Analysis of State

According to Table E, Wisconsin has the most profitable customers. These customers
have a profitability rate of 90%. As mentioned before, the number of observations is smaller in
comparison to the customers in Illinois. It is surprising that Illinois accounts for majority of
Rockford Mutual’s customer base, but they have a low profitability rate of 83%. Indiana, though
with a small amount of customers, also has a high profitability rate. Customers in Wisconsin
should be recognized as the best customers, and Rockford Mutual should consider expanding
their business in to other states.

Visualizations as Data Mining Using Tableau


Segmentation by Territory
We looked into the top 5 and bottom 5 performing territories based on total profit. The
loss premium dummy variable was included to determine the most profitable territory of both the

6
best and the worst. Table F shows the top 5 territories. The territory with the highest profit was
territory 97, however the territory that is considered to be the most profitable is territory 75.
Customers in area 75 are to be among the best customers. Table G shows the worst 5 territories.
These territories include territory 12, 50, 54, 65, and 99. Among these bad territories, the more
profitable ones are territory 54 and 99. However, these are still territories that are considered to
be the worst.
F. Top 5 Territories by Total Profit G. Bottom 5 Territories by Total Profit

Segmentation by Insurance Score Level


Table H tells us the average loss premium ratio for each insurance score level, and which
level is the most profitable per the loss premium dummy. Average Loss premium ratio is
measured by size, the bigger the circle the higher the loss ratio is for that group. Loss premium
dummy is the variable we are focusing on and can be determined by color, the darker the color
the more profitable that group is. According to this table, G1 is the most profitable group. It is
important to be weary of the number of records in each insurance score level because G1, can be
shown in Table D. Some other more profitable groups would be insurance score levels E2, B3,
and C2. Customers in these insurance score levels should be considered the “best” customers.
The worst customers are represented by the lightest colored circles. This indicates they have a
low profitability rate, and these groups are A1, F1, and F2. Customers with these insurance score
levels should be considered among the worst and least profitable.
H. Insurance Score by Loss Premium Ratio and Loss Premium Dummy

7
Segmentation by Protection Class
Table I focuses on the total profit by protection class, and we can determine profitability
by the loss premium variable. As the graph shows the total profit collected is highest in group 4,
but that is not the most profitable group in terms of the loss premium dummy variable (indicated
by color). The most profitable groups are 8A and 10. According to the field descriptions of the
data that was given in class, the class protection status ranges from 1(best)-10(worst). That being
said, the most profitable groups are groups that live farther from a fire station. Customers with
homes in zones that are further from a fire station should be considered “best” customers. The
customers that would be categorized as the worst based on their profitability would be those in
protection class 6 and 8B.
I. Protection Class by Total Profit

8
Segmentation by Home Age
Graph J has total profit separated by home age, which was coded into groups, and
determines profitability by loss premium dummy. This graph shows that group 0.00 has the
highest profitability percent, but does not account for a large total of profits. Group 0.00 consists
of homes ages 0 to 10 years. These are homes that are the most profitable. The worst group of
customers would consist of homes in group 9, homes aged 91-100 years. This is indicated by a
light color meaning it has a low profitability rate. This group also consists of the smallest total
profit brought in. Combined that with the low profitability rate suggests that Rockford Mutual
should eliminate customers with homes that old.
J. Total Profit by Home Age

Segmentation by Construction Type


Graph K demonstrates construction type by average home value, and determines
profitability by the loss premium dummy. As shown, log and “other” are the most profitable
types of homes. However, when looking at the number of records of log and “other” homes,
there are very little. Shown in table C, there are a total of 8 observations between these two types
of homes in comparison to the other types of construction. Taking that into consideration, log
and “other” homes are very profitable, but when determining the most profitable group of
customers I would not rule out other construction types. Frame has the second highest
profitability rate, so that should be included in the group of best customers. The worst customers
are represented by the lighter color, indicating they have a smaller profitability rate. This table
shows these customers are those with earth sheltered homes. However, as mentioned in Table C
these only account for a very small amount of records in the data and may not accurately
represent the worst customers.
K. Construction Type by Average Home Value

9
Segmentation by Age of Policyholder
This graph shows age groups of the policyholders and the amount of total profit each
group has brought in, and determines the profitability percentage by the loss premium dummy.
We also used the sum of number of records to determine how many policyholders are within
each category to determine accuracy of the profitability. We have filtered out group 0, ages 0-10
because there are no policyholders of this age. According to this graph, the most profitable group
of policyholders include group 1 and 2. These groups are identified by age ranges from 11-20
and 21-30 respectively. The profitability percentage for both these groups is around 88%. When
looking at the data, both of these groups do not have many observations within the data. Group 1
only has 17 records and group 2 has a little over 1,000 records. These ages should be considered
part of Rockford Mutual’s best customer group. Though, because these groups are so small it is
important to look at other groups. The next best group is group 9, ages 90-100. This group has
102 accounts and has a profitability percentage of 87%. We would also consider groups 3 and 5
to be good customers because they have many accounts in the group and still have a high
profitability percentage. Rockford Mutual should be cautious with this information, however,
because many of the policyholders do not have an indicated age in the data we were given. As
you can see in the graph, the null category is the largest category in both number of records and
total profit, but it has a low profitability percentage.
L. Age of Policyholder by Total Profit

10
Segmentation by State
This graph shows us the total profit by state. As shown, Illinois accounts for a majority of
the total profit. However, the loss premium dummy variable tells us that Illinois customers are
not the most profitable. According to the graph, Wisconsin is the most profitable and would be
considered better customers. However, Illinois has significantly more customers than Indiana
and Wisconsin do (shown in table E). Since the data for Wisconsin and Indiana isn’t as large, it
is likely that they would have a higher profitability rate but may not accurately represent the best
customer segment. Though, Rockford Mutual should keep in mind that customers in Wisconsin
and Indiana have high profitability rates and consider expanding in those regions.
M. Total Profit by State

Customer Profiling Summary


From the collected data, we were able to interpret the customers that would produce the
most profit and the customers that would produce the least amount of profit. First we concluded
the following best and worst customers from using the SAS programming. When it comes to
home age (Table A and B), the most profitable group of customers include those who have
homes in group 0. This group represents homes aged 0-10 years. There are 970 homes in this
group. The worst group of customers would be those with homes in group 9, ages of 91-100
years. This group has the lowest profitability rate of 83%. In reference to table C, the most

11
profitable group of customers include those with log and “other” type homes. These have a
profitability rate of 1, which is means they are 100% profitable. The worst group of customers
would include those with homes that are earth sheltered, this has a very low profitability rate.
According to Table D, the most profitable group of customers are those in group G1. G1 has the
highest profitability percentage of 89%. The worst group of customers would include those in
groups A1, F1, and F2 as they have the lowest profitability rate of 83%. According to Table E,
Wisconsin has the most profitable customers. These customers have a profitability rate of 90%,
however, it is important to realize that although the population of customers in Wisconsin and
Indiana is small, they still bring about substantial profit.
Next we concluded the following most and least profitable customers from using the
Tableau data programming. Table F shows the top 5 territories. The territory that is considered to
be the most profitable is territory 75. Customers in area 75 are to be among the best customers.
Table G shows the worst 5 territories. These territories include territory 12, 50, 54, 65, and 99.
Among these bad territories, the more profitable ones are territory 54 and 99. However, these are
still territories that are considered to be the worst. According to table H, G1 is the most profitable
group. Some other more profitable groups would be insurance score levels E2, B3, and C2.
Customers in these insurance score levels should be considered the “best” customers. The worst
customers are represented by the lightest colored circles. This indicates they have a low
profitability rate, and these groups are A1, F1, and F2. According to table I, when it comes to
protection class it has been interpreted that group 10 and 8A bring forth the most profit. The
customers that would be categorized as the worst based on their profitability would be those in
protection class 6 and 8B. Graph J shows that group 0.00 has the highest profitability percent.
Group 0.00 consists of homes ages 0 to 10 years. These are homes that are the most profitable.
The worst group of customers would consist of homes in group 9, homes aged 91-100 years. As
shown in graph K, log and “other” are the most profitable types of homes. Frame has the second
highest profitability rate, so that should be included in the group of best customers. The worst
customers are those with earth sheltered homes. According to graph L, the most profitable group
of policyholders include group 1 and 2. These groups are identified by age ranges from 11-20
and 21-30 respectively. The last profitable group is the null group which includes those accounts
who do not have an age range disclosed. According to the graph M, Wisconsin is the most
profitable and would be considered better customers. Since the data for Wisconsin and Indiana
isn’t as large as Illinois, it is likely that they would have a higher profitability rate but may not
accurately represent the best customer segment. Though, Rockford Mutual should keep in mind
that customers in Wisconsin and Indiana have high profitability rates and consider expanding in
those regions.

Customer Profitability Analysis


Decision Tree Analysis
Decision tree analysis uses a tree-like graph to determine in which each node represents a
test on an attribute and the outcome. We are testing attributes and the effect that it would have on

12
whether a customer is profitable or not. For our decision tree analysis, we ran two different
CHAID methods. The dependent variable we focused on is the loss premium dummy to
determine profitability, and the independent variables include territory, personal property
replacement cost, coverage A, protection class, home value, and insurance score.

The above decision tree was ran using the QUEST CHAID method. Based on this
analysis, the most profitable group of customers are those in node one. Node one has a
profitability percentage of 88.5%. These territories include 53, 84, 64, 83, 120, among the many
others listed for that node. The least profitable group in this analysis is represented in node 6.
This node shows a profitability rate of 69.6%, which is very low. These customers can be
categorized as those in territories 97, 49, 61, 115, and so on as listed in node 2; have a personal
property replacement cost; and, have a coverage A greater than $412,792.5.

13
The above decision tree was ran using the CRT CHAID method. Based on this analysis,
we can see that the most profitable group of customers are in node 2 and node 5. Both of these
nodes have a profitability rate of 87%. Node 2 represents customers in the following protection
classes: 7, 2, 8A, 10, and 1. Node 5 Represents customers in a protection class of 6, 4, 3, 5, 8, 9,
or 8B; have a home valued of 200001-300000, 500001-600000, 900001-1000000; and, have an
insurance score greater or equal to 904.5. The worst customers are represented in node 6, as they
have the smallest profitability percentage of 80.7%. These customers have the same
characteristics as node 5, however they have an insurance score less than or equal to 904.5.
Logistic Regression
Logistic regression is used when there are only two outcome variables. In our case, the
outcomes are profitable or unprofitable. Running a logistic regression will determine which
variables are significant in determining the outcome.

14
This model summary suggests that only .01% of the changes in the dependent variable is
being explained by the independent variables. This means that the variables insurance score,
protection class, home age, home value, and home/auto credit only explain .01% of whether a
customer is profitable or not.

This table describes explains why R squared is so small. The left side of the table shows
cases that are actually profitable and unprofitable, and the right hand side is the prediction that
our model made. In our study, 0 represents cases that are unprofitable and 1 represents cases that
are profitable. Looking at the table, the numbers in the 0 column are cases that the model
predicted as unprofitable and how they actually were observed in the data. There were no cases
predicted as unprofitable that turned out to be unprofitable or profitable. In the column for cases
predicted as profitable, 2,083 actually turned out to be unprofitable and 12,283 were actually
profitable. There is a problem as you can see because the model predicted there were no
unprofitable cases when there are actually 2,083 customers deemed unprofitable.

15
This graph shows us the variables that are significant and meaningful to us, and how they
affect the dependent variable. To determine this, we must look at the significance value. In order
for a variable to be significant, it must have a P (sig.) value less than or equal to .05. As you can
see in the chart, only one of the variables has a significance level that low. Home Age has a P
value of .043, and the rest of the variables have relatively high P values. We can also look at the
exponential value of the variables to determine the effect it has on the dependent variable. Any
variable with an exponential value (Exp(B)) greater than 1 increases the likelihood of a
transaction being unprofitable. Any value under 1 would mean that the present of those variables
increase the chances of the customer being profitable. Looking at our significant variable Home
Age, this table tells us that all else equal, one unit increase in Home Age the likelihood of the
transaction being profitable decreases by .996 times. In other words, the older the home is the
greater the chance of it being unprofitable.

Nearest Neighbor Prediction


Nearest Neighbor Analysis is used when we want to predict a target value. A new
example of the target value is created and an analysis is ran to find other observations in the data
that are most similar to the target value. The idea is to find clients similar to an organization’s
best customers. The analysis will plot customers on a data pane and the closest data points to the
target will be its “nearest neighbors”. We ran the analysis focusing on a few variables such as
state, territory, insurance score, protection class, dwelling age, and coverage A. In order to run
the analysis, we had to create two fake, target customers.
Customer 1
Customer 1 had the following characteristics: IL, territory 70, insurance score of 800,
protection class 08, coverage A of 150,000, dwelling age of 51. Based on these characteristics,
the analysis found customers in the data that are similar to our target.

16
Based on what is shown in these graphs, the data has pulled 3 customers similar to the
target. These customers are policyholder #7018, #10827, and #12408. The variables protection
class, construction type, state and territory are all very similar to our target. Where dwelling age
see the most variety. The dwelling age for our fake customer is relatively higher than its nearest
neighbors. Based on their loss premium dummy being 1, we can assume that our made up
customer would be profitable.
Customer 2
Customer 2 had the following characteristics: IL, territory 50, insurance score of 400,
protection class 06, frame construction type, coverage A of 90,000, and dwelling age of 21.
Based on these characteristics, the analysis found existing customers that are similar to our
target.

Based on the graph, the analysis has come up with 3 customers most similar to our target.
These customers include policyholder #414, #588, and #4637. These customers have the same
protection class and construction type, but differ slightly in territory, dwelling age, and coverage
A. Because these customers all have a loss premium dummy of 1, we can predict that our fake
customer will be profitable as well.
Overall, Rockford Mutual can use nearest neighbor analysis when approached by a new
client. By putting a potential customer’s information into a system like this and running the
analysis, they will be able to determine if that customer is likely to be profitable or unprofitable.

Discussion
The purpose of our study is to identify a target market for Rockford Mutual Insurance.
Through data analysis we have identified characteristics of the best and worst customers. The
main variables that were looked at were construction type, insurance score level, state, home age,
protection class, age of policyholder, and territory they are from. Through further analysis we are
able to identify certain segments of the customers whom are most profitable. We have identified
the most profitable customers to be those in the insurance score level G1; those with homes of

17
construction type log, frame, or other; who live further from a fire station in protection classes 8
to 10; who live in Wisconsin and Indiana; who have a home that was recently built, home age
range from 0 to 10 years; customers ages 11-30 and 90-100; and, who live in territories 56, 61,
66, 75, and 97. We have also identified some characteristics that that are not beneficial to
Rockford Mutual. These characteristics include customers in territory 12, 50, 54, 65, and 99;
with earth sheltered constructed homes; and, located in protection class 6 and 8B. We have
identified these characteristics as the worst customers because they have the lowest profitability
percentage.
With these findings, we have come up with some suggestions for Rockford Mutual as
they do business in the future. Rockford Mutual should only take on customers with newly built
homes. As our analysis has shown, as homes increase in age they also decrease in profitability.
Homes that are newly built is where Rockford Mutual will make the most money. Also,
Rockford Mutual should consider expanding more into Wisconsin and Indiana. We have
discovered that customers in these states have higher profitability rates than their customers in
Illinois. It could be only be a positive addition to their customer base to expand into these areas.
We feel these findings were the most prominent across our various analyses and therefore are
significant enough to adjust their business strategy.
Though we have come up with some meaningful insight as to how Rockford Mutual
should predict profitability of future customers, there are some limitations to our analysis. We
are not professionals. This class is the first time we have done real data analysis like this; we are
learning as we go along. It is reasonable to suspect that we have made some errors along the
way. We are also not experts at interpreting this data, as we have limited knowledge about data
analytics. We also lack knowledge of the insurance industry. Many of the variables we are
working with are not terms we are familiar with, so it is hard for us to be confident with making
business recommendations having little to no knowledge of the insurance industry as a whole.
With recognition to our weaknesses, we believe we have gathered information to the best of our
abilities. It is our hope that the analyses we have provided will help Rockford Mutual in their
future business.

18

Вам также может понравиться