# Research Objective:

## Is there a difference between the profitability of on-line and off-line customers?

The summary of the data provided is shown below. It can be observed that the average profitability is
111.5 (if Confidence Interval is taken at 95% [z=1.96] we get a range of 108.5 -114.5). The standard
deviation is 272.84 which signifies that there is high variation in profitability for customers in the given
sample.

T-test :

## The hypothesis for the T-Test :

H0 :  online =  offline (Null hypothesis: profitability of on-line and off-line customers are same)

The output of T-Test is shown below. From the data, the t value is -1.212 and sig is 0.225. As |t|<2 and
sig>.05 we fail to reject the Null Hypothesis. Hence, we can say that there is no observable and
meaningful difference between online and offline profitability across the entire customer population.
ANOVA:

## The hypothesis for the T-Test :

H0 :  online =  offline (Null hypothesis: profitability of on-line and off-line customers are same)

The same can also be varified from the One Way ANOVA test as shown below. From the table, we can
see that significance value is greater than 0.05 and thus we fail to reject the above Null Hypothesis.
The Mean Plot below also implies the result derived above.
Regression:

To find out the effect of various customer demography parameters while analysing the customer
profitability for both online and offline customers, a regression model is derived taking 9Profit as
depend variable and rest as independent variable.

As some of the data for Age and income are missing, the same is filled with the mode of Age(3) and
mode of Income (6).

From Exhibit 4, the Age bucket, Income bucket and Geographic region are provided as categorical
variables. The variables are converted into dummy variable as per below:

## District (3 values) -> 2 dummy variables

Stepwise Method is taken while deriving the regression model. The model output is shown below.

## The regression equation is :

9Profit = B0 + B1.9Online + B2.9Tenure + B3.D1_Age + B4. D2_Age + B5. D3_Age + B6. D4_Age + B7.
D5_Age + B8. D6_Age + B9. D1_Inc + B10. D2_ Inc + B11. D3_ Inc + B12. D4_ Inc + B13.D5_ Inc + B14.
D6_ Inc + B15. D7_ Inc + B16. D8_ Inc + B17. D1_ DIST + B18. D2_ DIST +e

## ANOVA -> H0 : B0 = B1 = B2 = B3 = B4 = B5 = B6 = B7 = B8 = B9 = B10 = B11 = B12 = B13 = B14=

B15= B16= B17= B18 =0

T-TEST -> H0 : B0=0, B1=0, B2=0, B3=0, B4=0, B5=0, B6=0, B7=0, B8=0, B9=0, B10=0, B11=0, B12=0,
B13=0, B14=0, B15=0, B16=0, B17=0, B18=0
It can be derived from the above result that the customer demographics of age, district ,online,
tenure and income play an important role in determining profitablity.

The Model Summary, R square value is .067 (Adjusted R square is 0.066). It says that only 6.7% of
the variation in customer profitability is explained by the variations in the demographic variables.

From the ANOVA table above, the significance value is less than 0.05 and so the null hypothesis is
rejected. Which says that there is some coefficient which is not zero.

From the co efficient table, it can be derived that all the variable except D1_Dist are significant and
hence become a part of the model.

SECONDARY OBJECTIVES:

## 1. Is profitability different with different age bucket

In order to analyse the above objective, ANOVA is done with the below hypothesis.

## H0 :  1 =  2 =  3 =  4 =  5 =  6 =  7 (profitability for all the customer age bucket are same)

As the significance value is less than 0.05 the null hypothesis is rejected. Hence, we can
conclude that there are significant difference among customer age buckets as far as customer
profitability is concerned.

## 2. Is profitability different with different income bucket

In order to analyse the above objective, ANOVA is done with the below hypothesis.

## H0 :  1 =  2 =  3 =  4 =  5 =  6 =  7 = 8 =  9 (profitability for all the customer income

bucket are same)
As the significance value is less than 0.05 the null hypothesis is rejected. Hence, we can
conclude that there are significant difference among customer income buckets as far as
customer profitability is concerned.
When variables are contrasted to carry out t-test, it is found out that profitability are same
for the income bracket 1 to 6 and the same is different from the income bracket from 7 to 9.

## 3. Is profitability different with different Geographic region

In order to analyse the above objective, ANOVA is done with the below hypothesis.
H0 :  1 =  2 = 3 (profitability for all the customer Geographic region are same)
As the significance value is less than 0.05 the null hypothesis is rejected. Hence, we can
conclude that there are significant difference among customer geographical region as far as
customer profitability is concerned.
When variables are contrasted to carry out t-test, it is found out that profitability are same
for the region 1100 and 1300 . However, the profitability are different the region 1100 and
1200, and 1300 and 1200.