Академический Документы
Профессиональный Документы
Культура Документы
The summary of the data provided is shown below. It can be observed that the average profitability is
111.5 (if Confidence Interval is taken at 95% [z=1.96] we get a range of 108.5 -114.5). The standard
deviation is 272.84 which signifies that there is high variation in profitability for customers in the given
sample.
T-test :
H0 : online = offline (Null hypothesis: profitability of on-line and off-line customers are same)
The output of T-Test is shown below. From the data, the t value is -1.212 and sig is 0.225. As |t|<2 and
sig>.05 we fail to reject the Null Hypothesis. Hence, we can say that there is no observable and
meaningful difference between online and offline profitability across the entire customer population.
ANOVA:
H0 : online = offline (Null hypothesis: profitability of on-line and off-line customers are same)
The same can also be varified from the One Way ANOVA test as shown below. From the table, we can
see that significance value is greater than 0.05 and thus we fail to reject the above Null Hypothesis.
The Mean Plot below also implies the result derived above.
Regression:
To find out the effect of various customer demography parameters while analysing the customer
profitability for both online and offline customers, a regression model is derived taking 9Profit as
depend variable and rest as independent variable.
As some of the data for Age and income are missing, the same is filled with the mode of Age(3) and
mode of Income (6).
From Exhibit 4, the Age bucket, Income bucket and Geographic region are provided as categorical
variables. The variables are converted into dummy variable as per below:
Stepwise Method is taken while deriving the regression model. The model output is shown below.
9Profit = B0 + B1.9Online + B2.9Tenure + B3.D1_Age + B4. D2_Age + B5. D3_Age + B6. D4_Age + B7.
D5_Age + B8. D6_Age + B9. D1_Inc + B10. D2_ Inc + B11. D3_ Inc + B12. D4_ Inc + B13.D5_ Inc + B14.
D6_ Inc + B15. D7_ Inc + B16. D8_ Inc + B17. D1_ DIST + B18. D2_ DIST +e
T-TEST -> H0 : B0=0, B1=0, B2=0, B3=0, B4=0, B5=0, B6=0, B7=0, B8=0, B9=0, B10=0, B11=0, B12=0,
B13=0, B14=0, B15=0, B16=0, B17=0, B18=0
It can be derived from the above result that the customer demographics of age, district ,online,
tenure and income play an important role in determining profitablity.
The Model Summary, R square value is .067 (Adjusted R square is 0.066). It says that only 6.7% of
the variation in customer profitability is explained by the variations in the demographic variables.
From the ANOVA table above, the significance value is less than 0.05 and so the null hypothesis is
rejected. Which says that there is some coefficient which is not zero.
From the co efficient table, it can be derived that all the variable except D1_Dist are significant and
hence become a part of the model.
SECONDARY OBJECTIVES: