Вы находитесь на странице: 1из 9

Assignment 1

A report submitted to Prof. Ashish Galande

In partial fulfilment of the requirements of the course


Marketing Research

By

Subhajit Roy (1911264)

Section D

On

24-1-2020
1. Based on the sample customer data of 1999, what can Green conclude about
average customer profitability for Pilgrim Bank's entire customer population?

Answer:

We can say from the data below that the average profit is 111.5. The lowest value
for profit is -221 and the standard deviation is 272.8394. This means consumer
profitability varies widely. Also we can see from histogram that a big chunk is even
below 0 which is unprofitable and exactly 46.8 per cent of customers have negative
profit. Also from the correlation matrix, we can see that there is a stronger
correlation between age, tenure, income with profit.

…………………………………………………………………………………………

9Profit

Mean 111.502687
1.53401645
Standard Error 1
Median 9
Mode -2
272.839391
Standard Deviation 5
74441.3335
Sample Variance 5
Range 2292
Minimum -221
Maximum 2071
Sum 3527276
Count 31634
Confidence 3.00673204
Level(95.0%) 2

9Tenur
  9Profit 9Online 9Age 9Inc e 9District
9Profit 1
9Online 0.00705 1
0.14554
9Age 1 -0.16573 1
0.14746 0.08115
9Inc 6 4 -0.07004 1
0.19113 0.42358 0.04539
9Tenure 3 -0.06648 3 5 1
0.00309 0.00346 0.02615 -
9District 5 9 -0.03083 1 0.01031 1

…………………………………………………………………………………………

2. Is the difference in average customer profitability between online and offline


customers in the sample indicative of a meaningful difference in the profitability of
the groups? 

Answer:

Here we chose the online and offline profits separately, and attempted to run
ANOVA to test whether or not the means and variances of both data are significantly
different.
After running the ANOVA, we tested the p-value and found that the p-value is
greater than 0.05 which means both the online and offline variances are statistically
identical within the error range of 5 percent. And hence we can say that there is no
significant difference in the profitability between online and offline variation.

Below are the results from the ANOVA test:


…………………………………………………………………………………………

RUNNING ANOVA ON EXCEL:

ANOVA:

Single Factor

SUMMARY
Varianc
Groups Count Sum Average e
307764 110.786 73604.2
Column 1 27780 2 2 2
116.666 80465.6
Column 2 3854 449634 8 3

ANOVA
Source of
Variation SS df MS F P-value F crit
Between 117039. 117039. 1.57226 0.20988 3.84175
Groups 3 1 3 4 8 3
2.35E+0 74439.9
Within Groups 9 31632 9

2.35E+0
Total 9 31633        

…………………………………………………………………………………………

3. Will the missing data have any effect on Green’s analysis?

Answer:
Here we have separated the data based on missing age and income data, as these
two demographic factors are missing values. For both cases, we took profits for
missing data as well as present data and ran ANOVA separately for both. After
running ANOVA we find that the p-value for both missing income and age is less than
.05, which means that the data variances are not identical and are significantly
different since both the hypothesis for variance being equal is rejected

Profit segregation on basis availability of Age data

…………………………………………………………………………………………

ANOVA: Single Factor

SUMMARY
Groups Count Sum Average Variance
125.1869 79197.9
With data 23345 3E+06 8 6
59039.7
Without data 8289 6E+05 72.96248 9

ANOVA
Source of P-
Variation SS df MS F value F crit
1668362 225.709 7.7E- 3.84175
Between Groups 6 1 16683626 8 51 3
73916.25
Within Groups 2.34E+09 31632 8

Total 2.35E+09 31633        

…………………………………………………………………………………………

Profit segregation on basis availability of income data

ANOVA: Single Factor

SUMMARY
Varianc
Groups Count Sum Average e
With data 23373 293773 125.6890 79463
0 4
71.36496
Without data 8261 589546 8 58060

ANOVA
Source of P-
Variation SS df MS F value F crit
Between 1801265 9.22E- 3.84175
Groups 2 1 18012652 243.83 55 3
73874.24
Within Groups 2.34E+09 31632 3

Total 2.35E+09 31633        

…………………………………………………………………………………………

4. What role do customer demographics play in analysing customer profitability for


online and offline customers?

Answer:

Here R is used, using some dummy variables for age, salary, and district brackets.
The model's R^2 is 6.45% which means data variation is not clearly explained by the
model, and hence it is advised to collect more data variables and fill in missing data
points such as age and income. The variables Age(2,3,4,5,6,7), income(5,6,7,8,9),
district(1200) and tenure have a value of p which is less than 0.05, which indicates
they have a strong relationship with online earnings. On the other hand, the p-value
of certain variables such as income(1,2,3,4), district(1300) exceeds 0.05, which
indicates they have a low relationship to the benefit of the online user.

…………………………………………………………………………………………

> x <- lm(formula = `X9Profit` ~ as.factor(`X9Age`) + as.factor(`X9Inc`) + `X9Tenure` +


as.factor(`X9District`),data = f)
> summary(x)

Call:
lm(formula = X9Profit ~ as.factor(X9Age) + as.factor(X9Inc) +
X9Tenure + as.factor(X9District), data = f)
Residuals:
Min 1Q Median 3Q Max
-524.20 -155.40 -71.04 67.55 1956.64

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -53.9294 13.1744 -4.093 4.26e-05 ***
as.factor(X9Age)2 30.1445 11.9343 2.526 0.01155 *
as.factor(X9Age)3 68.8419 11.7074 5.880 4.16e-09 ***
as.factor(X9Age)4 73.2837 11.7885 6.217 5.17e-10 ***
as.factor(X9Age)5 77.4957 12.2367 6.333 2.45e-10 ***
as.factor(X9Age)6 97.5431 12.6813 7.692 1.51e-14 ***
as.factor(X9Age)7 133.0971 12.5012 10.647 < 2e-16 ***
as.factor(X9Inc)2 1.2393 11.6532 0.106 0.91530
as.factor(X9Inc)3 11.3702 8.3952 1.354 0.17563
as.factor(X9Inc)4 11.4107 8.5523 1.334 0.18214
as.factor(X9Inc)5 16.7295 8.5341 1.960 0.04997 *
as.factor(X9Inc)6 40.4157 7.4686 5.411 6.32e-08 ***
as.factor(X9Inc)7 61.7383 8.1552 7.570 3.86e-14 ***
as.factor(X9Inc)8 79.6597 9.3112 8.555 < 2e-16 ***
as.factor(X9Inc)9 148.3020 8.3545 17.751 < 2e-16 ***
X9Tenure 4.0763 0.2354 17.314 < 2e-16 ***
as.factor(X9District)1200 19.1753 6.3776 3.007 0.00264 **
as.factor(X9District)1300 7.1871 7.7592 0.926 0.35432
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 273.7 on 22794 degrees of freedom


(8822 observations deleted due to missingness)
Multiple R-squared: 0.06448, Adjusted R-squared: 0.06379
F-statistic: 92.42 on 17 and 22794 DF, p-value: < 2.2e-16

19.175 7.187

…………………………………………………………………………………………
5. Using the Pilgrim Bank data that we used during class today, what can Alan Green
recommend about the customers who use Online Bill Payment? 

Answer:
Here we have selected online data for profits and bill payments for 1999 and 2000
and found that R^2 of the model is 1.2 percent. Hence data variation is not clearly
explained by the model, so it is recommended that more data be collected or
missing data points such as age and income be filled in. The variable bill payment has
a value of p which is less than 0.05, which means it has a direct connection to online
earnings.

…………………………………………………………………………………………

Online transactions with bill payment for year 1999

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.110613
R Square 0.012235
Adjusted R
Square 0.011979
Standard
Error 281.9606
Observations 3854

ANOVA
Significanc
  df SS MS F eF
47.7135
Regression 1 3793308 3793308 2 5.75E-12
79501.7
Residual 3852 3.06E+08 5
Total 3853 3.1E+08      

Coefficient Standard
  s Error
t Stat P-value
21.3060
Intercept 104.1669 4.889081 2 2.58E-95
6.90749
9Billpay 91.24033 13.20888 7 5.75E-12
…………………………………………………………………………………………
…………………………………………………………………………………………

Online transactions with bill payment for year 2000


SUMMARY
OUTPUT

Regression Statistics
Multiple R 0.081591
R Square 0.006657
Adjusted R
Square 0.006322
Standard Error 343.9376
Observations 2965

ANOVA
Significanc
  df SS MS F eF
19.8569
Regression 1 2348937 2348937 2 8.66E-06
118293.
Residual 2963 3.51E+08 1
Total 2964 3.53E+08      

Coefficient Standard
  s Error
t Stat P-value
22.5505
Intercept 155.2127 6.882884 3 4.7E-104
4.45611
0Billpay 77.19974 17.32447 1 8.66E-06
…………………………………………………………………………………………

Вам также может понравиться