Вы находитесь на странице: 1из 22

2019

Market Research Assignment 2

Apurva Negi (2018011)


Section B
8/14/2019
1. Error of Commission:

No error of commission as the means of all the variables are close to the scale points.

2. Missing Values
Respondent IDs 288 and 303 contain missing values in all the variables.
Thus, these two responses are removed.
Other missing values which are replaced by the mean whole values:
3,10,12,28,31,172,184,275

3. Outliers:
Removal of respondent cases which fall in extreme outliers:
56 unique cases removed due to extreme outlying property
Remaining cases: 327/384
4. Skewness and Kurtosis:
The likert scale data will be skewed or kurtotic and thus, removing the variables which
have high variability will make no sense. If the data was continuous, then we would
have considered the standard error x 3 criteria to remove the variables.

5. Normality:
Shapiro-Wilk test tells us whether our data represents a normal distribution or not. As
we can see that all the significance values are less than 0.05, thus, all our data is a
normal distribution (with 95% confidence level). Though, it makes no sense to prove
normality for likert scale data as the data can be skewed and still we will process
without analysis.
6. Correlation:
Checking multicollinearity in the variables if the Karl Pearson value >0.9, then we will
remove those variables. On seeing the correlation matrix, we found that there are no
variables with very high correlation values, thus, no multicollinearity.

Also, the determinant value is significant <0.01 at 99% confidence level


7. Factor Analysis:

KMO value 0.923, which means that the data and the sample size is good enough for
the test to proceed. Bartlett’s Test is significant and thus proves that the correlation
matrix is not an identity matrix.
All the variables have something related to each other, thus, the communalities
among them are >0.4.
Total variances explained by top 7 components is approximately 69%.

Rotations converged after 6 iterations, and we got variables clubbed into the 7 factors
or components. These 7 constructs are the same as defined earlier. Two variables
exhibited cross-loadings which are not acceptable as the factor analysis checks the
uni-dimentionality of the data.
Thus, removing InfoAcq_4 and InforAcq_5 will be a good step, and run factor
analysis again. After removing those two variables, and running factor analysis again,
the following were observed:
KMO and Bartlett’s value remain significantly good to proceed with the test.

Variances explained increased to approximately 70%.

All the variables are now clubbed in their respective 7 constructs.

Factor 1: Useful
Factor 2: Joy
Factor 3: Decision Quality
Factor 4: Playful
Factor 5: Usage Type
Factor 6: Competency
Factor 7: Acquired Information
8. Cronbach’s Alpha:

Factor 1: Useful

Factor 2: Joy

Factor 3: Decision Quality

Factor 4: Playful

Factor 5: Usage Type

Factor 6: Competency
Factor 7: Acquired Information

2. Discriminant Analysis
Experience as an dependent variable by recoding it into a median split variable. New variable
created is Exp_Split. The median value for Experience (through descriptives and frequencies)
is 3. Thus, Experience values less than and equal to 3 are termed as ‘Low’ with value 0, and
Experience values greater than 3 are termed as ‘High’ with value 1.

The variables Education, Experience, Playful, Competence, Type Usage and Information
Acquired are significant at 95% confidence level (sig value < 0.05). Wilk’s lambda tells us the
relative importance among the independent variables. Smaller the Wilk’s lambda values,
more the importance. Thus, importance:
Experience > Comp_Mean > Type_Mean > Playful_Mean > Education > others

The null hypothesis is accepted as the covariance matrices


are equal.
Sig<0.05

Pearson's correlation between the discriminant scores and the two groups is high (0.8)
36% of variances are not explained by the differences in the two groups (High and Low Exp),
thus, there is a greater discriminatory ability of the function.

The associated chi-square statistic tests the hypothesis that the means of the functions listed
are equal across groups. The small significance value indicates that the discriminant function
does better than chance at separating the groups High and Low Exp.

Larger the value, the greater the discriminatory ability of the


variables. The ability can have positive as well as negative
effect. Experience has the highest ability.
The ordering in the structure matrix is the same as that
suggested by the tests of equality of group means and is
different from that in the standardized coefficients table.
Variables having values greater than 0.4 is only Experience.

Tells us the mean cut-offs for yes and no


probabilities. From -1.211 to 1.464 lies the Zone of Confusion.

98.2% accuracy of the model


(Low-Low + High-High) / Total => (179+142)/327 = 98.18%
3. Cluster Analysis

Hierarchical Cluster Analysis

Variables: Experience, Playful_Mean, Comp_Mean, Joy_Mean, Info_Mean, Useful_Mean,


Type_Mean, Decision_Mean
There is huge jump at stage two
as the distance coefficients
have a huge jump, but for safe
purpose we can consider that
the clusters form can be three
or four as this is a subjective
test. Thus, when we go to the
next step of cluster analysis, we
can input 4 clusters.
Also, we get a long dendrogram, which shows 3-4 distinct clusters
K-Means Cluster Analysis

Though the cluster size is significantly varied, but when we perform K-Means Cluster
Analysis with 3 or 5 clusters, the cluster size vary drastically.

We can see distinct properties of the Four clusters.


TwoStep Cluster Analysis

Cluster sizes are considerably different. Thus, decreasing the cluster input in the test.
When the clusters were two, the ratio of sizes were relatively less than the other tests we did
earlier.
Conclusion: 2 Clusters
Predictor importance of frequency is the highest. We can see what happens if we remove it.
Removing frequency brings us to 3 clusters.
After removing gender too, we are left with 2 clusters.

The cluster size is relatively different.


Two clusters of 128 and 199 respondents.

After this, all the TwoStep Cluster analysis were giving significantly varied cluster sizes.
Thus, we stop here and conclude that only 2 clusters can be formed.
Structural Equation Modelling
Looking at the estimates, we see that all the observed variables significantly measure the
unobserved variables with confidence greater than 99.9% (CR value > 2.54). The relative
importance of each measure for the construct can be seen through Estimate value or from
the standardized regression weights.

Estimate S.E. C.R. P


Playful_1 <--- Playful 1.000
Playful_2 <--- Playful 1.343 .102 13.139 ***
Playful_3 <--- Playful 1.220 .099 12.301 ***
Playful_4 <--- Playful 1.262 .102 12.373 ***
Playful_5 <--- Playful 1.370 .106 12.966 ***
Playful_6 <--- Playful 1.418 .107 13.305 ***
Playful_7 <--- Playful 1.072 .098 10.958 ***
CompLatent_5 <--- Competency 1.000
CompLatent_4 <--- Competency .955 .090 10.595 ***
CompLatent_3 <--- Competency 1.019 .089 11.509 ***
CompLatent_1 <--- Competency 1.009 .090 11.212 ***
AtypUse_1 <--- Usage 1.000
AtypUse_2 <--- Usage 1.075 .060 17.844 ***
AtypUse_3 <--- Usage .980 .050 19.752 ***
AtypUse_4 <--- Usage 1.026 .053 19.227 ***
AtypUse_5 <--- Usage .969 .052 18.772 ***
Useful_7 <--- Useful 1.000
Useful_6 <--- Useful .976 .072 13.646 ***
Useful_5 <--- Useful 1.104 .075 14.752 ***
Useful_4 <--- Useful 1.216 .073 16.706 ***
Useful_3 <--- Useful 1.230 .076 16.273 ***
Useful_2 <--- Useful 1.150 .074 15.584 ***
Useful_1 <--- Useful 1.109 .076 14.553 ***
Joy_1 <--- Joy 1.000
Joy_2 <--- Joy 1.133 .070 16.078 ***
Joy_3 <--- Joy 1.062 .069 15.439 ***
Joy_4 <--- Joy 1.008 .062 16.353 ***
Joy_5 <--- Joy 1.162 .074 15.666 ***
Joy_6 <--- Joy 1.014 .065 15.559 ***
Joy_7 <--- Joy 1.061 .066 16.115 ***
InfoAcq_3 <--- InfoAcq 1.000
InfoAcq_2 <--- InfoAcq 1.125 .091 12.339 ***
InfoAcq_1 <--- InfoAcq 1.178 .087 13.586 ***
DecQual_8 <--- DecQual 1.000
DecQual_7 <--- DecQual 1.178 .135 8.729 ***
DecQual_6 <--- DecQual 1.050 .132 7.964 ***
DecQual_5 <--- DecQual 1.223 .135 9.069 ***
DecQual_4 <--- DecQual 1.236 .139 8.866 ***
DecQual_3 <--- DecQual 1.353 .143 9.427 ***
DecQual_2 <--- DecQual 1.240 .134 9.271 ***
DecQual_1 <--- DecQual 1.143 .134 8.551 ***
Correlations among the constructs can also be seen that no two construct is significantly
correlated with each other, which means that the constructs hold true for themselves.

Estimate
DecQual <--> InfoAcq .652
DecQual <--> Joy .419
DecQual <--> Useful .553
DecQual <--> Usage .186
DecQual <--> Competency .166
DecQual <--> Playful .291
Joy <--> InfoAcq .501
Useful <--> InfoAcq .576
Usage <--> InfoAcq .188
Competency <--> InfoAcq .290
Playful <--> InfoAcq .373
Useful <--> Joy .450
Usage <--> Joy .240
Competency <--> Joy .322
Playful <--> Joy .509
Usage <--> Useful .210
Competency <--> Useful .255
Playful <--> Useful .322
Competency <--> Usage .426
Playful <--> Usage .414
Playful <--> Competency .437

A snippet of Standardized total effect tables shows us the exact same result that we got from
our Exploratory Factor Analysis. Thereby confirming our constructs and measures.

Model Fit Summary


CMIN

Model NPAR CMIN DF P CMIN/DF

Default model 144 1531.829 758 .000 2.021

Saturated model 902 .000 0

Independence model 82 10471.692 820 .000 12.770

CMIN/DF value has to be less than 3 to be a good fit model.


Baseline Comparisons

NFI RFI IFI TLI


Model CFI
Delta1 rho1 Delta2 rho2

Default model .854 .842 .920 .913 .920

Saturated model 1.000 1.000 1.000

Independence model .000 .000 .000 .000 .000

CFI value need to be near 1 to be a goodness of fit model.

RMSEA

Model RMSEA LO 90 HI 90 PCLOSE

Default model .056 .052 .060 .008

Independence model .190 .187 .193 .000

RMSEA value need to be near 0 (strict cut-off: 0.05) which tells us the badness of fit.
Thus, through the model fit tests, we come to know that our model is the best fit model.

Вам также может понравиться