DATA PROCESSING, ANALYSING AND INTERPRETATION Ipmi

DATA PROCESSING AND ANALYSIS
Dr Firdaus Basbeth
2
Agenda
1. Introduction to Data Analysis & Statistics
3
2. Descriptive Statistics
3. Inferentiall Statistics & Hypothesis Testing

Things aren’t always what we think
Blind men and an elephant
• They all fell into heated
argument as to who was
right in describing the big
beast, all sticking to their
own perception.
4
• The wise man then calmly
said, “Each one of you is
correct; and each one of you
is wrong. Because each one
of you had only touched a
part of the elephant’s body.
Thus you only have a partial
view of the animal. If you put
WHY DO WE ANALYZE DATA
• To describe and summarize the data
• Compare variables
5
• Identify the differences between variables
• Identify relationship between variables
• Data must be interpreted. Numbers do not speak for themselves

1. FROM DATA TO INFORMATION TO KNOWLEDGE
• What is Data
Facts, numbers, characters, which can be processed by a computer. These
can be any alphanumeric characters i.e. text, numbers, symbols.
6
Data is meaningless. Data must be interpreted, by a human or machine to
derive meaning.
• Example
None of the above data sets have any meaning until they are given a
CONTEXT and PROCESSED into a useable form
Example Raw Data 42, 63, 96, 74, 56, 86
Farah’s scores in the six Innovation modules

Context
Processing
Avg 69.5, Min 42, Max 96, SUM 417

Information
Farah’s teacher could analyze the results to

determine whether it would be worth her re-sitting
a module. Adding scores would give us a mark out
Knowledge of 500 that could then be converted to an B level
grade. Alternatively we could convert the
individual module results into grades
Summary
Information = Data + Context + Meaning
Processing
Data – raw facts and figures

Information – data that has been processed (in a context) to give it meaning
What is Statistics?
“Statistics is a way to get information from data”
TOOL
Data Information
Statistics is a tool for creating new understanding

from a set of numbers.
Definitions: Oxford English Dictionary

Statistical Terms/Terminology
Population Entire group of people, events, or things of interest that the researcher wishes
to investigate
Parameter The characteristics of the population such as mean, standard deviation and
variance.
Sample A subset of the population. It comprises some members selected from it. Only
some bit not all elements of the population would from the sample.
Sampling Process of selecting a sufficient number of elements from the population
Individual People or objects included in the study
Variable Characteristics of the individual/object/ condition to be measured or observed
Independent Variable One that influences the dependent variable in either positive or negative way.
Dependent Variable The variable of primary interest to the researcher. The researcher goal is to
understand, describe, explain or predict the dependent variable
Data Measurement that are made on the subjects of an experiment
Quiz 1: Using basic terminology
Television station TV3 wants to know the proportion of TV owners in Kota Bharu who
watch the station’s new program at least once a week. The station asks a group of 1000
TV owner if they watch the program at least once a week
a) Identify the individual of the study and variable
The individual are the 1000 TV owners surveyed. The variable is the response does or does not watch the
program at least once a week
b) Do the data comprise a sample? If so what is the population

The data comprise a sample from all TV owner
c) Identify variable that might be of interest

Age or income
d) Is the proportion of viewers in the sample who watch the new program at least once a
week a statistics or a parameter? Statistics
How Variables Are Measured
• To establish relationships between variables, researchers must observe
the variables and record their observations. This requires that the
variables be measured.
• The process of measuring a variable requires a set of categories called a
12
scale of measurement
• These scales are classified into 4 Types of Measurement Scales : Nominal,
Ordinal, Interval, Ratio
1. Nominal Scale
Here nominal should be associated with the word name since this scale
identifies names, labels or categories
Examples:
• Gender
• Marital Status
• State of Residence
• College Major
• Zip Code
• Student ID
13
2. Ordinal Scale
The ordinal scale (think of the word order) Applies to data that can be
arranged in order 1, 2, 3 and so on, however differences between data
values either cannot be determined or are meaningless.
Examples
• Miss America results: first place, runner up, third
• Military rank
• Degree held
3. Interval Scale
An interval scale represent higher level than the ordinal scale, with the
additional property that distance between observations is meaningful.
Examples
• Celsius scale of temperature measurement
• Questionnaires
4. Ratio Scale
A ratio scale is an interval scale with the added property that ratios of
observations are meaningful.
Only scale that allows you to make ratio comparisons, such as “Susana’s
income is 35% more than Suhaila’s”
Examples:
Weight of package of books
Number of questions correct on a quiz
Distance from KL to Kota Bharu
QUIZ 2: Identify the type of data
Data Type of Data
Taos, Acoma, Zuni, and Cochiti are the names of four natives
American pueblos from the population of names of all Native
American pueblos in Arizona and New Mexico
In high school graduating class of 319 student, Saiful ranked 25 th, Hafiz
ranked 19th, Robby ranked 10th and Tava ranked 4th, where 1 is the
highest rank
Body temperatures (degree Celcius) of trout in Java Sea
Length of trout swimming in the Java Sea
Branches of Statistics
Descriptive Inferential
Describes a set of data, by Tries to infer information
graphically displaying the about population by using
information, or describing its information gathered by
means and standard sampling (estimation of
deviation parameters and Hypothesis
testing)
Branches of Statistics
Statistics
Descriptive Statistics Inferential Statistics
Parametric Statistics Non-Parametric Statistics

2. TOOLS TO SUPPORT DATA ANALYSIS
Computer Package
• SPSS
20
• CB-SEM (AMOS)
• PLS-SEM (SMART PLS)
RESEARCH ACTIVITIES USING STATISTICS
No What to do with data Criteria Tools Bracnh of Expected Result
statistics
1. (2) Pilot Test : Check validity and Pearson correlation (0.3) and SPSS Correlation Valid and Reliable
reliability of Questionnaires Cronbach’s Alpha (0.7
2. (3) Data preparation: Check Histogram, Normal distribution, SPSS Descriptive Normal distribution, No
Normality. Outliers, Missing skewness, Kurtosis, outliers, no missing values,
Values Kolmogorov Smirnoff,
significance, Q-Q Plot, Box Plot
3. (1) Descriptive Analysis Mean, SD, Frequency, % tage Excell, SPSS Descriptive Clear, simple meaningful
information
4. (4) Hypothesis Testing Path coefficient, t value , p PLS-SEM Partial Least Measurement model and
value, confidence interval bias Smart PLS3 Square / structural model fulfill
corrected, measurement and Regression criteria, conclusion of
structural model CB-SEM hypothesis
AMOS
Descriptive Statistics
A. Measure of Central Tendency
The three most measures of central tendency are the
arithmetic Mean, the Mode, the Median
• Mean
Example
• Mode
• Median
B. Measure of Variation
• Range
• Standard Deviation
• Variance
Demographic Variable Category Frequency Percent
Education Doctorate 282 84.7
Masters and others 51 15.3
126 37.8
Gender Female
Male 207 62.2
Age group Under 40 104 31.2
Over 40 229 68.8
Ethnicity Malay 304 91.3
Chinese 11 3.3
Indian 13 3.9
Other 5 1.5
Position Dean 30 9.0

Use xls Deputy Dean 46 13.8
or Head of Department 241 72.4

10 3.0
SPSS Director
Assistant Director 6 1.8
Mean Value
4.1
Use SPSS 4
3.9
3.8
3.7
3.6
3.5
3.4
3.3
Transformational Emotional Intelligence Organizational Culture
Leadership
TL
Use SPSS
Pilot Test  Validity and Reliability test
Correlation : Deal with the magnitude and direction of relationship
Prediction and relationship
Table 1. IQ and Grade Points
Instead of just summing we
first compute for each score.
This removes the negative and if we
minimize
we minimize the total error of
prediction.
Regression line and prediction error

Positive and negative relationship
Pilot Test  Test for Validity (Pearson r) and Reliability (CA)
Use SPSS
Correlation and Regression
• Correlation and regression are closely related.
• Correlation concerned with the magnitude and direction of the
relationship
• Regression focuses on using the relationship for prediction.
• If the relationship is perfect all the points fall on a straight line and all we
need do is derive the equation od the straight line and use it for
prediction.
Correlation and Regression
• Correlation is a bivariate analysis that measures the strength of
association between two variables and the direction of the relationship.
In terms of the strength of relationship, the value of the correlation
coefficient varies between +1 and -1. A value of ± 1 indicates a perfect
degree of association between the two variables.
• As the correlation coefficient value goes towards 0, the relationship
between the two variables will be weaker. The direction of the
relationship is indicated by the sign of the coefficient; a + sign indicates a
positive relationship and a – sign indicates a negative relationship.
Correlation
• In SPSS we can measure the degree of relationship between two variables
• For example if there a relationship between salary of customers and
amount of money they are willing to spend to buy television.
• In correlation there is no independent and dependent variables, you just
simply measure the variables.
• However, an important point to remember here is that correlation does
not imply causation
Correlation (Pearson, Kendall, Spearman)
Usually, in statistics, we measure four types of correlations: Pearson
correlation, Kendall rank correlation, Spearman correlation
Types of research questions a Pearson correlation can examine:
• Is there a statistically significant relationship between age, as measured in
years, and height, measured in inches?
• Is there a relationship between temperature, measured in degrees
Fahrenheit, and ice cream sales, measured by income?
• Is there a relationship between job satisfaction, as measured by the JSS,
and income, measured in dollars?
Correlation (Pearson, Kendall, Spearman)
Assumptions
• For the Pearson r correlation, both variables should be normally
distributed (normally distributed variables have a bell-shaped curve).
Other assumptions include linearity and homoscedasticity. Linearity
assumes a straight line relationship between each of the two variables and
homoscedasticity assumes that data is equally distributed about the
regression line.
Kendall rank correlation:
• Kendall rank correlation is a non-parametric test that measures the

strength of dependence between two variables.
• The following formula is used to calculate the value of Kendall rank

correlation:
Spearman rank correlation
• Spearman rank correlation is a non-parametric test that is used to
measure the degree of association between two variables. The
Spearman rank correlation test does not carry any assumptions
about the distribution of the data and is the appropriate correlation
analysis when the variables are measured on a scale that is at least
ordinal.
Spearman rank correlation
Types of research questions a Spearman Correlation can examine:

• Is there a statistically significant relationship between participants’ level of
education (high school, bachelor’s, or graduate degree) and their starting
salary?
• Is there a statistically significant relationship between horse’s finishing
position a race and horse’s age?
Assumptions
• The assumptions of the Spearman correlation are that data must be at
least ordinal
A. QUESTIONNAIRE VALIDITY
1. Copy & Paste data jawaban responden setiap
pernyataan ke Data View SPSS, disertai dengan total
skor setiap variabelnya.
2. Click  Analyze – Correlate – Bivariate
47
 Masukkan semua pernyataan ke kolom “variabel”,
juga total skor variabelnya.
 Biarkan option-option yang ada sebagaimana default
(Pearson Correlation)
 Click  OK
48
1. Pada tabel output, perhatikan kolom terakhir
sebelah kanan, yang menunjukkan nilai korelasi
setiap pernyataan dengan total skornya (baris
pertama). Nilai tersebut merupakan nilai koefisien
validitas (Pearson r).
2. Jika nilainya > 0,300 maka sebuah pernyataan
dinyatakan valid,
3. jika < 0,300 maka dinyatakan tidak valid.
49
Correlations
X1
Orientasi
Kewirausa Y1 Strategi Z1 Kinerja
haan X2 Inovasi Bersaing Bisnis
INV1 Pearson Correlation
.448* -.028 -.303 -.260
Sig. (2-tailed)
.013 .884 .103 .165
N 30 30 30 30
KV1 Pearson Correlation
.070 .740 ** .356 .481 **
Sig. (2-tailed)
.712 .000 .053 .007
N 30 30 30 30
CL1 Pearson Correlation .012 .294 .730 ** .614 **
Sig. (2-tailed)
.948 .115 .000 .000
N 30 30 30 30
CS Pearson Correlation
-.209 .109 .403 * .519 **
Sig. (2-tailed)
.268 .565 .027 .003
N 30 30 30 30
*. Correlation is significant at the 0.05 level (2-tailed).
**. Correlation is significant at the 0.01 level (2-tailed). 50
B. QUESTIONNAIRE
RELIABILITY
 Analyze – Scale – Reliability Analysis.
Cronbach’s alpha is a measure of

internal consistency, that is, how
closely related a set of items are as
a group.
51
 Masukkan semua pernyataan ke kolom “variabel”, tanpa
total skor variabelnya.
 Biarkan option-option yang ada sebagaimana default
(Alpha),
 OK.
52
 Pada tabel output, koefisien reliabilitas ditunjukkan oleh nilai
Cronbach’s Alpha.
 Jika nilainya > 0,700 maka sebuah variabel dinyatakan
reliabel,
 jika < 0,700 maka dinyatakan tidak reliabel
53
PILOT TEST  TEST FOR VALIDITY (PEARSON R)
AND RELIABILITY (CA)
Use SPSS
3. Data Preparation  Normality, Outliers, Missing Value
Assessing Normality  Use SPSS
• Analyze  Descriptive  Explore
• Select age move to Dependent List box
• In the Display section make sure that Both is selected
• Click on Statistics  Descriptive and Outliers
• Click on Continue
• Click on the Plots button. Under Descriptive, click on Histogram. Click on Normality
plots with test. Click on Continue
• Click on the Options button, in the missing values section click on Exclude cases
pairwise. Click on Continue and then OK
• Normality test  sig value > 0.05  normality
• Skewness value provides an indication of the symmetry

of the distribution . Positive Skewness values indicate
positive skew (score cluster to the left)
• Kurtosis provides information about the peakedness of
the distribution . Positive kurtosis indicate that the
distribution is clustered in the centre.
• Kurtosis below zero indicate that a distribution that is
relatively flat (too many cases in the extremes)
• Skewness : 0.474  positive skew clustered to the
left
• Kurtosis: - 0.484 below 0 = relative flat
Interpretation of output
Missing Data
• (Hair et al., 2017) recommend using mean value replacement when they
are less than 5%.
• Mean value replacement: the missing values are replaced with the mean
of valid values of that indicator (Hair Jr, Sarstedt, Ringle, & Gudergan, 2017
Checking for Outliers
If the mean value and 5% trimmed mean value

are very different  your extreme score are
having strong influence on the mean (outliers)
check the ID of the values (Box plot)
4. Hypothesis Testing  USING PLS-SEM (SMART PLS3.0)
WHAT IS PLS-SEM
 2nd Generation technique of multivariate method
to test the hypothesis of existing theories and
concept (confirmatory) or to develop theory
(exploratory)
 Focus on the predictive ability of the model
(explaining the variance in the dependent
variables)
61
WHAT IS PLS-SEM
Primarily Exploratory Primarily Confirmatory
1st • Cluster Analysis • Analysis of Variance

Generation • Exploratory Factor • Logistic regression
techniques Analysis • Multiple regression
• Multidimensional Scaling • Confirmatory factor
analysis
2nd Partial Least Squares • Covariance-based
Generation structural equation structural equation
Techniques modelling (PLS-SEM) modelling
SmartPLS3 (CB-SEM) AMOS
62
WHEN TO USE PLS-SEM
 The goal is predicting target constructs
 The structural model is complex
 The sample size is small.
 The data are non normally distributed
 Measurement scale is nominal, ordinal and interval
 Formatively measured constructs
63
MINIMUM SAMPLE
REQUIREMENTS (HAIR, 2017)
 Rules of thumb
 Tabel provided by Cohen(1992)
 Alternatively use program such as G*Power to carry
out power analyses specific to model set up.

http://www.gpower.hhu.de/
64
COHEN (1992)
Sample Size Recommendation in PLS-SEM for a Statistical Power
of 80%
65
PLS-SEM PATH MODELING
66
PLS-SEM PATH MODELING
Indicators / Indicators /
Items Items
Exogenous latent Endogenous latent

variable variable
67
REFLECTIVE MEASUREMENT
MODEL
68
FORMATIVE MEASUREMENT
MODEL
69
70
MODEL CREATION USING
1. SMARTPLS
Specifying the Structural Model
 Prepare a diagram that connect construct based on
theory to visually display the hypotheses that will be
tested
 Established Relationship between them by drawing
arrows Customer
Reputation
Loyalty
The construct on the left predict the construct on the

right side.
71
Reputation predict the Customer Loyalty 71
MEDIATION / INTERVENING
Customer
Reputation
Loyalty
Satisfactio Customer
Reputation
n Loyalty
Satisfaction b
a
indirect effect = a * b
c Customer
Reputation
Loyalty
direct effect
72
MODERATION
Income
Customer
Customer
Satisfactio
Loyalty
n
.
73
MODEL ESTIMATION & PLS-SEM
ALGORITHM
1. Model estimation delivers empirical measures of the relationships
between the indicators and the constructs, as well as between the
construct. We can determine how well the theory fits the data
2. PLS-SEM results are reviewed and evaluated using a systematic
process. The goal of PLS-SEM is maximizing the explained
variance (the R2 value) of the endogenous latent variables.
3. For this reason, evaluation of the model (measurement and
structural) focus on model predictive capability.
4. The most important measurement models are reliability,
convergent validity, and discriminant validity.
5. For the structural model are R2 (explained variance), f2 (effect
size) and Q2 (predictive relevance)
74
EVALUATION OF THE MODELS
The systematic evaluation follows a two step process :
1) measurement models and 2) structural model
75
EVALUATION OF
MEASUREMENT MODEL
Unreliable
measure can
never be valid
1. INTERNAL CONSISTENCY
RELIABILITY
• To evaluate internal consistency reliability researcher using
Cronbach’s alpha, and composite reliability
RUNNING THE ALGORITHM
REPORT
All partial regression models are estimated by the PLS-
SEM algorithm’s iterative procedures. Select a value of at
least 300 for the maximum number of iterations.
79
RESULT – INTERNAL CONSISTENCY
Constru Cronbach Criteria Result Composite Criteria Result

ct Alpha Reliability
(CR)
COMP 0-1 Reliable 0.7 to Reliable
0.776 0.870
0.9
CUSA 0-1 Reliable 0.7 to Reliable
1.000 1.000
0.9
CUSL 0-1 Reliable 0.7 to Reliable
0.421 0.692
0.9
LIKE 0-1 Reliable 0.7 to Reliable
0.831 0.897
0.9
80
2. CONVERGENT VALIDITY
• Convergent Validity is the extent to which a measure correlates
positively with alternative measure of the same construct.
• To evaluate convergent validity consider the outer loadings of
the indicators and the AVE
RESULT– CONVERGENT VALIDITY
CONST Outer Criteria Result AVE Criteri Result
RUCT loading a
COMP 0.692 >0.5 fulfilled
comp_1 0.766 >0.708 fulfilled
comp_2 0.858 fulfilled
comp_3 0.868 fulfilled
LIKE 0.744 fulfilled
like_1 0.903 fulfilled
CUSA 1.000 fulfilled 1.000 fulfilled
CUSL 0.530 fulfilled
cusl_1 -0.002
cusl_2 0.874 fulfilled
cusl_3 0.909 fulfilled
82
3. DISCRIMINANT VALIDITY
• Discriminant Validity is the extent to which a construct is truly
distinct from other constructs. It implies that a construct is
unique.
• To evaluate discriminant validity consider : cross loading &
fornell & Larcker criterion
3. RESULT FORNELL & LARCKER CRITIREON
Test Criteria
Fornell & compares the square root of the AVE values with the latent
Larcker variable correlations. As shown in Table the square root of each
construct's AVE should be greater than its highest correlation with
any other construct
COMP CUSA CUSL LIKE AVE SQRT AVE
COMP 1.000 0.692 0.832
CUSA 0.119 1.000 1.000 1.000
CUSL 0.126 0.614 1.000 0.530 0.728
LIKE 0.622 0.045 0.081 1.000 0.744 0.863

3. CROSS LOADING
Test Criteria
Cross Loading, Cross loading report shown that all indicator's outer loading
on the associated construct are greater than all of its
loadings on other constructs, therefore cross loading is
Cross Loadings fulfilled.
COMP CUSA CUSL LIKE

comp_1 0.766 0.082 0.103 0.612
comp_2 0.858 0.113 0.105 0.462
comp_3 0.868 0.100 0.108 0.495
cusa 0.119 1.000 0.614 0.045
cusl_1 0.008 -0.002 -0.002 0.053
cusl_2 0.082 0.504 0.874 0.076
cusl_3 0.139 0.586 0.909 0.070
like_1 0.574 0.056 0.082 0.903
like_2 0.498 0.038 0.060 0.859
like_3 0.534 0.014 0.066 0.824
3. HTMT
An estimate of the correlation between the construct. , based on the
average of heterotrait-heteromethod correlation as suggested by
(Henseler, Ringle, & Sarstedt, 2015).
The ratio of HTMT is expected lower than 0.90 at 95% confident

interval. To examining the HTMT ratio, we tested whether the HTMT
values are significantly different from 1.
The value of HTMT higher than 0.9 indicate there is a lack of

discriminant validity.
RESULT SUMMARY FOR REFLECTIVE
MEASUREMENT MODELS
Internal Consistency Convergent Validity Discriminant
Reliability Validity
Composite Cronbach Loadings AVE
Reliability Alpha
Latent
Indicators HTMT
Variables
confidence
0.6 – 0.9 0.6 - 0.9 >0.7 >0.5 interval
doesn’t
include 1
COMP Comp_1 0.870 0.776 0.766 0.693 Yes
Comp_2 0.858
Comp_3 0.868
LIKE Like_1 0.897 0.831 0.903 0.744 Yes
Like_2 0.859
Like_3 0.824
CUSL Cusl_1 0.692 0.421 -0.002 0.530 Yes
Cusl_2 0.874
Cusl_3 0.909
SESSION 4: EVALUATION OF
STRUCTURAL MODEL
Assessment of the structural model results enables you to determine
the model’s capability to predict one or more target construct
Structural Model Assessment Procedure
1. Collinearity Assessment
2. Path Coefficients
3. Coefficients of Determination (R 2 value)
4. Effect Size f2
5. Blindfolding and predictive relevance Q 2
6. Effect Size q2
88
1. COLLINEARITY
ASSESSMENT
Collinearity arises when two indicators are highly correlated`
Collinearity among latent variables are assessed through
Variance Inflated Factor
The threshold value:
 VIF ≥ 5 indicate a potential collinearity problem (Hair,
Ringle & Sarsted, 2011)
 VIF≥ 3 indicate a potential collinearity problem
(Diamantopoulos & Siguaw, 2006)
89
1. COLLINEARITY ASSESSMENT -
RESULT
Inner VIF Values
COMP CUSA CUSL LIKE

COMP 1.632 1.655
CUSA 1.016
CUSL
LIKE 1.632 1.635
90
2. PATH COEFFICIENT
 Path Coefficient is the coefficient linking construct in the
structural model.
 It represent the hypothesized relationship or the strength
of the relationship. For example, path coefficient close to
+1 indicate a strong positive relationship (and vice versa
for negative values). The closer the estimated
coefficients are to 0, the weaker the relationships.
 Very low values close to 0 generally are not statistically
significant
91
2. PATH COEFFICIENT -
RESULT
92
2. PATH COEFFICIENT
Whether a coefficient is significant depends on its
standard error that is obtained by bootstrapping, to
enable computing the empirical t values, p values for all
structural path coefficient
t values
 When an empirical t value is larger than the critical
value, we conclude that the coefficient is statistically
significant at a certain error probability.
 Commonly used critical values for two-tailed test are
1.65 (significant level 10%) and 1.96 (significant level 93
Statistically
T Statistics Critical value significance
?
COMP -> CUSA 2.962 1.96 Yes
COMP -> CUSL 0.887 1.96 No
CUSA -> CUSL 2.176 1.96 Yes
LIKE -> CUSA 0.183 1.96 No
LIKE -> CUSL 0.362 1.96 No
94
2. PATH COEFFICIENT
p values
 Most researcher use p values to assess significance levels.
 When assuming a significance level of 5%, the p value must
be smaller than 0.05 to conclude that the relationship under
consideration is significant at a 5% level.
 When assuming a significance level of 1%, the p value must
be smaller than 0.01 to indicate a relationship is significant .
95
Statistic
Statistically
T Critical P Critical ally
significance
Statistics value Values value significa
?
nce?
COMP -> CUSA 2.962 1.96 Yes 0.003 0.05 Yes
COMP -> CUSL 0.887 1.96 No 0.376 0.05 No
CUSA -> CUSL 2.176 1.96 Yes 0.030 0.05 Yes
LIKE -> CUSA 0.183 1.96 No 0.855 0.05 No
LIKE -> CUSL 0.362 1.96 No 0.717 0.05 Yes
96
2. PATH COEFFICIENT -
RESULT
 The individual path coefficient
can be interpreted as the
standardized beta coefficients
in an ordinary least squares
(OLS) regression.
 A one unit change of
exogenous construct changes
the endogenous construct by
the size of path coefficient
when everything else remains
constant (ceteris paribus; Hair
et al., 2010) 97
3. COEFFICIENTS OF DETERMINATION
(R2 VALUE)
 The R2 indicates the variance explained of the
endogenous variable by the exogenous variable
 The R2 value range from 0 to 1, which higher levels
indicating higher levels of predicting accuracy.
 Hair et al (2011; Henseler et, al., 2009) articulated that
values of 0.75, 0.50 or 0.25 can describe as substantial,
moderate and weak.
 Chin (1998) articulated the value of 0.67, 0.33 and 0.19
as substantial, moderate and weak
98
3. COEFFICIENTS OF DETERMINATION
(R2 VALUE) - RESULT
R R Square

Square Adjusted
CUSA 0.015 0.010 Weak
CUSL 0.380 0.375 Moderate
99
4. EFFECT SIZE f2
 Assessment of effect size allows the researcher to observe the
effect of each exogenous construct on endogenous construct.
 Cohen (1998) Guidelines for assessing f2 are that values of
 0.02 – represent small,
 0.15 – medium
 0.35 – large effect of the exogenous latent variable.
Effect size values of less than 0.02 indicate that there is no effect
100
4. EFFECT SIZE f2 - RESULT
COMP CUSA CUSL LIKE
COMP 0.014 0.001

0.588
CUSA
(Large)
CUSL
LIKE 0.001 0.001
101
5. BLINDFOLDING & PREDICTIVE
RELEVANCE Q2
 In addition to evaluating the magnitude of R2 values as a
criterion of predictive accuracy, researches should also
examine Stone-Geisser’s Q2 value (Geisser, 1974; Stone, 1974)
 This measure is an indicator of the model’s predictive power or
predictive relevance.
 The Q2 value is obtained by using Blindfolding procedures for a
specified omission distance D with a values between 5 and 10
 Q2 value larger than 0 suggest than the model has predictive
relevance for a certain endogenous construct. In contrast
values of 0 and below indicate a lack of predictive relevance.
102
BLINDFOLDING & PREDICTIVE RELEVANCE
Q2 -
RESULT
SSO SSE Q² (=1-SSE/SSO)
COMP 1,032.000 1,032.000
CUSA 344.000 365.407 -0.062
CUSL 1,032.000 983.206 0.047
LIKE 1,032.000 1,032.000
values are greater than 0, indicate that the model has predictive relevance. 103
6. EFFECT SIZE q2
 The effect size q2 allows assessing an exogenous
construct’s contribution to an endogenous latent variable’s
Q2 value.
 As a relative measures of predictive relevance q2 values of
 0.02  indicate exogenous construct has a small
 0.15  indicate exogenous construct has a medium
 0.35  indicate exogenous construct has a large
Predictive relevance for a certain endogenous construct.
104
6. EFFECT SIZE q2-RESULT
Construct Cross validated Redundancy
SSO SSE Q² (=1-SSE/SSO)
COMP 1,032.000 1,032.000

CUSA 344.000 365.407 -0.062
q2 = Q2 included-Q2 excluded
CUSL 1,032.000 983.206 0.047
1- Q2 included
LIKE 1,032.000 1,032.000
q2 = 0.047- (-0.062)
0.115 Medium Predictive
1- 0.047 Relevance
105
MEDIATION Y2
p1 p2
Y1 Y3
p3
 Mediation occurs when a third mediator variable intervenes

between two other related constructs.
 A change in the exogenous construct causes a change in the
mediator variable, which in turn, results in a change in the
endogenous construct in the model.
106
MEDIATION Y2
p1 p2
Y1 Y3
p3
 Direct effects are the relationship linking two constructs with a

single arrows. (p3)
 Indirect effects is a sequence of two or more direct effects and is
represented by multiple arrows (Y1  Y2  Y3 sequence)
 The indirect effect p1.p2 represents the mediating effect of the
construct Y2 on the relationship between Y1 and Y3
107
TYPES OF MEDIATION EFFECT
Is pi.p2
signific No
Yes ant?
Is p3 Is p3
signific significa
Yes ant? No No
Yes nt?
Is
p1.p2.p
3 No
Yes positive
?
Complementar Competitive
Indirect only Direct only No effect
y (partial (partial
(full mediation) (no mediation) (no mediation)
mediation) mediation)
MEDIATING ROLE
 Full Mediation
The independent variables doesn’t have significant effect on the
dependent variables after inclusion of the mediation variables
Partial Mediation
The independent variables has a significant effect on the
dependent variables after inclusion of the mediation variables
(Zhao, Lynch and Chen, 2010)
MEDIATING ROLE
TESTING MEDIATING EFFECT
Significance Analysis of the Direct and Indirect Effect
Direct 95% t Significa Indirect 95% t Significa Indirect Direct Conclusion
Effect Confidence Value nce Effect Confidence Value nce Effect Effect
Interval of (p<0.05) Interval of (p<0.05) Significance
the Direct ? the Direct ? ?
Effect Effect
Fully
COMP -> CUSL 0.033 (-0.025, 0.125) 0.887 0.376 0.090 (0.010, 0.207) 1.575 0.116 Yes No
Mediation
LIKE --> CUSL 0.033 (-0.126, 0.222) 0.362 0.717 -0.029 (-0.186, 0.173) 0.314 0.754 No No No Effect
(inlcude
zerro)
2. MODERATION
Income
Custome
r Custome
Satisfacti r Loyalty
on
 Moderation describes a situation in which the relationship between

.
two construct is not constant but depends on the values of a third
variable, referred to as a moderator variable.
 The moderator variable changes the strength or even the direction
of a relationship between two construct in the model. 112
2. MODERATION
Income
Custome
r Custome
Satisfacti r Loyalty
on
.  The relationship between customer satisfaction and customer

loyalty differs as a function of the customer’s income.
 Income has negative effect on the satisfaction – loyalty
relationship
 The higher the income the weaker the relationship between 113
HYPOTHESIS
114
HYPOTHESIS
115
EVALUATION OF MODERATOR
VARIABLE
 Measurement & Structural Model
 Size of the moderating effect
 Assess whether the interaction term is significant
 Assess moderator’s f2 effect size
VARIABLE
VARIABLE
1. FROM DATA TO INFORMATION TO KNOWLEDGE
• What is Information
 Data that has been processed into a form that give it meaning
 Data that has been processed within a context to give it meaning.
 Information is interpreted data. Information is meaningful
What is Knowledge
 Knowledge is the understanding of rules needed to interpret information
“…the capability of understanding the relationship between
pieces of information and what to actually do with the
information”
Debbie Jones – www.teach-ict.com

DATA PROCESSING, ANALYSING AND INTERPRETATION Ipmi

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

DATA PROCESSING, ANALYSING AND INTERPRETATION Ipmi

Загружено:

Авторское право:

Доступные форматы

DATA PROCESSING AND ANALYSIS

1. Introduction to Data Analysis & Statistics

3. Inferentiall Statistics & Hypothesis Testing

• To describe and summarize the data

• Identify the differences between variables

• Identify relationship between variables

• Data must be interpreted. Numbers do not speak for themselves

Farah’s scores in the six Innovation modules

Avg 69.5, Min 42, Max 96, SUM 417

Farah’s teacher could analyze the results to

Information = Data + Context + Meaning

Data – raw facts and figures

“Statistics is a way to get information from data”

Statistics is a tool for creating new understanding

Definitions: Oxford English Dictionary

b) Do the data comprise a sample? If so what is the population

c) Identify variable that might be of interest

Descriptive Statistics Inferential Statistics

Parametric Statistics Non-Parametric Statistics

Demographic Variable Category Frequency Percent

Education Doctorate 282 84.7

Masters and others 51 15.3

Age group Under 40 104 31.2

Over 40 229 68.8

Ethnicity Malay 304 91.3

Position Dean 30 9.0

or Head of Department 241 72.4

Regression line and prediction error

• Kendall rank correlation is a non-parametric test that measures the

• The following formula is used to calculate the value of Kendall rank

Types of research questions a Spearman Correlation can examine:

Cronbach’s alpha is a measure of

• Skewness value provides an indication of the symmetry

If the mean value and 5% trimmed mean value

1st • Cluster Analysis • Analysis of Variance

 Tabel provided by Cohen(1992)

 Alternatively use program such as G*Power to carry

out power analyses specific to model set up.

Exogenous latent Endogenous latent

The construct on the left predict the construct on the

Constru Cronbach Criteria Result Composite Criteria Result

COMP CUSA CUSL LIKE AVE SQRT AVE

COMP 1.000 0.692 0.832

CUSA 0.119 1.000 1.000 1.000

CUSL 0.126 0.614 1.000 0.530 0.728

LIKE 0.622 0.045 0.081 1.000 0.744 0.863

COMP CUSA CUSL LIKE

The ratio of HTMT is expected lower than 0.90 at 95% confident

The value of HTMT higher than 0.9 indicate there is a lack of

COMP CUSA CUSL LIKE

COMP CUSA CUSL LIKE

COMP 0.014 0.001

LIKE 0.001 0.001

COMP 1,032.000 1,032.000

CUSA 344.000 365.407 -0.062

CUSL 1,032.000 983.206 0.047

LIKE 1,032.000 1,032.000

SSO SSE Q² (=1-SSE/SSO)

COMP 1,032.000 1,032.000

 Mediation occurs when a third mediator variable intervenes

 Direct effects are the relationship linking two constructs with a

 Moderation describes a situation in which the relationship between

.  The relationship between customer satisfaction and customer