Академический Документы
Профессиональный Документы
Культура Документы
Dr Firdaus Basbeth
2
Agenda
3
2. Descriptive Statistics
• Compare variables
5
6
Data is meaningless. Data must be interpreted, by a human or machine to
derive meaning.
• Example
None of the above data sets have any meaning until they are given a
CONTEXT and PROCESSED into a useable form
Example Raw Data 42, 63, 96, 74, 56, 86
Processing
Processing
TOOL
Data Information
Population Entire group of people, events, or things of interest that the researcher wishes
to investigate
Parameter The characteristics of the population such as mean, standard deviation and
variance.
Sample A subset of the population. It comprises some members selected from it. Only
some bit not all elements of the population would from the sample.
Sampling Process of selecting a sufficient number of elements from the population
Individual People or objects included in the study
Variable Characteristics of the individual/object/ condition to be measured or observed
Independent Variable One that influences the dependent variable in either positive or negative way.
Dependent Variable The variable of primary interest to the researcher. The researcher goal is to
understand, describe, explain or predict the dependent variable
Data Measurement that are made on the subjects of an experiment
Quiz 1: Using basic terminology
Television station TV3 wants to know the proportion of TV owners in Kota Bharu who
watch the station’s new program at least once a week. The station asks a group of 1000
TV owner if they watch the program at least once a week
a) Identify the individual of the study and variable
The individual are the 1000 TV owners surveyed. The variable is the response does or does not watch the
program at least once a week
d) Is the proportion of viewers in the sample who watch the new program at least once a
week a statistics or a parameter? Statistics
How Variables Are Measured
• To establish relationships between variables, researchers must observe
the variables and record their observations. This requires that the
variables be measured.
• The process of measuring a variable requires a set of categories called a
12
scale of measurement
• These scales are classified into 4 Types of Measurement Scales : Nominal,
Ordinal, Interval, Ratio
1. Nominal Scale
Here nominal should be associated with the word name since this scale
identifies names, labels or categories
Examples:
• Gender
• Marital Status
• State of Residence
• College Major
• Zip Code
• Student ID
13
2. Ordinal Scale
The ordinal scale (think of the word order) Applies to data that can be
arranged in order 1, 2, 3 and so on, however differences between data
values either cannot be determined or are meaningless.
Examples
• Miss America results: first place, runner up, third
• Military rank
• Degree held
3. Interval Scale
An interval scale represent higher level than the ordinal scale, with the
additional property that distance between observations is meaningful.
Examples
• Celsius scale of temperature measurement
• Questionnaires
4. Ratio Scale
A ratio scale is an interval scale with the added property that ratios of
observations are meaningful.
Only scale that allows you to make ratio comparisons, such as “Susana’s
income is 35% more than Suhaila’s”
Examples:
Weight of package of books
Number of questions correct on a quiz
Distance from KL to Kota Bharu
QUIZ 2: Identify the type of data
Data Type of Data
Taos, Acoma, Zuni, and Cochiti are the names of four natives
American pueblos from the population of names of all Native
American pueblos in Arizona and New Mexico
In high school graduating class of 319 student, Saiful ranked 25 th, Hafiz
ranked 19th, Robby ranked 10th and Tava ranked 4th, where 1 is the
highest rank
Body temperatures (degree Celcius) of trout in Java Sea
Length of trout swimming in the Java Sea
Branches of Statistics
Descriptive Inferential
Describes a set of data, by Tries to infer information
graphically displaying the about population by using
information, or describing its information gathered by
means and standard sampling (estimation of
deviation parameters and Hypothesis
testing)
Branches of Statistics
Statistics
20
• CB-SEM (AMOS)
• PLS-SEM (SMART PLS)
RESEARCH ACTIVITIES USING STATISTICS
No What to do with data Criteria Tools Bracnh of Expected Result
statistics
1. (2) Pilot Test : Check validity and Pearson correlation (0.3) and SPSS Correlation Valid and Reliable
reliability of Questionnaires Cronbach’s Alpha (0.7
2. (3) Data preparation: Check Histogram, Normal distribution, SPSS Descriptive Normal distribution, No
Normality. Outliers, Missing skewness, Kurtosis, outliers, no missing values,
Values Kolmogorov Smirnoff,
significance, Q-Q Plot, Box Plot
3. (1) Descriptive Analysis Mean, SD, Frequency, % tage Excell, SPSS Descriptive Clear, simple meaningful
information
4. (4) Hypothesis Testing Path coefficient, t value , p PLS-SEM Partial Least Measurement model and
value, confidence interval bias Smart PLS3 Square / structural model fulfill
corrected, measurement and Regression criteria, conclusion of
structural model CB-SEM hypothesis
AMOS
Descriptive Statistics
A. Measure of Central Tendency
The three most measures of central tendency are the
arithmetic Mean, the Mode, the Median
• Mean
Example
• Mode
• Median
B. Measure of Variation
• Range
• Standard Deviation
• Variance
Descriptive Statistics
126 37.8
Gender Female
Male 207 62.2
Chinese 11 3.3
Indian 13 3.9
Other 5 1.5
Mean Value
4.1
Use SPSS 4
3.9
3.8
3.7
3.6
3.5
3.4
3.3
Transformational Emotional Intelligence Organizational Culture
Leadership
Descriptive Statistics
TL
Use SPSS
Pilot Test Validity and Reliability test
Correlation : Deal with the magnitude and direction of relationship
Prediction and relationship
Table 1. IQ and Grade Points
Instead of just summing we
first compute for each score.
This removes the negative and if we
minimize
we minimize the total error of
prediction.
Use SPSS
Correlation and Regression
• Correlation and regression are closely related.
• Correlation concerned with the magnitude and direction of the
relationship
• Regression focuses on using the relationship for prediction.
• If the relationship is perfect all the points fall on a straight line and all we
need do is derive the equation od the straight line and use it for
prediction.
Correlation and Regression
• Correlation is a bivariate analysis that measures the strength of
association between two variables and the direction of the relationship.
In terms of the strength of relationship, the value of the correlation
coefficient varies between +1 and -1. A value of ± 1 indicates a perfect
degree of association between the two variables.
• As the correlation coefficient value goes towards 0, the relationship
between the two variables will be weaker. The direction of the
relationship is indicated by the sign of the coefficient; a + sign indicates a
positive relationship and a – sign indicates a negative relationship.
Correlation
• In SPSS we can measure the degree of relationship between two variables
• For example if there a relationship between salary of customers and
amount of money they are willing to spend to buy television.
• In correlation there is no independent and dependent variables, you just
simply measure the variables.
• However, an important point to remember here is that correlation does
not imply causation
Correlation (Pearson, Kendall, Spearman)
Usually, in statistics, we measure four types of correlations: Pearson
correlation, Kendall rank correlation, Spearman correlation
Types of research questions a Pearson correlation can examine:
• Is there a statistically significant relationship between age, as measured in
years, and height, measured in inches?
• Is there a relationship between temperature, measured in degrees
Fahrenheit, and ice cream sales, measured by income?
• Is there a relationship between job satisfaction, as measured by the JSS,
and income, measured in dollars?
Correlation (Pearson, Kendall, Spearman)
Assumptions
• For the Pearson r correlation, both variables should be normally
distributed (normally distributed variables have a bell-shaped curve).
Other assumptions include linearity and homoscedasticity. Linearity
assumes a straight line relationship between each of the two variables and
homoscedasticity assumes that data is equally distributed about the
regression line.
Kendall rank correlation:
Assumptions
• The assumptions of the Spearman correlation are that data must be at
least ordinal
A. QUESTIONNAIRE VALIDITY
1. Copy & Paste data jawaban responden setiap
pernyataan ke Data View SPSS, disertai dengan total
skor setiap variabelnya.
2. Click Analyze – Correlate – Bivariate
47
Masukkan semua pernyataan ke kolom “variabel”,
juga total skor variabelnya.
Biarkan option-option yang ada sebagaimana default
(Pearson Correlation)
Click OK
48
1. Pada tabel output, perhatikan kolom terakhir
sebelah kanan, yang menunjukkan nilai korelasi
setiap pernyataan dengan total skornya (baris
pertama). Nilai tersebut merupakan nilai koefisien
validitas (Pearson r).
2. Jika nilainya > 0,300 maka sebuah pernyataan
dinyatakan valid,
3. jika < 0,300 maka dinyatakan tidak valid.
49
Correlations
X1
Orientasi
Kewirausa Y1 Strategi Z1 Kinerja
haan X2 Inovasi Bersaing Bisnis
INV1 Pearson Correlation
.448* -.028 -.303 -.260
Sig. (2-tailed)
.013 .884 .103 .165
N 30 30 30 30
KV1 Pearson Correlation
.070 .740 ** .356 .481 **
Sig. (2-tailed)
.712 .000 .053 .007
N 30 30 30 30
CL1 Pearson Correlation .012 .294 .730 ** .614 **
Sig. (2-tailed)
.948 .115 .000 .000
N 30 30 30 30
CS Pearson Correlation
-.209 .109 .403 * .519 **
Sig. (2-tailed)
.268 .565 .027 .003
N 30 30 30 30
*. Correlation is significant at the 0.05 level (2-tailed).
**. Correlation is significant at the 0.01 level (2-tailed). 50
B. QUESTIONNAIRE
RELIABILITY
Analyze – Scale – Reliability Analysis.
51
Masukkan semua pernyataan ke kolom “variabel”, tanpa
total skor variabelnya.
Biarkan option-option yang ada sebagaimana default
(Alpha),
OK.
52
Pada tabel output, koefisien reliabilitas ditunjukkan oleh nilai
Cronbach’s Alpha.
Jika nilainya > 0,700 maka sebuah variabel dinyatakan
reliabel,
jika < 0,700 maka dinyatakan tidak reliabel
53
PILOT TEST TEST FOR VALIDITY (PEARSON R)
AND RELIABILITY (CA)
Use SPSS
3. Data Preparation Normality, Outliers, Missing Value
Assessing Normality Use SPSS
• Analyze Descriptive Explore
• Select age move to Dependent List box
• In the Display section make sure that Both is selected
• Click on Statistics Descriptive and Outliers
• Click on Continue
• Click on the Plots button. Under Descriptive, click on Histogram. Click on Normality
plots with test. Click on Continue
• Click on the Options button, in the missing values section click on Exclude cases
pairwise. Click on Continue and then OK
• Normality test sig value > 0.05 normality
61
WHAT IS PLS-SEM
Primarily Exploratory Primarily Confirmatory
62
WHEN TO USE PLS-SEM
The goal is predicting target constructs
The structural model is complex
The sample size is small.
The data are non normally distributed
Measurement scale is nominal, ordinal and interval
Formatively measured constructs
63
MINIMUM SAMPLE
REQUIREMENTS (HAIR, 2017)
Rules of thumb
64
COHEN (1992)
Sample Size Recommendation in PLS-SEM for a Statistical Power
of 80%
65
PLS-SEM PATH MODELING
66
PLS-SEM PATH MODELING
Indicators / Indicators /
Items Items
67
REFLECTIVE MEASUREMENT
MODEL
68
FORMATIVE MEASUREMENT
MODEL
69
70
MODEL CREATION USING
1. SMARTPLS
Specifying the Structural Model
Prepare a diagram that connect construct based on
theory to visually display the hypotheses that will be
tested
Established Relationship between them by drawing
arrows Customer
Reputation
Loyalty
Satisfactio Customer
Reputation
n Loyalty
Satisfaction b
a
indirect effect = a * b
c Customer
Reputation
Loyalty
direct effect
72
MODERATION
Income
Customer
Customer
Satisfactio
Loyalty
n
.
73
MODEL ESTIMATION & PLS-SEM
ALGORITHM
1. Model estimation delivers empirical measures of the relationships
between the indicators and the constructs, as well as between the
construct. We can determine how well the theory fits the data
2. PLS-SEM results are reviewed and evaluated using a systematic
process. The goal of PLS-SEM is maximizing the explained
variance (the R2 value) of the endogenous latent variables.
3. For this reason, evaluation of the model (measurement and
structural) focus on model predictive capability.
4. The most important measurement models are reliability,
convergent validity, and discriminant validity.
5. For the structural model are R2 (explained variance), f2 (effect
size) and Q2 (predictive relevance)
74
EVALUATION OF THE MODELS
The systematic evaluation follows a two step process :
1) measurement models and 2) structural model
75
EVALUATION OF
MEASUREMENT MODEL
Unreliable
measure can
never be valid
1. INTERNAL CONSISTENCY
RELIABILITY
• To evaluate internal consistency reliability researcher using
Cronbach’s alpha, and composite reliability
RUNNING THE ALGORITHM
REPORT
All partial regression models are estimated by the PLS-
SEM algorithm’s iterative procedures. Select a value of at
least 300 for the maximum number of iterations.
79
RESULT – INTERNAL CONSISTENCY
80
2. CONVERGENT VALIDITY
• Convergent Validity is the extent to which a measure correlates
positively with alternative measure of the same construct.
• To evaluate convergent validity consider the outer loadings of
the indicators and the AVE
RESULT– CONVERGENT VALIDITY
CONST Outer Criteria Result AVE Criteri Result
RUCT loading a
COMP 0.692 >0.5 fulfilled
comp_1 0.766 >0.708 fulfilled
comp_2 0.858 fulfilled
comp_3 0.868 fulfilled
LIKE 0.744 fulfilled
like_1 0.903 fulfilled
like_2 0.859 fulfilled
like_3 0.824 fulfilled
CUSA 1.000 fulfilled 1.000 fulfilled
CUSL 0.530 fulfilled
cusl_1 -0.002
cusl_2 0.874 fulfilled
cusl_3 0.909 fulfilled
82
3. DISCRIMINANT VALIDITY
• Discriminant Validity is the extent to which a construct is truly
distinct from other constructs. It implies that a construct is
unique.
• To evaluate discriminant validity consider : cross loading &
fornell & Larcker criterion
3. RESULT FORNELL & LARCKER CRITIREON
Test Criteria
Fornell & compares the square root of the AVE values with the latent
Larcker variable correlations. As shown in Table the square root of each
construct's AVE should be greater than its highest correlation with
any other construct
88
1. COLLINEARITY
ASSESSMENT
Collinearity arises when two indicators are highly correlated`
Collinearity among latent variables are assessed through
Variance Inflated Factor
The threshold value:
VIF ≥ 5 indicate a potential collinearity problem (Hair,
Ringle & Sarsted, 2011)
VIF≥ 3 indicate a potential collinearity problem
(Diamantopoulos & Siguaw, 2006)
89
1. COLLINEARITY ASSESSMENT -
RESULT
Inner VIF Values
90
2. PATH COEFFICIENT
Path Coefficient is the coefficient linking construct in the
structural model.
It represent the hypothesized relationship or the strength
of the relationship. For example, path coefficient close to
+1 indicate a strong positive relationship (and vice versa
for negative values). The closer the estimated
coefficients are to 0, the weaker the relationships.
Very low values close to 0 generally are not statistically
significant
91
2. PATH COEFFICIENT -
RESULT
92
2. PATH COEFFICIENT
Whether a coefficient is significant depends on its
standard error that is obtained by bootstrapping, to
enable computing the empirical t values, p values for all
structural path coefficient
t values
When an empirical t value is larger than the critical
value, we conclude that the coefficient is statistically
significant at a certain error probability.
Commonly used critical values for two-tailed test are
1.65 (significant level 10%) and 1.96 (significant level 93
Statistically
T Statistics Critical value significance
?
COMP -> CUSA 2.962 1.96 Yes
COMP -> CUSL 0.887 1.96 No
CUSA -> CUSL 2.176 1.96 Yes
LIKE -> CUSA 0.183 1.96 No
LIKE -> CUSL 0.362 1.96 No
94
2. PATH COEFFICIENT
p values
Most researcher use p values to assess significance levels.
When assuming a significance level of 5%, the p value must
be smaller than 0.05 to conclude that the relationship under
consideration is significant at a 5% level.
When assuming a significance level of 1%, the p value must
be smaller than 0.01 to indicate a relationship is significant .
95
Statistic
Statistically
T Critical P Critical ally
significance
Statistics value Values value significa
?
nce?
COMP -> CUSA 2.962 1.96 Yes 0.003 0.05 Yes
COMP -> CUSL 0.887 1.96 No 0.376 0.05 No
CUSA -> CUSL 2.176 1.96 Yes 0.030 0.05 Yes
LIKE -> CUSA 0.183 1.96 No 0.855 0.05 No
LIKE -> CUSL 0.362 1.96 No 0.717 0.05 Yes
96
2. PATH COEFFICIENT -
RESULT
The individual path coefficient
can be interpreted as the
standardized beta coefficients
in an ordinary least squares
(OLS) regression.
A one unit change of
exogenous construct changes
the endogenous construct by
the size of path coefficient
when everything else remains
constant (ceteris paribus; Hair
et al., 2010) 97
3. COEFFICIENTS OF DETERMINATION
(R2 VALUE)
The R2 indicates the variance explained of the
endogenous variable by the exogenous variable
The R2 value range from 0 to 1, which higher levels
indicating higher levels of predicting accuracy.
Hair et al (2011; Henseler et, al., 2009) articulated that
values of 0.75, 0.50 or 0.25 can describe as substantial,
moderate and weak.
Chin (1998) articulated the value of 0.67, 0.33 and 0.19
as substantial, moderate and weak
98
3. COEFFICIENTS OF DETERMINATION
(R2 VALUE) - RESULT
R R Square
Square Adjusted
CUSA 0.015 0.010 Weak
CUSL 0.380 0.375 Moderate
99
4. EFFECT SIZE f2
Assessment of effect size allows the researcher to observe the
effect of each exogenous construct on endogenous construct.
Cohen (1998) Guidelines for assessing f2 are that values of
0.02 – represent small,
0.15 – medium
0.35 – large effect of the exogenous latent variable.
Effect size values of less than 0.02 indicate that there is no effect
100
4. EFFECT SIZE f2 - RESULT
101
5. BLINDFOLDING & PREDICTIVE
RELEVANCE Q2
In addition to evaluating the magnitude of R2 values as a
criterion of predictive accuracy, researches should also
examine Stone-Geisser’s Q2 value (Geisser, 1974; Stone, 1974)
This measure is an indicator of the model’s predictive power or
predictive relevance.
The Q2 value is obtained by using Blindfolding procedures for a
specified omission distance D with a values between 5 and 10
Q2 value larger than 0 suggest than the model has predictive
relevance for a certain endogenous construct. In contrast
values of 0 and below indicate a lack of predictive relevance.
102
BLINDFOLDING & PREDICTIVE RELEVANCE
Q2 -
RESULT
SSO SSE Q² (=1-SSE/SSO)
values are greater than 0, indicate that the model has predictive relevance. 103
6. EFFECT SIZE q2
The effect size q2 allows assessing an exogenous
construct’s contribution to an endogenous latent variable’s
Q2 value.
As a relative measures of predictive relevance q2 values of
0.02 indicate exogenous construct has a small
0.15 indicate exogenous construct has a medium
0.35 indicate exogenous construct has a large
Predictive relevance for a certain endogenous construct.
104
6. EFFECT SIZE q2-RESULT
Construct Cross validated Redundancy
q2 = 0.047- (-0.062)
0.115 Medium Predictive
1- 0.047 Relevance
105
MEDIATION Y2
p1 p2
Y1 Y3
p3
p1 p2
Y1 Y3
p3
107
TYPES OF MEDIATION EFFECT
Is pi.p2
signific No
Yes ant?
Is p3 Is p3
signific significa
Yes ant? No No
Yes nt?
Is
p1.p2.p
3 No
Yes positive
?
Complementar Competitive
Indirect only Direct only No effect
y (partial (partial
(full mediation) (no mediation) (no mediation)
mediation) mediation)
MEDIATING ROLE
Full Mediation
The independent variables doesn’t have significant effect on the
dependent variables after inclusion of the mediation variables
Partial Mediation
The independent variables has a significant effect on the
dependent variables after inclusion of the mediation variables
(Zhao, Lynch and Chen, 2010)
MEDIATING ROLE
TESTING MEDIATING EFFECT
Significance Analysis of the Direct and Indirect Effect
Direct 95% t Significa Indirect 95% t Significa Indirect Direct Conclusion
Effect Confidence Value nce Effect Confidence Value nce Effect Effect
Interval of (p<0.05) Interval of (p<0.05) Significance
the Direct ? the Direct ? ?
Effect Effect
Fully
COMP -> CUSL 0.033 (-0.025, 0.125) 0.887 0.376 0.090 (0.010, 0.207) 1.575 0.116 Yes No
Mediation
LIKE --> CUSL 0.033 (-0.126, 0.222) 0.362 0.717 -0.029 (-0.186, 0.173) 0.314 0.754 No No No Effect
(inlcude
zerro)
2. MODERATION
Income
Custome
r Custome
Satisfacti r Loyalty
on
Custome
r Custome
Satisfacti r Loyalty
on
114
HYPOTHESIS
115
EVALUATION OF MODERATOR
VARIABLE
Measurement & Structural Model
Size of the moderating effect
Assess whether the interaction term is significant
Assess moderator’s f2 effect size
EVALUATION OF MODERATOR
VARIABLE
EVALUATION OF MODERATOR
VARIABLE
1. FROM DATA TO INFORMATION TO KNOWLEDGE
• What is Information
Data that has been processed into a form that give it meaning
Data that has been processed within a context to give it meaning.
Information is interpreted data. Information is meaningful
What is Knowledge
Knowledge is the understanding of rules needed to interpret information
“…the capability of understanding the relationship between
pieces of information and what to actually do with the
information”
Debbie Jones – www.teach-ict.com