Вы находитесь на странице: 1из 13

SPSS:

SPSS Commands and Interpreting Statistics


Frequency Distributions
We use frequency distributions to determine the frequency or number of people
that fall into a certain category. For example, if we classified those running for
Senator or governor as Democratic and Republican, a frequency distribution would
allow us to determine the percent that were Democrat and Republican.
In our data file, the variable we used to list candidates as Republican or Democrat
was party.
1. Go to AnalyzeDescriptive StatisticsFrequency
2. Double click on party and then click OK.


Interpreting Frequency Distributions
1. As you can see, the two parties are listed below: Democrat and Republican. The
Missing category simply reflect the candidates whose party affiliation we could not
determine.
2. Under the Frequency column, we have the number of candidates that were Democrat
(186), Republican (280), or unclassified or missing (5).
3. Finally, we typically use the Valid Percent column in deterring the frequency
distribution of Democrats and Republicans because it does not take into consideration
those cases where we could not assign a category. In this case, 39.9 percent were
Democrat and 61.1 percent were Republican. Clearly, there is a greater number of
Republican than Democratic candidates.

Political Party
Frequency Percent Valid Percent
Valid
Democrat
186
39.5
39.9
Republican
280
59.4
60.1
Total
466
98.9
100.0
Missing 9.00
5
1.1
Total
471
100.0

Cumulative
Percent
39.9
100.0


Chi Square Test
Often we have two nominal level variables (gender, party affiliation, or ethnicity for
example) and we need to determine if a relationship exists between them. For
example, we may want to know if ethnicity is related to party affiliation. We suspect
it is the case and we hypothesize because that minorities are associated with the
Democratic Party and whites with the Republican Party.
Using a crosstab table and Chi Square test, we can determine if there is a
relationship between two variable that IS NOT DUE TO CHANCE.
1. To do this, go to AnalyzeDescriptive StatisticsCrosstabs.
We put the Political Party in the Row because the dependent variable ALWAYS
goes in the Row box. We put Ethnicity in the Column because the independent
(explanatory variable) ALWAYS goes in the Column box.


2. Next we click the Statistics button and click Chi Square, Phi and Cramers V,
and Lambda. Click the Continue button.

3. Next, click the Cell button. Under Counts, check Observed and under
Percentages click Row, Column, and Total. Then click the Continue Button.


4. Click the OK Button to run your crosstab.

Interpreting Your Crosstab


1. Reading a crosstabulation can be confusing. Over the years, I have found the following
to be helpful in reading them. First, we always begin with the dependent variable that is
listed in the column. In this case it is ethnicity, and since we are looking at ethnicity, we
will read the cell associated with % within Ethnicity (2). Here is how we read this
table. If we are interested in what party Non-whites support, we say:
Of those who are non-white, 72.1% are Democrats. And of those people who are
non-white, 27.9% are Republicans.
If we are interested in the white respondents, we say:
Of those who are white, 35.5% are Democrat AND 64.5% are Republican.
If you use this phrase and fill-in the blanks, you can interpret this table properly every
time!
Of those who are _____, ____% are _______ AND _____% are _______.

Political
Party

Total

Political Party * Ethnicity (2) Crosstabulation


Ethnicity (2)
NonWhite
White
Democrat
Count
146
31
% within Political
82.5%
17.5%
Party
% within Ethnicity (2) 35.5%
72.1%
% of Total
32.2%
6.8%
Republican Count
265
12
% within Political
95.7%
4.3%
Party
% within Ethnicity (2) 64.5%
27.9%
% of Total
58.4%
2.6%
Count
411
43
% within Political
90.5%
9.5%
Party
% within Ethnicity (2) 100.0%
100.0%
% of Total

90.5%

Totl
177
100.0%
39.0%
39.0%
277
100.0%
61.0%
61.0%
454
100.0%
100.0%

9.5% 100.0%


2. We thought, hypothesized, that ethnicity was related to party affiliation: Non-
whites were more likely to be Democrat and Whites more likely to be Republican. As
you can see from the table above, this is true. 72% of non-whites called themselves
Democrats and 65% of whites called themselves Republicans. So our statistics bear
out our hypothesis.
3. However, is there a possibility that the relationship between ethnicity and party
affiliation is due to chancethat is to say, there really is no statistically significant
reason to believe these variables are related to one another.
To answer this question, we use the Pearson Chi-Square test. Look at the table
below. In the Pearson Chi-Square row, there are numbers under three Sig.
columns. Disregard the column for the time being. If the number is between .000
and .050, we can say that the relationship between the independent variable
(ethnicity in this case) is significantly related to the dependent variable (party
affiliation). This is another way of saying that the relationship is not due to chance
and really exists! As you can see below, the Chi-Square coefficient (number) is .000
under the Asymp. Sign (2-sided) column. Therefore, ethnicity is definitely related
to party affiliation
If the number is .051 or above, the significance is due to chance and we say that we
are not confident that the ethnicity and party affiliation are related. Our hypothesis
that ethnicity is related to party affiliation is rejected.
Chi-Square Tests
Asymp. Sig. (2- Exact Sig. (2- Exact Sig. (1Value df
sided)
sided)
sided)
a
21.886 1
.000
20.375 1
.000
21.438 1
.000
.000
.000
21.838 1
.000

Pearson Chi-Square
Continuity Correctionb
Likelihood Ratio
Fisher's Exact Test
Linear-by-Linear
Association
N of Valid Cases
454
a. 0 cells (.0%) have expected count less than 5. The minimum expected count is
16.76.
b. Computed only for a 2x2 table

4. How strong is the relationship between the independent variable (ethnicity) and
the dependent variable (party affiliation). The are two measures of association and
for our purposes use Cramers V unless SPSS spits out only a Phi statistic. Under the
Value column, a number is listed. The higher the number, the greater the strength
of association. Lets use the following scale:
0-.30=no relationship (0) to weak relationship
.31-.70=moderate relationship
.71-1.0=strong relationship
A strong relationship means that knowing the ethnicity of a person will give us very
good reason to guess the political party with which they are affiliated. A weak
relationship, means that knowing the ethnicity of a person gives does not give us
much confidence is guessing the persons political party affiliation. In this case, the
association is weak (.220). If I guess the persons political affiliation based on a
persons apparent race, I would likely be wrong!
Symmetric Measures
Value Approx. Sig.
Nominal by
Phi
-.220
.000
Nominal
Cramer's
.220
.000
V
N of Valid Cases
454

Pearson Correlation
A correlation is a powerful way to determine the association between two interval level
variables. An interval level variable is one whose values are an equal distance apart. For
example, income (dollars), ages (years), experience in politics measured in years (years),
and percent of the vote (percentages). Male and female are not interval level variables,
because they are not expressed in values equal distance apart. They are categorical
variables.
For example, we may be interested in determining if political experience as measured by
the number of years a person has served in office is related to campaign funds raised. We
suspect that the longer the incumbent is in office, the more campaign funds s/he will
raise. After all, an incumbent has political power and is likely to be reelected: we would
want to contribute to the incumbent.
1. To do a correlation analysis, go to AnalyzeCorrelationBivariate
2. Find and double click the variables Political Experience and Money Raised. This
will put these two variables in the variable window.
3. Click the OK button to run your correlation.

Interpreting Your Pearson Correlation


1. A correlation coefficient (number) represents the strength of an association between to
variables. The higher the number, the greater the strength of association. Lets use
the following scale:
0-.30=no relationship (0) to weak relationship
.31-.70=moderate relationship
.71-1.0=strong relationship
2. In this case the correlation between Political Experience and Money Raised is
.331** This would be moderate relationship.
3. The Sig. (2-tailed) is important. It tells us if the relationship is due to chance. If
the correlation coefficient (number) is between .000 and .050, we can say that the
political experience and money raised are significantly related and we can say that
an increase in political experience will lead to an increase in campaign
contributions. If the coefficient is .051 or more, we say that we cannot be confident
that political experience and money raised are related or associated.
In this case, we can say that there is a moderate, significant relationship between
political experience and money raised.
Correlations
Political Experience
(Years)
Political Experience
(Years)

Pearson
Correlation
Sig. (2-tailed)
N
Money Raised
Pearson
Correlation
Sig. (2-tailed)
N
**. Correlation is significant at the 0.01 level (2-tailed).

Money
Raised
.331**

462
.331**

.000
414
1

.000
414

421

Multiple Regression
A very powerful way to analyze data is by using a multiple regression. For our
purposes, a multiple regression allow us to look at several factors that affect a
dependent variable and determine what factors exert a greater influence on the
dependent variable. For example, we may suspect that the size of a persons vote is
determined by the quality of the candidate AND the amount of money raised. After
all, better Senate candidates will win a greater percentage of the vote than poorer
Senate candidates and candidates with more money will be able to spend more to
get elected. With more money to spend, they should get a greater percent of the
vote. But, which factor is more important: candidate quality or money raised. To
answer this question, we do a multiple regression.
1. Go to AnalyzeRegressionLinear
2. Since the dependent variable is the percentage of the vote a candidate received,
we put Vote: Primary or Convention in the Dependent variable box. The two
independent variables we expect to influence the dependent variable goe in the
Independent(s) variable box. It should look like this:


3. Click the OK button.
Interpreting Your Multiple Regression
1. Your output produces a number of tables. Lets look at the most important tables.


1. The first table, Variables Entered/Removed, tells you what variables were used
in the analysis. As you can see, Money Raised and Political Experience were used.
Under the table, you can see that the dependent variable was Vote: Primary or
Convention.
Variables Entered/Removedb,c
Variables
Removed

Model
Variables Entered
1
Money Raised, Political Experience
(Years)
a. All requested variables entered.
b. Dependent Variable: Vote: Primary or Convention
c. Models are based only on cases for which Office = Senate

Method
. Enter


2. There are two coefficients or numbers that are important: the R and R
Square. The R is the combined effect of all the independent variables on the
dependent variable. In this case there is a moderate, positive association between
money raised and candidate quality (.662). The R Square simply means that these
two variables explain 43.8 percent of the variance in the dependent variable: the
vote. This is a technical way of saying that there are other factors (variables) that
explain the remaining 56.2 percent of the variance. What might they be? How about
incumbency or candidate quality?
Model Summary
R
Office =
Adjusted R
Std. Error of
Senate
Model
R Square
Square
the Estimate
(Selected)
1
.662a
.438
.432
20.33142
a. Predictors: (Constant), Money Raised, Political Experience (Years)

3. In the ANOVA table, look only at the Sig. column. If the number is between .000-
.05 inclusive, then we can say that the relationship between the independent
variables (money raised and candidate quality in this case) and the dependent
variable (share of the vote) is not due to chancewhich is the case here. This means
that we are confident that money raised and candidate quality influence the vote. If
it is greater than .05 (for example .051 or .60 or .154), then the relationship MIGHT
BE DUE TO CHANCE and we should say we are not confident that money raised and
candidate quality are linked to the percentage of the vote.
ANOVAb,c
Model
Sum of Squares df Mean Square
F
Sig.
1 Regression
62539.414
2
31269.707 75.646 .000a
Residual
80193.138 194
413.367
Total
142732.552 196
a. Predictors: (Constant), Money Raised, Political Experience (Years)
b. Dependent Variable: Vote: Primary or Convention
c. Selecting only cases for which Office = Senate

4. A very important table is the Coefficients table. This table tell us, among other
things, how much influence each independent variable exerts on the depend
variable. Note the following columns.
a. Under Model are listed the two independent variablesPolitical
Experience and Money Raised.
b. Really important are the coefficients (numbers) under the column
Standardized Coefficients, Beta. The higher the number the more influence this
variable influences the dependent variable, the percentage of the vote. In this case,
you can see that Political Experience (.398) is more important than Money
Raised (.370)but not much more. Thus, we can say that political experience is
more important than money in explaining voting for Senate candidatesbut not by
much!
In some cases the Beta coefficient will have a negative sign in front of it.
Disregard this sign in interpreting which variable exerts the most influence over
the dependent variable. The larger the number, regardless of the sign, exerts more
influence.
c. The Sig. column simply states whether the independent variables
(political experience and money raised) are significantly related to the dependent
variable (percent of the vote). If the number is between .000 and .050, we can say
that the relationship is NOT due to chance: that there is a significant relationship
between this variable and the dependent variable. As you can see, the relationship is

significant and we can say that political experience and money raised are
significantly related to the vote.

Model

Coefficientsa,b
Unstandardized
Coefficients
B
Std. Error
16.224
1.698
1.077
.166

Standardized
Coefficients
Beta

1 (Constant)
Political Experience
(Years)
Money Raised
2.470E-6
.000
a. Dependent Variable: Vote: Primary or Convention
b. Selecting only cases for which Office = Senate

Sig.

9.556 .000
.398 6.485 .000
.370 6.024 .000

Вам также может понравиться