Вы находитесь на странице: 1из 48

SOC 205

Social Statistics
Tuesday Lecture: Joanne Miller
Thursday Labs:
Hayden (Dae Shin) Ju
Lab: Wenjuan Zheng
Sign-In at Rear of Classroom
To Do List for Thursday
1. Know QC username & password

2. Read complete syllabus on Blackboard


Calendar & Due Dates
Course Policies & Grading
Text, Computer Accounts, Software, Calculator
Expanded Course Description

3. Fill out 1st participation survey on Blackboard


under “Lab Assignments.”

4. Do Chapter 1 homework problems


Statistics
Can Be Like
Learning a
Foreign Language
Statistics

What Is Statistics?
What Are Statistics?

Statistics ARE Numerical Facts

Number of people living in “New York City” in 2010

21,895,722 CMSA NY-NJ-CT-PA


8,175,133 5 Boroughs
1,585,873 Manhattan

Not the Focus of This Course


Statistics IS
A field of study providing methods for:

Collecting
Organizing
Reducing
&
Interpreting

Empirical Information (Data)


Empirical Information

Information collected by observation

(Data)

data = plural
datum/data set = singular
Why Learn about
Field of Statistics?
No matter what your intended career,
Education
Social Services/Counsellor/Treatment Professions
Management/Administration
Public Policy & Its Implementation
Marketing/Advertising/Sales
Journalism/Law/Advocacy…
advancement is increasing based on
your ability to understand, track and
produce the “numbers” that guide
decision-making and evaluation.
Where Do National Data
Come From?
Major Sources
Government Agencies
Non-profit Organizations
Survey Research Companies
(Private Businesses But Data Less Available)

See Statistical Abstract


of the U.S.
Online Statistical Abstract of the U.S.

Go to:
www.qc.cuny.edu
• Click on Libraries at top of screen
• Click on Find Databases at left of screen under Quick
Links
• Click on letter S.
• Scroll down to Statistical Abstract of the U.S at bottom
of list and click on it.

If you are on an off-campus computer you will be asked for


the barcode number on the back of your student ID card.

( Slides Follow)
1. Click on “Find
Databases;” On
Next Screen …

2. Click on “S”
1. Scroll down to bottom of screen

2. Click on “Statistical Abstract of the U.S.”


If you are using a computer off campus,
you will be asked for the barcode number on the
back of you student ID card.

Your barcode must be active and cannot have any fines attached.
To activate or pay fines go to the circulation desk at the library.
List of topics continues; scroll down. Or, search for
You can explore topics and click on subjects by
one of interest to see available tables. keywords
Chapter 1- Introduction
Learning Objectives

1. Who is studied
2. What is observed
3. How to classify observations

See glossary at end of each chapter & index


for important vocabulary
Preparing to Collect Empirical Information
For Statistical Analysis

1. Who Is Studied

Unit of Population Subjects Observed


Analysis Studied “Sample”
Unit of Analysis
Unit Studied

Individual Aggregate/Group
Person Class
Event City
Object Country

Defines “Case”
Unit of Analysis - Example
Measurement of Drug Use in High Schools & GPA

Individual Data Group Data


Percent of Students Using
Student Has or Has Not Marijuana in Each
Used Marijuana High School
Case #1 = Person 1 Case #1 = High School A
Case #2 = Person 2 Case #2 = High School B
Case #1 = has used drug Case #1 = 15% used drug
Case #2 = has not used drug Case #2 = 30% used drug

Number of Cases: Number of Cases:


Total Number of Students Total Number of High
Studied Schools Studied
Unit of Analysis – Example Continued
Measurement of Students’ GPAs

Individual Data Group Data


Mean GPA of All Students
GPA of Each Student in High School

Case #1 = Person 1 Case #1 = High School A


Case #2 = Person 2 Case #2 = High School B

Case #1 = 2.6 gpa Case #1 = 2.1 mean gpa


Case #2 = 3.2 gpa Case #2 = 3.0 mean gpa

Number of Cases: Number of Cases:


Total Number of Students Total Number of High
Studied Schools Studied
Association Between
Marijuana Use & GPA
Let’s say you found in group data that the mean
GPA is lower in schools that have a high level of
marijuana use.

It does not necessarily mean that the marijuana


users are the ones with the lower GPAs.
This conclusion requires individual-level data

You may have committed an ecology fallacy


Why Does Unit of Analysis Matter?
It is necessary to show analysis in
individual-level data not only group data
before drawing conclusions about
individual behavior

Famous example: foreign-born and literacy


Origin of the Concept of Ecological
Fallacy
• For states, 1930 Census showed the greater
the percent of immigrants in a state, the
higher the state’s average literacy in English.
• For individuals, immigrants were on average
less literate than people born in the US.
• Because immigrants tended to settle in states
where the population of citizens was more
literate – NOT because people who were
immigrants were more literate.
Group data are a useful flag for
thinking about individual behavior.
But it can be misleading without a
lot of thought about what it means
about
individual characteristics.

There may be other explanations.


Population
Definition of the Group Studied
Defined by Researcher – Must be Sensible

Examples:
•Students in your statistics section
•All students taking statistics in Sociology
•All students taking statistics in the College
•All students taking statistics in U.S.
Sample

Subset of Population Actually Observed

Example: Sample of Voters


Not every voter is contacted, only
certain voters – a sample of voters
Sample = voters actually observed
Preparing to Collect Empirical Information
For Statistical Analysis
Review

1. Who Is Studied

Unit of Population Subjects Actually


Analysis Studied Observed

Individual Defined By Defines


or Group Researcher Sample
“Inferential Statistics”
Preparing to Collect Empirical Information
For Statistical Analysis

2. What Is Observed

Variable Name Independent (X)


refers to a Dependent (Y)
characteristic Control (Z)*
that has been Variables
measured

* Control variables (Z) are introduced in Chapter 14


What Is Observed
Variable = Characteristic that Varies
Constant = Characteristic that Does Not Vary
Variable Name = Name of Characteristic
(usually short name)
Variable Label = Extended Definition*
(more words, more meaning)
* May still have to look at exact question wording to
understand what was measured. See “SPSS Notes” on
Blackboard.
Independent, Dependent, & Control Variable

Dependent Variable Is Outcome Studied (Y)


Independent Variable Influences Outcome (X)
Control Variable Is Alternative Explanation;
Another Independent Variable (Z)

Relationship Between
Independent & Dependent Variables
Called Association
Association Should NOT Be Confused with Causation
Preparing to Collect Empirical Information
For Statistical Analysis

1) Who is studied
Unit of Analysis - Individual or Group
Population – Defined by Researcher
Sample – Actual Observations

2) What is observed
Variable Name & Label
Independent Variable (X), Dependent Variable (Y), Control Variable (Z)

3) How to classify observations


Preparing to Collect Empirical Information
For Statistical Analysis

3. How to Classify Observations

Specify Values of Level (Type) of


Variable Measurement Scale
(Coding of Variable)
Specifying Values (Coding)
Example

Variable Name = SEX


Variable Label = Sex of Respondent

Variable Values = male


female

Values Determine Type of Measurement Scale


Level of Measurement
(Type of Measurement Scale)

Types of Measurement Scales Determined


by Category Values:

1. Nominal
2. Ordinal
3. Interval-Ratio
Nominal Measurement Scales

1) Values Are Named Categories That Are Separate


and Distinct (labels)
2) No Inherent Order of Categories

3) Distance Between Categories Not Known

4) Categories Cannot be Subdivided


Note:Nominal Measurement Scales
Can Have Number Values
Nominal Measurement Scale
Named Categories

Category Values = Women Men


Category Values = Princesses Princes
Category Values = 1 2
Category Values = 2 1

Names of Categories Can Be Number Labels


Without Order
Not A Count of How Much
Ordinal Measurement Scales
Values Can Be Ranked (Ordered)
Ordinal Measurement Scale
Named Categories With
Inherent Ranking/Order
Example: Performance at Olympic Games
Category Values:
Gold Medal
Silver Medal
Bronze Medal

Note: Exact Distance Between Category Values


Unknown But Can Be Ordered
Nominal & Ordinal
Measurement Scales

Remember:
1) Category Values = Named Labels
2) Exact Distance between Categories
Unknown
3) Values Cannot be Directly Subdivided
(without further information on cases)
Interval-Ratio* Measurement Scales

1. Values = Count of How Much/Many


2. Distance Between Categories:
• Is Known (can be measured)
• Equal and Consistent Throughout Scale
3. Values Can be Subdivided (in theory)

* Note: Textbook Combines Interval & Ratio Scales;


Ratio Measurement Scales

Ratio Measurement Scale


Scale Starts at Zero
Means Absence of Characteristic

Example: Weight
Count of Number of Pounds

0 pounds = absence of characteristic


Interval Measurement Scales

Scale Does Not Start at Zero

Example: Temperature in Fahrenheit


Count of Degrees (F)

0 degrees Fahrenheit does NOT mean


an absence of temperature.

0 degrees Fahrenheit means it is cold.


General Guidelines for Determining
Level of Measurement
Distance
Category Between Can Be
Values Category Subdivided
Values (In Theory)
Nominal Named Unknown No
Labels

Ordinal Named Unknown No


Labels Can
Be Ranked

Interval- Count of How Known & Yes


Ratio Much/Many Equal
Throughout

Problem Cases Follow


Problem Cases
Strength of Opinions
Opinion Scale:
1. Strongly Disagree
2. Disagree
3. Neither Disagree nor Agree (neutral/unsure)
4. Agree
5. Strongly Agree
Is This Measurement Scale a
Ordinal Scale or Interval Scale?

If distance between values unknown = Ordinal


If distance between values known = Interval
Problem Cases
Rating Scale or
Index Based on Multiple Questions
Beauty rated on a scale of 1-10
Composite Index low to high level 1-29
Ordinal Scale or Interval Scale?
Are distances between category values known?
Can scale be subdivided?

If large number of categories


researchers would treat variable as an interval-ratio scale.
Problem Cases
Income Measured in 25 Income Groups
(Not Dollars of Income; Not Equal Width Intervals)

Ordinal Scale or Interval-Ratio Scale?

Most researchers would treat this


variable as an interval-ratio scale
because of large number of income
levels.
Problem Cases
Count of Number of Children
Can scale be subdivided (in theory)?
Researchers would say yes, even
though children (entity) cannot be subdivided.

1. Scale can be subdivided in theory as a


count of how much/many.
2. Distance is known and is equal and
consistent throughout scale
Same Variable Can Have Different
Measurement Scales
Example: AGE

Category Values: Category Values: Category Values:


1
Child 1–12 2
Teenager 13-19 …
Young Adult 20-34
18
Middle Aged 35-59 19
Older Age 60+ …

Ordinal Ordinal 85 Interval-Ratio


Measurement Scale Measurement Scale 86 Measurement
5 Categories … Scale

125
Summary
1) Who is studied?
Unit of Analysis: Individual or Group
Population
Sample Statistical Inference

2) What is observed?
Variable Name & Label
Independent (X) Dependent (Y) Control (Z) Variable

3) How to classify observations?


Nominal Scale – Named category value, no order
Ordinal Scale – Named category value, inherent ranking/order
Interval-Ratio Scale – Count of how much/many, known distance
between categories that is equal & consistent.
For problem cases, decide which type of scale and defend you decision
Begin Statistical Analysis of Data

Method of Analysis Depends on


Type of Measurement Scale

Вам также может понравиться