Вы находитесь на странице: 1из 28

Applied Statistics for Engineers

WiSe 2018 / 2019

Michael Morlock, Adrian Falkenberg, Gerd


Huber, Philipp Messer, Valerie Polster,
Sebastian Zobel
Institute of Biomechanics

morlock@tuhh.de
040 42878 3253
Denickestrasse 15 Rm 3514

0. Overview
Prerequisites: none
Course format, period: 2 ECTS lectures, 1 ECTS exercise,
2 ECTS PBL , winter term
Work load: 6 ECTS Language: English
Performance evaluation: The examination consists of
- Final Exam (70%, 90 Minutes)
- Homework (10% extra credit for exam): given during
exercises, has to be turned in through stud_ip by Tuesday
the following week before noon
- Problem Based Learning (30%): presentation + evaluation

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 2
PBL

Applied Statistics
Problem Based Learning (PBL)

P. Messer, A. Falkenberg, V. Polster, Sebastian Zobel,


G. Huber, M. Morlock

Contact:
Philipp Messer
philipp.messer@tuhh.de
Adrian Falkenberg
Institute of Biomechanics adrian.falkenberg@tuhh.de
© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 3

PBL

You need to attend PBL if…

… you started MEDMS after WiSe 2015. Your Applied


Statistics module contains 6 ECTS

If you started MEDMS WiSe 2015 or earlier (4 ECTS) –


contact Dr. Huber for assignment of a group by
Friday 19th (g.huber@tuhh.de)

If your study plan contains Applied Statistics with 4


ECTS you still have to participate in the Exercise

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 4
PBL

Aims of Problem Based Learning


• Work independently and solutions-oriented, work
problem based
• Apply knowledge from the lecture, understand how,
why and when
• Develop practical SPSS skills
• Work in teams
• Learn to effectively present your work
• Improve your presentation skills with peer feedback

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 5

Timetable
Schedule Lecture:
Oct. 16th Lecture 1 15:00-16:30 (H0.16)
Oct. 16th Sign into Exercise & PBL groups 20:00 (in groups @ 4 members, Ai, Bi, Ci via StudIP PBL)

Schedule Exercise:
Oct. 23rd Group A 13:00-13:45 (L3038P1)
Oct. 23rd Group B 13:45-14:30 (L3038P1)
Oct. 24th Group C 13:15-14:00 (L3038P1)
….. and so on…
Exception: no Exercise on Oct. 31st
Oct. 30th Group A & C/2 13:00-13:45 (L3038P1)
Oct. 30th Group B & C/2 13:45-14:30 (L3038P1)

Schedule PBL (subgroups i work together – no shifts):


Oct. 16th Group Task 1 : given in lecture
Oct. 18th No PbL-Session
Oct. 25th Group A-C, individual: 11:30-13:00 (E2.054P4b & E2.055P4a)
Nov. 01st Group A, Presentation 1: 11:30-13:00 (D2.022) – mandatory for Group A
Nov. 01st Group B&C, individual: 11:30-13:00 (E2.054P4b & E2.055P4a)
Nov. 08th Group B, Presentation 1: 11:30-13:00 (D2.022) - mandatory for Group B
Nov. 08th Group A&C, individual: 11:30-13:00 (E2.054P4b & E2.055P4a)
Nov. 15th Group C, Presentation 1: 11:30-13:00 (D2.022) - mandatory for Group C
Nov. 15th Group A&B, individual: 11:30-13:00 (E2.054P4b & E2.055P4a)
….. and so on…

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 6
Timetable Seminar

Date Task
01.11.2018 Seminar Group A Descriptive Statistics
08.11.2018 Seminar Group B Descriptive Statistics
15.11.2018 Seminar Group C Descriptive Statistics
22.11.2018 Seminar Group A Correlation
29.11.2018 Seminar Group B Correlation
06.12.2018 Seminar Group C Correlation
13.12.2018 Seminar Group A Categorical Data
20.12.2018 Seminar Group B Categorical Data
10.01.2018 Seminar Group C Categorical Data
17.01.2018 Seminar Group A Statistics Guidance
24.01.2018 Seminar Group B Statistics Guidance
31.02.2018 Seminar Group C Statistics Guidance

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 7

Seminars

If it‘s your group‘s turn for the seminar,


attendance is mandatory (in D2.022) !
 Short presentations 3+1 mins/group
• 1 presenter (picked by chance, every member
must be prepared to do the presentation)
• 3 supporters for questions
Peer feedback
• Procedure will be explained in detail in first
seminar session

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 8
Individual PBL

• Chance to meet with your group,


discuss and help working on tasks
• Attendance is voluntary
• Reserved computer pools with SPSS access:
Pa4a and Pa4b

Tutor(s) will be present within the time


frame (11:30-13:00), but not for the entire
time. Possibility to ask questions and discuss
solutions
© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 9

Grading

• Each group‘s presentation will be graded


• The grade will apply for all group members
independent of who was the presenter
• Individual effort in asking or answering
questions (or the incapacity of doing so,
respectivly) will be taken into consideration
for the final grading

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 10
Grading (continued)

• In case of sickness or inability to attend,


inform us and your group in advance with
explanation. Your group must be able to
present.
• Repeated (2-times and more) unexplained
absence = 0 points for PBL
• PBL grade will be 30% of overall module
grade

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 11

Grouping

Subscription into groups A, B and C i.e. the


respective subgroups @ 4 people in
StudIP
„Problemorientierte Lehrveranstaltung:
Angewandte Statistik für Ingenieure“

Starting today, 16.10., 20:00!


(Closing 22.10., 12:00)

Groups A, B, C also determine time for Exercises

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 12
Task 1 – Descriptive Statistics
Acquiring and presenting data:
TUHH Mensa

Abendblatt.de

• Look at one or more question(s) of interest


• Acquire data (count, ask, record, …)
• Present with proper statistical diagrams like
histograms, scatter plots, boxplots, …
• Topic is completely up to you, be creative!
(But please don‘t bother the employees more than necessary…)
© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 13

Important

Presentation Structure (standard scientific)


1. Introduction
2. Materials, Methods
3. Results
4. Discussion
5. Take Home Message

Duration: 3 minutes (+/-15s)

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 14
Exercise Applied Statistics WS17/18

Contact:
Valerie Polster (valerie.polster@tuhh.de)
Sebastian Zobel (sebastian.zobel@tuhh.de)
Location & Time:
DE17 – L, Room 3038 P1
Group A: Tue 13:00-13:45 (start 23.10.2018)
Group B: Tue 13:45-14:30 (start 23.10.2018)
Group C: Wed 13:15-14:00 (start 24.10.2018)
Homework:
optional 10%  bonus points
StudIP: 1 week online (Tue 17:00–Tue 12:00)
© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 16/17 Lecture 1 Slide 15

Timetable
Schedule Lecture:
Oct. 16th Lecture 1 15:00-16:30 (H0.16)
Oct. 16th Sign into Exercise & PBL groups 20:00 (in groups @ 4 members, Ai, Bi, Ci via StudIP PBL)

Schedule Exercise:
Oct. 23rd Group A 13:00-13:45 (L3038P1)
Oct. 23rd Group B 13:45-14:30 (L3038P1)
Oct. 24th Group C 13:15-14:00 (L3038P1)
….. and so on…
Exception: no Exercise on Oct. 31st
Oct. 30th Group A & C/2 13:00-13:45 (L3038P1)
Oct. 30th Group B & C/2 13:45-14:30 (L3038P1)

Schedule PBL (subgroups i work together – no shifts):


Oct. 16th Group Task 1 : given in lecture
Oct. 18th No PbL-Session
Oct. 25th Group A-C, individual: 11:30-13:00 (E2.054P4b & E2.055P4a)
Nov. 01st Group A, Presentation 1: 11:30-13:00 (D2.022) – mandatory for Group A
Nov. 01st Group B&C, individual: 11:30-13:00 (E2.054P4b & E2.055P4a)
Nov. 08th Group B, Presentation 1: 11:30-13:00 (D2.022) - mandatory for Group B
Nov. 08th Group A&C, individual: 11:30-13:00 (E2.054P4b & E2.055P4a)
Nov. 15th Group C, Presentation 1: 11:30-13:00 (D2.022) - mandatory for Group C
Nov. 15th Group A&B, individual: 11:30-13:00 (E2.054P4b & E2.055P4a)
….. and so on…

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 16
0. Overview
Reading resources:

Applied Regression Analysis and Multivariable Methods


David G. Kleinbaum Lawrence L. Kupper, Keith E. Muller,
Azhar Nizam, 1998, ISBN/ISSN: 0-534-20910-6

Applied Statistics and Probability for Engineers


Douglas C. Montgomery, George C. Runger
(downloadable from the Internet)

Discovering Statistics using IBM SPSS Statistics


Andy Field
Internet (WIKIPEDIA, …..)

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 17

Housekeeping
• Scripts and informations through stud_ip
• You have to register
• Script available evening prior to the lecture
• Homework through stud_ip
• Questions for Homework: After each lecture (17:00)
• 1st homework (pulldown „VIP“) will be online
October 30rd on stud_ip

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 18
0. Overview

Exam Date: January 29th


(last lecture)

One hand written A4 page (two-sided) is


allowed as help for the exam.
Calculators (non programmable)

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 19

Housekeeping

• Please register also in the digital university calendar


(abonnieren)

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 20
0. Overview
Objectives: Introduction to basic statistical methods and
their application to simple problems using
established software (SPSS).
Contents: • Introduction, Definitions, Variables, Basics
 Descriptive Statistics
 Distributions
 Chi square test
 Simple regression and correlation
 Multiple regression and correlation
 Analysis of Variance
 Survival Analysis
 Discriminant analysis
 Analysis of categorial data
 Non-parametric statistics

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 21

0. Overview

Do we need statistics?

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 22
0. Overview

What do we need statistics for?

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 23

0. Overview

What do we need statistics for?

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 24
0. Overview

What do we need statistics for?

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 25

0. Overview
State elections 14.10.2018 Bavaria

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 26
0. Overview

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 27

Obesity Trends* Among U.S. Adults

(*BMI ≥30, or ~ 30 lbs overweight for 5’ 4” person)

1985

http://www.cdc.gov/nccdphp/dnpa/obesity/trend
Source: BRFSS, CDC.
Source: Mokdad A H, et al. JAMA 1999;282:16.
No Data <10% 10%–14% Source: Mokdad A H, et al. JAMA 2001;286:10.
Source: Mokdad A H, et al. JAMA 2003;289:1.

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 28
Obesity Trends* Among U.S. Adults

(*BMI ≥30, or ~ 30 lbs overweight for 5’ 4” person)

2011

No Data <10% 10%–14% 15%–19% 20%–24% 25%-29% ≥30%

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 29

0. Overview

Doesn’t look good – what can we do?

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 30
Obesity Trends* Among U.S. Adults
BRFSS, 2011
*Prevalence reflects BRFSS methodological changes in 2011, and these
estimates should not be compared to previous years.
2011

*Sample size <50 or the relative standard error (dividing the SE by the prevalence) ≥ 30%.
© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 31

Obesity Trends* Among U.S. Adults


BRFSS, 2016
*Prevalence reflects BRFSS methodological changes in 2011, and these
estimates should not be compared to previous years.
2016

*Sample size <50 or the relative standard error (dividing the SE by the prevalence) ≥ 30%.
© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 32
© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 33

0. Overview

The answers to essentially yes/no questions


(Hypothesis testing)
Estimates of numerical characteristics
(Estimation)
Descriptions of association
(Correlation)
Modeling of relationships
(Regression)

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 34
1. Introduction

"Statistics are no substitute for judgment."


Henry Clay

"The science of collecting and analyzing


data for the purpose of drawing conclusions
and making decisions."
from Tamhane, Ajit C., and Dorothy D. Dunlop. Statistics and Data Analysis from Elementary to Intermediate. Prentice
Hall, 2000, pp.1 (adopted from Dr. Elizabeth Newton, MIT)

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 35

0. Overview

The important issues we do not really talk


about….

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 36
1. Introduction

Precision: Spread of estimator of a parameter

Accuracy: How close estimator is to true value


Bias: Systematic deviation of estimate
from true value

Diagram courtesy of MIT OpenCourseWare, Dr. Elizabeth Newton

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 37

1. Introduction

Quality criteria for a test

Reliability

Objectivity

Validity

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 38
1. Introduction
Population: The set of all units of interest, can be
finite (all students TU) or nearly
infinite (all students).
Sample: The subset of the population
actually observed.
Variable: Attribute of each unit (income, age,
satisfaction). Variable.
Parameter: Numerical value, fixed.
Y = 2a * X

Statistic: Numerical function used to estimate


population parameter from sample.
© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 39

1. Introduction
Steps in Study Design and Implementation
1. Background research and literature review
2. Define the goals and hypotheses of the study
3. Determine variables to be measured
4. Develop a plan to collect the data
Sampling design
Sample size
Inclusions and exclusions
5. Train Personnel
6. Gather Data
7. Analyze Data
8. Report Results
(adopted from Dr. Elizabeth Newton MIT)

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 40
1. Introduction
Steps in Study Design and Implementation
1. Background research and literature review
2. Define the goals and hypotheses of the study
3. Determine variables to be measured
4. Develop a plan to collect the data
Sampling design
Sample size
Inclusions and exclusions
5. Train Personnel
6. Gather Data
7. Analyze Data
8. Report Results
(adopted from Dr. Elizabeth Newton MIT)

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 41

1. Introduction
Steps in Study Design and Implementation
1. Background research and literature review
2. Define the goals and hypotheses of the study
3. Determine variables to be measured
4. Develop a plan to collect the data
Sampling design
Sample size
Inclusions and exclusions
5. Train Personnel
6. Gather Data
7. Analyze Data
8. Report Results
(adopted from Dr. Elizabeth Newton MIT)

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 42
2. Classification of Variables

Nominal
weakest level, different categories
dichotomous or polychotomus
(gender, place of residence)

Ordinal
not only grouping into categories but also ordering
dichotomous or polychotomus
(age <= 39, >= 39; pain: low, medium, high)

Continuous/Interval (no gaps)


meaningful measure of distance and ratio
well accepted physical unit of measurement

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 43

2. Classification of Variables

Classification whether “independent” or “dependent”


depends on the study objectives rather than on the
mathematical structure of a variable.

Independent
variable under investigation re. which is modified
by investigator (treatment, cause)

Dependent
variable used to describe the outcome

Covariate
independent variable affecting outcome of study
but not of intrinsic interest
© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 44
1. Introduction
Steps in Study Design and Implementation
1. Background research and literature review
2. Define the goals and hypotheses of the study
3. Determine variables to be measured
4. Develop a plan to collect the data
Sampling design
Sample size
Inclusions and exclusions
5. Train Personnel
6. Gather Data
7. Analyze Data
8. Report Results
(adopted from Dr. Elizabeth Newton MIT)

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 45

1. Introduction
Sampling design
Descriptive:
One group (students of this lecture)
Observational:
Investigator records data without intervening.
Difficult to distinguish effects of predictors and
confounding variables
Comparative:
2 or more groups, e.g. students of 2 different
universities, common final exam.
Experimental:
Investigator actively intervenes to control study
conditions, investigate relationship between
intervention and response (outcome) variables

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 46
1. Introduction
Sampling design
Retrospective (case-control)
look back in time (technique A vs. technique B over
the last 20 years)
Cross-sectional
sample is investigated at a single point in time
(weight of rural and city children)
Prospective
sample is followed forward with time (2 different
medications, follow patients for 10 years)

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 47

1. Introduction
Data are usually organized as a matrix with rows
corresponding to observations and columns
corresponding to variables

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 48
2. Classification of Variables

Rough guide to choice of analysis


Classification of Variables
Method General Purpose
Dependent Independent
Regression To describe the extent, direction, strength
Continuous Continuous
Analysis of relationship
To describe the relationship between a
Analysis of
Continuous Nominal continuous dependent and nominal
Variance
independent variables
To describe the relationship between a
Nominal and
Analysis of continuous dependent and nominal
Continuous continuous
Covariance independent variable controlling for the
variables
effect of continuous independent variables
Group To determine how independent variables
Discriminant
membership Mixture are related to the probability of the
analysis
(2 or more) occurrence of one of two possible outcomes

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 49

3. Basic Statistics
3.1 Descriptive Statistics

Mean
arithmetic mean

Mode
the value that has the largest number of observations, namely the most
frequent value or values.
mode of {1, 2, 2, 2, 3, 9} = 2, the arithmetic mean = 3.16
the mode of {apple, apple, banana, orange, orange, orange, peach}= orange
(WIKIPEDIA)
Median
the value below which 50% of the scores fall, or the middle score
( 1/2 of the population will have values <= median re. >= median );
even sample size: the median is the mean of the two centermost scores
median {1, 2, 2, 2, 3, 9} = 2 (WIKIPEDIA)

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 50
3. Basic Statistics
Variance
measure of dispersion around the mean, equal to the sum of squared
deviations from the mean divided by the number of cases (population mean),
re. by one less than the number of cases (sample mean).

μ: population mean x : sample mean


Standard deviation
measure of dispersion around the mean. In a normal distribution, 68% of cases
fall within one SD of the mean and 95% of cases fall within 2 SD.
Population Sample

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 51

3. Basic Statistics
Variance
measure of dispersion around the mean, equal to the sum of squared
deviations from the mean divided by the number of cases (population mean),
re. by one less than the number of cases (sample mean).

μ: population mean x : sample mean


Standard deviation

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 52
3. Basic Statistics
Standard deviation (wikipedia)

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 53

3. Basic Statistics
Range
The difference between the largest and smallest values of a numeric variable;
the maximum minus the minimum.

Minimum
The smallest value of a numeric variable.

Maximum
The largest value of a numeric variable.

Standard error of the mean


A measure of how much the value of the mean may vary from one sample to
another sample taken from the same distribution.
It can be used to roughly compare the observed mean to a hypothesized value or
another group mean
(~two values are different at the 5% level if the ratio is <= -2 SEM or >= +2 SEM).

N
© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 54
3. Basic Statistics

Symbol Name Symbol Name


A, α Alpha Ν, ν Ny
Β, β Beta Ξ, ξ Xi
Γ, γ Gamma Ο, ο Omikron
Δ, δ Delta Π, π Pi
Ε, ε Epsilon Ρ, ρ Rho
Ζ, ζ Zet Σ, σ Sigma
Η, η Eta Τ, τ Tau
Θ, θ, Theta Υ, υ Ypsilon
Ι, ι Iota Φ, φ Phi
Κ, κ Kappa Χ, χ Chi
Λ, λ Lambda Ψ, ψ Psi
Μ, μ My Ω, ω Omega

© Michael Morlock Institute of Biomechanics TUHH Applied Statistics WiSe 18/19 Lecture 1 Slide 55

Вам также может понравиться