Вы находитесь на странице: 1из 42

ECON1203 Business and Economic Statistics

Week 1

Week 1 topics

Administrative details Descriptive p statistics


Frequency distributions & histograms Shapes of distributions Describing bivariate relations

Key references

Course outline Keller Chs 1-3

Administrative details: Staff

Administrative details: Teaching & Learning


Keller 9th ed. provides detail & basic reference material Lectures provide

Overview Emphasis of key points Some worked examples Review & discussion opportunities Preview participate practice
4

Tutorials provide p

Learning cycle

Administrative details: What youre expected to know

BES will utilize prior knowledge from high school mathematics


Basic algebra Basic calculus Basic probability

This is presumed knowledge & there will be no mathematics instruction in BES But BES is about statistics not mathematics

Administrative details: Assessment

Online quizzes designed to give timely feedback

Weeks 3,6, & 12 Week 8

Assessment Percentage Component of total mark Feedback quizzes In-tutorial test Project Final examination Total 6 14 20 60 100

In-tutorial test

EXCEL important but no p g mark direct computing


Some tutorial problems have computing part Major j p project j requires q EXCEL

Weekly tutorials (Weeks 2-13) 2 13) prepare you for

Final examination

Administrative details: Lectures

Lecture notes will be sparse Lecture discussion will be more expansive There should be benefit in attending lectures Use textbook to fill any remaining i i gaps Textbook provides extensive reference material Worked examples in lectures will appear with space for an answer a s e Statistics in action (SIA) will use real case studies

Worked eg on frequency distributions (So, it its s probably a good idea to


print out the lecture slides so you can fill in the answers as they occur ) occur.)

Administrative details: Why BES?


Its a Faculty requirement! BES covers basic statistics

St ti ti = collecting, Statistics ll ti analyzing l i & interpreting i t ti data d t

Statistical information & analysis is pervasive in business & other disciplines


Provides skills for real-world decision making Provides foundation for all 2nd year econometrics subjects U d i quantitative Underpins tit ti analysis l i across all ll ASB schools h l

Applications of interest to accounting, accounting economics, economics finance, management & marketing students
the sexy job in the next 10 years will be statisticians. And Im not kidding kidding. Hal Varian, Varian Chief economist@Google
8

Administrative details: Statistical & generic skills developed

By end of BES you should be able to

Identify appropriate statistical methods for describing data & making inferences about population parameters Apply appropriate statistical methods to samples of data Use statistical reasoning to aid in problem solving Use EXCEL to apply appropriate statistical methods W it a basic Write b i b business i report td documenting ti statistical t ti ti l analyses Critically evaluate statistical work of others

Statistics in action (SIA)

Case studies using real, contemporary data


Motivating examples in lectures Material for tutorial problems Suggest prototype project problems

Examples may include


Baby bonus Cars in China Private health insurance Petrol prices p Migrant wealth Sydney housing prices Crime statistics
10

SIA: Baby bonus

Australian Government introduced a $3000 baby bonus in July 1, 2004


Increased again July 1 1, 2006 Analysis based on Gans & Leigh (2008) If so by how much? Note: change announced 7 weeks before implementation Daily births in Australia for 1975 1975-2004 2004 from Australian Bureau of Statistics (ABS)

Did women react to this economic incentive?


Data?

11

SIA: Baby bonus


Heres (part of) the data BUT need descriptive p statistics in order to

Summarize data Facilitate interpretation Analyse the research question Extract information

12

400 June 3 June 3

Numbe er of births per day 500 600 700 800 900 0 400 1000

Number of births per da ay 600 800

June 10

June 10

June 17

June 17

June 24

June 24

Panel A: Raw Data

SIA: Baby bonus

Panel B: Controlling for Day-of-Week Effects

Figure 1: The Introduction Effect

1975-2003 July 1 2004 July 7 July 14 July 21 July 28

July 1

July 7

July 14

July 21

July 28

13

SIA: Baby bonus

- Data are numbers of What are the key newborns per day features of these data? (variable) -Data are observed in time (time series data)

Statistical concepts?

Population Parameter Sample Statistic Statistical inference

See the next slides slides

14

Statistical concepts

A population consists of all subjects that are studied. A parameter is a numerical measure of population. - Note: parameters are often unknown A sample p is a subset of the p population p that is being g studied. A statistic is a numerical measure that describes a characteristic of a sample. Inferential statistics uses the information from a sample l t to draw d conclusions l i about b t th the population. l ti Or, inferential statistics computes/uses a statistic to infer about a parameter of the population. population
15

Statistical concepts

16

Worked example: diabetes data

The data set consists of observations on 442 patients about their 4 variables: a quantitative measure of disease progression, height, weight and gender. What is the population? What is the sample? The average height of all diabetes patients is a parameter (unknown) The average height of all patients in the data set is a statistic (known)

17

Types of data

A variable is a characteristic of a population or of a sample from a population We observe values or observations of a variable A data set contains observations on variables Variables may be Discrete or continuous

Discrete example - football scores, number of newborns per day Continuous example - time remaining in football game, height, weight

Quantitative variables vs qualitative variables


Scores, time, height, weight are quantitative Gender, colour are qualitative Quantitative is also referred to as interval or numerical. Q Qualitative is also referred to as nominal or categorical.

Ordinal data are qualitative but there is an ordering


This course is p poor, , good, g , very y good g Standard & Poors ratings AAA > AAA- > AA+ > AA
18

Types of data

Type of observation can also be used to classify data Time series data refer to measurements at different points in time

Eg: SIA Baby bonus births per day Eg SIA Sydney Eg: S dne housing ho sing prices b by s suburb, b rb diabetes data

Cross sectional data are measurements at single point in time

Data type can influence what is appropriate by way of analysis Total number of births per day makes sense Suppose marital status is coded as Single =1, Married =2, Divorced =3, Widowed=4;

Does it make sense to total the marital status of a sample of individuals?

19

Descriptive statistics

Difficult to determine key features of data

Need to organize & summarize data in order to extract i f information ti This is role of descriptive statistics Descriptive statistics is useful for inferential statistics Note: descriptive statistics is about organizing and summarizing data while inferential statistics is for inferring about the population parameters t Some are graphical, graphical some numerical Type of data may impact on which to use

There is a vast range of tools & techniques


20

Frequency distributions (for categorical data)


Want to summarize categorical data with associated counts Lists all possible values for a variable along with the number of observations for each value UNSW interested in transport issues

Mode of Frequency transport to campus p Resident Walk Cycle Car Bus Other Total

Relative Frequency

How do people travel to campus? (http://www.facilities.unsw.edu. au/getting-uni g g ) Categories need to be mutually exclusive & exhaustive

100

21

Bar charts & pie charts

Provide graphical representation of frequency distributions 2011 UNSW Travel S Survey sample l of f 5881

BarchartofmodeoftransporttoUNSWcampus
3000

2500

2000

47 (0.8%) Resident 628 (10 (10.7%) 7%) Walk 210 (3.6%) Bike 1032 (17.5%) Car 1188 (20.2%) Bus 2776 (47.2%) Other

1500

1000

500

0 Resident Walk Bike Car Bus Other

22

Bar charts & pie charts

Pie charts show relative frequencies What s in the Other Whats Other category? Can we use a bar chart to summarize quantitative data? q

Piechartofmodeoftransport toUNSWcampus
Resident 1% Walk 11% Bike 4%

Other 47%

Car 17%

Bus 20%

Modifiedpiechartofmodeoftransport toUNSWcampus
Resident 1% Walk 11% Bike 4%

Bus&train 45%

Car 17%

Other 2%

Bus 20%

23

Frequency q y distributions (for ( quantitative data)

A frequency distribution of a quantitative data set is a table that


groups the data into classes (intervals) with the corresponding frequencies width =(largest value-smallest value)/(number of classes)

Each interval has the same width determined by y The intervals need to be mutually exclusive & exhaustive

How many classes? (EXCEL calls them bins)


Too many doesnt summarize Too few no information No set rules although g more observations more classes Usually at least 5 and no more than 15

24

Worked example: Keller Ex 3.3: Business Statistics Marks


65 71 66 79 65 82 80 86 67 64 62 74 67 72 68 81 53 70 76 73 73 85 83 80 67 78 68 67 62 83 72 85 72 77 64 77 89 87 78 79 59 63 84 74 74 59 66 71 68 72 75 74 77 69 60 92 69 69 73 65

Lets use 5 classes Width = ? Frequency distribution?

25

Histograms

A graph of a frequency distribution (for numerical data) is called a histogram The horizontal axis shows the class limits or class midpoints The vertical axis shows frequencies
"Aussie" Marks histogram
35 30 25 20 15 10 5 0 49 64 74 Marks 84 100

Frequenc cy

26

Histograms

Beware of EXCEL features

incorrect

Histogram for Example 2.6


30 25 Frequency 20 15 10 5 0 50 60 70 Bin 80 90 More

Should be no gaps between bars for quantitative data Classes defined by upper limits when class midmid points may be more natural Bar areas should be proportional to frequencies (refer Ch 8 8, p 265)

27

Other distributions & displays

Can convert information in frequency distributions into:

Cumulative frequency or (cumulative) relative frequency distributions Aussie A i marks k eg - How H many students d got a credit di or better? Associated cumulative histograms & ogives May be interesting information lost in histograms Do examiners avoid marks close to borderline?

Stem-and-leaf displays

See Keller 3 3.1 1 for more discussion & examples


28

Key features: Shapes of histograms

Symmetry

Left half of histogram is a mirror image of right half Famous bell-shape is symmetric Asymmetric A t i hi histogram t Long tail to the right (positively skewed) Long tail to the left (negatively skewed) May be associated with outliers Modal class is class with highest frequency Histograms may be unimodal or multimodal, or no mode
29

Skewness

Number of modal classes


Key features: Shapes of histograms

30

A key feature: Shapes of histograms


From Keller Ex 3 3.2 2 Histogram of returns on Investment A is Modal class is No. of modal classes?

Therefore histogram is

31

Bivariate relations

Want to extend univariate analyses to characterize relationships p between variables

Contingency table (Cross-tabulation table)

For relationship between qualitative variables For relationship between quantitative variables, e.g. height and weight If one of the variables is time we get a time series plot (line chart)

Scatter plots

32

Mode of transport example: Staff/student differences?

Want to understand if there is a relation between commuter type and mode of transport What does the bar graph highlight?

Mode Resident Walk Cycle Car Bus Bus & train Other Total

Staff

Commuter type Student

Total

0 97 52 472 186 230 25 1062

47 531 158 560 1002 2439 82 4819

47 628 210 1032 1188 2669 107 5881

Modeoftransport bycommutertype
3000 2500

Frequency

Is there a better representation?

2000 1500 1000 500 0 Resident Walk Bike S ff Staff Car S d Students Bus Bus& train Other

33

Scatter plots

34

SIA: Cars in China

35

SIA: Cars in China

What are the key features of these data?

Read Keller Ch 3 3.4 4

How does this graph rate in terms of characteristics for graphical excellence?
36

Time series plot

Bivariate relationship between variable Y & time


With time series data order matters Business cycles defined by extended periods of growth or contraction Relatively sophisticated time plot

SIA Baby SIA: Bab bon bonus s

37

SIA: Petrol prices


Monthlyunleadedpetrolpricesforselectedcapitalcities
180 160 140 120 100 80 60 40 20 0

Ave erageprice(cents sperlitre)

SYDNEYMETRO

BRISBANEMETRO

MELBOURNEMETRO

38

SIA: Petrol prices

What are the key features of these data?

Do these data inform motorists about the best time to buy petrol?

39

SIA: Petrol prices

40

SIA: Petrol prices


Pattern representative of daily price movements in Sydney in winter 2006 C Consider id d daily il d data t J June 11 2006 (S (Sunday) d ) -July J l 10 2006

Notable price variation from day to day Determine day of weekly peak and trough Common day for prices to peak was Thursday No N peaks k ( (or t troughs) h ) on M Monday, d T Tuesday, d F Friday, id Saturday or Sunday Common day for prices to trough was Tuesday No troughs on Wednesday, Thursday, Friday or Saturday No days when prices both peaked & troughed
41

Consider longer period of daily data (not shown)


SIA: Petrol prices

Would bar or pie charts be useful in displaying the weekly peak & trough data?

42

Вам также может понравиться