0 оценок0% нашли этот документ полезным (0 голосов)

329 просмотров26 страницasd

Jul 29, 2016

© © All Rights Reserved

DOC, PDF, TXT или читайте онлайн в Scribd

asd

© All Rights Reserved

0 оценок0% нашли этот документ полезным (0 голосов)

329 просмотров26 страницasd

© All Rights Reserved

Вы находитесь на странице: 1из 26

Instructor: Humaira Husain (HHn) Lecturer, Dept. of Economics

Office: NAC 819

Consultation hours: ST: 10am- 1pm

MW 9.10AM to 9.40am & 1pm 1.30pm

Email: humaira@northsouth.edu

Course Objective:

The aim of this course is to assist students to get familiar with the standard

statistical techniques frequently used in Business and Economics. This course

introduces advanced topics in statistics and their application in Business and

Economics. The course is thoroughly application oriented and serves as a

prerequisite for Course like Introduction to Econometrics (ECO372) and higher

level quantitative research based courses in Business studies.

Upon completion of this course a student is expected to be able to carry on

Applied Statistical research on various topics in Business.

Prerequisite: BUS172

Text Book: Statistics for Business and Economics 6 th edition / 7th edition ,

authored by Paul Newbold, William . L. Carlson and Betty thorne.

References:

1.

/Wathen , 7 edition

by Lind/ Marchal

th

Grading Policy: The course grade will be based NSU grading policy The

weights are as follows:

Best 1 of Quiz -1 & Quiz-2

Assignment-1

:

:

15%

10%

Mid-Term-1

Mid-Term-2

Final Examination

Total:

:

:

:

25%

25 %

25 %

100%

P.T.O.

1. Hypothesis Testing part-1 (chapter-10)

2. Hypothesis Testing part-2 (chapter-11)

3. Simple regression (Chapter 12)

6. Goodness of Fit and Contingency Tables (Chapter 16)

7. Analysis of Variance (Chapter 17)

Course regulations:

1. Mobile phones must be switched off during the class hour.

2. Make-up exams will be arranged only in case of emergency , subject to

submission of genuine documents. There will be NO MAKE-UP for quiz tests

under any circumstances.

3. Distracting the instructor by talking to other classmates is not allowed.

4. Students are free to consult the instructor regarding the class material only

during the office hour mentioned.

5. Attendance is important to earn satisfactory grade in this course.

Spring 2014

BUS 173 Applied Statistics -2

Worksheet -1

Instructor: Humaira Husain

theorem

Q1. According to IRS study , it takes a mean of 330 minutes for taxpayers to

prepare, copy and electronically file a tax form. This distribution of times follows

Normal distribution and standard deviation is 80 minutes. A consumer

surveillance agency selects a random sample of 40 taxpayers.

a. Calculate standard error of the mean in this sample.

b. What is the likelihood the sample mean is greater than 320 minutes?

c. What is the likelihood the sample mean is between 320 and 350 minutes?

d. What is the likelihood the sample mean is greater than 350 minutes?

Q2. The rent for a one bed room apartment in California follows Normal

distribution with a mean of 2200 dollars per month and a standard deviation of

250 dollars per month. What is the probability of selecting a sample of 50 one bed

room apartments and finding the mean to be at least 1950 dollars per month?

Q3. Antelope Coffee is considering the possibility of opening a gourmet coffee

shop. Shops will be successful if per capita annual income is above $60000.

Standard deviation of income is $5000. A random sample of 36 people was

obtained and the mean income was 62300$. Does this sample provide evidence

that the shop will be opened?

Q4.Given a population with mean 100 and variance 2 81 . The central limit

theorem applies when the sample size n 25 . A random sample of size n 25 is

obtained.

a. What are the mean and variance of sampling distribution of sample mean?

b. What is the probability that x > 102?

c. What is the probability that 98 x 101?

d. What is the probability that x 101.5 ?

Spring 2014

BUS 173 Applied Statistics -2

Worksheet 2

Instructor: Humaira Husain

Q1. A personal manager has found that historically the scores on aptitude tests

given to applicants for entry level positions follow normal distribution with a

standard deviation of 32.4 points. A random sample of nine test scores from the

current group of applicants had a mean score of 187.9 points.

a. Find 90% confidence interval for population mean score of the current group of

applicants.

b. Based on these sample results, a statistician found for the population mean

with a confidence interval extending from 165.8 to 210.0 points. Find the

confidence level of this interval.

Q2. A college admissions officer for an M.B.A program has determined that

historically applicants have undergraduate grade point averages that are

normally distributed with standard deviation 0.45. From a random sample of 25

applications from the current year, the sample mean grade average is 2.90.

a. Find 95% confidence interval for population mean.

b. Based on these sample results, a statistician computes for the population mean

a confidence interval extending from 2.81 to 2.99. Find the confidence level

associated with this interval.

Q3. The owner of Brittens egg farm wants to estimate mean number of eggs laid

per chicken . A sample of 20 chickens shows they laid on average of 20 eggs per

month with a standard deviation of 2 eggs pee month.

a. Find the value of population mean?

b. Explain why we need to use the t distribution . What assumption do you need

to make?

c. For a 95 % confidence interval , what is the value of t?

d. Develop 90% confidence interval for population mean.

Instructor: Humaira Husain

Topic: Hypothesis Testing

Q1. Test the Hypotheses

H0: 100

H1: > 100

Using a random sample of size n 25 , a probability of Type error equal to .05, and the

following sample statistics.

a. x 106 ,

s 15

b. x 104 , s 10

c. x 95 , s 10

d. x 92 , s 18

Q2. Test the Hypotheses

H 0 : 100

H 1 : < 100

Using a random sample of size n 36 , a probability of type error equal to 0.05 and the

following sample statistics.

a. x 106 ,

s 15

b. x 104 , s 10

c. x 95 , s 10

d. x 92 , s 18

Q3. The accounts of a corporation show that , on average, accounts payable are $125.32.

An auditor checked a random sample of 16 of these accounts. The sample mean was

$131.78 and the sample standard deviation was $25.41. Assume that the population

distribution is normal . Test at the 5% significance level against a two sided alternative

the null hypothesis that the population mean is $125.32.

Q4. A process that produces bottles of shampoo when operating correctly , produces

bottles whose contents weigh, on average , 20 ounces. A random sample of nine bottles

from a single production run yielded the following content weights (in ounces):

21.4 19.7

19.7

20.6

20.8

20.1

19.7

20.3

20.9

Assuming that the population distribution is normal, test at the 5% level against a two

sided alternative the null hypothesis that the process is operating correctly.

Q5. In contract negotiations a company claims that a new incentive scheme has resulted

in average weekly earnings of at least $400 for all customer service workers. A union

representative takes a random sample of 15 workers and finds that their weekly earnings

have on average of $381.35 and a standard deviation of $48.60. Assume a normal

distribution.

a. Test the companys claim.

b. If the same sample results had been obtained from a random sample of 50 employees,

could the companys claim be rejected at a lower significance level than that used in part

(a)?

Q6. The production manager of northern Windows has asked you to evaluate a proposed

new procedure for producing its regal line of double hung windows. The present process

has a mean of production of 80 units per hour with a population standard deviation of

8 . Is there any strong evidence that the mean production level is higher with the

new process.? Consider level of risk to be 5% and sample size is 25 and the resulting

sample mean is 83 .

Q7. The production manager of twin Forks ball bearing has asked your assistance in

evaluating a modified ball bearing production process. Ball bearings weights are

normally distributed with mean of 5 ounces and standard deviation of 0.1 ounces. A new

raw material supplier was used for a recent production run and the manager wants to

know if that change has resulted in a lowering of the mean weight of ball bearings

.Consider level of risk to be .05. and sample size = 16 and the sample mean is 4.962.

Q8. The production manager of Circuits unlimited has asked your assistance in

analyzing a production process. This process involves drilling holes whose diameters are

normally distributed with population mean 2 inches and population standard deviation

of 0.06 inches. A random sample of 9 measurements had a sample mean of 1.95

inches.Use a significance level of 5% to determine if the observed sample mean is

unusual and suggests that the drilling machine should be adjusted.

Q9. You have been asked to evaluate single employer plans after the establishment of the

health benefit guarantee corporation. A random sample of 76 percentage changes in

promised health benefits was observed. The sample mean percentage change was .078

and the sample standard deviation was .201. Find and interpret the p value of a test of

null hypothesis that the population percentage change is 0 against the two sided

alternative.

Q10 A sample of 64 observations is selected from normal population. The sample mean

is 215 and population standard deviation is 15. Conduct the following test of hypothesis

using 3% level of significance.

H0: 220

H1: < 220

Instructor: Humaira Husain

BUS173

Worksheet-4

[Reference: Keller, G ,9th edition]

Q1. Calculate the probability of Type 2 Error for the following test of hypothesis, given

that 203.

H 0 : 200

H 1 : 200

.05

10

n 100

Q2. A statistics practitioner wants to test the following hypotheses with 20 and

n 100 .

H O : 100

H 1 : > 100

a. using 0.10 , find the Type 2 Error for the following test of hypothesis, given that

102.

b. Repeat part a with 0.02

c. Describe the effect on of decreasing

H O : 50

H 1 : < 50

.05

, 10 , n 40

Q4. Compute the p value in order to test the following hypotheses given that x 52 ,

n 9 and 5 , .03 .

H O : 50

H 1 : > 50

a. repeat part a with n 25.

b. repeat part a with n 100.

c. Describe what happens to the value of the test statistic and p value when the sample

size increases.

Q5. A statistics practitioner formulated the following hypotheses and learned that

x 190 , n 9

and 50 . Compute the p value in order to test the following

hypotheses.

H O : 200

H 1 : < 200

a. repeat part a with n 30

b. repeat part a with n 10

standard deviation decreases.

Q6.

and p value

when the

H O : 200

H 1 : < 200

Find the probability of a type 2 error for the following test of hypothesis given that

196. consider significance level to be 10%, population standard deviation is 30 and the

sample size is 25.

a. repeat part (a) with sample size =100.

b. describe the effect on error type 2 of increasing sample size.

Q7. Compute the p value in order to test the following hypotheses given that x 990 ,

n 100 and 50 .

H O : 1000

H 1 : < 1000

a. repeat part a with 50

b. repeat part a with 100

c. Describe what happens to the value of the test statistic

standard deviation increases.

and p value

when the

Instructor: Humaira Husain

Topic: Two sample test of Hypothesis (Hypothesis testing II )

Q1. In random samples of 12 from each of two normal populations , we found

following statistics:

x1

x2

= 74

s1 18

= 71

s 2 16

a. Test with .05 to determine whether we can infer that population means

differ.

b. Repeat part (a) increasing standard deviations to s1 210 and s 2 198 ,

describe the result.

c. Repeat part (a) with sample size=150 and discuss the effect of increasing

sample size.

d. Repeat part(a) changing the mean of sample1 ( x1 ) to 76. Discuss the effect of

increasing x1 .

Q2. A number of restaurants feature a device that allows credit card users to

swipe their cards at the table. It allows user to specify a percentage or a dollar

amount to leave as a tip. In an experiment to see how it works , a random sample

of credit card users was drawn. Some paid the usual way and some used new

device. The percent left as a tip was recorded and listed below. Can you infer that

users of new device leave larger tips?

Usual

10.3

15.2

13

9.9

12.1

13.4

Device 13.6

15.7

12.9

12.2

14.9

13.2 12.0

12.1

13.9 15.7

15.4

17.4

Q3. Every month a clothing store conducts an inventory and calculates losses

from theft. The store would like to reduce these losses and is considering two

methods . The first is to hire a security guard, and the second is to install

cameras. To help decide which method to choose , the manager hired a security

guard for 6 months . During the next 6 month period, the store installed cameras.

The monthly losses were recorded and are listed here. Manager decided that

because the cameras were cheaper than the guard, he would install the cameras

unless there was enough evidence to infer that the guard was better. What

should the manager do?

Security guard

355

284

401

398

477

254

Cameras

486

303

270

386

411

435

Q4. How do drivers react to sudden large increases in the price of gasoline? To

help answer the question, a statistician recorded the speeds of cars as they

passed a large service station. He recorded the speeds (mph) in the same location

after the service station sign showed that the price of gasoline had risen by 15

cents. Can we conclude that speeds differ?

43

36

31

30

28

36

27

36

35

30

32

36

26

30

32

30

32

33

36

31

32

29

28

39

operating room contamination resulted in the accompanying table. The new soap

was tested in a sample of eight operating rooms in the greater Seattle area

during the last year .

Operating Room

A

Before

6.6

6.5

9.0

10.3

11.2

8.1

6.3

11.6

After

6.8

2.4

7.4

8.5

8.1

6.1

3.4

2.0

are lower after use of the new soap?

(To solve above problems assume that population variances are unknown but

2

2

EQUAL so 1 2 and use t distribution with n1 n2 2 df )

Reference: 1. Gerald Kellers Statistics for Management and Economics 9 th

edition Chapter 13s problem exercises.

2. Lind / marchal / Wathen , Basic Statistics for Business & Economics Chapter

11s exercises .

( Z statistic is rarely used in TWO SAMPLE test of hypothesis because in most

cases population variances are NOT known.)

Worksheet-6 BUS173.3

Topic: Simple regression

Q1. It was hypothesized that the number of bottles of an imported premium beer

sold per evening in the restaurants of a city depends linearly on the average costs

of meals in the restaurants . The following results were obtained for a sample of

n 17 restaurants , of approximately equal size, where

x 25.5

x x

n

i 1

n 1

= 16.0

x

n

350

i 1

x yi y

n 1

180

b. Interpret the slope of the sample regression line .

sample line?

Q2. Find and interpret the coefficient of determination for the regression of DVD

system sales on price, using the following data.

Sales

Price

420

5.5

380

6

350

6.5

400

6

440

5

380

6.5

450

4.5

420

5

Q3 A fast food chain decided to carry out an experiment to assess the influence of

advertising expenditure on sales . Different relative changes in advertising

expenditure , compared to the previous year were made in eight regions of the

country and resulting changes in sales levels were observed . The accompanying

table shows the results.

Increase in advertising

Expenditure %

0

Increase in sales % 2.4

4

7.2

14

10.3

10

9.1

9

10.2

8

4.1

6

7.6

1

3.5

increase in advertising expenditure.

b. Find a 90% confidence interval for the slope of the population regression

line.

Q4. A sample of 25 blue collar employees at a production plant was taken. Each

employee was asked to assess his or her own job separation x on a scale from 1

to 10. In addition, the number of days absent y from work during the last year

were found for these employees. The sample regression line y hat = 12.6 1.2 x

was estimated by least squares for these data. Also found were

= 6.0

x

25

i 1

x = 130.0

SSE 80.6

the null hypothesis that job satisfaction has no linear effect on absenteeism.

n 1

250

Test against a two-sided alternative the null hypothesis that the slope of the

population regression line is 0.

Q6. It might be that watching television reduces the amount of physical exercise,

causing weight gains. The number of pounds each child was overweight was

recorded ( a negative number indicates the child is underweight ). In addition, the

number of hours of television viewing per week was also recorded. These data are

listed here.

Television

42

38 28 29

34

Overweight

8

5

3

18

Television

36

18

25

35

-1

37

13

38

31

33

19

29

14

-9

Overweight

14

-7

a)Calculate the sample regression line and describe what the coefficients tell you

about the relationship between the two variables.

b) Determine the coefficient of determination and describe what it tells you.

c) Conduct a test to determine whether there is evidence of a linear relationship

between weight and watching television.

d) Estimate or predict with 90% confidence the Mean overweight for children who

are watching television 20 times per week.

Worksheet -7

Instructor: Humaira Husain

Spring 2014

Q1. In an attempt to determine the factors that affect the amount of energy used,

200 households were analyzed. In each the number of occupants and the number

of electricity used measured. We have the following sample statistics: x bar = 4.75

, y bar = 762.6,

variance of x = 4.84 , Variance of y = 56725, covariance

between x and y = 310.0

a) Determine the regression line and interpret the results.

b) Assess the fit of the regression line (Compute the standard error of the

estimate and R square )

c) Estimate the mean number of electricity consumption for households with

90% confidence when the number of occupants = 5.

measure of poverty than is currently in use. To help acquire information she

recorded the annual household income ( in thousand dollars) and the amount of

money spent on food during one week for a random sample of households. We

have the following sample statistics:

x bar = 59.42 , y bar = 270.3

variance of x =, 115.24 , Variance of y =

1797.25, covariance between x and y = 225.66 , n = 150

b) Determine the coefficient of determination and describe what it tells you.

c) Conduct a test whether there is evidence of linear relationship between

household income and food budget.

d) Predict or forecast the mean food budget of a family when the household

income is 50000 dollars. Use 90% confidence level.

Q3.An economist wanted to investigate the relationship between Office rents (the

dependent variable) and vacancy rates . Accordingly he took a random sample of

monthly office rents and the percentage of vacant office space in 30 different

cities. The sample statistics are followings: x bar =11.33

, y bar = 17.2 ,

variance of x = 35.47 , Variance of y = 11.24, covariance between x and y = 10.78 , n = 30 .

b) Can we infer that office rents and vacancy rates are linearly related ?

c) Forecast or predict the mean office rent when the vacancy rate is 10% with

95% confidence.

Topic: Multiple regression

Instructor: Humaira Husain

Spring 2014

explain household milk consumption:

y i o 1 x1i 2 x 2i i

x1 = Weekly Income.( in hundreds of dollars)

x 2 = Family size

The least square estimates of the regression parameters were followings:

b0 0.025

b1 0.052

b2 1.14

b) is it possible to provide a meaningful interpretation of the estimate b0 ?

and

SSR = 88.2

d) Find the adjusted coefficient of determination.

In the above problem the standard errors are followings:

S b1 0.023

S b2 0.35

e) Test against the appropriate one sided alternative the null hypothesis that for

fixed family size , milk consumption does not depend linearly on Income.

f) Find 95% confidence Interval for 2 .

Q2. A study was conducted to determine whether certain features could be used

to explain variability in the prices of furnaces. For a sample of 19 furnaces the

following regression model was estimated

Where

R 2 0.84

yi =

Price in dollar

x1 = Rating of furnace

x 2 = Energy efficiency ratio

x 3 = Number of settings

a) Find a 95% confidence interval for the expected increase in price resulting from

an additional setting when the values of the rating and the energy efficiency ratio

remains fixed.

b) Test the null hypothesis that all else being equal the energy efficiency ratio of

furnaces does not affect their price against the alternative that the higher the

energy efficiency ratio the higher the price.

Q3. In Question number 1

a) Test the null hypothesis H o : 1 2 0

b) Set out the analysis in variance table.

Q4. The president of a company that manufactures the drywall wants to analyze

the variables that affect demand for his product . Drywall is used to construct the

walls in houses and offices. Consequently the president decides to develop a

regression model in which dependent variable y i is monthly sales of drywall and

followings are the independent variables:

x1

x2

x3

x4

= Five year mortgage rates

= Vacancy rate in apartments (in %)

= Vacancy rate in Office buildings (in %)

R square = .8935 Adjusted R square = .8711

S of epsilon = 40.13

F = 39.86

Intercept

Permits

Mortgage

Coefficients

-111.83

4.76

16.99

Standard error

134.34

.395

15.16

t statistic

-.83

12.06

1.12

Apartment vacancy

Office vacancy

-10.53

1.31

6.39

2.79

-1.65

.47

b) What is the standard error of the estimate ? Can you use this statistic to assess

the models fit?

c) Interpret R square value.

d) Test the overall validity of the model. ( Conduct the F test)

e) Interpret each of the coefficients.

f) Test to determine whether each of the independent variables is linearly

related to drywall demand in this model.

g) Predict next month drywall sales with 95% confidence if the number of

building permits is 50 and 5 year mortgage rate is 9% , vacancy rates are 3.6% in

apartments and 14.3% in Office buildings.

Question:5

Consider the following software generated result sheet of a multiple

regression.

Analysis of Variance

SOURCE

Regression

Error

DF

5

20

SS

100

40

MS

20

2

Predictor

Coef

St Dev.

Constant

X1

X2

X3

X4

3.00

4.00

3.00

0.20

-2.50

1.50

3.00

0.20

0.05

1.00

t statistic

2.00

1.33

15.00

4.00

-2.5

X5

3.00

4.00

0.75

b. Write the Population regression model and the sample regression line.

c. Check whether the model is valid or not.

d. Test the regression coefficients individually. Would you consider omitting any

variables? If so, which one(s)? use .05 significance level.

e. Compute the R square and adjusted R square and interpret the result.

f. What assumption you have regarding the model error variance?

g.

assumptions of the multiple regression model do not hold in the above case for

example heteroscedasticity and if the residual terms are serially correlated .

[ E( i j ) 0 ]

DF = degrees of freedom

SS = Sum Square

MS= Mean square

Coef= coefficient

St. Dev = Standard deviation

Reference: Keller and Newbold

Instructor: Humaira Husain

Treatment

_________________________________

Statistic

1

2

3

_________________________________

Sample size

X

10

15

20

S2

50

50

50

b) Repeat part a by increasing the sample size to 10.

c) Describe what happens to F statistic when sample size increases.

Q2. You are given the following statistics

Treatment

_________________________________

Statistic

_________________________________

Sample size

4

4

4

20

S2

10

22

25

10

10

b) Repeat part a by changing the variances to 25.

c) Describe what happens to F statistic of increasing the sample variance.

Q3. A management scientist believes that one way of judging whether a computer

came equipped with enough memory is to determine the age of the computer. In a

preliminary study random sample of computer users were asked to identify the

brand of the computer and its age (by months).the categorized responses are

shown below. Do these data provide sufficient evidence to conclude that there are

differences in age between the computer brands? Use level of Significance = .05

IBM

DELL

HEWLETT - PACKARD

OTHER

17

24

10

15

12

13

21

15

Q4. In early 2001 the economy was slowing down and companies were laying off

workers . A gallup poll asked a random sample of workers how long it would be

before they had significant financial hardships if they lost their jobs and could not

find new ones. They also classified their income. The classifications are

More than 50000$

30000$ to 50000$

20000$ to 30000$

Less than 20000$

Sample

xi

S i2

ni

____________________________________________________

1

22.21

121.6

39

18.46

90.39

14

15.49

85.25

81

9.31

65.4

67

Source of variation

Between groups

Within groups

Total

Sum of Squares

1000

750

1750

Degrees of freedom

4

15

19

Compute mean squares for between groups and within groups . Compute the F

ratio and test the hypothesis that the group means are equal.

for its fleet- domestic, Japanese or European. Five cars of each type were ordered

and after 10000 miles of driving the operating cost per mile of each was assessed .

The accompanying results in cents per mile was obtained.

Domestic

Japanese

European

_________________________________________________

18

17.6

17.4

19.1

16.9

20.1

17.6

16.1

17.3

17.4

19.3

17.4

17.1

18.6

16.1

b) Test the null hypothesis that the population mean operating costs per mile are

the same for these three types of car.

Applied Statistics- II

Consider the following software generated result sheet of a multiple

regression.

Analysis of Variance

SOURCE

Regression

Error

DF

5

20

SS

100

40

MS

20

2

Predictor

Coef

St Dev.

Constant

X1

X2

X3

X4

X5

3.00

4.00

3.00

0.20

-2.50

3.00

1.50

3.00

0.20

0.05

1.00

4.00

t statistic

2.00

1.33

15.00

4.00

-2.5

0.75

b. Write the Population regression model and the sample regression line.

c. Check whether the model is valid or not.

d. Test the regression coefficients individually. Would you consider omitting any

variables? If so, which one(s)? use .05 significance level.

e. Compute the R square and adjusted R square and interpret the result.

f. What assumption you have regarding the model error variance?

g.

assumptions of the multiple regression model do not hold in the above case for

example heteroscedasticity and if the residual terms are serially correlated .

[ E( i j ) 0 ]

DF = degrees of freedom

SS = Sum Square

MS= Mean square

Coef= coefficient

St. Dev = Standard deviation

IMPORTANT GUIDELINES!

Assignment should be submitted in formal binding (Spiral binding)

Write your Name and ID. Do NOT include any cover page.

Exactly same answers of the assignments from students will result deduction in marks

accordingly. You are not allowed to ask questions regarding assignment to your

instructor. Please read the relevant handouts.

## Гораздо больше, чем просто документы.

Откройте для себя все, что может предложить Scribd, включая книги и аудиокниги от крупных издательств.

Отменить можно в любой момент.