0 Голоса «за»0 Голоса «против»

Просмотров: 40412 стр.statistics

Nov 13, 2018

© © All Rights Reserved

PDF, TXT или читайте онлайн в Scribd

statistics

© All Rights Reserved

Просмотров: 40

statistics

© All Rights Reserved

- Neuromancer
- The E-Myth Revisited: Why Most Small Businesses Don't Work and
- How Not to Be Wrong: The Power of Mathematical Thinking
- Drive: The Surprising Truth About What Motivates Us
- Chaos: Making a New Science
- The Joy of x: A Guided Tour of Math, from One to Infinity
- How to Read a Person Like a Book
- Moonwalking with Einstein: The Art and Science of Remembering Everything
- The Wright Brothers
- The Other Einstein: A Novel
- The 6th Extinction
- The Housekeeper and the Professor: A Novel
- The Power of Discipline: 7 Ways it Can Change Your Life
- The 10X Rule: The Only Difference Between Success and Failure
- A Short History of Nearly Everything
- The Kiss Quotient: A Novel
- The End of Average: How We Succeed in a World That Values Sameness
- Made to Stick: Why Some Ideas Survive and Others Die
- Algorithms to Live By: The Computer Science of Human Decisions
- The Universe in a Nutshell

Вы находитесь на странице: 1из 412

(Abstract)

B.Sc Programme in Statistics under Choice based Credit Semester System – Scheme and

Syllabus – implemented with effect from 2009 admission onwards – approved - Orders

issued.

-------------------------------------------------------------------------------------------------------------

GENERAL AND ACADEMIC BRANCH – I ‘J’ SECTION

No. GA. I/J2/2455/06 Dated, Calicut University. P.O., 25.06.2009

-------------------------------------------------------------------------------------------------------------

Read : 1. U.O. No. GAI/J2/3601/08 (Vol. II) dated 19.06.2009.

2. Minutes of meeting of the Board of Studies in Statistics (UG) held on

29.01.2009 and 30.04.2009

3. Item No.2. vii(a) of the minutes of the meeting of the Faculty of Science held

on 05.05.2009.

4. Item No.IIA (8) of the minutes of meeting of the Academic Council held on

14.05.2009.

ORDER

Choice based Credit Semester System and Grading has been introduced for UG

Curriculum in the affiliated colleges of the University with effect from 2009 admission

onwards and the Regulation for the same implemented vide University Order cited 1st

paper above.

Vide paper read as (2), the Board of Studies in Statistics (UG) approved the draft

regulation and the syllabi of B Sc Programme in Statistics prepared as per draft regulation

of Choice based Credit Semester System 2009.

The Faculty of Science vide paper read as 3rd above endorsed the minutes of the

Board of Studies in Statistics (UG).

The Academic Council, vide paper read as 4 above, approved the minutes of the

Faculty of Science.

Sanction has therefore been accorded for implementing the scheme & syllabus of

B.Sc Prigramme in Statistics under Choice based Credit Semester System from 2009

admission onwards.

Orders are issued accordingly . Syllabus appended.

Sd/-

DEPUTY REGISTRAR (G&A I)

For REGISTRAR

To

The Principals of all affiliated colleges -

offering B.Sc Statistics programme

Controller of Examination /EX Sn/EGI/DR B Sc/Enquiry/

System Administrator with a request to upload in the University website.

Tabulation Section/GA I ‘A ‘F’ G’Sections/G&A II, III Branches

Forwarded / By order

SECTION OFFICER

1

SYLLABUS OF B.Sc. STATISTICS MAIN – SEMESTER SYSTEM

CCSSUG 2009 (2009 admission onwards)

ster hours/week hours Ext:Int

No

1 ST1B01 METHODOLOGY OF STATISTICS, 4 4 3 3:1

BASIC CALCULUS AND

PROBABILITY THEORY

STATISTICS

NUMERICAL MATHEMATICS

STATISTICAL QUALITY CONTROL

5 Open course offered by other 3 4 3 3:1

faculties

6 ST6B09 TIME SERIES AND INDEX NUMBERS 5 4 3 3:1

ACTURIAL SCIENCE

STB601(E02) department.

STB601(E03)

*For Practical paper the internal marks are based on the practical records

CCSSUG 2009 (2009 admission onwards)

2

Semester Course Course Title Instructional Credit Exam Ratio

No. Code Hours/week hours Ext:int

1 ST6B01 Probability Models and Risk Theory 3 2 3 3:1

CCSSUG 2009 (2009 admission onwards)

No. Code Hours/week hours Ext:int

1 ST5D01 Economic Statistics 3 4 3 3:1

3

Table showing the components and weightage for internal assessment

Components Weight

Assignment 1

Test paper 2

Seminar 1

Attendance 1

There shall be two test papers and the average grade point is to be considered for

internal assessment.

There shall be 4 parts A, B, C and D in all the question papers except for course 12,

practical. Part A consists of 12 objective type questions. Part B consists of 8 questions to

be answered in a word, phrase or sentence. Part C consists of 6 questions of short essay

type of which the student can attempt 4. Part D consists of 3 questions of long essay type

of which the student can attempt 2. In part A the weightage per question is ¼.for part B

weightage is 1/question .For part D the weightage is 2/question and for part D the

weightage is 4/question. As far as possible the number of questions should be proportional

to the modules.

The practical paper consists of 6 questions and the student can attempt 4. Calculators are

permitted

The internal assessment for the practicals shall be based on the average grade point of two

practical test papers and the practical record. The test papers shall have weight 1 each and

the record shall have weight 2

4

CORE COURSE I: METHODOLOGY OF STATISTICS,

BASIC CALCULUS AND PROBABILITY THEORY

conducting a statistical enquiry – preparation of questionnaire – primary and

secondary data – classification and tabulation – Formation of frequency distribution

– diagrammatic and graphic presentation of data – population and sample –

advantages of sampling over census – methods of drawing random samples from a

finite population-Fitting of straight line, parabola, exponential and logarithmic

curves using the principal of least squares.

17hours

examples only): -Derivative of a function-relationship between continuity and

differentiability-derivatives of polynomial, exponential and logarithmic functions-

differentiation of sum, difference, product and quotient-function of a function rule -

second order derivative- sign of derivative -increasing and decreasing functions- -

maxima and minima. Integration as inverse operation of differentiation- indefinite

and definite integrals- simple examples -properties of integration-first and second

fundamental theorem on integral calculus-application of integration- area under a

curve. -Beta and Gamma integrals-simple properties-Function of two variables-

double integrals- evaluation of double integrals (application in statistics only)-

change of variable.

25 hours

5

Module 3. Probability concepts: Random experiment, sample space, event, classical

definition, axiomatic definition and relative frequency definition of probability.

Concept of probability measure. Addition and multiplication theorem (limited to

three events). Conditional probability and Bayes’ Theorem – numerical problems

15 hours

variable. Probability mass function (pmf), probability density function (pdf) and

(cumulative) distribution function (df) and their properties Change of variables:

Discrete and continuous cases (univariate case only). Simple problems

15hours

Statistics, Wiley Eastern.

2. S.C. Gupta and V.K. Kapoor : Fundamentals of Mathematical Statistics,

Sultan Chand and sons

3. Mood A.M., Graybill. F.A and Boes D.C. : Introduction to Theory of

Statistics McGraw Hill

4. Shaum’s Series : Calculus

5. John E Freund : Mathematical Statistics (Sixth Edition), Pearson

Education (India),New Delhi.

6

Model Question Paper

B.Sc. STATISTICS

I Semester

CORE COURSE I: METHODOLOGY OF

STATISTICS, BASIC CALCULUS AND PROBABILITY THEORY

Time: 3 Hrs

PART A

Answer all questions ( Bunch of 4 carries weight age 1)

(a) calculate mean only (b) representation of data (c) summarize data (d) none

d (uv )

2. If u and v are functions of x, then is

dx

udv vdu dudv udv vdu du dv

(a) + , (b) , (c) − , (d) +

dx dx dxdx dx dx dx dx

3. If f(x) is an increasing function, then

df df df df

(a) = 0, (b) < 0, (c) ≠ 0, (d ) >0

dx dx dx dx

d2 f

4. Let f ( x) = 2 x 3 + 1, What is ?

dx 2

(a) 6 x 2 + 1, (b) 12x, (c) x, (d) 2 x 2 + 1

5. What is ∫ ∫ xydxdy

x2 y2 x2 y2

(a) , (b) , (c) x 2 y 2 , (d) 4 x 2 y 2

2 4

1

6. Obtain the value of ∫ dx

x

−1 1

(a) 2 , (b) logx, (c) e x , (d) 2

x x

7. Sample space of a coin toss experiment is

(a){HT}, (b){H,T}, (c){HH, TH, HT, TT}, (d){H}

8. Which of the following is an axiom of probability.

(a) 0 < P (Ω) < 1, (b) P (Ω) = 1, (c) if A ⊂ B then P ( A) ≤ P ( B ),

(d) P ( A ∩ B ) = P ( A).P ( B )

9. If f ( x) is a probability density function, then

(a) ∫ f ( x) dx = 0 , (b) ∫ f ( x) dx = 1, (c) ∫ f ( x)dx < 1, (d) ∫ f ( x) dx > 0

11. If F(x) is a distribution function, then

(a)F(x) is increasing in x, (b)F(x) is constant, (c)F(x) is decreasing in x,

7

(d)F(x)=1 for every x

12. If f(x)= x, 0<x<1, Obtain F(x).

x2

(a)F(x)= x 2 , (b) F(x)= , (c) F(x)= x , (d) F(x)=2 x 2

2

PART B

Answer all questions wt 1

df d2 f

14. If > 0, and 2

< 0, then x0 is the point of …………

dx0 dx0

15. Integral of f(x)=2x+1 over (a,b) is ………….

16. The Beta function is…………..

17. Total probability of a random variable is ……..

18. If A and B are any two events, then P( A | B) is……….

19. If F(x) is a distribution function, its minimum value is…...and maximum value

is……..

20. The third axiom of probability is………

PART C

Answer any four questions wt 2

2

22. Evaluate ∫ (2 x 2 + 1)dx

1

∫ ∫ xy( x

2

23. Evaluate + y 2 )dxdy over [(0,a),(0,b)]

24. State three axioms of probability?

25. A continuous random variable X has the pdf given by f ( x) = 2 x,0 < x < 1 , and 0

Elsewhere. Find F (x) and P(X<1/2)?

26. Given f ( x) = e − x , x ≥ 0, find the pdf of y=-3x+7?

PART D

Answer any two questions wt 4

28. State and prove addition theorem for two events? Explain what happens when A is

subset of B?

Find two numbers a and b such that (1) P ( X ≤ a ) = P( X ≥ a ) and (2) P ( X ≥ b) = 0.5.

8

Module 1. Mathematical Expectations: Expectation of a random variable, moments,

relation between raw and central moments, moment generating function (mgf) and

15hours

Module 2. Bi variate random variable: Definition (discrete and continuous type)

Joint probability mass function and probability density function, marginal and

conditional distributions, independence of random variables.

Bivariate moments: Definition of raw and central product moments, conditional

mean and conditional variance, covariance, correlation and regression coefficients.

Mean and variance of a random variable in terms of conditional mean and

conditional variance

20 hours

Geometric (definition, simple properties and applications). Discrete Uniform

(definition, mean, variance and mgf only) – Continuous type – Rectangular,

Exponential, Gamma (definition, mean, variance and mgf only)

applications). Lognormal, Pareto and Cauchy Distributions (definition only)

25 hours

probability, Weak Law of Large Numbers, Bernoulli Law of Large Numbers.

12hours

Statistics, Wiley Eastern.

2. S.C. Gupta and V.K. Kapoor : Fundamentals of Mathematical Statistics,

Sultan Chand and sons

9

3. Mood A.M., Graybill. F.A and Boes D.C. : Introduction to Theory of

Statistics McGraw Hill

4. John E Freund: Mathematical Statistics (Sixth Edition), Pearson Education

(India),New Delhi.

Core Course II Semester II

COURSE II: POBABILITY DISTRIBUTIONS

Time 3hrs Part A

(In parts A answer all question) weight 1 for a bunch of 4questions

1. The probability of getting one head &one tail in the toss of two

unbiased coins simultaneously is

10

a) .25 b).50 c)1 d) .75

2. If x and y are two independent random variables with joint p.d.f f(x,y)

then

Then E(xy) =

a) E(x).E(y) b) E(x)/E(y) c) E(x) d) E(y)

3. E(x/y) is generally a function of

a) y b) x c) x and y d) none

4. Mx+y(t) ,if x&y are independent r.vs is given by

a) Mx(t)+My(t) b) Mx(t)/My(t) c) Mx(t).My(t) d)none

b) If V(x) = 1, then V(2x ± 3) is

a) 5 b) 13 c) 14 d) 1

2

c) E(x-k) is minimum when

a) k<E(x) b) k= E(x) c) k>E(x) d) K2= E(x)

5.If x is a random variable having probability function f (x), then the function

tx

Σe f(x), , is known as

a. moment generating function

b. probability generating function

c. probability distribution function

d. characteristic function

a) ¼ b) 2/4 c) 4 d) 2

8. X is normally distributed with zero mean and unit variance. The variance of

11

x2 is

a) 0 b) 1 c) 2 d) 4

9.In a normal curve area to the right of the point x1 is 0.6 and to the left of the

point x2 is 0.7. Which is the correct statement.

a) n1> n2 b) n1< n2 c) n1= n2 d) none of them

10.For a normal distribution, Q.D, M.D and S.D. are in the ratio.

4 2 4 4 2 1 4

a) : 2/3:1, b) : :1 c) 1: : d) : 1:

5 3 5 5 3 2 5

d)

11.If x is a continuous r.v with means µ and variance σ 2 then for any positive

1

number k P[│x- µ │ > K σ ] ≥ is known as

k2

a. Liapunov’s inequality b) Tchebycheff’s inequality

c. Bienayme- Tchebycheff’s inequality d) Khinchin’s inequality

12.If x and y are two random variables such that their expectations exist and

P(x ≤y) =1 then

a) E(x) ≤E (y) b) E (x) >E (y)

c. E (x) = E (y) d) None of the above

13 Expected value of a random variable x exists if ……………

14 If x is a random variable E (x-constant)2 is minimum when the constant is

15.Name the discrete distribution for which mean and variance have the same

value.

16 What is the third moment about the mean of a poison distribution if the

second moment about the origin is 12.

17. Identify the distribution (using the uniqueness property) if the name of

generating function of the distribution

12

is Mx(t)= (1+et ) 5/32

18. State the additive property of Binomial distribution.

19. Write down the pdf of the exponential distribution and write down its first

raw moments.

20. What are the points of inflexion of a normal curve N(µ,σ).

Part C

(Answer any 4 questions) Weight 2

v (ax +by) = a2 v (x) +b2 v (y).

22. x and y are independent random variables with means 10 and 20, and

variances 2 and 3 respectively find the mean and variances of 3x+4y.

23. A symmetric die is thrown 600 times. Find the lower bound for the

probability of getting 80 to 120 sores.

24. For a binominal distribution, the mean is 6 and S. D is 2. Write out all the

parameters of the distribution.

25. Show that for the normal distribution the points of inflexion lie at a distance

of ± σ from the mean where σ is the S. D.

26. If x→ N (30,5) find the probability of │x-30│>5

Part D

(Answer any 4 questions) Weight 4

28.Show that under certain conditions (to be stated) a Binominal distribution

tends to the poissons distribution .

29. Fit a poisson distribution to the following data .

Number of mistakes per page : 0 1 2 3 4 Total

109 65 22 3 1 200

13

CORE COURSE III: STATISTICAL INFERENCE – I

distribution, sampling distribution of a statistic, standard error, sampling

from a normal population, sampling distributions of the sample mean and

variance. Chi- square, student’s T and F distributions – derivations,

properties uses and inter relation ships. Central Limit Theorem for

independent and identically distributed random variables (Lindberg Levy

form)

30 hours

Module 2. Theory of Estimation: Point estimation, desirable properties of a

good estimator, unbiased, consistency, sufficiency. Fisher Neyman

factorization criterion(statement and application only), efficiency, Cramer

Rao Inequality

25 hours

Module 3. Methods of estimation – method of moments, method of

maximum likelihood, method of least squares. Properties of estimators

obtained by these methods –concept of Bayesian estimation.

14

20 hours

Module 4. Interval Estimation: Large sample confidence intervals for mean,

equality of means, proportions, equality of proportions. Derivation of exact

confidence intervals for means, equality of means, variance and ratio of

variances based on Normal, t, chi- square and F distribution 15hrs

Books for reference

Statistics, Wiley Eastern.

2. S.C. Gupta and V.K. Kapoor : Fundamentals of Mathematical Statistics,

Sultan Chand and sons

3. Mood A.M., Graybill. F.A and Boes D.C. : Introduction to Theory of

Statistics McGraw Hill

4. John E Freund: Mathematical Statistics (Sixth Edition), Pearson Education

(India),New Delhi.

15

Model Question Paper

Part A

Answer all questions ,4 questions carry weight 1

1. The mean of a Chi – square distribution with n degrees of freedom is

( a ) 2n ( b ) n 2 ( c ) n (d ) n

2. The relation between student’s-t and F distribution is.

( a ) t( n) 2 = F( n ,1) ( b ) t( n ) 2 = F(1,n ) ( c ) t(1)2 = F(1,n ) ( d ) t( n ) 2 = F(1,1)

3. Let X 1 , X 2 ,..., X n be a random sample from a normal population N ( µ , σ 2 ) ,then the

∑ ( x − x)

2

i

distribution of is.

σ2

( a ) χ 2( n) ( b ) t( n) ( c ) χ 2( n−1) ( d ) t( n−1)

1

( )

2

s2 =

n

∑ xi − x ,the unbiased estimator for the population variance σ 2 is

1 2 1 2 n 2 n −1 2

(a) s (b ) s (c) s (d ) s

n −1 n n −1 n

5. If T is a consistent estimator of θ then

( a ) T is a consistent estimator of θ 2 ( b ) T 2 is a consistent estimator of θ

( c ) T 2 is a consistent estimator of θ 2 ( d ) None of the above

16

6. Let X 1 , X 2 ,..., X n be a random sample from a Bernoulli population. A sufficient

statistics for p is

( a ) ∑ X i ( b ) ∏ X i ( c ) Max( X 1 , X 2 ,..., X n ) ( d ) Min( X 1 , X 2 ,..., X n )

8. The 95% confidence interval for mean µ of a normal population N ( µ , σ 2 ) with

known σ 2

n n n n

9. The mean difference between 9 paired observations is 15 and standard deviation of

differences is 5. Then the value of the t statistic used in paired t test is

( a ) 27 ( b ) 9 ( c ) 3 ( d ) 0

10. A sample of 12 specimen taken from a normal population is expected to have a mean

50mg/cc. The sample has a mean 64 mg/cc with a variance of 25 .to test

H 0 : µ = µ0 aganistH1 : µ ≠ µ0 , you will choose

11. A random sample of size 20 from a nor mal population gives a mean 42 and a

variance 25.Then the value of the χ 2 statistic used for testing the significance of

population variance is

( a ) 7.81 ( b )15.62 ( c ) 51.20 ( d )14.36

12. If X>1is the critical region for testing H 0 : θ = 2 aganistH1 : θ = 1 on the basis of the

single observation from the population f ( x, θ ) = θ eθ x , x > 0 ,then the value of type I

error is

( a ) e ( b ) e2 ( c ) e−2 ( d ) e−1

Part B

Answer all questions ,each questions carries weightage 1

13.Let X 1 , X 2 be a random sample of size 2 from N ( 0,1) .Then the distribution of

( X 1 + X 2 ) is-------------

2

( X1 − X 2 )

2

17

15.Let X 1 , X 2 , X 3 be a random sample of size 3 from N ( µ , σ 2 ) .he efficiency of

X1 + 2 X 2 + X 3 X + X2 + X3

relative to 1 is------------

4 3

1 X −θ

16.Let X 1 , X 2 ,..., X n be a random sample from the population with pdf f ( x,θ ) = e ,

2

The m.l.e of θ is---------

17.The diameter of a cylindrical rod is assumed to be normally distributed with a variance

of 0.04cm. A sample of 25 rods has a mean diameter of 4.5 cms.95% confidence interval

for population mean is -----------

18.The power of a test is ----------

19.Degrees of freedom for chi-square in case of contingency table of order 4x3 is ---

20.In tossing of a coin ,let the probability of a head turning up be p .the hypotheses are

H 0 : p = 0.4 aganistH1 : p = 0.6 . H0 is rejected if there are five or more heads in six

tosses. Then probability of type I error is----------

PartC

Answer any 4 questions ,each questions carries a weightage of 2

21.Obtain the distribution of the sample mean of a random sample X 1 , X 2 ,..., X n of size n

from N ( µ , σ 2 ) .

B (1, p ) .Let T = ∑ X i .

T (T − 1)

Show that is an unbiased estimator of p2.

n(n − 1)

23.Define sufficient statistic. Let X 1 , X 2 ,..., X n be a random sample of size n from

24.An oil company claims that less than 20% of all car owners have not tried its gasoline

.Test this claim at the 0.01 level of significance if a random check reveals that 22 out of

200 car owners have not tried oil company’s gasoline.

25.In the comparison of two kinds of paint ,a consumer testing service finds that four 1-

gallon cans of one brand cover on the average 546 square feet with a standard deviation of

31 square feet ,whereas four 1-gallon cans of another brand cover on the average 492

square feet with a standard deviation of 26 square feet. Assuming that the two populations

sampled are normal and have equal variance. Test the hypothesis that on the average the

first kind of paint covers a greater area than the second.

26. Mention the advantages of non-parametric tests over parametric test.

18

Part D

Answer any 2 questions ,each questions carries 4 credit

27 Let X 1 , X 2 ,..., X n be a random sample of size n from N ( µ , σ 2 ) . Find the mle’s of

28 Explain Interval estimation.Obtain 100(1 − α )% confidence intervals for the

19

CORE COURSE IV: STATISTICAL INFERENCE – 2

1. Module 1. Testing of Hypotheses; concept of testing hypotheses, simple and

composite hypotheses, null and alternative hypotheses, type I and type II

errors, critical region, level of significance and power of a test, most

powerful test, Neyman Pearson theorem and its simple applications. Concept

of p value

35 hours

2. Module 2. Large sample tests concerning mean, equality of means,

proportions, equality of proportions. Small sample tests based on t

distribution for mean, equality of means and paired mean for paired data.

Tests based on F distribution for ratio of variances. Test based on chi-

distribution for variance, goodness of fit and for independence of attributes

and homogeneity of proportions. Test for correlation coefficients- Z

trasformation

35 hours

Module 3. Non parametric tests: Basic idea of distribution free method.

Kolmogorov Smirnov test-one sample and two sample sign tests. Wilcoxen

matched pairs signed rank test- Kruskal Wallis test and test for randomness

(run test).

20 hours

Books for reference

Statistics, Wiley Eastern.

2. Goon A.M., Gupta.M.K., and Das Gupta: Fundamentals of Statistics Vol. I.

the World Press, Culcutta.

3. S.C. Gupta and V.K. Kapoor : Fundamentals of Mathematical Statistics,

Sultan Chand and sons

4. Gibbons J.D.: Non parametric Methods for Quantitative Analysis, McGraw

Hill.

5. John E Freund: Mathematical Statistics (Sixth Edition), Pearson Education

(India),New Delhi.

20

Model Question Paper

IV Semester

COURSE IV: STATISTICAL INFERENCE – 2

Time: 3 Hrs

PartA

(Answer all questions)

(Contains 12 questions, 4 questions carry a weightage of 1)

1. In a chi-square contingency table with 3 rows and 5 columns, the d.f of chi-square

statistic is

a) 15

b) 24

c) 8

d) 7

2. The chi-square test statistic for a goodness of fit test is given by:

21

Oi − Ei

a)

Ei

Oi − Ei

b) ∑ Ei2

(Oi − Ei )2

c) ∑ Ei2

(Oi − Ei ) 2

d) ∑ Ei

3. In a Poisson goodness of fit test having ‘k’ sets of observed frequencies with estimated

value of λ , the chi-square statistic has d.f.

a) k-2

b) k

c) k-1

d) k-2

a) The variable is continuous

b) The variable is discrete

c) The variable is normal

d) The variable is standard normal

5. The non-parametric equivalent test for a paired t-test is:

a) Signed Rank test

b) Rank sum test

c) Run test

d) Sign test

6. The test used to check the randomness of the collected set of symbols is:

a) Sign test

b) Rank sum test

c) Signed rank test

d) Run test

7 When there are 3 groups, each following normal distribution, and the null hypothesis is

concerned with the equality of means the test used is:

a) Chi square test

b) t-test for equality of means

c) Analysis of variance

d) none of the above

( a ) 2n ( b ) n 2 ( c ) n (d ) n

22

9. The relation between student’s-t and F distribution is.

( a ) t( n) 2 = F( n ,1) ( b ) t( n ) 2 = F(1,n ) ( c ) t(1)2 = F(1,n ) ( d ) t( n ) 2 = F(1,1)

differences is 5. Then the value of the t statistic used in paired t test is

( a ) 27 ( b ) 9 ( c ) 3 ( d ) 0

11 A sample of 12 specimen taken from a normal population is expected to have a mean

50mg/cc. The sample has a mean 64 mg/cc with a variance of 25 .to test

H 0 : µ = µ0 aganistH1 : µ ≠ µ0 , you will choose

12. If X>1is the critical region for testing H 0 : θ = 2 aganistH1 : θ = 1 on the basis of the

single observation from the population f ( x, θ ) = θ eθ x , x > 0 ,then the value of type I

error is

( a ) e ( b ) e2 ( c ) e−2 ( d ) e−1

13. In chi-square test of independences of 2 attributes with 2 observations each, the d.f of

the test statistic is 1.

b) Explain your answer.

14 In the case of sign test, the test statistic follows a binomial distribution.

b) Explain your answer.

15 In χ 2 test of goodness of fit if the calculated value of χ 2 is zero, then it is a bad fit.

b) Explain your answer.

23

c) Let X 1 , X 2 be a random sample of size 2 from N ( 0,1) .Then the distribution of

( X1 + X 2 ) is-------------

2

( X1 − X 2 )

2

17. Degrees of freedom for chi-square in case of contingency table of order 4x3 is ---

18. In tossing of a coin ,let the probability of a head turning up be p .the hypotheses are

H 0 : p = 0.4 aganistH1 : p = 0.6 . H0 is rejected if there are five or more heads in six

19. Define Type-II error.

20.Write down the test statistics of paired t test naming the notations.

21. What is the null hypothesis for a chi-square test of homogeneity of proportions and

give the layout of observations.

23. Give an example for a paired t test. Give the test statistics and explain the notations

24. An oil company claims that less than 20% of all car owners have not tried its gasoline

.Test this claim at the 0.01 level of significance if a random check reveals that 22 out of

200 car owners have not tried oil company’s gasoline.

25. In the comparison of two kinds of paint ,a consumer testing service finds that four

1-gallon cans of one brand cover on the average 546 square feet with a standard

deviation of 31 square feet ,whereas four 1-gallon cans of another brand cover on the

average 492 square feet with a standard deviation of 26 square feet. Assuming that the two

populations sampled are normal and have equal variance. Test the hypothesis that on the

average the first kind of paint covers a greater area than the second.

26. Mention the advantages of non-parametric tests over parametric test.

24

27.. A factory operates in three shifts. The factory manager feels that quality of part is

related to shifts. For this purpose he has collected the following data from the past

records of production.

No. of Parts

Good Bad

Shift Day

900 130

Evening

Night 700 170

400 200

28.. Fifteen patient records from each of two hospitals were received and assigned a score

designed to measure level of care. The scores were as follows:-

Hospital 99 85 73 98 83 88 99 80 74 91 80 94 94 98 80

A:

Hospital 78 74 69 79 57 78 79 68 59 91 89 55 60 55 79

B

Use a proper non-parametric test to see whether the two populations are identical with

respect to the level of care.

real valued functions of one variable. Uniform continuity, Rolle’s theorem,

Mean Value theorem and Taylor’s theorem-Maclaurin’s thereom- expansion

of a function as a power series- simple examples

30 hours

2. Module 3. Riemann Integral: Definition, integrability of continuous

functions, monotonic functions,. Properties of integrals. First mean value

25

theorem and fundamental theorem of integral calculus.

20

hours

3. Module 3. Complex Numbers: Analytic functions – Cauchy Riemann

equations – Cauchy’s integral formula – Taylor and Laurent’s series

expansion – fundamental theorem of algebra – poles and singularities –

contour integration – simple problems.

40 hours

Books for reference

6. Kresig: Engineering Mathematics

B.Sc. STATISTICS

Semester III

Core Course V – Mathematical Methods

Part A

(Answer all questions) weight 1 for a bunch of 4 questions

1

x

1. e

The value of lim is

x − > 01 + e1 / x

a) 0 b) 1 c) .2 d) doesnot exist

2. If lim f(n) exists and lim f(n) ≠ f (c) , them f (x) has n->c

a) Discontinuity if first kind at x =c b) Discontinuity of Second at x =c

c) Removable disconitunity at x =c d) None of these

26

3. If f (x) ‘ { 1, when x is irreational then -1, when x is rational

a) Is not uniformly continuous on (-1,1)

b) Is uniformly continuous on (-1,1)

c) Has removable discontinuity at x =0

d) Has discontinuity if first kind at x =0

5. A function which is continuous on a……………….interval is also uniformly

continuous on that interval

a) Open b) Closed c) Left open d) Right open

6. The function f(n) = 1x1 is

a) Differentiate at every point on R b) Differentiable on (-1,1)

c) Not differentiable on x>o d) Not differentiable at x =0

1

7. The function defined by f(n) = { x sin /x ; x ≠ 0 is 0 ;x=0

a) Not continuous and derivable at x =0

b) Derivable but not continuous at x =0

c) Continuous but not derivable at x =0

d) Continuous at derivable at x =0

8. If f(n) is derivable at x =c and f ( c ) ≠ 0, then

1 1

a) in not derivable at x =c b) is derivable at x =c

f ( n) f (n)

1 1

c) in not derivable at x ≠ c d) in not derivable at x ≥ o

f ( n) f ( n)

9. The function defined by f(n) = { 0 when x in rational 1 when x is irrigationed

a) Is integrable on any interval on R

b) Is not integrable on any interval on R

c) Is integrable on (0,00)

d) Is not integrable on (0,00)

10. If f(n) is integrable on (a,b), then

a) If (x) is also integrable on (a,b)

b) If (x) is is not integrable on (a,b)

c) If (x) is integrable on (a,b) only if a ≠ o

d) Can not say integrability if if (x) on (a,b)

11. If

∫ f (n) dn = F(b) – F (a) , then F (.) is called

b

n

c) Primitive if f (n) d) Refinement of f (n)

12. If f(n) and g (x) are integrable on (a,b) then

a) f+g is integrame where as f- g is not enegrable on (a,b)

27

b) Both f+g and f- g are not integration on (a,b)

c) Can not say about the integrability of f +g and f-g on (a,b)

Part- B

( Answer all questions) weight 1

13. Define uniform continuity

14. State Rolle’s Theorem

15. Write Taylor’s Series if f(n) in powers of (n-a)

16. What is meant by Partition of an interval

17. When will you say integral if f(n) exist)

18 What do you mean by Analytic functions

19 State Cauchy’s integral formula.

20. Define contour

21. Discuss the continuity of f(n) = { -n2 ; n ≤ o 5x-4 ; 0<x ≤ 1

4x2 – 3x ; 1<x ≤ 2 3x+4 ; x >2

22. Examine Lagrange’s Mean value theorem for f(n) = 2x2- 7x +10 for 2 ≤ x ≤ 5

23. State and prove the first mean value theorem of integral calculus.

24. 2

Evaluate ∫

1

(3n + 1)dn by partitioning the range into n subintervals if length 1/n

26. Explain singularities if complex junction

Part- D (Answer any 2 questions) weight 4

27. State and Prove Rolle’s Theorem. Verify Rolle’s Theorem for f(n) = n2- 4x on (-2,2)

28. Obtain an infinite series expansion if log (1+x) and log (1-x) using Maclaurin’s

expansion.

29. State and prove Cauchy’s Integral formula.

28

CORE COURSE VI

INFORMATICS AND NUMERICAL MATHEMATICS

programme – executing the C programme

10 hours

Module 2 Numeric constants, variables and data types. Arithmetic operators

and expressions – managing input/output operations

10 hours

Module 3. Conditional operators: Relational operators, loops, one –

dimensional and two dimensional arrays, logical operators and expressions

10 hours

Module 4. Functions: Library functions – mathematical functions – defining

and using functions.

10 hours

Module 5. Simple programmes – summation of series – solution of quadratic

equation – matrix addition and multiplication. Calculation of mean, median,

variance, covariance, correlation and regression coefficients.

29

20 hours

Module 6. Numerical Analysis : Operators E and Delta and their basic

properties.Divided differences. Interpolation formulae: Newton’s forward

and backward formulae, Lagrange’s formulae, Newton’s divided difference

formula Numerical Integration: Trapezodial rule, Simpson’s 1/3rd and 3/8th

rules and Weddle’s rule

30 hours

Books for reference

Publishing Company

6. Milne – Thomson : Calculus of finite differences

30

Model Question Paper

B.Sc. STATISTICS

Semester III

Core Course VI

Time 3hrs Informatics and Numerical mathematics

Weight 1 for abunch of 4 questions

a. Character */………*/ b. Characters **…………..**

c. Characters /*……….*/ d. Characters //………….//

2. Which is a valid C decimal integer constant

a. 46,711 b. 123.00 c. 0624 d. – 5126

3. Choose the correct variable name allowed in C

a. Sum – 1 b. Sum–1 c. Sum,1 d. Sum .1

4. Which of the following date type is not allowed in C

a. Float b. Double c. Char d. real

a. Reading function b. Writing Function

c. Reading in a formatted way d. Writing in a formatted way

6. In C, all input/output functions are stored in

a. Stdio.h b. conio. h c. stdlin.g. d. ctype.g.

7. The operator used in scan f along with variable name

31

a. ? b. /t c) & d) 01

a. Both EI & EII are true b. EI in true but EII is not true

c. EII in true but EI is not true d. EI or EII is true

9 A = Pow (x,y) returns

a. E=1-∆ b. ∆ = E+1 c. ∆+ E =1 d. E - ∆ =1

11. Which of the following formula in independent of difference table

a. Language’s b. Newton’s c. Gauss’s d. Stirlings

12. For applying Newtons formard formula, origin if arguments should be shifted to

a. X0 b. Xn c. Middle of X0, X1, Xn d. Arbitary point

13. What is a symbolic constant used in C.

14. What is the use of get chac in C.

15. Construct an expression for finding the true status of (AxB) / (CxD) if A in positive

and C or D in non Zero.

16. Write down the statement for storing

i) Marks of 100 students of a school ii) a 3x7 matrix in proper syntax

17. Distinguish between Branching and Looping.

18. Write down Newtons Divided difference formula.

19. b

If ∫

a

f ( x)dx is to be evaluated by dividing the range into n equal parts , them h is?

20. Write down Weddle’s rule for integration .

Part- C (Answer any 4 questions )Weight 2

21. Explain briefly the structure of a C program

22. Write a program to output the A.M. of a set of 10 observations.

23. Construct a loop to find S = ε 4ε 5ε 6 (i2 + j2 + k2) using 1=1 j= -1, k= 1 while

statement

24. Find the value of f(a) inf f(x) is

x: 2 4 7 12

f(x) -4 16 196 1296 using Lagrange’s Formula

25. In usual notations, show that i) (1+∆) (1-∆) = 1 ii) E-1 = 1- ∆

Part- D (Answer any 2 questions ) weight 4

of a byvariate data.

1 1 1

S = 1+ + + + ..............

1x 2 2 x3 3 x 4

ii) Derive simpsons 1/3rd rule of Numerical Integration.

29. Given

Weight (lbs) : 20-40 40-60 60-80 80 -100 100- 120

32

No. of Students 25 120 100 70 30

Estimate i) No: if students having weight less than 32 lbs.

ii) No: if students having weight more than 105 lbs.

probability sampling, judgement sampling, Organisation and execution of

large scale sample surveys sampling and non sampling errors, preparation of

questionnaire

20 hours

2. Module 2. Simple random sampling with and with out replacement, methods

of collecting simple random samples – unbiased estimates of the population

mean and population total – their variances and estimates of these variances

–simple random sampling for proportions.

20 hours

and population total – proportional and Neyman allocation of sample sizes –

cost function – optimum allocation considering cost – comparison with

simple random sampling.

20 hours

4. Module 4. Systematic sampling: Liner and circular systematic sampling

comparison of systematic sampling with simple random sampling.

10 hours

33

5. Module 4. Cluster sampling: Clusters with equal sizes – estimation of

population mean and total comparison with simple random sampling two

stage cluster sampling – estimate of the variance of the population mean.

20 hours

Culcutta.

2. Cochran W.G.: Sampling Techniques, Wiley Eastern.

3. S.C. Gupta and V.K. Kapoor : Fundamentals of Mathematical Statistics,

Sultan Chand and sons

4. Daroja Singh and F.S.Chaudhary: Theory and Analysis of Sample Survey

Designs, Wiley Eastern Limited.

34

Model Question Paper

B.Sc. STATISTICS

I Semester IV

CORE COURSE VII SAMPLE SURVEYS

Bunch of 4 questions has weight 1

(a) Blood test of a person

(b) When the population is infinite

(c) Testing of life of dry battery cells

(d) all the above

2. Probability of drawing a unit at each selection remains same in:

(a) srswor (b) srawr

(c) both(a) and (b) (d) neither (a) and (b)

3. Probability of including a specified unit in a sample of size n selected out of N units is:

(a) 1/n (b) 1/N

(c) n/N (d) N/n

4. In simple random sampling with replacement, the same

sampling unit may be included in the sample:

(a) only once (b) only twice

(c) more than once (d) none of the above

5. Greatest drawback of systematic sampling is that:

(a) One requires a large sample

(b) datas are not easily accessible

(c) no single reliable formula for standard error of mean is available

(d) None of the above

(a) Systematic sample is superior than stratified random sample

(b) Simple random sample is inferior than systematic sample

(c) Stratified random sample is better than systematic sample

(d) none of the above

7. In srswor the variance of the sample mean is

S2 N −n S2 N −n S2 N −n S2 n − N

(a) (b) (c) (d)

n N N n n N2 n N

8. In srswor Var(p) is

PQ N − n PQ N − 1 PQ N − n PQ N − n

(a) (b) (c) (d)

n N −1 n N −n N N −1 n N1 − 1

9. The total number of samples of size n = 2 from a population of N = 6 is:

value of S2 in question is:

(a) 3 (b) 3.5 (c) 4 (d) 5

11. Consider a population of 6 units with values 1, 2, 3, 4, 5, 6. The

value of σ 2 is:

35

(a) 2.9 (b) 2.8 (c) 2.7 (d) 3

variance of the sample mean is:

(a) 1.2 (b) 1.1 (c) 1.3 (d) 1

PART-B

14. Cluster sampling helps to ……….cost of the survey.

15. Precision of estimates …………….by proper stratification.

16. Non sampling error arises due to ………..of data.

17. A sample of 30 students is to be drawn from a population consists of 300 students belonging to

two colleges of strength 200 and 100 respectively. What is the value of n1 and n2 if we use

proportional allocation?

18 In a Systematic sampling N = 40 and n = 4 what is k.

19 If the population consists of a linear trend state the relationship of variance of sample means.

20. In cluster sampling the population is divided into…….

PART-C

(Answer any four questions) weight 2 21. Explain the

concept of stratified sampling.

22. What is the difference between cluster and systematic sampling?

23 Derive the expression for variance of sample mean in srswor.

24 Show that sample mean is an unbiased estimate of population

25 What are the advantages of sampling over census.

26 List out the simple random samples for the data given in question

PART – D

(Answer any two questions) weight 4

nk − 1 S 2

28 Show that V ( y sys ) = [1 + (n − 1)ρ ] where ρ is the interclass correlation between

nk n

the units of the same systematic sample.

29 In question No. 17 if the means are30 and 60 and standard deviations are 10 and 40

respectively, obtain the variance of sample mean and compare its efficiency with srswor.

QUALITY CONTROL

36

1. Module 1. Linear programming: Mathematical formulation of LPP,

Graphical and Simplex methods of solving LPP – duality in linear

programming

20 hours

2. Module 2. Transportation and Assignment problems: North – west corner

rule, row column and least cost method – Vogel’s approximation method.

Assignment problem Hungarian algorithm of solution

20 hours

3. Module 3. General theory of control charts, causes of variations in quality,

control limits, sub grouping, summary of out- of control criteria, charts of

attributes, np chart, p chart, c chart. Charts of variables:X bar chart, R chart

and sigma chart. Revised control charts. Applications and advantages.

25hours

4. Module 4. Principles of acceptance sampling – Problems and lot acceptance,

stipulation of good and bad lots- producers’ and consumers’ risks, simple

and double sampling plans, their OC functions, concepts AQL, LTPD,

AOQL, Average amount of inspection and ASN function 25 hrs

B.Sc. STATISTICS

Semester IV

CORE COURSE VIII OPERATIONS RESEARCH AND

37

STATISTICAL QUALITY CONTROL

Part A

Time 3hrs Answer all questions (Weight 1 for bunch of 4)

a) in a convex set b) outside a convex set c) at the extreme point of the convex set c)

none of them.

2. Transportation problem is.

a) a lpp b) an assignment problem c) Quadratic programming problem d) Dynamic

programming problem.

3. VAM method is used to solve a

a) Assignment problem b) transportation problem c) usual lpp d) none of them

4. Dual of a dual is

a) slack b) surplus c) artificial d) primal.³σ

5. In a control chart the manageable cause is

a) assignable cause b) random cause c) chance cause d) none of them

6. A control chart for fraction defectives is said to be in control if the points lie within

a) X‾± 3σ b) p’±3np’q’ c)p’±3√p’q’/n d)c±3√c

7. The spread of a process is given by

a) 3σ b) 6σ c)2σ d) 1.96σ

8. Upper control limit for R Chart is

a) A2R‾ b) A1R‾ c) D3R‾ d) D4R‾

9. Consumers risk is usually denoted by

a) µ b)∂ c) β d) α

10. The acceptance sampling plan is used for

a) Identifying good lots b) protecting the consumers interest c) protecting the producers

interest

d) All of the above

11. The Consumers risk usually fixed at

a) .05 b).01 c).95 d) .99

12. The OC curve gives

a) proportion of bad lots b) proportion of good lots c) discriminating power of the

sampling plan

d) none of them.

Part B ( answer all questions ,weight 1)

13. The inequality constrains are made equality in a lpp using---------- variables

14. If maximization lpp problem can be increased infinitely the problem is said to have---

--------- solutions

15. A sampling plan in which we take a decision based on one sample only is called-------

------

16. In a non degenerate transportation problem with m rows and n columns the number

allocations

will be----------

17. Expand the term LTPD

18. The method used to solve an assignment problem is called---------------

19. An artificial variable is used for--------------------

20. Chart used for number of defects is based on ---------- distribution

Part C ( answer 4 questions, Weight 2)

21. Define AOQ and LTPD.

22. Define the Linear programming problem.

23. What is double sampling plan?

24. Write the assignment problem as an lpp.

25. What are probability limits?

38

26. What is an unbalanced transportation problem?

Part D ( answer any 2 questions, weight 4)

27. Distinguish between double and single sampling plans.

28. Draw the OC curve of the single sampling plan showing the consumers and producers

risks.

29. Find the initial basic feasible solution of the following transportation problem. There

are four origins three destinations. The availabilities are 9,10,8,7and the requirements

are 17,10,7 respectively.

A B C

D 2 3 2

E 1 3 4

F 2 3 1

G 2 4 3

Sons

components, illustrations, additive and multiplicative models,

determination of trend, growth curves, analysis of seasonal fluctuations,

construction of seasonal indices.

25 hours

39

2. Module 2. Analysis of income and allied distributions- Pareto distribution

, graphical test, fitting of Pareto’s law, illustrations, log normal

distribution and properties. Lorenz curve, Gini’s coefficient.

20 hours

problems in the construction of index numbers- simple aggregate and

weighted aggregate index numbers. Test of consistency of index

numbers- factor reversal- time reversal test and unit test. Chain base

index numbers- Base shifting- splicing- and deflating of index numbers.

Consumer price index numbers- family budget enquiry- limitations of

index numbers.

30 hours

4. Module 4. Attitude Measurements and scales: Issues in attitude

measurements-scaling of attitude-Guttman scale-Semantic differential

scale-the Likert Scale- selection of appropriate scale- limitations of scales

15 hours

1. SC Gupta and V.K. Kapoor: Fundamentals of Applied Statistics, Sultan

Chand & Sons

2. Goon A.M., Gupta M.K. and Das Gupta: Fundamentals of Statistics Vol.II

The World Press, Culcutta.

3. Box, G.E.P. and G.M. Jenkins: Time Series Analysis, Holden –Day

4. Meister David: Behavioural Analysis and Measurement Methods, John

Wiley, New york

5. Luck D.J. et al: Marketing Research, Prentice Hall of India, New Delhi

40

Model Question Paper

B.Sc. STATISTICS

Semester IV

CORE COURSE- IX OPERATIONS RESEARCH AND

STATISTICAL QUALITY CONTROL

Part A

Time 3hrs Answer all questions (Weight 1 for bunch of 4)

distribution d) none.

41

a) Trend b) Seasonal Variation c) Cyclic variation d) Random variation

5. Which of the method can be used for getting trend values for each given time point

10. A model of time- series explains the ……………….relation between value of variable and

time series components

13. Give an example each for seasonal and cyclic variation in a time – series

42

14. Define period of Moving average.

15. Give any three examples of irregular variation affecting a Time- series data.

17. Give the formula for converting chain base into fixed base and fixed base into chain base

Index numbers.

25. With the help of an Index Number formula, explain Time and Factor Reversal Tests.

26. Explain the use for developing Cost of Living Index Numbher.

27. Given the following data related to yield of a crop in three different seasons.

1990 12 19 17

1991 14 25 23

1992 13 27 20

1993 15 28 22

1994 17 31 24

28. Briefly explain the use of Pareto distribution and its applications

29. Calculate the cost of Living Index Number for the data given below.

Rice

43

Year Season 1 Season 2 Season 3

Food 30 47 4

Fuel 8 12 1

Clothing 14 18 3

House Rent 22 15 2

Miscellaneous 25 30 1

BLUE – Gauss – Markov theorem – Linear hypothesis

25 hours

2. Module 2. Analysis of variance: One way and two way classification (with a

singles observation per cell). Analysis of covariance with a single

observation per cell.

25 hours

3. Module 3. Principles of design – randomization – replication – local control.

Completely randomized design – randomized block design – Latin Square

design. Missing plot technique – comparison of efficiency.

25 hours

3

4. Module 4. Basic concepts of factorial experiments :2 factorial experiments-

Duncan’s multiple range test

15 hours

Books for reference

44

1. S.C. Gupta & V.K.Kapoor: Fundamentals of Applied Statistics, Sultan

B.Sc. STATISTICS

Semester IV

Part A

Time 3hrs Answer all questions (Weight 1 for bunch of 4)

(a) increase the efficiency of the design

(b) reduce experimental error

(c) to form homogeneous blocks

(d) all the above

2.Errors in a statistical model are always taken to be:

(a) independent (b) distributed as N (0, )

(c) Both (a) and (b) (d) neither (a) and (b)

3.A completely randomized design is also known as:

(a) unsystematic design (b) non-restrictional design

(c) Single block design (d) all the above

4.A randomized block design has:

(a) Two way classification (b) one way classification

(c) Three way classification (d) no classification

5.In the analysis of data of a randomized block design

with r blocks and s treatments the error degrees of freedom

are

(a) r (s-1) (b) s(r-1) (c) (r-1) (s-1) (d) none of the above

6.Error sum of squares in RBD as compared to CRD using the

same material is

(b) more (b) less (c) equal (d) not comparable

7.The ratio of the number of replications required in CRD and

RBD for the same amount of information is

(a) 6:4 (b) 10:6 (c) 10:8 (d) 6:10

8 In a randomized block design with 4 blocks and 5 treatments

having one missing value, the error degrees of freedom will be

45

(a) 12 (b) 11 (c) 10 (d) 9

9.A Latin square design controls

(a) two way variation (b) three way variation

(c) multi way variation (d) no variation

10.While analysing the data of a latin square, the error degrees

of freedom in analysis of variance is equal to

(a) (r-1)(r-1) (b) r(r-1)(r-2) (c) 2r-2 (d) 2r –r-1

11.Two types of effects measured in a factorial experiment are

(a) main and interaction effects (b) simple and complex effects

(c) both (a) and (b) (d) neither (a) nor (b)

factors A and B each at two levels from three replications are,

0,0=18 1,0=17 0,1=25 1,1=30, the sum of square for the

interaction AB is equal to:

(a) 4 (b) 3 (c) 6 (d) 675

PART-B

answer all questions (weight 1)

13. Write down Gauss Markov Linear model.

14. State the necessary and sufficient condition for estimability

of Parametric function.

15. What are the principles of experimental design?

16. Write the expression for estimating missing value in LSD.

17. If there are two missing values in a RBD with 4 blocks and

5 treatments,

What will be the degrees of freedom of error sum

of squares?

18.In a LSD with 4 treatments and error sum of squares is 16,

find the Mean error sum of squares.

19.Write expression for efficiency of LSD compared to CRD

simple effect of B at the second level of A?

Part C

What are the assumptions used in it?

22 Give the analysis for completely randomized design

23. Derive the expression for estimating one missing

observationin RBD

24 Explain the efficiency of LSD compared to RBD

46

25How can estimate the effects and calculate the sum of squares

in factorial experiment ?

RBD were as tabulated below. Analyse the experimental data

and interpret the result.

1 21 20 19

2 19 18 18

3 18 19 19

4 27 25 24

PART – D

(Answer any two questions)Weight 4

28. Derive the analysis of variance of RBD.

29. Estimate the missing value in the following Latin Square

Design and then set up the analysis of variance.

A C B D

12 19 10 8

C B D _

18 12 6

B D A C

22 10 5 21

D A C B

12 7 27 17

47

CORE COURSE XI: POPULATION STUDIES AND ACTURIAL SCIENCE

Rates and ratios mortality rates – crude, age specific and standard death rates

– fertility and reproduction rates –c rude birth rates – general and specific

fertility rates – gross and net reproduction rates

20 hours

bridged life tables and its characteristics principal methods of construction of

abridged life tables . Reed Merrel’s method

40 hours

3. Module 3. Fundamentals of insurance: Insurance defined meaning of loss,

peril, hazard and proximate cause in insurance. Costs and benefits of

insurance to society – branches of insurance. Insurable loss exposures –

feature of loss that is deal for insurance. Construction of mortality table –

computation of premium of life insurance for fixed duration and for the

whole life –

30 hours

Hall

48

Model Question Paper

B.Sc. STATISTICS

Semester V

CORE COURSE- XII POPULATION STUDIES AND ACTURIAL

SCIENCE

Part A

Time 3hrs Answer all questions (Weight 1 for bunch of 4)

1. Vital statistics is mainly concerned with

(a) births (b) deaths (c) marriages (d) all the above

2. Vital rates are customarily expressed as

(a) percentages (b) per thousand (c) per million (d) per ten thousand

3. The registration of births, deaths and marriages are

(a) a fancy of society (b) a part of medical research

(c) a legal document (d) all the above

4. The child bearing age in India is

(a) 20-24 years (b) 20-29 years (c) 13-49 years (d) 15- 49 years

5. The relation between N.R.R and G.R.R is

(a) N.R.R and G.R.R are usually equal (b) N.R.R can never exceed G.R.R

(c) N.R.R is generally greater than G.R.R (d) none of the above

6. Life-table has also been named as

(a) survival table (b) mortality table (c) life expectancy table (d) all the above

7. Normally a life-table is constructed for an age interval of

(a) five years (b) ten years (c) one year (d) 5-10 years

8. The central mortality rate ‘mx ’ in terms of qx is given by the formula

2q x 2q x qx qx

(a) (b) (c) (d)

2 + qx 2 − qx 2 + qx 2 − qx

9. The payment received by the insurer is known as

(a) loss (b) cost (c) premium (d) benefit

10. _______ is a condition that increases the frequency or severity of loss.

(a) peril (b) hazard (c) risk (d) loss exposure

11. Uncertainty of loss is known as

(a) probability (b) hazard (c) loss exposure (d) risk

12. The cause of loss is defined as

(a) hazard (b) risk (c) peril (d) claim

SECTION B

(Answer all the questions) Weight 1

49

13. Death rate computed for a specified section of the population is known as ______.

14. The ratio of instantaneous rate of decrease in lx to the value of lx is

defined as _______.

15. The expectation of life at any age can be obtained from a ________.

16. Pearle’s Vital Index = ________

17. An abridged life table usually consists of ages at distance of ________ years.

18. ______ is a financial arrangement that redistributes the costs of unexpected losses.

19. The insured’s possibility of loss is called the insured’s _______.

20. If the covered peril is death, the contract is called _______.

SECTION C

(Answer any four questions) weight 2

21. What are the various uses of vital statistics?

22. What is expectation of life? Distinguish between ‘curate expectation’ and ‘complete

expectation’ of life.

23. Define general fertility rate. Explain its merits and demerits.

24. What do you understand by an abridged life table?

25. Discuss the costs and benefits of insurance to society.

26. Explain life insurance and fire insurance.

SECTION D

(Answer any two questions) weight 4

27. Compute the crude and standardized death rates of the two populations A and B,

regarding A as standard population, from the following data:

Age-group A B

(Years) Population Deaths Population Deaths

under10 20,000 600 12,000 372

10-20 12,000 240 30,000 660

20-40 50,000 1250 62,000 1612

40-60 30,000 1050 15,000 525

above 60 10,000 500 3,000 180

Age in years lx dx px qx Lx Tx e xo

50

4 95,000 500 ? ? ? 4,850,300 ?

5 ? 400 ? ? ? ? ?

51

CORE COURSE XII: PRACTICAL

1. . Numerical questions from the following topics of the syllabi are to be asked

for external examination of this paper. The questions are to be evenly chosen

d. Sample surveys

e. Design of Experiments

g. Linear Programming

h. Numerical Analysis

i. Time series

j. Index Numbers

d) Numerical Analysis

e) Sample surveys

f) Design of Experiments

52

g) Construction of Control Charts

h) Linear Programming

i) Time Series

B.Sc. STATISTICS

Semester VI

(Answer any Four Questions)

1(a) Compute chain index numbers with 1981 prices as base from the following table

giving the average wholesale prices of the commodities A,B and C for the year

1986 to 1990. (Wt-1)

Commodity Average Whole Sale Price (Rs)

1986 1987 1988 1989 1990

A 20 16 28 35 21

B 25 30 24 36 45

C 20 25 30 24 30

(b) Calculate seasonal indices by the ratio to moving average method (Wt-1)

Year 1Qtr II Qtr III Qtr IV Qtr

1998 68 62 61 63

1999 65 58 66 61

2000 68 63 63 67

53

2(a) In a study it is reported that 60 out of group of 1000 insured person died within an

year. Examine whether this justifies the assumption that less than 4% only are

likely to die with in an year, at 5% level of significance (Wt-1)

(b) A sample of 200 boys who passed SSLC examination has a mean marks 50 with

standard deviation 5. The mean marks for a sample of 100 girls was found to be 48

with standard deviation 4. Does this indicate any significant differences between

the abilities of hoys and girls, assuming that the standard deviations are the same,

at 5% level of significance (Wt-1)

3 (a) Tea accountants were given intensive earaching and two tests were conducted in a

month. The scores of test 1 and 2 are given below. (Wt-1)

S.No. of Accounts : 1 2 3 4 5 6 7 8 9

10

Marks in Ist Test : 50 42 51 42 60 41 70 55 62

38

nd

Marks in 2 Test : 62 40 61 52 68 51 64 63 72

50

Does the scores from test 1to test 2 shows an improvement? Test at 5% level of

significance.

23:6 , 28:1, 21.0 , 27.8, 19.2, 22.2, 25.0, 23.0, 26.0 obtain (i) a point estimate of

the population mean (ii) a 99% confidence interval for the population mean.

(Wt-1)

4(a) A sample of 30 students, is to be drawn from a population consisting of 300

students belonging to two colleges A and B. The means and standard deviations

are given below.

Total No. of Mean Standard Deviation

Students

College A 200 300 10

College B 100 60 40

Draw a sample using proportional allocation. Hence obtain the variance of the

population mean and compare its efficiency with SRSWOR. (Wt-2)

5(a) An experiment was carried out on wheat with 3 treatments in it randomized blocks.

The plan and yield per plot are as follows.

Blocks

1 2 3 4

A(8) C (10) A(6) B (10)

54

C (12) B (8) B (9) A (8)

B (10) A (8) C (10) C (9)

Analyze the data and give your conclusions. (Wt-1)

(b) The following are the number of defects noted in the final inspection of 30 days of

woolen cloths:-

0,3,1,3,2,2,1,3,5,0,2,0,0,1,2,4,3,0,0,0

1,2,4,5,0,9,4,10,3 And 6

Draw suitable control chart. (Wt-1)

Maximum z = 5x 1+3x2

Subject to 3x 1+5x2 ≤ 15

5x1+2x2 ≤ 10

x1+x2 ≥ 0

55

56

PROJECT

1. The project is offered in the fifth and sixth semester of the degree course

and duration of the project may spread over the complete year

in a group shall not exceed five. However, the project report shall be

3. There shall be a teacher from the department to supervise the project and the

synopsis of the project should be approved by that teacher. The head of the

4. As far as possible, topics for the project may be selected from the applied

The following books may be used to get an idea about projects and project

report writing.

ELECTIVE SUBJECTS

can select one of the following Elective subjects in the sixth semester.

57

A. ACTURIAL SCIENCE

PROBABILITY MODELS AND RISK THEORY

Module.1 Individual risk model for a short time: Model for individual claim

random variables-Sums of independent random variable-

Approximation for the distribution of the sum-Application to

insurance 10hrs

aggregate claims-Selection of basic distributions-Properties of

compound Poisson distributions –Approximations to the distribution

of aggregate claims 15hrs

adjustment coefficient-Discrete time model-The first surplus below

the initial level-The maximal aggregate loss 15hrs

Approximating the individual model-Stop-loss re-insurance-The

effect of re-insurance on the probability of ruin 14hrs

McCutcheon, J.J., Scott William (1986): An introduction to Mathematics

of Finance

Butcher,M.V., Nesbit, Cecil. (1971)Mathematics of compound interest,

Ulrich’s Books

Neill, Alistair, Heinemann, (1977): Life contingencies.

Bowers, Newton Let al (1997): Actuarial mathematics, society of

Actuaries, 2nd Ed

Time: 3 Hrs

58

Part A

Choose the correct answer from the brackets

Bunch of four questions carries one weight age

1. Let X is the number obtained when one true die is tossed. Let y be the sum of the

numbers obtained when x true dice are then thrown. calculate E[y]

(a) 4/7 (b)7/4 (c)2/6 (d)3/6

2. Under certain assumptions, the probability of ruin is

Ψ(u)= (0.3) e-2u +(0.2) e-4u+(0.1)e-7u, u > 0. Calculate θ?

(a)2/3 (b)1/3 (c)1 (d)½

3. Suppose that λ = 3, C = 1 and P(x) = 1/3 e-3x +16/3 e-6x , x >0 Calculate P1

(a)3/27 (b)6/27 (c)4/27 (d)5/27

4. Suppose that λ = 1, C = 10 and P(x) = 9x/25 e-3x/5, x>0. Calculate θ

(a) 3 (b) 4 (c) 2 (d)5

5. Suppose that the claim amount distribution is discrete with P(1)=1/4 and

P(2)=3/4.If R=log 2.Calculate θ

(a) 10 -1 (b) 10 (c) 10 -1 (d)10

7log2 7log2 5log2 5log2

6. Suppose that Wi assumes only, the value 0 and +2 and that

P[W=0]=p,P[W=2]=q,where p+q=1,Assume that C=1,P>1/254

7. Consider an insurance portfolio that will produce 0, 1, 2, or three claims in a fixed

Time period with probabilities 0.1, 0.3, 0.4 and 0.2respectevely An individual

Claim will be of amount 1, 2, or 3 with probabilities 0.5, 0.4 and 0.1

Respectively Calculate E[N]

(a) 1.7 (b) 2.7 (c) 2.8 (d)1.6

8. Suppose that θ=2/5 and p(x)= 3/2e-3x + 7/2e-7x , x>0 calculate γ

(a) 2 (b)3 (c)4 (d)2.5

9. If S has a compound Poisson distribution given by λ=3,p(1)= 5/6,p(2)=1/6,

Calculate fs(x) for x=0

(a) 0.050 (b) 0.25 (c) 0.052 (d) 0.523

10 Consider an insurance portfolio that will produce 0, 1, 2, or three claims in a fixed

Time period with probabilities 0.1, 0.3, 0.4 and 0.2respectevely an individual

Claim will be of amount 1, 2, or 3 with probabilities 0.5, 0.4 and 0.1

Respectively Calculate V [N]

(a) 0.8 (b) 0.028 (c) 0.08 (d) 0.285

-3x -7x

11. Suppose that θ=2/5 and p(x)= 3/2e + 7/2e , x>0 calculate R

(a) 2.5 (b) 3.45 (c) 4.25 (d) 2.5

Calculate Fs(x) for x=2

(a) 0.354 (b) 0.258 (c) 0.520 (d) 0.545

PART B

Attempt all questions- each questions carries one weight age

function of N is given by

P[N=n] = pqn , n=0,1,2…..

Where 0<q<1 and p=q-1.Determine MS(t) in terms of MX(t)?

14. If S has a compound Poisson distribution, specified λ and p(x), Then the distribution of

59

Z = S- λP1Converges the standard normal distribution as λ→∞?

λP2

15. Write an expression for the distribution of the surplus level at the first time, the

surplus falls below the initial level u, given that it does fall below u, if all

claims are of size 2?

16. Derive an expression for Ψ(u) if the Xi’s have an exponential claim amount

distribution?

17. Write an expression for the distribution of L, if the size of the individual claims

has an exponential distribution with parameter β?

18. Find the mean and variance of the Inverse Gaussian distribution, by using its

mgf

19. Derive an expression for R in the special case where the Wi’s common

distribution is N(µ,σ2)?

20. Determine the adjustment coefficient if the claim amount distribution is

exponential with parameter β>0?

PART C

Attempt any four questions- each questions carries two weight age

21.Assume that u(λ) is the gamma probability distribution function with parameter α

and β,

u(λ) = βα λα-1 e –βλ

Γα ,λ>0

Where Γα = ∫∞0 yα-1 e –y dy. Show that the marginal distribution of N is negative

binomial with parameters, r = α , p= β

1+ β

22.Prove that if S1,S2,………….Sm are mutually independent random variables,such

that Si has a compound Poisson distribution with parameter λi and d.f of

claim amount Pi(x),i=1,2,……….m, then S= S1+ S2+………….+Sm has a

m m

Compound Poisson distribution with λ= ∑ λi and P(x)= ∑ λi /λPi (x)

i=m i=m

23. Assume that u(λ) is the inerse Gaussian p d f with parameters α and β . Exhibit

the moment generating function of N, E [N] and V [N]?

distribution p(x) is exponential with parameter θ ?

25. Calculate the adjustment coefficient if all the claims are of size 1?

26. Calculate the probability of ruin in the case that the claim amount distribution is

exponential with parameter β

PART D

Attempt any two questions- each questions carries four weight age

60

is 1/6 and B, the benefit amount given that there is a claim ,has pdf

F(y) = 2(1-y), 0<y<1

0 , elsewhere

Let S be the total claims for the portfolio. Using a normal distribution, Estimate

P[S>4]

28.Prove that for compound distribution where the probability distribution for N

the number of claims , satisfies the condition

P[N=n] = a+b/n ,for n= 1,2,………..

P[N=n-1] and where the distribution of claim amounts is restricted to the

positive integers.

x

fS(x) = ∑ [a+bi/x]p(i) fS(x-i) ,x=1,2,………

i=1

With the starting value fS(0) = P[N=0]

29.Given that θ = 2/5, and P(x) = 3/2 e-3x +7/2 e-7x , x>0 .Calculate Ψ(u),γ,R?

B. STOCHASTIC MODELING

Module 1. Concept of mathematical modeling, definition, natural testing a

61

Definition of stochastic process, classification, Markov chain, transition

30hrs

14hrs

1. V.K. Rohatgi: An introduction to probability Theory and Mathematical

B.Sc. STATISTICS

Semester VI

STOCHASTIC MODELING

62

(a) P( X = k ) (b) ∑ P( X = k ) (c) ∑ P( X = k )s

k k

k

(d) P ( X = k ) s k

x x x

∫

(a) g ( x − y ) f ( y ) dy (b)

0

∫ g ( x + y) f ( y)dy (c)

0

∫ g ( x − y) f ( y − x)dy (d)

0

x

∫ g ( x) f ( y )dy

0

process of

(a)Discrete time (b) Continuous time (c) Discrete state & Continuous

time (d) Discrete

time & discrete state

(a) P ( X n | X n −1 ) = 0 (b) P ( X n | X n −1 ) ≠ 0 (c) P ( X n | X n −1 ) ≥ 0 (d)

P( X n | X n −1 ) = P( X n )

5. State i is a return state if

( n) (n)

(a) Pij > 0 for some n ≥ 1 (b) Pij > 0 for all n ≥ 1 (c)

(n)

Pij = 0 for all n ≥ 1

(d) Pij > 0

6. State j is absorbing if

( n)

(a) Pjj > 0 for some n ≥ 1 (b) f jj = 1 (c) Pjj < 1 (d) f ij = 1

7. For an irreducible markov chain, if one state is ergotic, then

(a)all states are ergotic (b) one more state is ergotic (c) no other state is

ergotic (d) none

0 1 0

8. For the following Markov chain, P= 1 / 2 0 1 / 2 with state 1, 2, 3, the

0 1 0

chain is

(a)transient (b) recurrent (c) absorbing (d) none of these

9. For the above Markov chain

(a) P 2 = P (b) P 3 = P (c) P 4 = P (d) P 2 = P 3

10. For a poisson process, {N(t)}, p n (t )

(a)independent of time (b) depends on t (c) depends on time length

(d) zero

11. Which of the following is incorrect for a poisson process

(a)Markovian (b) time homogeneous (c) independent (d)

nonstationory

12. Interarrival distribution of poisson process is

(a)gamma (b) geometric (c) exponential (d) binomial

63

Part B.

Answer all question Wt 1

14. …………..is an example of discrete state stochastic process

15. Markov chain is a ………..time and …………………state stochastic process

16. A state j is recurrent if …………………………….

17. Chapman-Kolmogorov equation of a Markov chain

is………………………………

18. A recurrent non-null and aperiodic state of a Markov chain is

called…………………..

19. Poisson process has……………and …………………increments

20. For a poisson process its mean value is………

Part C.

Answer any four question Wt 2

22. For the following Markov chain, compute P ( X 3 = 1, X 2 = 2, X 1 = 1, X 0 = 2)

3 / 4 1/ 4 0

1 / 4 1 / 2 1 / 4 with initial probability P ( X 0 = i ) = 1 / 3, ∀i

0 3 / 4 1/ 4

0 0 1 0

0 0 0 1

23. For the following Markov chain check whether all

0 1 0 0

1 / 4 1 / 8 1 / 8 1 / 2

states are ergotic or not

24. Prove that in an irreducible chain, all the states are of the same type.

25. Define poisson process.

26. Show that sum of two poisson process is a poisson process.

Part D.

Answer any two questions Wt 4

∑p =∞

n

27. Prove State j is persistant if ij

n =0

64

1 / 3 2 / 3 0 0

1 0 0 0

28. For the following Markov chain show that State 1 is

1/ 2 0 1/ 2 0

0 0 1 / 2 1 / 2

ergotic, state 2 is recuurent and chain is ergotic.

e − λt (λ t ) n

29. Derive, for a poisson process, p n (t ) = , n = 0,1,... , using its postulates

n!

C. RELIABILITY THEORY

series and parallel structure with example-dual structure function-coherent

structures-preservation of coherent system in terms of paths and cuts-

representations of bridge structure-times to failure-relative importance of

components-modules of coherent systems.

(20 hours)

Module 2.Reliability of Coherent systems: reliability of system of independent

components-some basic properties of system reliability-computing exact system

reliability-inclusion exclusion method-reliability importance of components.

(20 hours)

Module.3 Parametric distributions in reliability: A notion of ageing (IFR and

DFR only) with example-exponential distribution-Poisson distribution.

(14 hrs)

1. R. E. Barlow and F Proschan (1975) Statistical theory of reliability and life testing,

Holt Rinhert, Winston

2. N. Ravi Chandran Reliability Theory, Wiley Estern

65

Model Question Paper

B.Sc. STATISTICS

Semester VI

ELECTIVE- RELIABILITY THEORY

(a) φ ( x ) = min( x1 ,..., xn ) (b) φ ( x) = max( x1 ,..., xn ), (c) φ ( x) = ( x1 + xn ), (d) φ ( x ) = x1...xn )

2. A k-out-of n system functions if

66

(a)all components functions, (b)only one component functions, (c)atleast k components

functions, (d)atmost k component functions

3. If φ (1i , x) = φ (0i , x), ∀(.i , x) then component i is

(a) relevant, (b) irrelevant, (c) coherent, (d)monotonic

4. If φ is the structure function, then its dual is

(a) φ D ( x) = 1 − φ ( x) , (b) φ D ( x ) = 1 − φ (1 − x) , (c) φ D ( x) = φ (1 − x) , (d) φ D ( x) = 1 + φ ( x)

5. For a coherent system, which of the following argument is correct?

(a)a component may relevant, (b) each of the component is relevant, (c) no component

is relavant, (d) atleast two component is relavant

6. Which of the following is reliability of a binary system?

(a) Eφ (x) , (b) Eφ 2 ( x) , (c) 1 − Eφ ( x) , (d) Eφ ( x ) − 1

7. Reliability of a three component series system is

(a) (1 − p )3 , (b) p 3 , (c)1-(1-p) 3 , (d) p (1 − p ) 2

8. Let h(p) is the reliability function of a coherent structure.

(a) h(p) is increasing in pi , (b) h(p) is decreasing in pi , (c) constant in pi ,

(d)independent of pi

9. Which of the following is true?

(a) 0 < I h ( j ) ≤ 1 , (b) 0 < I h ( j ) < 1 , (c) 1 < I h ( j ) ≤ ∞ , (d) 0 < I h ( j ) ≤ ∞

10. Which of the following is a failure rate function?

f (t ) f (t ) F (t ) 1 − f (t )

(a) , (b) , (c) , (d)

F (t ) 1 − F (t ) f (t ) F (t )

11. Which distribution has constant failure rate?

(a) normal, (b) poisson, (c) exponential, (d) lognormal

12. A process which has stationary independent increments is

(a) gamma process, (b) poisson process, (c) exponential process, (d)geometric

Process

PART B

Answer all questions (Weight 1)

14. A parallel system functions if …………component is functioning.

15.The structural importance of a component is……………

16. Reliability of a 2-out-of 3 system is………..

17. If p=0.5, then reliability of a 5 component series system is……….

18. If p=0.6, the reliability of a 2-out of 3 system is………

19. If λ = 1, then failure rate of exponential distribution is……..

20. A distribution having memory-less property is……..

PART C

Answer any four questions( weight 2)

n n

22.Let φ (x ) be the coherent structure of n components, show that ∏ xi ≤ φ ( x) ≤ Χ xi

i =1 i =1

23. Define relative importance of components?

24. Let h(p) be the reliability function of a coherent structure, show that h(p) is increasing

In each pi .

25. Explain inclusion exclusion method.

26. Define IFR and DFR results for exponential distribution?

67

PART D

Answer any two questions (weight 4)

28. What is reliability importance of a component? How can we compute reliability

Importance in a system?

29. Establish the lack of memory property of exponential distribution? Check whether

failure rate function is increasing or decreasing or constant?

OPEN COURSES

A. ECONOMIC STATISTICS

illustrations, additive and multiplicative models, determination of trend,

growth curves, analysis of seasonal fluctuations, construction of seasonal

indices.

24 hours

Module 2. Index Numbers: Meaning and definition – uses and types- problems

in theconstruction of index numbers- simple aggregate and weighted

aggregate index numbers. Test of consistency of index numbers- factor

reversal- time reversal test and unit test. Chain base index numbers- Base

shifting- splicing- and deflating of index numbers. Consumer price index

numbers- family budget enquiry- limitations of index numbers.

68

30 hours

Books for reference

1. SC Gupta and V.K. Kapoor: Fundamentals of Applied Statistics,

Sultan Chand & Sons

2. Goon A.M., Gupta M.K. and Das Gupta: Fundamentals of Statistics

Vol.II The World Press, Culcutta.

B.Sc. STATISTICS

Semester V

OPEN COURSE (ECONOMIC STATISTICS)

Time 3Hr

Part A

Answer all questions (Weight 1 for bunch of 4)

a. The natural forces affecting the variable value

b. Systematic forces affecting the variable value

c. Manmade forces affecting the variable value

d. Any sort of force affecting the variable value

2. The rise in human population is an example of

a) Trend b) Seasonal Variation c) Cyclic variation d) Random variation

3. ‘Business cycle’ is an example of

a) Trend b) Seasonal Variation c) Cyclic variation d) Random variation

4. In method of Semi- Averages, Trend in assumed to be

a) Linear b) quadratic c) Exponential Growth d) None of these

5. Which of the method can be used for getting trend values for each given time point

a) Method of simple averages b) Method of moving averages

c) Method of least square curve filling d) All the above

6. Non-centered moving averages are due to

a) Odd period b) Even period

c) Odd no:of time point d) even no : of time points

69

Seasonal variations are periodic due to

7.

a) Man made customs, habits, rituals etc

b) Resulting due to Natural reasons

c) Resulting due to change in weather condition

d) Any force that operate regularly year after year

8. Seasonal variation is measured using

a) Seasonal Averages b) Seasonal Indices

c) Seasonal Relatives d) None of these

9. A monthly seasonal variation measures are adjusted to

a) 12 b) 120 c) 1200 d) None of these

10. A model of time- series explains the ……………….relation between value of

variable and time series components

a) Additive b) Multiplicative c) Mathematical d) None of these

a) 12 % growth from base to current year

b) 112 % growth from base to current year

c) 88 % depreciation from base to current year

d) 12 % depreciation from base to current year

12. Which of the following is called ideal Index Number

a) Laspere’s b) Paschee’s c)Fischer’s d) Kelly’s

Part B

Answer all questions wt 1

13. Distinguish between seasonal and acyclic variation in a time – series

14. Define period of Moving average.

15. Give any three examples of irregular variation effecting a Time- series data.

16. How seasonal variation in measured.

17. Give the formula for converting chain base into fixed base and fixed base into chain

base Index Numbers.

18. Why base shifting is necessary for Index Numbers.

19. Why Index Numbers are called Economic Barometers.

20. Give three major limitation of Index Numbers.

70

Part- C (answer any 4 questions) weight 2

21. How trend in measured using Moving Averages.

22. Explain periodic variations in Time- Series with suitable examples.

23. Explain the Link Relative Method of measuring seasonal variation.

24. Explain the uses of Index Numbers.

25. With the help of an Index Number formula, explain Time and Factor Reversal Tests.

26. Explain the concept behind developing cost of Living Index Numbher.

Part- D (Answer any 2 Questions) weight 4

27 Given the following data related to yield of a crop in three different seasons.

Yield (Kg/10 cent plot)

Year Season 1 Season 2 Season 3

1990 12 19 17

1991 14 25 23

1992 13 27 20

1993 15 28 22

1994 17 31 24

i) If this trend is followed, what will be the expected yield in 1995?

ii) Does season influence yield of crop?

28. Briefly explain the problems in the construction of an Index Number.

29. Calculate the cost of Living Index Number for the data given below.

Rice

Year Season 1 Season 2 Season 3

Food 30 47 4

Fuel 8 12 1

Clothing 14 18 3

House Rent 22 15 2

Miscellaneous 25 30 1

71

B. QUALITY CONTROL

control limits, sub grouping, summary of out- of control criteria, charts of

attributes, np chart, p chart, c chart. Charts of variables:X bar chart, R chart

and sigma chart. Revised control charts. Applications and advantages.

30 hours

Module 2. Principles of acceptance sampling – Problems of lot acceptance,

stipulation of good and bad lots- producers’ and consumers’ risks, simple

and double sampling plans, their OC functions, concepts AQL, LTPD,

AOQL, Average amount of inspection and ASN function

24 hrs

Sons

Sons

B.Sc. STATISTICS

72

SEMESTER V -OPEN COURSE (QUALITY CONTROL)

Time 3Hr

Part A

Answer all questions (Weight 1 for bunch of 4)

a) 3σ b) 6σ c)2σ d) 1.96σ

2. Upper control limit for R Chart is

a) A2R‾ b) A1R‾ c) D3R‾ d) D4R‾

3. Consumers risk is usually denoted by

a) µ b)∂ c) β d) α

4. The acceptance sampling plan is used for

a) Identifying good lots b) protecting the consumers interest c) protecting the producers

interest

d) All of the above

5. The Consumers risk usually fixed at

a) .05 b).01 c).95 d) .99

6. The OC curve gives

a) proportion of bad lots b) proportion of good lots c) discriminating power of the

sampling plan

d) none of them.

7. Number of breakdowns in an electric wire is studied using

a) R chart b) Sigma chart c) d chart d) c chart

8. The manageable cause of a process out of control is

a) assignable b) random c) unknown d) none

9. The quality of the lot after rectifying inspection will

a) not change b) change c) improve d) worsen.

10.Which of the following is an assignable cause.

a) Humidity b) Temperature d) Location c) Wear & tear.

11. To study the variation of a process where of costly items we use

a) R chart b) sigma chart c) p chart d) d chart.

12. The exact distribution used in acceptance sampling is

a) Binomial b) poisson c) geometric d) hyper geometric.

13. A sampling plan in which we take a decision based on one sample only is called

--------------

15. Chart used for number of defects is based on ---------- distribution

16. The control limits used before the availability of sufficient data is called-----------

17. The tabled values corresponding to subgroup sizes is given in -------- table.

73

18. Expand the term AOQ

19. Give an example where there is only upper specification limits.

20. Give an example where there is only lower specification limits.

21. Define AOQ and LTPD.

22. What is double sampling plan?

23. What are probability limits?

24. What are rational subgroups?

25. What happens when the control limits are within the spread of the process?

26. What is AOQL.

27. Distinguish between double and single sampling plans.

28. Draw the OC curve of the single sampling plan showing the consumers and

producers risks.

29. Describe the basis of a control chart.

C. BASIC STATISTICS

74

Module 1. Elements of sample surveys: Census and sampling, advantages, principal

steps in a sample survey, sampling and non sampling errors. Probability sampling,

judgement sampling and simple random sampling

15 hours

Module 2. Measures of central tendency: Mean, median, mode and their empirical

relationship. Weighted arithmetic mean- Dispersion: absolute and relative measures,

standard deviation and coefficient of variation

15 hours

scatter diagram, curve fitting, principle of least squares, fitting of straight line. Simple

correlation, Pearson’s correlation coefficient, limits of correlation coefficient,

Invariance of correlation coefficient under linear transformation.

19 hours

events. Statistical regularity, frequency definition, classical definition and axiomatic

definition of probability- Addition theorem, conditional probability, multiplication

theorem and independence of events (limited to three events).

20hrs

2. D.C.Sancheti and V.K.Kapoor: Statistics (Theory, Methods and Application)

Time: 3 Hrs

75

Section A

Answer all questions (Contains 12 questions, 4 Questions carry a weightage of 1)

a) Median

b) Mode

c) Geometric mean

d) Arithmetic mean

2. The most suitable measure for an ordinal data is:

a) Median

b) Arithmetic mean

c) Combined mean

d) Mode

3. Mean of 20 values is 45. If one of these values is to be taken 64 instead of 46, the

correct value of mean is:

a) 49.5

b) 45.9

c) 40.9

d) 42.9

4. The formula to find coefficient of variation is:

__

σ X

a) × 100 b) × 100

__

σ

X

Median

c) ×100 d) σ × 100

σ

5. Mean deviation from median is:

a) Equal to mean deviation from mean

b) Greater than mean deviation from mean

c) Less than mean deviation from mean

d) No relation

a) Leptokurtic curve

b) Mesokurtic curve

6. The value of the square of Karl Pearson’s coefficient of correlation lies between:

a) 0 and 1 b) -1 and 1

76

7. Karl Pearson’s coefficient of correlation for the following set of observation (3,12),(5,6)

will be:

a) Negative b) Positive

c) Zero d) No relation

9. Mutually exclusive events other than null event and sure event are:

a) not independent

b) independent

c) no relation

d) independent under some conditions

10. The probability that India wins a cricket match against England is 1/3. If India and

England play 3 matches, what is the probability that India will lose all the three

matches?

11. What is the probability that a non leap year selected at random will have 53 Sundays?

Q12. For a discrete r.v P(X >0) = P(X <0) and P(X =0) = p. The variable takes the

following values -2, -1, 0, 1, 2. What is the probability that X >0?

b) Explain your answer

77

15. Classical definition of probability can be used in the case of a sample space with

infinite outcomes.

b) Explain your answer

16. In the case of disjoint events A and B, P(A Υ B)< P(A) +P(B).

a) Say true or false

b) Explain your answer

17. Getting a queen and getting a Jack while drawing cards from a deck of cards are

independent events.

b) Explain your answer

18. The correlation coefficient between X and Y is 0.85. Find the coefficient of

determination. 1

b) Explain your answer

21. Explain why A.M. is considered as the best measure of central tendency? 2)

22. Calculate quartile deviation for the following data:-

26, 54, 33, 41, 94, 41, 54, 26, 93, 87, 81, 64, 68, 95.

23. The first two-sub-groups have 10 items with mean 15 and S. D. 3. If the whole group

has 250 items with mean 15. 6 and S.D. 13.44 , find the standard deviation of the

second subgroup.

24. If A and B are two independent events such that

P ( A c ) = 0.7, P ( B c ) = k , P ( A ∪ B ) = 0.8 , then find the value of k.

25. A and B stand in a ring with 12 other persons. Find the probability that A & B are

together.

26. Explain why in the case of two variables there are always two regression lines? When

do they coincide?

78

PART D ( Answer any 2 questions) Weight 4

27. State and prove addition theorem for two events? Explain what happens when A is

subset of B?

28. P (A) = 1/3, P(B) = 1/4, P(A∩B) = 1/11. Find the following probabilities.

b) At least one of the events A, B happens.

c) None happens.

29. Explain the concept of rank correlation. When is it used?

79

STATISTICS: COMPLEMENTARY – I Syllabus for BSc.

ester hours/week hours Ext:Int

No

1 ST1C01 4 3 3 3:1

PROBABILITY

THEORY

2 ST2C02 PROBABILITY 4 3 3 3:1

DISTRIBUTIONS

3 ST3C03 STATISTICAL 5 3 3 3:1

INFERENCE

4 ST4C04 5 3 3 3:1

APPLIED STATISTIC

There shall be 4 parts A, B, C and D in all the question papers*. Part A consists of 12

objective type questions. Part B consists of 8 questions to be answered in a word, phrase

or sentence. Part C consists of 6 questions of short essay type of which the student can

attempt 4. Part D consists of 3 questions of long essay type of which the student can

attempt 2. In part A the weightage per question is ¼.for part B weightage is 1/question

.For part D the weightage is 2/question and for part D the weightage is 4/question.

As far as possible the number of questions should be proportional to the modules.

follows

1

2. 8 short answer questions 4 theory + 4 problems weight 1

Components Weight

Assignment 1

Test paper 2

Seminar 1

Attendance 1

There shall be two test papers and the average grade point is to be considered for

internal assessment

Semester I

2

COURSE I : PROBABILITY THEORY

properties.

15 hours

properties.

20 hours

3

Book for reference

4

Model Question Paper

Semester I

COMPLEMENTARY COURSE I

PROBABILITY THEORY

Time: 3 Hrs

Part-A

Answer all the questions weight 1 for bunch of 4

1. Cans of soft drinks cost $0.30 in a certain vending machine. What is the

expected value and variance of daily revenue (Y) from the machine, if X, the

number of cans sold per day has E(X) = 125, and Var(X) = 50 ?

(b) E(Y ) = 37.5 , V ar(Y ) = 4.5

(c) E(Y ) = 37.5 , V ar(Y ) = 15

(d) E(Y ) = 37.5 , V ar(Y ) = 15

(e) E(Y ) = 125 , V ar(Y ) = 4.5

Solution: b

projected annual cash flow for the new location is:

Annual

Cash Flow $10,000 $30,000 $70,000 $90,000

$100,000

Probability 0.10 0.15 0.50 0.15 ?

The expected cash flow for the new location is:

(a) $12,800

(b) $64,000

(c) $70,000

(d) $60,000

(e) $50,000

Solution: b

5

3 The probability that the Red River will flood in any given year has been estimated

from200 years of historical data to be one in four .This means

(a) The Red River will flood every four year.

(b) In the next 100 years, the Red River will flood exactly 25 times.

(c) In the last 100 years, the Red River flooded exactly 25 times.

(d) In the next 100 years, the Red River will flood about 25 times.

(e) in the next 100 years, it is very likely that the Red River will flood exactly 25

times.

Solution: d

4 The chances that you will ticketed for illegal parking on campus are about 1/3.

During the last nine days, you have illegally parked everyday and have NOT been

ticketed you lucky person)! Today, on the 10th day, you again decided to park

illegally. The chances that you will be caught are:

(a) greater than 1/3 because you were not caught in the last nine days.

(b) less than 1/3 because you were not caught in the last nine days.

(c) still equal to 1/3 because the last nine days do not aﬀect the probability.

(d) equal to 1/10 because you were not caught in the last nine days.

(e) equal to 9/10 because you were not caught in the last nine days.

Solution: c

5. The chance that a person will contract AIDS after asexual contact with an infected

partner has been estimated to be 1/4. This means:

(a) A person will be infected after exactly 4 sexual contacts with infected partners.

(b) Of 1000 people having sexual contacts with infected partners, exactly 250 will

become infected.

(c) Of 200 people having sexual contacts with infected partners, about 50 will

become infected.

(d) In exactly 25% of all sexual contacts with infected partners, the infection will

spread.

(e) Of 20 people having sexual contact with infected partners it is very likely that

exactly 5 people will become infected.

Solution: c

Y -1 0 1 2

P(y) 3C 2C 0.4 0.1

a) 0.1

b) 0.15

c) 0.2

d) 0.25

e) 0.75

Solution a

6

7. A random variable X has probability distribution as follows

R 0 1 2 3

P[R=r] 2k 3k 13k 2k

The probability that P[X < 0.2] is equal to

a) 0.9

b) 0.25

c) 0.65

d) 0.15

e) 0.75

Solution b

8 If A, B, C are any three events probability of at least one is represented by

a) P[ A Υ B Υ C ]

b) P[ AB Υ AC Υ BC ]

c) P[ A Ι B Ι C ]

d) 1 − P[ A Υ B Υ C ]

e) P[ A Υ B Υ C ]

9 A continuous random variable X has p.d.f. f ( x) = 3 x 2 ,0 ≤ x ≤ 1 . If

P[ X ≤ a ] = P[ X > a ] , then a is

1

a)

3

−1 / 3

b) 2

3

1

c)

2

1

d)

3

3

1

e)

2

Solution b

10 If F(x) is the distribution function of X, and if Y = F(x), then E(Y) is

1

a)

2

b) 1

c) y

d) 2

e) none of the above

7

11 For a continuous random variable with p.d.f. f(x) and distribution function F(x),

which may not be true

a) 0 ≤ f ( x ) ≤ 1

∞

b) ∫ f ( x)dx = 1

−∞

c) 0 ≤ F ( x) ≤ 1

d) P[ X = 0] = 0

e) F (∞ ) = 1

Solution a

12 If the rth moment of a random variable X is µ r′ = r! , the Moment generating

function is

a) (1 − t )

t

b

1− t

c) (1 − t ) −1

d) ln(1-t)

e) None of these

Part-B

Answer all the questions ,weight 1

14 State the addition theorem of probability for 3 events.

15 Two coins are tossed one after the other until head appears. Write the sample

space

16 Let A and B be the possible outcomes of a random experiment and suppose

P(A) = 0.4, P ( A Υ B ) = 0.7 and P(B) = p. For what choice of p, are A and B

independent.

x

, x = 1,2,3,4,5

17 If f ( x) = 15 . Find P ( 12 < X < 5

2 X > 1)

0 else where

0 if x < − a

1 x

18 Is the following is a distribution function F ( x) 2 ( a + 1), − a ≤ x ≤ a .

1 If x > a

19 If φ X (t ) is the characteristic function of X . show that φ X (−t ) and φ X (t ) are

conjugate functions.

20 Define probability density function of a discrete random variable.

Part-C

Answer any four questions ,weight 2

21 State and prove addition and multiplication theorem of probability for two events.

22 From a vessel containing 3 white and 5 black balls, four balls are transferred in to

an empty vessel. From this vessel a ball is drawn and is found to be white. What

is the probability that out of four balls transferred, 3 are white and 1 is black.

8

kx ,0 ≤ x < 1

k ,1 ≤ x < 2

23 Let X be a continuous random variable with p.d.f. f ( x) =

− kx + 3k ,2 ≤ x < 3

0 , else where

(1) Find the constant k, (2) Determine the distribution function.

24 Define row and central moments. Establish the relation between row and central

moments of a random variable.

25 Find the measures of skewness and kurtosis based on moments for the following

1 2 −x

p.d.f. f ( x) = x e , 0 < x < ∞.

2

26 State and prove bayes theorem.

Part-D

Answer any two questions, weight 4

27 The kms X in thousands of kms which car owners get with a certain kind of tyre is

1 − 20x

,x > 0 .

a random variable having probability density function f ( x) = 20 e

0 ,x ≤ 0

Find the probabilities that one of these tyres will last (1) at least 10000kms.(2)

anywhere from 16000 to 24000kms and (3) at least 30000kms. (4) Find the

expected distance in kms the car owners get with the tyre.

28 Explain axiomatic definition of probability

29 Explain the terms. (1) Random experiment, (2) Sample space, (3) Mutually

exclusive events, (4) Equally likely events. With example.

9

Semester II

15hours

15 hours

30 hours

10

Module 4. Law of large Numbers: Chebychev’s inequality, convergence

12 hours

11

Model Question Paper

Semester II

COMPLEMENTARY COURSE I

PROBABILITY DITRBUTIONS

Time 3hrs

Part A

(Answer all the questions. Choose the correct answer from the alternatives

given below each question). Weight 1 for a bunch of 4 questions

1. For two random variables x and y, the relation E (xy)= E(x) E(y) holds good.

a) if x and y are identical

b) for all x and y

c) if x and y are statistically independent

d) None of the above.

2. If V(x) = 1, then V(2x ± 3) is

a) 5 b) 13 c) 14 d) 1

3. E(x-k)2 is minimum when

a) k<E(x) b) k= E(x) c) k>E(x) d) K2= E(x)

4. If x is a random variable having probability function f (x), then the function

itx

Σ e f(x), for i to be an imaginary unit, is known as

a) moment generating function

b) probability generating function

c) probability distribution function

d) characteristic function

5. The skewness of a binomial distribution will be zero if

a) p < ½

b) p = ½

12

c) p > ½

d) p < q

6. The coefficient of variation of poison distribution with mean 4 is

a) ¼ b) 2/4 c) 4 d) 2

7. X is normally distributed with zero mean and unit variance. The variance of

x2 is

a) 0 b) 1 c) 2 d) 4

8. In a normal curve area to the right of the point x1 is 0.6 and to the left of the

point x2 is 0.7. Which is the correct statement.

a) n1> n2 b) n1< n2 c) n1= n2 d) none of them

9. For a normal distribution, Q.D, M.D and S.D. are in the ratio.

4 2 4 4 2 1 4

a) : 2/3:1, b) : :1 c) 1: : d) : 1:

5 3 5 5 3 2 5

10. If x is a continuous r.v with means µ and variance σ 2 then for any positive

1

number k P[│x- µ │ > K σ ] ≥ is known as

k2

a. Liapunov’s inequality b) Tchebycheff’s inequality

c. Bienayme- Tchebycheff’s inequality d) Khinchin’s inequality

11. If x and y are two random variables such that their expectations exist and

P(x ≤y) =1 then

a) E(x) ≤E (y) b) E (x) >E (y)

c. E (x) = E (y) d) None of the above

1 2

12. If x is a standard normal variate then x is

2

1

a) Gramma variate with parameters

2

b) Normal variable

1

c. Passion variable with parameter

2

d) Exponential variable with parameter 2

13

Part B

(Answer all the questions) Weight 1

14. If x is a random variable E (x-constant)2 is minimum when the constant is

15. Name the discrete distribution for which mean and variance have the same

value.

16. What is the third moment about the mean of a poison distribution if the

second moment about the origin is 12.

17. Identify the distribution (using the uniqueness property) if the name of

generating function of the distribution

is Mx(t)= (1+et ) 5/32

18. The relationship between Beta distributors of the first and second kind is----

19. What is the characteristic function of a standard cauchy distribution.

20. What are the points of inflexion of a normal curve N(µ,σ).

Part C

(Answer any 4 questions) Weight 2

v (ax +by) = a2 v (x) +b2 v (y).

22. x and y are independent random variables with means 10 and 20, and variances

2 and 3 respectively find the mean and variances of 3x+4y.

23. A symmetric die is thrown 600 times. Find the lower bound for the probability

of getting 80 to 120 sores.

24. For a binominal distribution, the mean is 6 and S. D is 2. Write out all the

parameters of the distribution.

25. Show that for the normal distribution the points of inflexion lie at a distance

of ± σ from the mean where σ is the S. D.

26. If x→ N (30,5) find the probability of │x-30│>5

14

Part D

(Answer any 2 questions) Weight 4

28. Show that under certain conditions (to be stated) a Binominal distribution

tends to the poisson distribution

29. Fit a poisson distribution to the following data .

Number of mistakes per page : 0 1 2 3 4 Total

109 65 22 3 1 200

15

SEMESTER III

10 hours

and composite hypotheses, null and alternative hypotheses, type I and type

16

on t distribution for mean, equality of means and paired mean for paired

of attributes. 30 hours

(India),New Delhi.

17

Model Question Paper

Semester III

Time 3hrs

COMPLEMENTARY COURSE- I

STATISTICAL INFERENCE

Part A

Answer all questions ,4 questions carry weight 1

1. The mean of a Chi – square distribution with n degrees of freedom is

( a ) 2n ( b ) n 2 ( c ) n (d ) n

2. The relation between student’s-t and F distribution is.

( a ) t( n ) 2 = F( n,1) ( b ) t( n) 2 = F(1,n ) ( c ) t(1) 2 = F(1,n) ( d ) t( n ) 2 = F(1,1)

3. Let X 1 , X 2 ,..., X n be a random sample from a normal population N ( µ , σ 2 ) ,then the

∑ ( x − x)

2

i

distribution of is.

σ2

( a ) χ 2( n ) ( b ) t( n) ( c ) χ 2( n −1) ( d ) t( n−1)

1

( )

2

s2 =

n

∑ xi − x ,the unbiased estimator for the population variance σ 2 is

1 2 1 2 n 2 n −1 2

(a) s (b ) s (c ) s (d ) s

n −1 n n −1 n

5. If T is a consistent estimator of θ then

( a ) T is a consistent estimator of θ 2 ( b ) T 2 is a consistent estimator of θ

( c ) T 2 is a consistent estimator of θ 2 ( d ) None of the above

6. Let X 1 , X 2 ,..., X n be a random sample from a Bernoulli population. A sufficient

statistics for p is

18

( a ) ∑ X i ( b ) ∏ X i ( c ) Max( X1 , X 2 ,..., X n ) ( d ) Min( X 1 , X 2 ,..., X n )

8. The 95% confidence interval for mean µ of a normal population N ( µ , σ 2 ) with

known σ 2

n n n n

9. The mean difference between 9 paired observations is 15 and standard deviation of

differences is 5. Then the value of the t statistic used in paired t test is

( a ) 27 ( b ) 9 ( c ) 3 ( d ) 0

10. A sample of 12 specimen taken from a normal population is expected to have a

mean 50mg/cc. The sample has a mean 64 mg/cc with a variance of 25 .to test

H 0 : µ = µ0 aganistH1 : µ ≠ µ0 , you will choose

11. A random sample of size 20 from a nor mal population gives a mean 42 and a

variance 25.Then the value of the χ 2 statistic used for testing the significance of

population variance is

12. If X>1is the critical region for testing H 0 : θ = 2 aganistH1 : θ = 1 on the basis of the

single observation from the population f ( x, θ ) = θ eθ x , x > 0 ,then the value of type I

error is

( a ) e ( b ) e2 ( c ) e−2 ( d ) e−1

19

Part B

Answer all questions ,each questions carries weightage 1

13.Let X 1 , X 2 be a random sample of size 2 from N ( 0,1) .Then the distribution of

( X 1 + X 2 ) is-------------

2

( X1 − X 2 )

2

15.Let X 1 , X 2 , X 3 be a random sample of size 3 from N ( µ , σ 2 ) .he efficiency of

X1 + 2 X 2 + X 3 X + X2 + X3

relative to 1 is------------

4 3

1 X −θ

16Let X 1 , X 2 ,..., X n be a random sample from the population with pdf f ( x, θ ) = e ,

2

The m.l.e of θ is---------

17.The diameter of a cylindrical rod is assumed to be normally distributed with a

variance of 0.04cm. A sample of 25 rods has a mean diameter of 4.5 cms.95% confidence

interval for population mean is -----------

18.The power of a test is ----------

19.Degrees of freedom for chi-square in case of contingency table of order 4x3 is ---

20.In tossing of a coin ,let the probability of a head turning up be p .the hypotheses are

H 0 : p = 0.4 aganistH1 : p = 0.6 . H0 is rejected if there are five or more heads in six

tosses. Then probability of type I error is----------

20

PartC

Answer any 4 questions ,each questions carries a weightage of 2

21.Obtain the distribution of the sample mean of a random sample X 1 , X 2 ,..., X n of size n

from N ( µ , σ 2 ) .

B (1, p ) .Let T = ∑ X i .

T (T − 1)

Show that is an unbiased estimator of p2.

n( n − 1)

23.Define sufficient statistic. Let X 1 , X 2 ,..., X n be a random sample of size n from

24.An oil company claims that less than 20% of all car owners have not tried its gasoline

.Test this claim at the 0.01 level of significance if a random check reveals that 22 out of

200 car owners have not tried oil company’s gasoline.

25.In the comparison of two kinds of paint ,a consumer testing service finds that four 1-

gallon cans of one brand cover on the average 546 square feet with a standard deviation

of 31 square feet ,whereas four 1-gallon cans of another brand cover on the average 492

square feet with a standard deviation of 26 square feet. Assuming that the two

populations sampled are normal and have equal variance. Test the hypothesis that on the

average the first kind of paint covers a greater area than the second.

26. Mention the advantages of non-parametric tests over parametric test.

21

Part D

Answer any 2 questions ,each questions carries 4 credit

27 Let X 1 , X 2 ,..., X n be a random sample of size n from N ( µ , σ 2 ) . Find the mle’s

28 Explain Interval estimation.Obtain 100(1 − α )% confidence intervals for the

29 Use the data shown in the following table to test at the 0.01% level of significance

whether a person’s ability in mathematics is independent of his or her interest in

statistics.

Ability in Mathematics

Low Average High

Interest

Low 63 42 15

in

Statistics Average 58 61 31

High 14 47 29

22

SEMESTER IV

kurtosis

5 hours

30 hours

15 hours

23

control charts, 3 sigma limits. Control chart for variables – X-bar chart and

25 hours

ANOVATable 15 hours

1. Goon A.M., Gupta M.K and Das Gupta: Fundamentals of Statistics Vol.1

24

4. 3 long essay type question 1 problem + 2 theory weight 4

Semester IV

Time 3hrs

COMPLEMENTARY COURSE- I

APPLIED STATISTICS

Part A

Answer all questions (weight 1 for a bunch of 4 questions)

Calculators are permitted

1. If the coefficient of kurtosis is equal to 3 the distribution is called

2. If ρ = 0 the lines of regression are .

3. The range of multiple correlation coefficient R is.

( a ) 0 to1 ( b ) 0 to ∞ ( c ) − 1to1 ( d ) − ∞ to ∞

4. The test statistic for testing the significance of ρ = 0 with usual notation is.

r 1− r2 r n−2 r n−2 r 2 (1 − r 2 )

(a)t = (b ) t = (c) t = (d )t =

n−2 1− r2 1− r2 n−2

( a ) Trend ( b ) Cyclic variation

( c ) Seasonal variation ( d ) Irregular variation

6. For the given five values 15,24,18,33,42,the three years moving averages are.

7. Seasonal variation means the variations occuring within.

25

( a ) a number of years ( b ) parts of a year

( c ) parts of a month ( d ) none of the above

8. Link relatives in a time series remove the influence of.

( a ) Trend ( b ) Cyclic variation

( c ) Seasonal variation ( d ) all the above

10. The error degrees of freedom for two way anova with k rows and n columns is

( a ) k − 1 ( b ) n − 1 ( c )( k − 1)( n − 1) ( d ) nk − 1

11. The causes leading to vast variation in the specifications of a product are

( a ) random causes ( b ) assignable causes

( c ) non − traceable causes ( d ) all the above

12. The control charts for fraction defectives are known as

Part B

Answer all questions Weight 1

13 Karl Pearsons’s formula for measure of skewness is -------------

and Y are ------------

15 The formula for multiple correlation coefficient R2.13 in terms of the simple --

correlation coefficients r12 , r13 and r23 is ----------

16 Given the trend equation , Y = 108 + 2.8 X with 2000 as orgin and yearly data from

2000 to 2002,the estimated trend value for 2005 is.---------

26

19 One or more points outside the control limit indicates that -------

PartC

Answer any 4 questions ,weight 2

22. Show that Correlation coefficent is indepndent of change of orgin and scale.

two pairs as (6,14) and (8,6) while the correct values where (8,12) and (6,8)

24. In a trivariate distribution r12 = .77, r13 = .72, r23 = .52 .Find the partial correlation

26. What do you understand by 3-σ control chart. Obtain the 3-σ control limits for

X bar chart

Part D

Answer any 2 questions , weight 4

27. The following are the cholesterol contents in milligrams per package that four

laboratories obtained for 6-ounce packages of three very similar diet foods

27

Diet food A Diet food B Diet food C

.

Laboratory 1 3.4 2.6 2.8

Laboratory 2 3.0 2.7 3.1

Laboratory 3 3.3 3.0 3.4

Laboratory 4 3.5 3.1 3.7

Perform a two way analysis of variance and test the null hypotheses concerning

the diet foods and laboratories at the 0.05 level of significance.

28. .Calculate seasonal index for the following time series by Ratio to moving

average method.

1995 65 58 56 61

1996 68 63 63 67

1997 70 59 56 52

1998 60 55 51 58

28

29. The net weight of a dry bleach product is to be monitored by X-bar and R

chart

using a sample size of n=5 .Data for 12 preliminary samples are as follows.

Sample no. X1 X2 X3 X4 X5

1 15.8 16.3 16.2 16.1 16.6

2 16.3 15.9 15.9 16.2 16.4

3 16.1 16.2 16.5 16.4 16.3

4 16.3 16.2 15.9 16.4 16.2

5 16.1 16.1 16.4 16.5 16.0

6 16.1 15.8 16.7 16.6 16.4

7 16.2 16.1 16.2 16.1 16.2

8 16.2 16.1 16.2 16.1 16.3

9 16.3 16.2 16.4 16.1 16.5

10 16.6 16.3 16.4 16.1 16.5

11 16.2 16.4 15.9 16.3 16.4

12 15.9 16.6 16.7 16.2 16.5

Set up X-bar and R control chart using this data. Does the process exhibit statistical

control.

29

SYLLABUS OF COMPLEMENTARY II- ACTUARIAL SCIENCE

STATISTICS: COMPLEMENTARY – II

CUCCSSUG 2009 (2009 admission onwards)

ester hours/week hours Ext:Int

No

1 AS1C01 4 3 3 3:1

FINANCIAL

MATHEMATICS

FINANCIAL

MATHEMATICS

2 AS2C02 FINANCIAL 4 3 3 3:1

MATHEMATICS

3 AS3C03 LIFE 5 3 3 3:1

CONTINGENCIES

AND PRINCIPLES OF

INSURANCE

4 AS4C04 LIFE 5 3 3 3:1

CONTINGENCIES

AND PRINCIPLES OF

INSURANCE

Pattern of Question papers.

objective type questions. Part B consists of 8 questions to be answered in a word, phrase

or sentence. Part C consists of 6 questions of short essay type of which the student can

attempt 4. Part D consists of 3 questions of long essay type of which the student can

attempt 2. In part A the weightage per question is ¼.for part B weightage is 1/question

.For part D the weightage is 2/question and for part D the weightage is 4/question

As far as possible the number of questions should be proportional to the modules.

Components Weight

Assignment 1

Test paper 2

Seminar 1

Attendance 1

There shall be two test papers and the average grade point is to be considered for

internal assessment

SEMESTER I

Course I

Financial mathematics

rate of interest-Accumulation and Present value of a single

payment-Nominal rate of interest-Constant force of interest-

Relation ship between these rate of interest- Accumulation and

Present value of a single payment using these rate of interest-

Accumulation and Present value of a single payment using these

symbols-When the force of interest is a function of t,

δ(t).Definition of A(t1,t2),A(t),v(t1,t2) and v(t).Expressing

accumulation and present values of a single payment using these

symbols-when the force of interest is a function of t, δ(t) 22hrs

Accumulation and present vales of annuities with level payments

and where the payments and interest rates have same frequencies-

Definition and derivation –Definition of perpetuity and derivation-

Accumulation and present values of annuities where payments and

interest rates have different frequencies 22hrs

Annuities payable continuously-Annuities where payments are

increasing continuously and payable continuously-Definition and

derivation 10hrs

credit transaction 18hrs

McCutcheon, J.J., Scott William (1986): An introduction to Mathematics

of Finance

Butcher,M.V., Nesbit, Cecil. (1971)Mathematics of compound interest,

Ulrich’s Books

Neill, Alistair, Heinemann, (1977): Life contingencies.

Bowers, Newton Let al (1997): Actuarial mathematics, society of

Actuaries, 2nd Ed

Model Question Paper

Semester I

COMPLEMENTARY COURSE II

FINANCIAL MATHEMATICS

Time: 3 Hrs

Part A

Choose the correct answer from the brackets

Bunch of four questions carries one weight age

1. If an investor deposits £4000 in a bank account that pays simple interest at a rate

of 6% pa. Then after 8 years it will be ------------------

(a)5920 (b)4920 (c)3920 (d)3000

2. If an investor deposits £4000 in a bank account that pays compound interest at a

rate of 6% pa. Then after 8 years will be ------------------

(a)5920 (b)4920 (c)6375 (d)6000

3. An investor must make a payment of £5000 in 5years time. The investor wishes to

make provision for this payment by investing a single sum now in a deposit

account that pays 10% pa compound interest. How much should the initial

investment be?

(a)3105 (b)4105 (c)4000 (d)3000

4. An 8 month loan repayable by a single repayment is issued at a rate of

commercial discount of 15%pa. If the amount of the repayment is £1,00,000 How

much was initially lent to the borrower?

(a)80000 (b)90000 (c)100000 (d)75000

5. £80 is invested at time 5 and the accumulated amount at time 8 is £100.what is the

value of interest

(a)8.33% (b)8% (c)7% (d)7.33%

6. Find the value at time t=0 of$250 due at time t=6 and $600 due at time t=8. If

S(t)=3%pa for all t

(a)680.79 (b)650 (c)675.25 (d)680

7. Calculate a25 at 13½%pa effective

(a)7.095 (b)7.25 (c)8.095 (d)8.75

8. A loan of £900 is repayable by equal monthly payments for 3years, with interest

payable at 18½%pa effective. Calculate the amount of each monthly payments

(a)32.13 (b)31.13 (c)35.25 (d)30.75

9. Find R,if P=7892, l=5, i= 10% and n=10

(a)125.01 (b)123.25 (c)175 (d)150

10. Find P, if l=5, R=125, i=10% and n=20

(a)61.15 (b)65.25 (c)60.825 (d)62.13

11. Calculate numerical value for ā7 @7½%pa

(a)5.4928 (b)6.492 (c)7.25 (d)8.125

12. Calculate 5\ ä8(3) @ 6%

(a) 3.8247 (b) 4.8247 (c) 5.25 (d)6.875]

Part B

Attempt all questions- each questions carries one weight age

14. An investor makes an initial investment of £5000 and is credited with £500

interest at the end of the year. What is the effective rate of interest and the value

of i?

15. £4600 is invested at time 0 and the proceeds at time 10 are £8200. Calculate

A(7,10) if A(0,9)=1.8, A(2,4)=1.1, A(2,7)=1.32, A(4,9)=1.45

16. Find the accumulated value if $1 is invested for 7years at an interest rate of

6.5%pa effective

17. Calculate the present value on 1-Sept-2002 of payments of £280 due on 1-Sept-

2004 and £360 due on 1-March-2005. Interest is 15%pa effective

18. Calculate a6(4) at 1½%pa, first without using the tables and then with the tables

19. Write down a formula for Lt , if the loan is repaid by level regular instatements, so

that Xt=X,for all t

20. Write down a formula for m\ān in terms of ān ?

PART C

Attempt any four questions- each questions carries two weight age

21. Consider two non-overlapping time periods. Period 1 has length l time units and

period 2 has length m time units. If the effective period 1 interest rate is i. Express

the equivalent effective period 2 interest rate in terms of I, l and m

22. If the force of interest is δ(t)=0.04,0<t<6 and δ(t)=0.2-0.02t, 6<t<9. Find the

accumulated value at time 8 of a payment of $400 at time 3

23. Find the accumulated value of a payment stream of 0.3+1.5t that is received

continuously from time 4 to time 8 During which time the force of interest is

0.01+0.05t

24. A motorist buys a car costing £5000 using a loan with a flat rate of interest of

10% and repayments at the end of each of the next 12 months .Calculate the loan

outstanding immediately thereafter the second payment

25. A loan of $50000 is repayable by equal annual payments at the end of each of the

next 5 years; interest is 8%pa for the first 3 years and 12%pa thereafter. Calculate

the loan outstanding immediately thereafter the second payment

26. Derive formulae for (Iä)n

a. Algebraically, and

b. By general reasoning, starting from the formula for (Ia)n

PART D

Attempt any two questions- each questions carries four weight age

27.A woman takes out a home improvement loan for £11000 over 5 years. She makes

monthly repayments in arrears and the bank charges an effective rate of interest of

6%pa

(a) What is the monthly repayment?

(b) How much interest does she pay in the 3rd year?

(c) How much capital is repaid in the 20th installments?

28. An investor wishes to find the present value of a stream of property income

payments. She proposes to make the following assumptions

• The level of current payment is £20,000 paid quarterly in advance

• Payments will remain fixed for 5 years period. At the end of each

5-year period the payments will raise in line with total inflationary

growth over the previous 5 years

• Inflation assumed to be constant at 3%pa

• The interest rate for the calculation is 12%pa effective

Find the P.V of the income stream; assume that the payments continue for 50

years

29. The force of interest is given by

δ(t)={0.04+0.002t 0<t<10

0.015t-0.08 10<t<12

0.07 t>12

Find the expression for the accumulation factor from time 0 to t?

SEMESTE II

Probability for the age at death- life tables- The deterministic

survivorship group. Other life table functions, assumptions for

Fractional Ages Some analytical laws of mortality select and

ultimate life table 25hrs

Module II: Multiple life functions: Joint life status-the last survivor status-

Probabilities and expectations-Insurance and annuity benefits-

Evaluation-Special mortality laws-Evaluation-Uniform distribution

of death-Simple contingent functions-Evaluation 10hrs

Life assurance contracts-(whole, n-year term, n-year endowment,

deferred)-Insurance payable at the moment of death and insurance

payable at the end of year of death-Recursion equations-

Commutation functions 19hrs

life annuities-Discrete life annuities-Life annuities with monthly

payment-Commutation Function formulae for annuities with level

payments-Varying annuities-Recursion equations-complete

annuities-immediate and apportion able annuity –due 18hrs

McCutcheon, J.J., Scott William (1986): An introduction to Mathematics

of Finance

Butcher,M.V., Nesbit, Cecil. (1971)Mathematics of compound interest,

Ulrich’s Books

Neill, Alistair, Heinemann, (1977): Life contingencies.

Bowers, Newton Let al (1997): Actuarial mathematics, society of

Actuaries, 2nd Ed

Model Question Paper

Semester II

COMPLEMENTARY COURSE II

PART A

Bunch of four questions carries one weight age

(a)1/100-x (b)x/100 (c)1/100 (d)3/100

2. If S(x) = [1-x/100]1/2 , 0<x<100. Evaluate 17P19

(a)1/8 (b)8/9 (c)9/8 (d)5/8

3. Given that 25P25:50 = 0.2 and 15P25=0.9, calculate the probability that a person

aged 40 will survive to age 75

(a)1/2 (b)9/2 (c)1/9 (d)2/9

4. If m(x)=1/100-x for 0<x<100, calculate e040:50

(a)36.94 (b)18.06 (c)25.05 (d)32.30

5. In a mortality table known to follow Makeham’s law, you are given that

A=0.003 and C10=3. If e40:50=17 , calculate xq401:50

(a)0.0123 (b)0.5632 (c)0.2755

(d)0.4835

6. Assume mortality is described by lx=100-x, 0<x<100 and that the force of

interest is δ=0.05. Calculate Ā401:50

(a)0.47890 (b)1.3567 (c)0.0523

(d)0.2378

7. If lx=100-x, 0<x<100 and i= 0.05, calculate (IA)40

(a)5.5545 (b)2.5678 (c)4.2891

(d)6.7235

8. If Ax=0.25, Ax+20 = 0.40 and Ax:20= 0.55, calculate Ax:201

(a)0.2 (b)0.3 (c)0.5 (d)0.125

9. On the basis of the illustrative life table with interest at the effective annual

rate of 6%, calculate the value of ä(12)25:40

(a)11.20 (b)15.038 (c)19.638

(d)25.32

10. Let Y be the PVRV for a continuous 10-year temporary life annuity of 1 pa

commencing at age 60. On the basis of your illustrative life table with uniform

distribution of deaths over each year of age and i= 0.08, calculate mean

(a)6.4634 (b)2.5891 (c)8.7800 (d)5.3239

11. Use the illustrative life table with uniform distribution of deaths over each year

of age and i= 0.07, to determine ä30:20

(a)11.415 (b)10.415 (c)9.897 (c)8.326

12. If S(x) =1-x/100, 0<x<100. Calculate FX(x)

(a)X/100 (b)1/100 (c)x-100 (d)100

PART B

Attempt all questions- each questions carries one weight age

13.On the basis of life table, evaluate the probability that (20)will

(a) live to 100

(b) die before 70

14.Explain complete expectation of life

15. Under the assumption of uniform distribution of deaths, show that

(a) e0x=ex+1/2

(b) Var[T]= Var[K]+1/2

16. The pdf of the future life time T,for (x)is assumed to be

fT(t)={1/80 , 0<t<80

0 , elsewhere

At a force of interest δ, calculate for Z,the PVRV for a whole life insurance f

or unit amount issued to (x)

(a) The actuarial present value

(b) The variance

17. Explain Endowment life insurance at the moment of death

18. Compare the variances of the PVRV’s for the complete annuity - immediate

19. Prove that n/qx = (Ax:n - Ax)/d – nEx

20. Explain n-year temporary life annuity – due

PART C

Attempt any four questions- each questions carries two weight age

21. Assuming that future life times of (80) and (80)are independent, obtain an

expression in single life table functions for the probability that their

(a) First death occurs after 5 and before 10 years from now

(b) Last death occurs after 5 and before 10 years from now

22. Prove that nqx2y = nqx-nqx1y & nqx1x= ½nqxx

23. Using life tables, evaluate

(a)2P[30] (b) 5P[30] (c) 1\q[31] (d)q[31]+1

24. Under the constant force of mortality assumption, are the random variable K and

S are independent

25. Assume that each of 100 independent lives

(i) Is age x

(ii) Is subjected to a constant force of mortality µ=0.04 and

(iii) Is insured for a death benefit amount of 10 units, payable at the

moment of death

The benefit payments are to be withdrawn from an investment fund earning δ=

0.06. Calculate the minimum amount at t=0, so that the probability is

approximately 0.95 that sufficient funds will be on hand to withdraw the benefit

payment at the death of each individual

26. Consider a 5-year deferred whole life insurance payable at the moment of death of

(x). The individual is subject to a constant force of mortality µ=0.04. For the

distribution of the PV of the benefit payment at δ=0 .10

(a) Calculate the expectation

(b) Calculate the variance

PART D

Attempt any two questions- each questions carries four weight age

27. Relationship between insurance payable at the moment of death and the end of

year of death

28. Under the assumptions of a constant force of mortality M, and of a constant force

of interest delta, evaluate

(a) āx=E[āT]

(b) Var[āT]

(c) Probability that āT exceeds ax

29. The future life T(x)and T(y)are independent and each has a distribution defined

by the pdf fX(t)={0.02(10-t) , 0<t<10

0 , elsewhere

(a) Determine the distribution function, survival function and force of

mortality

(b) Determine the joint pdf & joint distribution function and joint survival

Function for T(x) and T(y)

(c) Determine complete expectation for the joint life status T(x,y)

SEMESTER III

Course III

premiums-True mthly payment premiums-Apportion able

premiums-Commutation functions-Accumulation type benefits

20hrs

Module II: Fully continuous net premium reserves-other formulas for fully

discrete net premium results-Reserves on semi continuous basis-

Reserves based on semi continuous basis-Reserves based on

apportion able or discounted continuous basis-Recursive formulae

for fully discrete basis-Reserves at fractional duration-Allocation

of the loss to the policy years-Differential equation for fully

continuous reserves 25

Module III: Concept of Risk-the concept of Insurance-Classification of

Insurance-Types of Life Insurance-Insurance Act, fire ,marine,

motor engineering, Aviation and agricultural-Alternative

classification-Insurance of property-pecuniary interest, liability

&person, Distribution between Life & General Insurance-History

of General Insurance in India. 25hrs

elements of Insurance-optimal insurance-Multiple decrement

models 20 hrs

McCutcheon, J.J., Scott William (1986): An introduction to Mathematics

of Finance

Butcher,M.V., Nesbit, Cecil. (1971)Mathematics of compound interest,

Ulrich’s Books

Neill, Alistair, Heinemann, (1977): Life contingencies.

Bowers, Newton Let al (1997): Actuarial mathematics, society of

Actuaries, 2nd Ed

Model Question Paper

Semester III

COMPLEMENTARY COURSE II

Time: 3 Hrs

PART A

Choose the correct answer from the brackets

Bunch of four questions carries one weight age

PART A

1. Given for a double decrement table, that q401(1) = 0.02 and q1(2)=0.04. Calculate

q40(1) to four decimals

(a) .0909 (b)0.0592 (c)0.0426 (d)0.3296

0.1x

2. Let the loss random variable X have a pdf given by f(x)=0.1e ,x>0, calculate

E[X]?

(a) 10 (b)30 (c)25 15

3. The loss random variable X have the pdf given by f(x)=1/100, 0<x<100, calculate

V[X]?

(a) 50, (b) 2500/3 (c) 250/3 (d)45/3

1 (12) (12)

4. If Px :20 = 1.032 and Px:20=0.040, what is the value of Px:20 ?

(a) [0.035 (b) 0.326 (c) 0.957 (d) 0.583

5. Using the illustrate life table and directly calculate P(2)[Ā50:20/ā(2)50:20]

(a) [0.0413 (b) 0.0328 (c) 0.191 (d) 0.0456]

6. Using the illustrate life table and interest rate of 6%, calculate the component of the

decomposition

1000 P50:20 = 1000(P50:120 + P50:201`)

7. An ordinary life contract for a unit amount on a fully discrete basis is issued to a

person age x with an annual premium of 0.048. Assume d= 0.06, Ax=0.4 and

2

Ax=0.2. Let L be the insurer’s loss function at issue of this policy calculate E[L].

(a) 0.1296 (b)-0.1296 (c)-0.08 (d) 0.08

8. Calculate the value of Px :n if nVx = 0.080 , Px = 0.024 and Px1:n = 0.2

1

9. If 10V35 = 0.150 and 20V35 = 0.354 calculate 10V45.

(a) 0.252 (b) 0.240 (c) 0.232 (d)0.2

10. Assuming δ = 0.05 qx = 0.05 and a uniform distribution of death in each year of

age, calculate (ĪĀ)x1:1

(a) 0.01896 (b) 0.02418 (c) 0.2418 (d) 0.1896

11.f Px1:20(12) = 1.032 and Px:20=0.040, what is the value of Px:20(12)?

(a) 0.035 (b) 0.326 (c) 0.957 (d) 0.583

12.Using the illustrate life table and directly calculate P(2)[Ā50:20/ā(2)50:2

(a) 0.0413 (b) 0.0328 (c) 0.191 (d) 0.0456

PART B

Attempt all questions- each questions carries one weight age

13. Determine an expression in actuarial present values and benefit premiums for the

Var[ kL / k(x) = k, k+1,……..] for a fully discrete n-year endowment insurance with a

unit benefit

14. A fully discrete whole life insurance with a unit benefit issued to (x) has its first years

benefit and the remaining benefit premiums are level and determined by the equivalence

principle

Determine formulas for

a. The first year benefit premium

b. The level benefit premium after the 1st year

15.Calculate P[Āx] and Var[L] with the assumptions that the force of mortality is a

constant µ=0.04 and the force of interest δ=0.06

16.Derive relationships among continuous benefit premiums using identities

17. A decision maker’s utility function is given by u[w]=-e-5w. The decision maker has

two random economic prospects available. The outcome of the first has a normal

distribution with mean 5 and variance 2 and the outcome of the second has a normal

distribution with mean 6 and variance 2.5. Which prospects will be preferred

18. Explain fully continuous benefit reserves in whole life insurance

19. Derive a general expression for 2Āx - (Āx)2/ (δāx)2 , where µx(t)=µ and δ is the force of

interest for t>0

20. Prove and interpret the formula Px:n = nPx + Px:n1(1-Ax+n)

PART C

Attempt any four questions- each questions carries two weight age

22. The normal benefit premiums for a fully discrete whole life insurance with a unit

benefit issued to (x) are Πj = Πwj where wj =(1+r)j , the rate r might be selected to

estimate the expected growth rate in the insured. Develop formulas for (a) Π

(b) hV when r=i

23. What you mean by insurance and explain the classification of insurance

24.What is utility and Explain its importance in insurance

25The probability that a property will not be damaged in the next period is 0.75. The pdf

of a possible loss is given by f(x)=0.25(0.01)e-0.01x, x>0 . The owner of property has a

utility function given by u(w)= - e-0.05w. Calculate the expected loss and the maximum

insurance premium. The property owner will pay to the complete insurance

PART D

Attempt any two questions- each questions carries four weight age

27 An insurer is planning to issue a policy to a life age 0, whose curtate future life

time k is governed by the p.f k/q0=0.2,k=0,1,2,3,4

The policy will pay 1 unit at the end of year of death in exchange for the payment

of a premium P at the beginning of each year, provided the life survives. Find the

annual premium P is determined by;

c. Principle I: P will be the annual premium such that the insurer, using a

utility of wealth function u(x)=x will be indifferent between accepting and

not accepting the risk

d. Principle II: P will be the annual premium such that the insurer, using a

utility of wealth function u(x)= -e-0.01x will be in utility of wealth function

28. If k\qx= C (0.96) k+1, k=0, 1, 2……where c=0.04/0.96 and i=0.06. Calculate Px

and V[L]

29. On the basis of De-Moiver’s law with lx=100-x and the interest rate of 6%.

Calculate

(a) P(Ā35) , (b)tV(Ā35) and V[tL\T(x)>t] , for t=0,10,20,….,60

SEMESTER IV

Course IV

Probability models and Risk theory

Module I: Individual risk model for a short time: Model for individual claim

random variables-Sums of independent random variable-

Approximation for the distribution of the sum-Application to

insurance 20hrs

Module II: Collective risk models for a single period: The distribution of

aggregate claims-Selection of basic distributions-Properties of

compound Poisson distributions –Approximations to the

distribution of aggregate claims 25hrs

Module III: Collective risk models over an extended period: Claims process-

The adjustment coefficient-Discrete time model-The first surplus

below the initial level-The maximal aggregate loss 20hrs

Approximating the individual model-Stop-loss re-insurance-The

effect of re-insurance on the probability of ruin 25hrs

McCutcheon, J.J., Scott William (1986): An introduction to Mathematics

of Finance

Butcher,M.V., Nesbit, Cecil. (1971)Mathematics of compound interest,

Ulrich’s Books

Neill, Alistair, Heinemann, (1977): Life contingencies.

Bowers, Newton Let al (1997): Actuarial mathematics, society of

Actuaries, 2nd Ed

Model Question Paper

Semester IV

COMPLEMENTARY COURSE II

Time: 3 Hrs

Part A

Choose the correct answer from the brackets

Bunch of four questions carries one weight age

1. Let X is the number obtained when one true die is tossed. Let y be the sum of the

numbers obtained when x true dice are then thrown. calculate E[y]

(a) 4/7 (b)7/4 (c)2/6 (d)3/6

2. Under certain assumptions, the probability of ruin is

Ψ(u)= (0.3) e-2u +(0.2) e-4u+(0.1)e-7u, u > 0. Calculate θ?

(a)2/3 (b)1/3 (c)1 (d)½

3. Suppose that λ = 3, C = 1 and P(x) = 1/3 e-3x +16/3 e-6x , x >0 Calculate P1

(a)3/27 (b)6/27 (c)4/27 (d)5/27

4. Suppose that λ = 1, C = 10 and P(x) = 9x/25 e-3x/5, x>0. Calculate θ

(a) 3 (b) 4 (c) 2 (d)5

5. Suppose that the claim amount distribution is discrete with P(1)=1/4 and

P(2)=3/4.If R=log 2.Calculate θ

(a) 10 -1 (b) 10 (c) 10 -1 (d)10

7log2 7log2 5log2 5log2

6. Suppose that Wi assumes only, the value 0 and +2 and that

P[W=0]=p,P[W=2]=q,where p+q=1,Assume that C=1,P>1/254

7. Consider an insurance portfolio that will produce 0, 1, 2, or three claims in a fixed

Time period with probabilities 0.1, 0.3, 0.4 and 0.2respectevely An individual

Claim will be of amount 1, 2, or 3 with probabilities 0.5, 0.4 and 0.1

Respectively Calculate E[N]

(a) 1.7 (b) 2.7 (c) 2.8 (d)1.6

-3x -7x

8. Suppose that θ=2/5 and p(x)= 3/2e + 7/2e , x>0 calculate γ

(a) 2 (b)3 (c)4 (d)2.5

9. If S has a compound Poisson distribution given by λ=3,p(1)= 5/6,p(2)=1/6,

Calculate fs(x) for x=0

(a) 0.050 (b) 0.25 (c) 0.052 (d) 0.523

10 Consider an insurance portfolio that will produce 0, 1, 2, or three claims in a fixed

Time period with probabilities 0.1, 0.3, 0.4 and 0.2respectevely an individual

Claim will be of amount 1, 2, or 3 with probabilities 0.5, 0.4 and 0.1

Respectively Calculate V [N]

(a) 0.8 (b) 0.028 (c) 0.08 (d) 0.285

-3x -7x

11. Suppose that θ=2/5 and p(x)= 3/2e + 7/2e , x>0 calculate R

(a) 2.5 (b) 3.45 (c) 4.25 (d) 2.5

12. If S has a compound Poisson distribution given by λ=3,p(1)= 5/6,p(2)=1/6,

Calculate Fs(x) for x=2

(a) 0.354 (b) 0.258 (c) 0.520 (d) 0.545

PART B

Attempt all questions- each questions carries one weight age

13. Assume that N has a geometric distribution; that is ,the probability function of N

is given by

P[N=n] = pqn , n=0,1,2…..

Where 0<q<1 and p=q-1.Determine MS(t) in terms of MX(t)?

14. If S has a compound Poisson distribution, specified λ and p(x), Then the

distribution of Z = S- λP1

λP2 Converges the standard normal distribution as

λ→∞?

15.Write an expression for the distribution of the surplus level at the first time, the

surplus falls below the initial level u, given that it does fall below u, if all claims

are of size 2?

16. Derive an expression for Ψ(u) if the Xi’s have an exponential claim amount

distribution?

17. Write an expression for the distribution of L, if the size of the individual claims

has an exponential distribution with parameter β?

18. Find the mean and variance of the Inverse Gaussian distribution, by using its

mgf

19. Derive an expression for R in the special case where the Wi’s common

distribution is N(µ,σ2)?

20. Determine the adjustment coefficient if the claim amount distribution is

exponential with parameter β>0?

PART C

Attempt any four questions- each questions carries two weight age

21.Assume that u(λ) is the gamma probability distribution function with parameter

α and β,

u(λ) = βα λα-1 e –βλ

Γα ,λ>0

Where Γα = ∫∞0 yα-1 e –y dy. Show that the marginal distribution of N is negative

binomial with parameters, r = α , p= β

1+ β

that Si has a compound Poisson distribution with parameter λi and d.f of

claim amount Pi(x),i=1,2,……….m, then S= S1+ S2+………….+Sm has a

m m

Compound Poisson distribution with λ= ∑ λi and P(x)= ∑ λi /λPi (x)

i=m i=m

23. Assume that u(λ) is the inerse Gaussian p d f with parameters α and β . Exhibit

the moment generating function of N, E [N] and V [N]?

distribution p(x) is exponential with parameter θ ?

25. Calculate the adjustment coefficient if all the claims are of size 1?

26. Calculate the probability of ruin in the case that the claim amount distribution

is exponential with parameter β >0

PART D

Attempt any two questions- each questions carries four weight age

is 1/6 and B, the benefit amount given that there is a claim ,has pdf

F(y) = 2(1-y), 0<y<1

0 , elsewhere

Let S be the total claims for the portfolio. Using a normal distribution, Estimate

P[S>4]

28.Prove that for compound distribution where the probability distribution for N

the number of claims , satisfies the condition

P[N=n] = a+b/n ,for n= 1,2,………..

P[N=n-1] and where the distribution of claim amounts is restricted to the

positive integers.

x

fS(x) = ∑ [a+bi/x]p(i) fS(x-i) ,x=1,2,………

i=1

With the starting value fS(0) = P[N=0]

29.Given that θ = 2/5, and P(x) = 3/2 e-3x +7/2 e-7x , x>0 .Calculate Ψ(u),γ,R?

STATISTICS: COMPLEMENTARY – I Syllabus for BSc.

ester hours/week hours Ext:Int

No

1 SG1C01 4 3 3 3:1

STATISTICAL

METHODS

2 SG2C02 Regression Analysis, 4 3 3 3:1

Time Series and Index

Numbers

3 SG3C03 PROBABILITY 5 3 3 3:1

4 SG4C04 5 3 3 3:1

TESTING OF

HYPOTHESIS

There shall be 4 parts A, B, C and D in all the question papers. Part A consists of 12

objective type questions. Part B consists of 8 questions to be answered in a word,

phrase or sentence. Part C consists of 6 questions of short essay type of which the

student can attempt 4. Part D consists of 3 questions of long essay type of which the

student can attempt 2. In part A the weightage per question is ¼.for part B weightage

is 1/question .For part D the weightage is 2/question and for part D the weightage is

4/question. As far as possible the number of questions should be proportional to the

modules.

Table showing the components and weightage for internal assessment

Components Weight

Assignment 1

Test paper 2

Seminar 1

Attendance 1

There shall be two test papers and the average grade point is to be considered for

internal assessment.

Semester I

Module 1. Meaning, Scope and limitations of Statistics – collection of data,

conducting a statistical enquiry – preparation of questionnaire – primary and

secondary data – classification and tabulation – Formation of frequency

distribution – diagrammatic and graphic presentation of data – population and

sample –advantages of sampling over census – methods of drawing random

samples from a finite population. (Only a brief summary of the above topics is

intended to be given by the teacher. Detailed study is expected from the part of

students). 12hrs

arithmetic mean, medium, mode, geometric mean and harmonic mean, partition

values – quartiles – deciles and percentiles. 30hrs

dispersion, measures of dispersion – range – quartile deviation – mean

deviation-standard deviation – Lorenz curve – skewness and kurtosis.

30 hours

Model Question Paper

B.Sc.Geography (Main)

I Semester

COURSE I : (Complementary I)

STATISTICAL METHODS

Marks:

Section A

Answer all questions (Contains 12 questions, 4 Questions carry a weightage of 1)

1. The heights of 150 students are collected. The type of classification that is best

suited is

a) Qualitative

b) Quantitative

c) Geographical

d) Chronological

2. A frequency distribution in which the upper limits are not included in their

respective classes is called a

a) Continuous frequency distribution

b) Discrete frequency distribution

c) Raw data

d) Ungrouped frequency distribution

3. The class mark of a class is obtained by

a) upper limit-lower limit

b) upper limit + lower limit

upperlimit + lower limit

c)

2

upperlimit − lower limit

d)

2

4. When there are zeroes in the data we can not use

a) Median

b) Mode

c) Geometric mean

d) Arithmetic mean

5. The most suitable measure for an ordinal data is:

a) Median

b) Arithmetic mean

c) Combined mean

d) Mode

6. Mean of 20 values is 45. If one of these values is to be taken 64 instead of 46, the

correct value of mean is:

a) 49.5

b) 45.9

c) 40.9

d) 42.9

7. The formula to find coefficient of variation is:

__

σ X

a) × 100 b) × 100

__

σ

X

Median

c) ×100 d) σ × 100

σ

8. Mean deviation from median is:

a) Equal to mean deviation from mean

b) Greater than mean deviation from mean

c) Less than mean deviation from mean

d) No relation

9. The 50th percentile is equal to:

a) 10th decile

b) 1st decile

c) 2nd decile

d) 5th decile

10. For a symmetric distribution median and mode = 10. The value of mean is:

a) Zero

b) 20

c) 10

d) 5

11. For a positively skewed data:

a) Mean = mode

b) Mean < mode

c) Mean > mode

d) (Mean – Mode)/2

12. A curve which is flatter than a normal curve is called

a) Skewed curve

b) Platykurtic curve

c) Leptokurtic curve

d) Mesokurtic curve

Section B (Contains 6 questions answer any 4) Weight-1

13. When there are open end classes, we use median as a measure of central tendency

(1) Say true or false

(2) Explain your answer

14. In the case of categorical data we can not use histogram

(1) Say true or false

(2) Explain your answer

15 Suppose that the standard deviation of a set of observation is 3. If from each

observation ‘3’ is subtracted, the new standard deviation is zero.

(1) Say true or false

(2) Explain your answer

16. If 25% of the items are less than 10 and 25% are more than 40 the coefficient of

quartile deviation is -------.

17. Karl Pearson’s coefficient of skewness of a distribution is 0.32 and its standard

deviation is 6.5. The mean is 29.6. The mode is -------.

18. Define harmonic mean of n observations.

19. Give an example of a primary data.

20. Give the empirical relationship between mean ,median and mode.

Section C

(4 Questions to be answered out of 6) Weight-2

21. Explain why A.M. is considered as the best measure of central tendency?

22. Calculate quartile deviation for the following data:-

26, 54, 33, 41, 94, 41, 54, 26, 93, 87, 81, 64, 68, 95.

23. The first two-sub-groups have 10 items with mean 15 and S. D. 3. If the whole

group has 250 items with mean 15. 6 and S.D. 13.44 , find the standard deviation

of the second subgroup.

25. Explain the terms skewness and kurtosis.

26. The means of two samples of sizes 50 and 100 respectively are 54.1 and 50.3. The

standard deviations are 8 and 7. Obtain the mean and standard deviations of the

sample consisting of 150 observations by combining the two samples.

27 What is a Lorenz curve? Give its uses?

28. What is meant by classification? What are its bases?

29. Calculate mean deviation about median for the data given below.

Class: 5-10 10-15 15-20 20-25 25-30 30-35

Freq: 8 7 30 26 12 7

Semester II

Course-II Regression Analysis, Time Series and Index Numbers

Module 1. Fitting of curves of the form – linear, y=abx, y=aebx – correlation

analysis – concept of correlation – methods of studying correlation – scatter

diagram – Karl Pearson’s correlation coefficient – concept of rank correlation

and Spearman’s rank correlation coefficient – regression analysis – linear

regression – regression equations (concepts only – Derivations are beyond the

scope of this syllabus). 30hrs

Module 2. Index numbers, meaning and use of index numbers – simple and

number, chain base and fixed base index number – construction of cost of

20hrs

secular trend semi average, moving average and least square methods (linear

Model Question Paper

B.Sc.Geography (Main)

Semester II

COURSE II : (Complementary I)

Part A

Answer all questions (A bunch of 4 carries weight 1)

Time 3hrs

1. If the coefficient of kurtosis is equal to 3 the distribution is called

a) leptokurtic b) mesokurti c) platykurtic d) none

2. If ρ = 0 the lines of regression are .

3. The long term regular movement in a time series is called .

a) trend b) cyclical variation c) seasonal variation c) none

4. For the given five values 15,24,18,33,42,the three years moving averages are.

a) 19,22,23.b) 19,25,33. c)19,30,31. d) 19, 30,33.

5. Seasonal variation means the variations occurring during.

a) a year b) part of a year c) part of a month d) none

6. Non- centered moving averages are due to

11. If an Index Number I o1 = 112, then it means

Part B

Answer all questions Weight 1

13. Karl Pearsons’s formula for measure of skewness is -------------

and Y are ------------

15. Write down the normal equation for fitting a straight lune.

16. Given the trend equation , Y = 108 + 2.8 X with 2000 as orgin and yearly data

from 2000 to 2002,the estimated trend value for 2005 is.---------

17. The formula for calculating the rank correlation coefficient is--------

PartC

Answer any 4 questions ,weight 2

25.With the help of an Index Number formula, explain Time and Factor Reversal Tests.

Part- D (Answer any 2 Questions) weight 4d

27.Given the following data related to yield of a crop in three different seasons.

1990 12 19 17

1991 14 25 23

1992 13 27 20

1993 15 28 22

1994 17 31 24

29. Calculate the cost of Living Index Number for the data given below.

Rice

Food 30 47 4

Fuel 8 12 1

Clothing 14 18 3

House Rent 22 15 2

Miscellaneous 25 30 1

Semester III

Course III-PROBABILITY

problems. 30hrs

Model Question Paper

B.Sc.Geography (Main)

Semester III

COURSE III : (Complementary I)

PROBABILITY

Part A

(Answer all the questions. Choose the correct answer from the

alternatives given below each question). Bunch of 4 carries weight 1

a. P (x=X) b P (x>X) c. P (x ≤ X) d) none of the above

2. A and B are any two events, then

a. P (A ∪ B) + P (A ∩ B)> 1

b. P (A ∪ B) + P ( A1 ∩ B1) = 1

c. P (A) + P (A ∩ B)= P (A ∪ B)

d. A ∪ B and A1 ∪ B1 are mutually exclusive events.

3. A continuous random variable x has the distribution function Fx (n). The

range of variation of Fx (n) is

a. -∞ to +∞

b. 0 to ∞

c. Is that of the random variable x

d. 0 to 1

4. Two events A and B are said to be independent if

a. P(A/B)= P (A) & P(B/A) = P (B)

b. P (A ∪ B) = P (A) + P (B)

c. P (A ∩ B) = P (A) P (B/A)

d. P (A ∩ B) = 0

5. A random variable x is said to be continuous if it takes

a. an infinite number of values

b. Finite number of values or a countably infinite numbers of values

c. A continuum of value

d. A finite number of values

6. There are 4 houses available and 4 applicants. The probability that all

the applicants apply for the same house is

a. 3/32 b) 1/16, c) 1/64, d) 1/44

7. A coin is tossed 3 times. The chance that head and tail show alternatively is

a. 1/8 b) 1/4, c) 3/8, d) 1/2

8. Let the distribution function of a r.v n be

F(n) = 1-e-2n, n ≥ 0 ,

= 0 other wise

Then the density function is

a) 1-2e-n, n>0 =0 other wise b) 2e-2n, n>0 = 0 other wise

c) 1-2e-2n, n>0 =0 other wise d) e-2x, n>0 = 0 other wise

9. The theory of probability which takes into account of prior probabilities

of an event is

a) Addition theorem of probability

b) Multiplication theorem for dependent events

c) Baye’s Theorem

d) None of these

10. If x is a continuous random variable having the p.d.f. f(x) then

∫xf(x )dx is

a) >1 b) 0 c) 1 d) non negative

11. A and B are two events such that

P (A ∩ B) = 1/3, P (A1 ∩ B1) = 1/6 and 2 P (A) = P (B)= K then k is

a) 5/9 b) 7/9 c) 1/3 d) 2/3

12. The mean of the standard normal distribution is equal to

a) infinity b) zero c) unity d) none of these

Part B

(Answer all the questions) Weight-1

Then P (A-B) =……………….

14. If F is the distribution function of a.r. v x and if a<b then

P (a<x ≤ b)=……………………….

15. A and B are two independent events such that P(A1) = 0.7, P (B1)= k and

Space S. Then P (A ∪ B) = 0.8 then the value of k is equal

to……………

16. -------- ditribution has a mean greater than its variance.

17. If (x,y) is a pair of continuous r.vs with joint p.d.f f (x,y) then

∫ yf (x,y) dy is the ------------------------

18. Write down the set theoretic equivalent of the statement “ neither A nor

B” occcurs where A and B are two events in a sample space S.

19. What is the probability of getting a spade or an ace from a pack of cards.

20. Define a probability space.

Part C

(Answer any 4 questions) Weight-2

21. Show that for any two events A and B in a sample space S

P ( A ∩ B) ≥ P (A) + P (B) -1

22. In a swimming race the odds that A will win are 2 to 3 and the odds that

B will win are 1 to 4. Find the probability that A or B wins the race.

23. State and prove the multiplication law of probability.

24. What are the properties of a distribution function.

25. For a poisson distribution with parameter 3find Pr(X>2)

26. Examine whether f(x) as defined below is a pdf.

F(x) =0 for x<2

1

(3+2x) for 2 ≤ x<4

18

= 0 for n>4

Part D

(Answer any 2 questions) Weight-4

P ( A ∩ B) ≤ P (A) ≤ P (A ∪ B) ≤ P (A) + P (B)

28. Let x be a continuous random variable with p.d.f given by

ax 0≤ x ≤ 1

f(x) = a 1≤ x ≤2

-ax +3a 2 ≤ x ≤3

0 otherwise

Determine the constant a and determine the distribution function F(x)

29.Describe the various definitions of probability pointing out the

limitations of each.

Semester IV

Complementary I

Course-IV-TESTING OF HYPOTHESIS

Module 1. Testing of statistical hypotheses, large and small sample tests, basic

squares tests.

35hours

Module 2. Non parametric tests – advantages, sign test, run test, signed rank

30 hours

1. S.C. Gupta and V.K. Kapoor : Fundamentals of Mathematical

Statistics, Sultan Chand and sons

2. Mood A.M., Graybill. F.A and Boes D.CIntroduction to Theory of

3. Gibbons J.D.: Non parametric Methods for Quantitative Analysis,

McGraw Hill.

4. S.C. Gupta & V.K.Kapoor: Fundamentals of Applied Statistics, Sultan

5. Box, G.E.P. and G.M. Jenkins: Time Series Analysis, Holden –Day

Model Question Paper

B.Sc.Geography (Main)

Semester IV

COURSE IV : (Complementary I)

TESTING OF HYPOTHESIS

Part A

Time 3 hours Answer all questions

a) Rejecting H0 when H0 is true

b) Accepting H0 when H0 is true

c) Accepting H0 when H1 is true

d) Accepting H1 when H0 is true

2. In a paired t-test:

a) The sample sizes should be equal

b) The size of the first sample should be less than the size of the second

c) The size of the second sample should be less than the size of the first

d) Both sample sizes should be ≥ 50

r

a. 1− r2

n−3

r 1− r2

b.

n−2

r 1− r2

c. n−3

r

d. n−2

1− r2

4. In the test of equality of means of two normal population with small samples of

sizes n1 and n2 taken from them and if the population have equal but unknown

variance, the test statistic follows:

a) t n1+n2-1 b)t n1+n2 c)t n1+n2/2 d) t n1+n2-2

5. In a chi-square contingency table with 3 rows and 5 columns, the d.f of chi-square

statistic is

a) 15

b) 24

c) 8

d) 7

6. The chi-square test statistic for a goodness of fit test is given by:

Oi − Ei

a)

Ei

Oi − Ei

b) ∑ Ei2

(Oi − Ei ) 2

c) ∑ Ei2

(Oi − Ei )2

d) ∑ Ei

a) The variable is continuous

b) The variable is discrete

c) The variable is normal

d) The variable is standard normal

8.. The non-parametric equivalent test for a paired t-test is:

a) Signed Rank test

b) Rank sum test

c) Run test

d) Sign test

9. The test used to check the randomness of the collected set of symbols is:

a) Sign test

b) Rank sum test

c) Signed rank test

d) Run test

10. When there are 3 groups, each following normal distribution, and the null

hypothesis is concerned with the equality of means the test used is:

a) Chi square test

b) t-test for equality of means

c) Analysis of variance

d) none of the above

11. The test statistic in a two way ANOVA table follows:

a) Chi-square distribution

b) t-distribution

c) Normal distribution

d) F-distribution

12. In a one way ANOVA if the d.f of the total S.S is 13 and the d.f of the between

sample sum of squares is 6, the d.f of the error sum of squares is:

a) 7 b) 6 c) 19 d) 3

13. In chi-square test of independences of 2 attributes with 2 observations each, the d.f

of the test statistic is 1.

b) Explain your answer.

14.In the case of sign test, the test statistic follows a binomial distribution.

b) Explain your answer.

15In an one-way ANOVA, the total sum of squares of observations is 6212 and the

error sum of squares is 3272. The sum of squares between samples is 2900.

b) Explain your answer.

fit.

b) Say true or false.

a) Explain your answer.

18A sample of size 12 is taken from a normal distribution. The sample variance is 1.8.

What is the value of the test statistic for the test with H o = σ 2 = 3 .

Part C weight-2 ( answer any 4 questions)

21. What is the null hypothesis for a chi-square test of homogeneity of proportions

and give the layout of observations.

24. In a lot containing 1235 articles, 35 were found to be defective. Does the

hypothesis: The proportion of defective articles is less than 0.02 hold?

25. The value of the sample mean from a population which was assumed to have

mean

5 is 4. The sample size is 100 and the variance of the sample is 1. Is there

significant difference between sample mean and population mean?

26.Explain paired t-test.

Part D Weight-4 ( Answer any 2 questions)

27 A factory operates in three shifts. The factory manager feels that quality of

part is related to shifts. For this purpose he has collected the following data

from the past records of production.

No. of Parts

Good Bad

Shift Day

900 130

Evening

Night 700 170

400 200

28 Fifteen patient records from each of two hospitals were received and

assigned a score designed to measure level of care. The scores were as

follows:-

Hospital 99 85 73 98 83 88 99 80 74 91 80 94 94 98 80

A:

Hospital 78 74 69 79 57 78 79 68 59 91 89 55 60 55 79

B

Use a proper non-parametric test to see whether the two populations are

identical with respect to the level of care.

29. The laboratories A and B carry out independent estimates of fat content in ice-

creams made by a firm. A sample is taken from each batch, halved and the

separate halves sent to the two laboratories. The fat contents obtained by the

laboratories are recorded below. (Fat contents in milligrams are given below)

Batch No. 1 2 3 4 5 6 7 8 9 10

Lab A 7 8 7 3 8 6 9 4 7 8

Lab B 9 8 8 4 7 7 9 6 6 6

Is there a significant difference between the mean fat content obtained by the two

laboratories A and B?

STATISTICS: COMPLEMENTARY – I

SYLLABUS FOR BSc. PSYCHOLOGY (MAIN)

CUCCSSUG 2009 (2009 admission onwards)

este al hours

Ext:Int

r No hours/wee

k

1 PS1C01 4 3 3 3:1

STATISTICAL

METHODS

ANALYSIS AND

PROBABILITY

DITRIBUTIONS AND

PARAMETRIC TESTS

TESTS AND ANALYSIS

OF VARIANCE

There shall be 4 parts A, B, C and D in all the question papers .Part A consists of 12 objective

type questions. Part B consists of 8 questions to be answered in a word, phrase or sentence.

Part C consists of 6 questions of short essay type of which the student can attempt 4. Part D

consists of 3 questions of long essay type of which the student can attempt 2. In part A the

weightage per question is ¼.for part B weightage is 1/question .For part D the weightage is

2/question and for part D the weightage is 4/question. As far as possible the number of

questions should be proportional to the modules.

Table showing the components and weightage for internal assessment

.

Components Weight

Assignment 1

Test paper 2

Seminar 1

Attendance 1

There shall be two test papers and the average grade point is to be considered for

internal assessment

Semester-I STATISTICAL METHODS

Modue 1. Pre-requisites.

A basic idea about data, its collection, organization and planning of survey and

diagramatic representation of data is expected from the part of the students.

Classification of data, frequency distribution, formation of a frequency distribution, Graphic

representation viz. Histogram, Frequency Curve, Polygon, Ogives and Pie Diagram. 20hr

Mean, Median, Mode, Geometric Mean, Harmonic Mean, Combined Mean, Advantages and

disadvantages of each average. 20hrs

Modue 3. Measures of Dispersion.

Range, Quartile Deviation, Mean Deviation, Standard Deviation, Combined Standard

Deviation, Percentiles, Deciles, Relative Measures of Dispersion, Coefficient of Variation.

Modue 4. Skewness and Kurtosis.

Pearson’s Coefficient of Skewness, Bowley’s Measure, Percentile Measure of

Kurtosis. 16hrs

Books for Study.

1. Gupta, S P (1988). Statistical Methods, Sultan Chand and Sons, New Delhi.

2. Gupta, S C and Kapoor, V K (2002). Fundamentals of Applied Statistics, Sultan

Chand and Sons, New Delhi.

3. Garret, H E and Woodworth, R S (1996). Statistics in Psychology and Education,

Vakila, Feffex and Simens Ltd., Bombay.

Model Question Paper

B.Sc. Psychology

I Semester -Staistical Methods

COURSE I : Psychological Statistics (Complementary I)

Time: 3 Hrs

PART A

(Contains 12 questions, 4 Questions carry a weightage of 1)

1. The heights of 150 students are collected. The type of classification that is best suited is

a) Qualitative

b) Quantitative

c) Geographical

d) Chronological

2. A frequency distribution in which the upper limits are not included in their respective

classes is called a

c) Raw data

c)

2

d)

2

4. When there are zeroes in the data we can not use

a) Median

b) Mode

c) Geometric mean

d) Arithmetic mean

a) Median

b) Arithmetic mean

c) Combined mean

d) Mode

6. Mean of 20 values is 45. If one of these values is to be taken 64 instead of 46, the correct

value of mean is:

a) 49.5

b) 45.9

c) 40.9

d) 42.9

__

σ X

a) __

× 100 b) × 100

σ

X

Median

c) × 100 d) σ × 100

σ

d) No relation

9. The 50th percentile is equal to:

a) 10th decile

b) 1st decile

c) 2nd decile

d) 5th decile

10. For a symmetric distribution median and mode = 10. The value of mean is:

a) Zero

b) 20

c) 10

d) 5

a) Mean = mode

d) (Mean – Mode)/2

a) Skewed curve

b) Platykurtic curve

c) Leptokurtic curve

d) Mesokurtic curve

PART B

13. When there are open end classes, we use median as a measure of central tendency

14. In the case of categorical data we can not use histogram

15. Suppose that the standard deviation of a set of observation is 3. If from each observation

‘3’ is subtracted, the new standard deviation is zero.

16. If 25% of the items are less than 10 and 25% are more than 40 the coefficient of quartile

deviation is -------.

17. Karl Pearson’s coefficient of skewness of a distribution is 0.32 and its standard deviation

is 6.5. The mean is 29.6. The mode is -------.

PART C

21. Explain why A.M. is considered as the best measure of central tendency?

26, 54, 33, 41, 94, 41, 54, 26, 93, 87, 81, 64, 68, 95.

23. The first two-sub-groups have 10 items with mean 15 and S. D. 3. If the whole group has

250 items with mean 15. 6 and S.D. 13.44 , find the standard deviation of the second

subgroup.

26. The means of two samples of sizes 50 and 100 respectively are 54.1 and 50.3. The

standard deviations are 8 and 7. Obtain the mean and standard deviations of the sample

consisting of 150 observations by combining the two samples.

PART D

Freq: 13 15 19 20 23 25 28 13

29. Calculate mean deviation about median for the data given below.

Freq: 8 7 30 26 12 7

COURSE II -SEMESTER-II

REGRESSION ANALYSIS AND PROBABILITY

Meaning, Karl Pearson’s Coefficient of Correlation, Scatter Diagram, Calculation of

Correlation From a 2-way table, Interpretation of Correlation Coefficient, Rank Correlation,

Regression, Regression Equation, Identifying the Regression Lines. 20hrs

Partial and Multiple Correlation Coefficients, Multiple Regression Equation,

Interpretation of Multiple Regression Coefficients (three variable cases only). 16h

Sets, Union, Intersection, Complement of Sets, Sample Space, Events, Classical,

Frequency and Axiomatic Approaches to Probability, Addition and Multiplication Theorems,

Independence of Events (Up-to three events). 20hrs

Discrete and Continuous Random Variables, Probability Mass Function, Distribution

Function of a Discrete Random Variable. 16hrs

Books for Study.

4. Gupta, S P (1988). Statistical Methods, Sultan Chand and Sons, New Delhi.

5. Gupta, S C and Kapoor, V K (2002). Fundamentals of Applied Statistics, Sultan

Chand and Sons, New Delhi.

6. Garret, H E and Woodworth, R S (1996). Statistics in Psychology and Education,

Vakila, Feffex and Simens Ltd., Bombay.

Model Question Paper

B. Sc. Psychology

II Semester REGRESSION ANALYSIS AND PROBABILITY

Part A

1. The value of the square of Karl Pearson’s coefficient of correlation lies between:

a) 0 and 1 b) -1 and 1

2. Karl Pearson’s coefficient of correlation for the following set of observation (3,12),(5,6) is: a)

Zero b) -1 c) +1 d) infinity

a) Negative b) Positive

c) Zero d) No relation

a) b)

2

1− r 23 1 − r232 1 − r132

c) d)

1− r 2

23 1− r 2

12 1 − r132

5. In a multiple regression equation of X3 on X1 and X2:

b) The joint effect of X2 and X3 are studied keeping the effect of X1 a constant.

d) The correlation between X2 and X3 are studied keeping the effect of X1 a constant.

7. Mutually exclusive events other than null event and sure event are:

a) not independent

b) independent

c) no relation

8. The probability that India wins a cricket match against England is 1/3. If India and England play 3

matches, what is the probability that India will lose all the three matches?

9. What is the probability that a non leap year selected at random will have 53 Sundays?

10. The probability mass function of a discrete r.v is: p(x) = cx/15, x = 1, 2, 3, 4, 5. The value of c is:

a) zero b) 15 c) 5 d) 1

a) constant

b) non-decreasing

c) non-increasing

d) never exists

12. For a discrete r.v P(X >0) = P(X <0) and P(X =0) = p. The variable takes the following values -2, -

1, 0, 1, 2. What is the probability that X >0?

Part B

13. Classical definition of probability can be used in the case of a sample space with infinite

outcomes.

14. In the case of disjoint events A and B, P(A Υ B)< P(A) +P(B).

15. Getting a queen and getting a Jack while drawing cards from a deck of cards are independent

events.

16. The correlation coefficient between X and Y is 0.85. Find the coefficient of determination.

Part C

21. Give the axiomatic definition of probability. Mention one advantage of the definition.

22. If A and B are two independent events such that P ( A c ) = 0.7, P ( B c ) = k , P ( A ∪ B ) = 0.8 , then

find the value of k.

23. A and B stand in a ring with 12 other persons. Find the probability that A & B are together.

24. Explain briefly the concept of partial correlation with the help of an example.

25. Explain why in the case of two variables there are always two regression lines? When do they

coincide?

Part D

27. From a bag containing 5 red and 6 blue balls, 4 balls are taken at random. Find the probability

mass function of:

28. P(A) = 1/3, P(B) = 1/4, P(A∩B) = 1/11. Find the following probabilities.

29. Give an example to show that correlation coefficient is a measure of linear correlation only

Semester-III

Course III -PROBABILITY DITRIBUTIONS AND PARAMETRIC TESTS

Binomial, Poisson and Normal Distributions, Mean and Variance (without

derivations), Numerical Problems, Fitting, Importance of Normal Distribution, Central Limit

Theorem. 25hrs

Methods of Sampling, Random and Non-random Sampling, Simple Random

Sampling, Stratified, Systematic and Cluster Sampling. 20hrs

Modue 3. Testing of Hypotheses.

Fundamentals of Testing, Type-I & Type-II Errors, Critical Region, Level of

Significance, Power, p-value, Tests of Significance.

Large Sample Tests – Test of a Single Mean, Equality of Two Means, Test of a Single

Proportion, Equality of Two Proportions. 25hrs

Modue 4. Small Sample Tests.

Test of a Single Mean, Paired and Unpaired t-Test, Chi-Square Test of Variance, F-

Test for the Equality of Variance, Tests of Correlation. 20hrs

Books for Study.

7. Gupta, S P (1988). Statistical Methods, Sultan Chand and Sons, New Delhi.

8. Gupta, S C and Kapoor, V K (2002). Fundamentals of Applied Statistics, Sultan

Chand and Sons, New Delhi.

9. Garret, H E and Woodworth, R S (1996). Statistics in Psychology and Education,

Vakila, Feffex and Simens Ltd., Bombay.

Model Question Paper

B. Sc. Psychology

III Semester PROBABILITY DITRIBUTION AND PARAMETRIC TESTS

a) Zero

b) 1/4

c) 3/4

d) One

2. The parameter of a Poisson distribution is 6. Its variance is:

a) Less than 6

b) Greater than 6

c) Equal to 6

d) No relation

a) One

b) Zero

c) Three

d) Four

4. If a sample of size n is taken without replacement from a population with N units, the

probability of getting a sample is:

a) 1/n b) 1/N c) 1/nCn d) 1/2N

5. The test statistic that is used to check equality of variance of two normal populations when

two small samples are taken from them is:

a) standard normal

b) F

c) t

d) χ 2

6. A statistic is

a) Constant

b) Same as parameter

8. In a paired t-test:

b) The size of the first sample should be less than the size of the second

c) The size of the second sample should be less than the size of the first

r

a. 1− r2

n−3

r 1− r2

b.

n−2

r 1− r2

c. n −3

r

d. n−2

1− r2

10. In the test of equality of means of two normal population with small samples of sizes n1

and n2 taken from them and if the population have equal but unknown variance, the test

statistic follows:

11. Stratified sampling procedure of highly effective in:

a) heterogeneous population

b) homogeneous population

c) infinite population

d) always

population is.

18 A sample of size 12 is taken from a normal distribution. The sample variance is 1.8. What

is the value of the test statistic for the test with H o = σ 2 = 3 .

19. Define type II error

20 Define the power of the test.

Part C (4 Questions to be answered out of 6) wt 2

21. What do you mean by standard error?

22. Explain paired t-test.

23. What are the advantages of systematic sampling compared to SRS.

24. A correlation coefficient 0.65 was observed in a sample of 50 bi-variate observations. Is

the value significant?

25. In a lot containing 1235 articles, 35 were found to be defective. Does the hypothesis: The

proportion of defective articles is less than 0.02 hold?

26. The value of the sample mean from a population which was assumed to have mean = 5 is

4. The sample size is 100 and the variance of the sample is 1. Is there significant

difference between sample mean and population mean?

Part D (2 Questions to be answered out of 3) wt4

27. Using Poisson approximation to the Binomial distribution, solve the following. If the

probability that an individual suffers a bad reaction from a particular infection is 0.001,

determine the probability that out of 2,000 individuals,

a) Exactly 3

28. The laboratories A and B carry out independent estimates of fat content in ice-creams

made by a firm. A sample is taken from each batch, halved and the separate halves sent to

the two laboratories. The fat contents obtained by the laboratories are recorded below.

(Fat contents in milligrams are given below)

Batch No. 1 2 3 4 5 6 7 8 9 10

Lab A 7 8 7 3 8 6 9 4 7 8

Lab B 9 8 8 4 7 7 9 6 6 6

Is there a significant difference between the mean fat content obtained by the two laboratories

A and B?

Semester-IV NON PARAMETRIC TESTS AND ANALYSIS OF VARIANCE

Course IV

Modue 1. Chi-square Tests.

Chi-square Test of Goodness of Fit, Test of Independence of Attributes, Test of

Homogeneity of Proportions. 25hrs

Modue 2. Non-Parametric Tests.

Sign Test, Wilcoxen’s Signed Rank Test, Wilcoxen’s Rank Sum Test, Run Test,

Krushkal-Wallis Test. 20hrs

Modue 3. Analysis of Variance.

One-way and Two-way Classification with Single Observation Per Cell, Critical

Difference. 25hrs

Modue 4. Preparation of Questionnaire, Scores and Scales of Measurement, Reliability and

Validity of Test Scores. 20hrs

Books for Study.

10. Gupta, S P (1988). Statistical Methods, Sultan Chand and Sons, New Delhi.

11. Gupta, S C and Kapoor, V K (2002). Fundamentals of Applied Statistics, Sultan

Chand and Sons, New Delhi.

12. Garret, H E and Woodworth, R S (1996). Statistics in Psychology and Education,

Vakila, Feffex and Simens Ltd., Bombay.

Model Question Paper

B. Sc. Psychology

Time: 3 Hrs

Q.1. In a chi-square contingency table with 3 rows and 5 columns, the d.f of chi-square

statistic is

a) 15

b) 24

c) 8

2. The chi-square test statistic for a goodness of fit test is given by:

Oi − Ei

a)

Ei

Oi − Ei

b) ∑ Ei2

(Oi − Ei ) 2

c) ∑ Ei2

(Oi − Ei ) 2

d) ∑ Ei

3. In a Poisson goodness of fit test having ‘k’ sets of observed frequencies with estimated

value of λ , the chi-square statistic has d.f.

a) k-2

b) k

c) k-1

d) k-2

c) Run test

d) Sign test

6. The test used to check the randomness of the collected set of symbols is:

a) Sign test

d) Run test

7 When there are 3 groups, each following normal distribution, and the null hypothesis is

concerned with the equality of means the test used is:

c) Analysis of variance

a) Chi-square distribution

b) t-distribution

c) Normal distribution

d) F-distribution

a) t-test

b) Normal test

c) Chi-square test

d) ANOVA10. In a one way ANOVA if the d.f of the total S.S is 13 and the d.f of the

between sample sum of squares is 6, the d.f of the error sum of squares is:

a) 7 b) 6 c) 19

11. The mean value of a set of scores is 50 with S.D.=5. If the raw score of an individual is

55, his z-score is:

a) Zero b) -1 c) 50 d) +1

12. The reliability coefficient of a test of 50 items is 0.60. How much should it be lengthened

to raise the self correlation to 0.9?

a) 5 b) 6 c) 7 d)

\

Part B Answer all questions weight 1

13. In chi-square test of independences of 2 attributes with 2 observations each, the d.f of the

test statistic is 1.

14 In the case of sign test, the test statistic follows a binomial distribution.

15. In an one-way ANOVA, the total sum of squares of observations is 6212 and the error

sum of squares is 3272. The sum of squares between samples is 2900.

16. In χ 2 test of goodness of fit if the calculated value of χ 2 is zero, then it is a bad fit.

18. In test re-test method Karl Pearson’s coefficient of correlation between two test scores is

0.9. What is the coefficient of reliability?

Part C (Answer 4 questions out of 6) weight 2

21. What is the null hypothesis for a chi-square test of homogeneity of proportions and give

the layout of observations.

24. The reliability coefficient of a test of 50 items is 0.6. How much should the test be

lengthened to raise the self correlation to 0.9? What effect will the doubling of the test

length has upon the reliability coefficient?

25. A test of 50 items has reliability 0.7 and validity 0.5. If another 150 comparable items are

added to it what will be the validity?

26. In a one-way analysis of variance with three groups (samples) each consisting of 5

observations, the mean error sum of squares is 30.5. Calculate the critical difference. The

group means are 20, 25 and 26 respectively. Find which pairs show significant difference

if any.

Section D

27. A factory operates in three shifts. The factory manager feels that quality of part is related

to shifts. For this purpose he has collected the following data from the past records of

production.

No. of Parts

Good Bad

28. Fifteen patient records from each of two hospitals were received and assigned a score

designed to measure level of care. The scores were as follows:-

Hospital 99 85 73 98 83 88 99 80 74 91 80 94 94 98 80

A:

Hospital 78 74 69 79 57 78 79 68 59 91 89 55 60 55 79

B

Use a proper non-parametric test to see whether the two populations are identical with respect

to the level of care.

Multiple Choice Questions

Analysis of Variance - Single factor completely

randomized design

varies among four brand of cigarettes. Three packs of each brand were selected,

and one cigarette from each pack was placed in a smoking machine to determine

the tar content. An Analysis of Variance was performed and here are the results

(some parts are hidden):

MODEL 3 ***.******** 116.00000000 ***** 0.0028

ERROR ** 80.00000000 ************

CORR. TOTAL ** 428.00000000

BRAND 3 348.00000000 ***** 0.0028

1. The value of the F-statistic for testing the equality of the means is:

(a) 4.35

(b) .0028

(c) 13.05

(d) 11.60

(e) 116.00

Solution: d

Past performance 1990 Apr - 75%

Past performance 1991 Feb - 63% (c-27%)

Past performance 1993 Feb - 84% (c-10%)

1

2. The hypothesis would be rejected at α=0.05 if the test statistic is greater

than:

(a) 4.07

(b) 3.86

(c) 8.85

(d) 8.81

(e) 3.59

Solution: a

Past performance 1990 Apr - 79%

Past performance 1991 Feb - 61% (b-31%)

Past performance 1993 Feb - 86% (b-12%)

(a) Because the p-value is small, there is evidence that all the brands

differ from each other in the mean amount of tar present.

(b) Because the p-value is small, there is no evidence that any of the

brands differ in the mean tar content.

(c) Because the p-value is small, there is evidence that at least one brand

has a different mean tar content from the other brands.

(d) Because the p-value is small, there is no evidence that at least one

brand has a different mean tar content from the other brands.

(e) Because the p-value is small, there is evidence that all of brands have

the same mean tar content.

Solution: c

Past performance 1993 Feb - 95%

Since the p-value is 0.0028 the hypothesis of equal means is rejected. Con-

sequently a multiple comparison procedure was performed. Here is a por-

tion of the output:

NOTE: THIS TEST CONTROLS THE TYPE I COMPARISONWISE ERROR RATE,

NOT THE EXPERIMENTWISE ERROR RATE

ALPHA=0.05 DF=* MSE=***

CRITICAL VALUE OF T=2.30600

LEAST SIGNIFICANT DIFFERENCE=5.9541

2006

c Carl James Schwarz 2

T GROUPING MEAN N BRAND

A 122.000 3 Wheezer

B 112.000 3 Choker

B

B 110.000 3 Hacker

B

B 108.000 3 Killer

in any comparison.

(b) The experiment-wise error rate is the probability of at least one Type

I error in all possible comparisons

(c) There is no evidence of a difference between the average tar content

of the Hacker and Killer brands.

(d) The Hacker brand appears to have lower mean tar content than the

Choker brand.

(e) Two sample means must differ by the Least Significant Difference

(5.9541) before the corresponding population means are declared dif-

ferent.

Solution: d

Past performance 1990 Apr - 62% (B-15%)

three different brands. She believes that the value of 4 is a good estimate

of the population standard deviation. What is the estimated sample size

to be 80% sure of detecting a 5 mg. difference in the mean tar content

when testing at α=0.05?

(b) 12 cigarettes in total; four of each brand

(c) 14 of each brand for a total of 42 cigarettes

(d) 14 cigarettes in total; five cigarettes in two brands, four in the third

(e) 15 cigarettes in total; five of each of three brands.

Solution: c

Past performance 1990 Apr - 76% (E-13%)

Past performance 1991 Feb - 86%

2006

c Carl James Schwarz 3

6. Suppose the analyst wishes to repeat the experiment blocking by the type

of inhalation of smokers. Which of the following is NOT CORRECT about

a randomized complete block design?

(b) Every treatment must appear at least once in every block.

(c) Blocking is used to remove the effects of another factor (not of inter-

est) from the comparison of the levels of the primary factor.

(d) The ANOVA table will have another line in it for the contribution to

the variability from the blocks.

(e) Block should contain experimental units that are as different as pos-

sible from each other.

Solution: e

Past performance 1990 Apr - 79%

Some varieties of nematodes (round worms that live in soil and are fre-

quently so small that they are invisible to the naked eye) feed on the roots

of lawn grasses and crops such as strawberries and tomatoes. The pest,

which is particularly troublesome in warm climates, can be treated by the

application of nematocides. However, because of the size of the worms, it

is very difficult to count them directly. Hence, the yield of a crop is used

as a surrogate for the the number of worms. Four brands of nematocides

are to be compared. Twelve plots of land of comparable fertility that were

suffering from nematodes were planted with a crop. Each nematocide was

applied to three plots; the assignment of the nematocide to the plot was

made at random. At harvest time, the yields of each plot were recorded

and part of the ANOVA table appears below:

Source df SS MS F-value

Nematocides * 3.456 * *

Error 8 1.200 *

Total 11 4.656

the mean yields among the four brands is:

(a) 23.04

(b) 2.89

(c) 3.46

(d) 1.20

(e) 7.68

2006

c Carl James Schwarz 4

Solution: e

Past performance 1990 Feb - 90%

(b) Reject H if F ∗ > 3.59

(c) Reject H if F ∗ > 4.07

(d) Reject H if F ∗ > 2.60

(e) Reject H if F ∗ > 8.85

Solution: c

Past performance 1990 Feb - 92%

9. Suppose that based upon this experiment, the scientist wishes to be 80%

sure of detecting a difference of about 0.45 kg/plot in the average yield

among the four nematocides when testing at α=0.05. She decides to use

0.15 as an estimate of the population variance. Then:

(a) The required sample size is about 20 plots per nematocide for a total

of 80 plots.

(b) The required total sample size is 20 plots, i.e., 5 plots per nematocide.

(c) The required sample size is about 4 plots per nematocide for a total

of 16 plots.

(d) The required total sample size is 4 plots, i.e., 1 plot per nematocide.

(e) The required sample size cannot be determined because the individ-

ual population means are not known.

Solution: a

Past performance 1990 Feb - 40% (A-40%, C-46%)

10. What is the best reason for randomly assigning treatment levels to the

experimental units?

can apply the nematocides in any pattern rather than in a systematic

fashion.

(b) Randomization will tend to average out all other uncontrolled fac-

tors such as soil fertility so that they are not confounded with the

treatment effects.

(c) Randomization makes the analysis easier because the data can be

collected and entered into the computer in any order.

2006

c Carl James Schwarz 5

(d) Randomization is required by statistical consultants before they will

help you analyze the experiment.

(e) Randomization implies that it is not necessary to be careful during

the experiment, during data collection, and during data analysis.

Solution: b

Past performance 1990 Feb - 97%

(a) Conclude that the mean yields of the four nematocides are equal

when in fact at least one is not equal.

(b) Conclude that the mean yields of the four nematocides are equal

when in fact they are equal.

(c) Conclude that the mean yields of the four nematocides are unequal

when in fact at least one is not equal.

(d) Conclude that the mean yields of the four nematocides are unequal

when in fact they are equal.

(e) Fail this exam because you used the osmosis method of studying.

Solution: d

Past performance 1990 Feb - 82%

Cuckoo birds lay their eggs in the nests of other species (the host species).

Can cuckoo birds modify their eggs sizes according to the nest of the host

species. A sample of nests containing a cuckoo egg were found and the size

of the cuckoo egg in the host species nest was measured. The following

output was obtained:

2006

c Carl James Schwarz 6

12. Which is the null and alternate hypothesis?

(a) H: all sample means are equal;

A: at least one sample mean differs from the others.

(b) H: all host species have the same population mean cuckoo egg size;

(c) H: all eggs are the same size:

A: at least one egg differs in size from the others.

2006

c Carl James Schwarz 7

(d) H: all host species are the same;

A: at least one host species is different from the others.

(e) H: all host species have the same size eggs;

A: at least one host species has different sized eggs from the others.

Solution: b

Past performance 2006 Dec - 38% (30%-a; 19%-c)

(a) This is a paired experiment because all host species were measured

more than once.

(b) This experiment is unbalanced with unequal number of eggs mea-

sured from each host species.

(c) There is no need to carefully select a random sample of host species

nests because the sample size is large.

(d) The ANOVA methods tests if the variances are equal across all treat-

ment groups.

(e) In the Analysis of Variance (ANOVA) method, the F-test can be

thought of as test of equal variances.

Solution: b - rats a typing error made the original have no answer

(a) Because the p-value is small, there is very strong evidence that the

means are equal.

(b) The F -ratio of 10.4 tests if all the individual values are the same.

(c) Because some confidence diamonds do not overlap, there is evidence

that not all means are equal.

(d) The Tukey-Kramer output shows that all the means are different

from each other.

(e) The comparison circles show that the eggs from Wren nests are all a

different size than eggs from other host species.

Solution: c

Past performance 2006 Dec - 71% (20%-e)

2006

c Carl James Schwarz 8

Multiple Choice Questions

Analysis of Variance - general

of the Analysis of Variance technique?

(b) The populations are normally distributed.

(c) The variances of the populations are the same.

(d) The means of the populations are equal.

(e) all of the above

Solution: d

1

Multiple Choice Questions

Analysis of Variance - Single factor randomized

complete block designs

preparation on the first year growth of slash pine seedlings. Four locations

(provincial forest areas) were selected, and each location was divided into three

plots. Three methods of soil preparation were used: no preparation, light fer-

tilization, and burning. One treatment was randomly assigned to the plots

within each location and all three treatments were applied at each locations.

On each plot, the same number of seedlings was planted, and the average first

year growth for the seedlings on the plot was recorded. Two outputs appear

below - only one of which is a “correct” way of analyzing this data.

Source df SS MS F Prob

prep * 38 **.* **.* 0.1517

Error * 73 **.*

Total * 111

prep Mean

burn 12.0

fertilize 16.0

none 12.5

Source df SS MS F Prob

prep * 38.0 **.* **.* 0.0121

locn * 61.7 **.* **.* 0.0077

Error * 11.3 **.*

Total * 111

prep Mean

1

burn 12.0

fertilize 16.0

none 12.5

1. The value of the test statistic for testing the appropriate hypothesis is:

(a) 2.3

(b) 10.1

(c) 10.9

(d) 2.6

(e) 11.8

Solution: b

Past performance 1993 Apr - 60% (a-15%; e-10%)

(a) 2.1

(b) 4.3

(c) 2.7

(d) 4.6

(e) 2.4

Solution: e

Past performance 1993 Apr - 33% (c-33%)

detecting a difference between the “burn” and the “none” treatments in

the mean growth when testing at α=0.05. The required sample size for

each treatment is:

(a) > 17

(b) 4

(c) > 21

(d) 12

(e) 14

2006

c Carl James Schwarz 2

Solution: a

Past performance 1993 Apr - 25% (b-15%; c-42%; d-10%)

Every winter, tons of salt are dumped on Winnipeg streets. In the spring,

the salt washes into the soil where it can be very harmful to trees and grass.

To investigate this problem, an experimenter wishes to investigate the

effects of different salinity levels upon vegetation growth. Since different

areas of the city differ by soil type and other factors, she blocks by location

in the city. In each location, she administers six different levels of salinity

(15, 20, 30, 35, 45, 50 ppm). The output from SAS follows:

Sum of Mean

Source DF Squares Square F Value Pr > F

Model 8 891.05166667 111.38145833 13.73 0.0001

Error 15 121.67791667 8.11186111

Corr Total 23 1012.72958333

TRT 5 664.43708333 132.88741667 16.38 0.0001

BLOCK 3 226.61458333 75.53819444 9.31 0.0010

NOTE: This test controls the type I comparisonwise error rate not

the experimentwise error rate.

Least Significant Difference= 4.2926

A 18.100 4 20

A

A 14.150 4 15

B 7.475 4 30

B

C B 6.000 4 35

C B

C B 5.775 4 45

C

C 3.075 4 50

2006

c Carl James Schwarz 3

(b) F*= 9.31; Reject H if F ∗ > 8.71.

(c) F*= 13.73; Reject H if F ∗ > 2.64.

(d) F*= 9.31; Reject H if F ∗ > 3.29.

(e) F*= 16.38; Reject H if F ∗ > 4.62.

Solution: a

Past performance 1991 Apr - 55% (C-25%)

in any comparison.

(b) The experiment-wise error rate is the probability of at least one Type

I error in all possible comparisons

(c) According to the output, there is evidence of a difference between

15ppm and 35 ppm.

(d) Since the mean biomass at 30 ppm is not found to be different from

that at 35 ppm, and that at 35 ppm is not found to be different

from that at 50 ppm, there is no evidence of a difference in the mean

biomass between 30 ppm and 50 ppm.

(e) Two sample means must differ by the Least Significant Difference

(4.29) before the corresponding population means are declared dif-

ferent.

Solution: d

Past performance 1991 Apr - 95%

6. The results of this experiment were interesting but not conclusive. She

now wishes to detect differences when testing at α =.05. Which of the

following is not correct?

difference of 9 in the biomass means.

(b) We would need more than 27 blocks to be 80% sure of detecting a

difference of 1.5 in the biomass means.

(c) We would need 27 blocks to be 80% sure of detecting a difference of

8 in the biomass means.

(d) We would need 13 blocks to be 80% sure of detecting a difference of

4.5 in the biomass means.

(e) We would need 8 blocks to be 80% sure of detecting a difference of

5.8 in the biomass means.

2006

c Carl James Schwarz 4

Solution: d

Past performance 1991 Apr - 71% (A-10%)

plete block experiment?

(a) Every block is randomized separately from every other block.

(b) Every treatment must appear at least once in every block.

(c) Blocking is used to remove the effects of another factor (not of inter-

est) from the comparison of levels of the primary factor.

(d) The ANOVA table will have another line in it for the contribution to

the variability from blocks.

(e) Blocks should contain experimental units that are as different as pos-

sible from each other.

Solution: e

Past performance 1991 Apr - 93%

Past performance 1998 Dec - 85%

2006

c Carl James Schwarz 5

Multiple Choice Questions

Chi-square tests for independence

and poultry operations. A random sample of operators was selected, and the

operators were classified according to the type of operation and the extent of

the rodent population. A total of 78 egg operators and 53 turkey operators were

classified and the summary information is:

1

1. Which of the following is not correct?

(a) Operators who had both operations could not be used because this

type of analysis requires each unit to be counted in one and only one

cell.

(b) The null hypothesis is that the severity of the rodent problem is

independent of the type of operator.

(c) The alternate hypothesis is that the proportion of turkey operators

with mild, moderate, and severe rodent problems is different from the

proportion of egg operators with mild, moderate, and severe rodent

problems.

(d) A Type I error would be to conclude that the severity of rodent

problems is dependent upon the type of operator while, in fact, the

proportion of turkey operators with mild, moderate, and severe ro-

dent problems is the same as the proportion of egg operators with

mild, moderate, and severe rodent problems.

(e) A Type II error would be to conclude that the proportion of egg

operators with mild, moderate, or severe rodent problems is the same

as the proportion of turkey operators with mild, moderate, or severe

rodent problems when in fact they are independent.

Solution: e

Past performance 1993 Apr - 52% (a-10%; b-10%; c-14%; d-14%)

Past performance 1996 Dec - 61% (a-10%, d-12%)

Past performance 1998 Dec - 72%

(a) about 5.99

2006

c Carl James Schwarz 2

(b) about 9.71

(c) about 6.81

(d) about 5.64

(e) about 8.60

Solution: d

Past performance 1993 Apr - 65% (a-14%; c-10%)

Past performance 1998 Dec - 99%

(a) about 26.00

(b) about 33.33

(c) about 53.00

(d) about 31.55

(e) about 78.00

Solution: d

Past performance 1996 Dec - 71% (a-16%)

Past performance 1998 Dec - 87%

(a) about .060

(b) about .014

(c) about .032

(d) about .008

(e) about .05

Solution: a

Past performance 1993 Apr - 48% (b-14%; c-16%; e-13%)

Past performance 1996 Dec - 89%

Past performance 1998 Dec - 96%

5. One reviewer of the study suggested that there may be a problem with the

study because results from small operators were pooled with the results

from large operators. Which of the following is NOT CORRECT?

(a) Simpson’s paradox occurs when conclusions from a pooled table differ

from the individual tables.

(b) Tables can be pooled when the underlying rates are equal among

tables.

2006

c Carl James Schwarz 3

(c) Simpson’s paradox occurs when tables with unequal row totals are

pooled.

(d) Inspection of the row or column percents will give a good clue if

Simpson’s paradox is likely to occur.

(e) Simpson’s paradox occurs when the pooled table gives no evidence

of an effect but the individual tables show evidence of an effect.

Solution: c

Past performance 1990 Dec - 68%

Past performance 1993 Apr - 32% (b-16%; d-22%; e-25%)

Past performance 1996 Dec - 65% (b-10%, d-10%)

Past performance 1998 Dec - 73% ( d-10%)

In the paper “Color Association of Male and Female Fourth-Grade School

Children” (J. Psych., 1988, 383-8), children were asked to indicate what

emotion they associated with the color red. The response and the sex of

the child are noted and summarized below. The first number in each cell

is the count, the second number is the row percent.

Frequency|

Row Pct |anger |happy |love |pain | Total

---------+--------+--------+--------+--------+

f | 27 | 19 | 39 | 17 | 102

| 26.47 | 18.63 | 38.24 | 16.67 |

---------+--------+--------+--------+--------+

m | 34 | 12 | 38 | 28 | 112

| 30.36 | 10.71 | 33.93 | 25.00 |

---------+--------+--------+--------+--------+

Total 61 31 77 45 214

------------------------------------------------------

Pearson Chi-Square * 4.629 *****

Likelihood Ratio Chi-Square * 4.661 *****

Mantel-Haenszel Chi-Square 1 0.307 *****

6. Under a suitable null hypothesis, the expected frequency for the cell cor-

responding to Anger and Males is:

(a) 15.9

(b) 55.7

(c) 30.4

(d) 31.9

2006

c Carl James Schwarz 4

(e) 29.1

Solution: d

Past performance 1991 Apr - 63% (C-17%, E-15%)

Past performance 1991 Dec - 84% (e-11%)

Past performance 1997 Aug - 87%

7. The null hypothesis will be rejected at α=0.05 if the test statistic exceeds:

(a) 3.84

(b) 5.99

(c) 7.81

(d) 9.49

(e) 14.07

Solution: c

Past performance 1991 Apr - 86%

(b) Between .050 and .100

(c) Between .025 and .050

(d) Between .010 and .025

(e) Between .005 and .010

Solution: a

Past performance 1991 Dec - 77% (e-11%)

(a) The children were cross-classified by sex and emotion associated with

red. Each child was counted in one and only one cell.

(b) The null hypothesis is that the type of emotion associated with red

is independent of the sex of the child.

(c) The null hypothesis is that the proportion of emotions associated

with red is the same for both sexes.

(d) All expected cell counts should be greater than five in order that

the distribution of the test statistic is an approximate chi-square

distribution.

(e) If we reject the null hypothesis than we have proven that the two

sexes associate red with emotions in different ways.

2006

c Carl James Schwarz 5

Solution: e

Past performance 1991 Apr - 76% (C-12%)

Past performance 1991 Dec - 77% (c-9%, d-12%)

Past performance 1993 Feb - 67% (d-16%)

with the color red than do male students.

(b) More students associate the color red with the emotion “love” than

with the emotion “anger”.

(c) Each student was classified by gender and by emotion association.

Each student was counted in one and only one cell.

(d) We will be unable to compute a correlation for this data because the

variables are not both interval or ratio in scale.

(e) We compute row or column percentages by dividing the cell count by

the table total (214).

Solution: e

Past performance 1993 Feb - 67% (d-16%)

Past performance 1996 Oct - 92%

(a) We conclude that the sex of the child and the emotion associated

with red are independent when in fact they are not independent.

(b) We conclude that the sex of the child and the emotion associated

with red are not independent when in fact they are not independent.

(c) We conclude that the proportion of emotions associated with red

differs between males and female when in fact they are the same.

(d) We conclude that the proportion of emotions associated with red is

the same for male and female when in fact they are the same.

(e) We fail to find any association between the color red and emotions

for either sex.

Solution: c

Past performance 1991 Apr - 76% (E-20%)

Past performance 1991 Dec - 84%

Past performance 1997 Aug - 76%

(a) emotional association with red is independent of gender

2006

c Carl James Schwarz 6

(b) gender is dependent upon the emotional association with red

(c) the probability of selecting an emotion with red is related to gender

(d) the number of children in each cell does not depend upon gender nor

upon emotion

(e) the color red is independent of the emotion associated with it and

with gender.

Solution: c

Past performance 1997 Aug - 74%

(a) 4.661 .1983

(b) 4.661 .3966

(c) 4.629 .2011

(d) 4.629 .4022

(e) 4.629 .1006

Solution: b

Past performance 1997 Aug - 76%

14. Each person in a random sample of 50 was asked to state his/her sex and

preferred colour. The resulting frequencies are shown below.

Colour

Red Blue Green

Male 5 14 6

Sex Female 15 6 4

A chi-square test is used to test the null hypothesis that sex and preferred

colour are independent. Which of the following statements is a correct

decision about the null hypothesis?

(b) Reject at the 0.01 level but not at the 0.005 level.

(c) Reject at the 0.025 level but not at the 0.01 level.

(d) Reject at the 0.05 level but not at the 0.025 level.

(e) Accept at the 0.05 level.

2006

c Carl James Schwarz 7

15. The following data were obtained from a company which manufactures

special plastic containers which are to hold a specified volume of hazardous

material. On each of the three 8 hour shifts workers are able to make 500

of the containers. Some containers do not meet specifications as required

by the company’s customer because they are too small, others because

they are too large.

Conformance to Specification

Shift Too Small Within Spec. Too Large

8am 36 452 12

4pm 24 443 33

midnight 12 438 50

ber of containers that meet specification on the 4pm shift is

(a) 166.7

(b) 443

(c) 33

(d) 444.3

(e) 500

16. Are all employees equally prone to having accidents? To investigate this

hypothesis, Parry (1985) looked at a light manufacturing plant and clas-

sified the accidents by type and by age of the employee.

Accident Type

Age Sprain Burn Cut

Under 25 | 9 17 5

25 or over | 61 13 12

(b) Age seems to be independent of accident type.

(c) Accident type does not seem to be independent of age.

(d) There appears to be a 20.78% correlation between accident type and

age.

(e) The proportion of sprain, cuts and burns seems to be similar for both

age classes.

2006

c Carl James Schwarz 8

Solution: c

Past performance 1989 Apr - 64%

two questions: Question 1. Are you happy with your financial situation?

Question 2. Do you approve of the Federal government’s economic poli-

cies? The responses are:

Question 1.

Yes No | Total

Question Yes 22 48 | 70

2 No 12 18 | 30

Total 34 66 | 100

response to Question 2 at 5% level, the expected frequency for the cell

(Yes,Yes) and the critical value of the associated test statistic are:

(b) 10.2 and 3.84 respectively

(c) 23.8 and 3.84 respectively

(d) 23.8 and 7.81 respectively

(e) 10.2 and 7.81 respectively

Solution: c

smoking are related. The following information was compiled for 600

individuals:

Smoker Non-smoker

Drinker 193 165

Non-drinker 89 153

Consumption are independent.

(b) The appropriate null hypothesis is H: Smoking and Alcohol Con-

sumption are not independent.

(c) The calculated value of the test statistic is 3.84.

(d) The calculated value of the test statistic is 7.86.

2006

c Carl James Schwarz 9

(e) At level .01 we conclude that smoking and alcohol consumption are

related.

Solution: e

Intermediate. The number of doctors who prescribed tetracycline to at

least one patient under the age of 8 were recorded for each of these practice

areas. The results are:

Tetracycline 95 74 31

No tetracycline 126 84 30

If the county type of practice and the use of tetracycline are independent,

then the expected number of rural doctors who prescribe tetracycline is:

(a) 31.0

(b) 27.7

(c) 1.37

(d) 51%

(e) 62

Solution: b

20. For the problem outlined above, the critical value(table value) of the test

statistic when the level of significance is α =0.05, is:

(a) 0.1026

(b) 7.3778

(c) 5.9915

(d) 12.5916

(e) 7.8147

Solution: c

A study was conducted to determine if the fatality rate depends on the

size of the automobile. The analysis of accidents is as follows (with some

values hidden):

2006

c Carl James Schwarz 10

DEATH SIZE

FREQUENCY| m | s | L | TOTAL

---------+--------+--------+--------+

no | 63 | 128 | 46 | 237

---------+--------+--------+--------+

yes | 26 | 95 | 16 | 137

---------+--------+--------+--------+

TOTAL 89 223 62 374

STATISTIC DF VALUE PROB

------------------------------------------------------

CHI-SQUARE * 8.663 *****

LIKELIHOOD RATIO CHI-SQUARE * 8.838 *****

21. Under a suitable null hypothesis, the expected frequency for the cell cor-

responding to fatal type of accident and small size automobile is:

(a) 81.68

(b) 67.00

(c) 61.43

(d) 63.41

(e) 59.72

Solution: a

Past performance 1990 Apr - 92%

status. Each accident was counted in one and only one cell.

(b) The null hypothesis is that the fatality status is independent of the

size of the automobile.

(c) The null hypothesis is that the proportion of fatality status is the

same for all three sizes of automobiles.

(d) All expected cell counts should be greater than five in order that

the distribution of the test statistic is an approximate chi-square

distribution.

(e) If we reject the null hypothesis than we have proven that the size of

the automobile affects the chances of a fatality.

Solution: e

Past performance 1990 Apr - 39% (B-12%, C-36%)

Past performance 1990 Dec - 20% ( 15% - c, 56% - d)

2006

c Carl James Schwarz 11

23. The null hypothesis will be rejected at α=0.05 if the test statistic exceeds:

(a) 12.59

(b) 7.81

(c) 5.99

(d) 3.84

(e) 9.49

Solution: c

Past performance 1990Apr - 79%

(b) between .005 and .010

(c) between .010 and .025

(d) between .025 and .050

(e) between .050 and .100

Solution: c

Past performance 1990 Dec - 78%

Past performance 1993 Apr - 80%

25. A controversial issue in sports is the use of the “instant replay” for making

decisions on plays that are extremely close or hard to call by an official.

A survey of players in each of four professional sports was conducted,

asking them if they felt “instant replays” should be used to decide close or

controversial calls. The results are as follows:

Favor Oppose

Football 22 2

Baseball 18 6

Basketball 15 26

Soccer 3 10

In testing to see whether opinion with respect to the use of instant replays

is independent of sport, a table of expected frequencies is found. In this

table, the expected number of professional baseball players opposing the

use of instant replays is equal to:

(a) 10.4

(b) 24.1

2006

c Carl James Schwarz 12

(c) 11.0

(d) 6.0

(e) 8.4

26. Each person in a random sample of males and females was asked to state

his/her sex and preferred colour. The resulting frequencies are shown

below.

Colour

Red Blue Green

Male 3 11 6

Sex Female 17 11 2

(a) 55% of males prefer the colour blue.

(b) Of those who prefer the colour green, 75% are males.

(c) 44% of people surveyed prefer the colour blue.

(d) A higher percentage of males prefered the colour blue than females.

(e) 15% of people are males who prefer the colour red.

Solution: e

Past performance 2006 Oct - 76% (16%=d)

2006

c Carl James Schwarz 13

Multiple Choice Questions

Chi-square tests for independence

and poultry operations. A random sample of operators was selected, and the

operators were classified according to the type of operation and the extent of

the rodent population. A total of 78 egg operators and 53 turkey operators were

classified and the summary information is:

1

1. Which of the following is not correct?

(a) Operators who had both operations could not be used because this

type of analysis requires each unit to be counted in one and only one

cell.

(b) The null hypothesis is that the severity of the rodent problem is

independent of the type of operator.

(c) The alternate hypothesis is that the proportion of turkey operators

with mild, moderate, and severe rodent problems is different from the

proportion of egg operators with mild, moderate, and severe rodent

problems.

(d) A Type I error would be to conclude that the severity of rodent

problems is dependent upon the type of operator while, in fact, the

proportion of turkey operators with mild, moderate, and severe ro-

dent problems is the same as the proportion of egg operators with

mild, moderate, and severe rodent problems.

(e) A Type II error would be to conclude that the proportion of egg

operators with mild, moderate, or severe rodent problems is the same

as the proportion of turkey operators with mild, moderate, or severe

rodent problems when in fact they are independent.

Solution: e

Past performance 1993 Apr - 52% (a-10%; b-10%; c-14%; d-14%)

Past performance 1996 Dec - 61% (a-10%, d-12%)

Past performance 1998 Dec - 72%

(a) about 5.99

2006

c Carl James Schwarz 2

(b) about 9.71

(c) about 6.81

(d) about 5.64

(e) about 8.60

Solution: d

Past performance 1993 Apr - 65% (a-14%; c-10%)

Past performance 1998 Dec - 99%

(a) about 26.00

(b) about 33.33

(c) about 53.00

(d) about 31.55

(e) about 78.00

Solution: d

Past performance 1996 Dec - 71% (a-16%)

Past performance 1998 Dec - 87%

(a) about .060

(b) about .014

(c) about .032

(d) about .008

(e) about .05

Solution: a

Past performance 1993 Apr - 48% (b-14%; c-16%; e-13%)

Past performance 1996 Dec - 89%

Past performance 1998 Dec - 96%

5. One reviewer of the study suggested that there may be a problem with the

study because results from small operators were pooled with the results

from large operators. Which of the following is NOT CORRECT?

(a) Simpson’s paradox occurs when conclusions from a pooled table differ

from the individual tables.

(b) Tables can be pooled when the underlying rates are equal among

tables.

2006

c Carl James Schwarz 3

(c) Simpson’s paradox occurs when tables with unequal row totals are

pooled.

(d) Inspection of the row or column percents will give a good clue if

Simpson’s paradox is likely to occur.

(e) Simpson’s paradox occurs when the pooled table gives no evidence

of an effect but the individual tables show evidence of an effect.

Solution: c

Past performance 1990 Dec - 68%

Past performance 1993 Apr - 32% (b-16%; d-22%; e-25%)

Past performance 1996 Dec - 65% (b-10%, d-10%)

Past performance 1998 Dec - 73% ( d-10%)

In the paper “Color Association of Male and Female Fourth-Grade School

Children” (J. Psych., 1988, 383-8), children were asked to indicate what

emotion they associated with the color red. The response and the sex of

the child are noted and summarized below. The first number in each cell

is the count, the second number is the row percent.

Frequency|

Row Pct |anger |happy |love |pain | Total

---------+--------+--------+--------+--------+

f | 27 | 19 | 39 | 17 | 102

| 26.47 | 18.63 | 38.24 | 16.67 |

---------+--------+--------+--------+--------+

m | 34 | 12 | 38 | 28 | 112

| 30.36 | 10.71 | 33.93 | 25.00 |

---------+--------+--------+--------+--------+

Total 61 31 77 45 214

------------------------------------------------------

Pearson Chi-Square * 4.629 *****

Likelihood Ratio Chi-Square * 4.661 *****

Mantel-Haenszel Chi-Square 1 0.307 *****

6. Under a suitable null hypothesis, the expected frequency for the cell cor-

responding to Anger and Males is:

(a) 15.9

(b) 55.7

(c) 30.4

(d) 31.9

2006

c Carl James Schwarz 4

(e) 29.1

Solution: d

Past performance 1991 Apr - 63% (C-17%, E-15%)

Past performance 1991 Dec - 84% (e-11%)

Past performance 1997 Aug - 87%

7. The null hypothesis will be rejected at α=0.05 if the test statistic exceeds:

(a) 3.84

(b) 5.99

(c) 7.81

(d) 9.49

(e) 14.07

Solution: c

Past performance 1991 Apr - 86%

(b) Between .050 and .100

(c) Between .025 and .050

(d) Between .010 and .025

(e) Between .005 and .010

Solution: a

Past performance 1991 Dec - 77% (e-11%)

(a) The children were cross-classified by sex and emotion associated with

red. Each child was counted in one and only one cell.

(b) The null hypothesis is that the type of emotion associated with red

is independent of the sex of the child.

(c) The null hypothesis is that the proportion of emotions associated

with red is the same for both sexes.

(d) All expected cell counts should be greater than five in order that

the distribution of the test statistic is an approximate chi-square

distribution.

(e) If we reject the null hypothesis than we have proven that the two

sexes associate red with emotions in different ways.

2006

c Carl James Schwarz 5

Solution: e

Past performance 1991 Apr - 76% (C-12%)

Past performance 1991 Dec - 77% (c-9%, d-12%)

Past performance 1993 Feb - 67% (d-16%)

with the color red than do male students.

(b) More students associate the color red with the emotion “love” than

with the emotion “anger”.

(c) Each student was classified by gender and by emotion association.

Each student was counted in one and only one cell.

(d) We will be unable to compute a correlation for this data because the

variables are not both interval or ratio in scale.

(e) We compute row or column percentages by dividing the cell count by

the table total (214).

Solution: e

Past performance 1993 Feb - 67% (d-16%)

Past performance 1996 Oct - 92%

(a) We conclude that the sex of the child and the emotion associated

with red are independent when in fact they are not independent.

(b) We conclude that the sex of the child and the emotion associated

with red are not independent when in fact they are not independent.

(c) We conclude that the proportion of emotions associated with red

differs between males and female when in fact they are the same.

(d) We conclude that the proportion of emotions associated with red is

the same for male and female when in fact they are the same.

(e) We fail to find any association between the color red and emotions

for either sex.

Solution: c

Past performance 1991 Apr - 76% (E-20%)

Past performance 1991 Dec - 84%

Past performance 1997 Aug - 76%

(a) emotional association with red is independent of gender

2006

c Carl James Schwarz 6

(b) gender is dependent upon the emotional association with red

(c) the probability of selecting an emotion with red is related to gender

(d) the number of children in each cell does not depend upon gender nor

upon emotion

(e) the color red is independent of the emotion associated with it and

with gender.

Solution: c

Past performance 1997 Aug - 74%

(a) 4.661 .1983

(b) 4.661 .3966

(c) 4.629 .2011

(d) 4.629 .4022

(e) 4.629 .1006

Solution: b

Past performance 1997 Aug - 76%

14. Each person in a random sample of 50 was asked to state his/her sex and

preferred colour. The resulting frequencies are shown below.

Colour

Red Blue Green

Male 5 14 6

Sex Female 15 6 4

A chi-square test is used to test the null hypothesis that sex and preferred

colour are independent. Which of the following statements is a correct

decision about the null hypothesis?

(b) Reject at the 0.01 level but not at the 0.005 level.

(c) Reject at the 0.025 level but not at the 0.01 level.

(d) Reject at the 0.05 level but not at the 0.025 level.

(e) Accept at the 0.05 level.

2006

c Carl James Schwarz 7

15. The following data were obtained from a company which manufactures

special plastic containers which are to hold a specified volume of hazardous

material. On each of the three 8 hour shifts workers are able to make 500

of the containers. Some containers do not meet specifications as required

by the company’s customer because they are too small, others because

they are too large.

Conformance to Specification

Shift Too Small Within Spec. Too Large

8am 36 452 12

4pm 24 443 33

midnight 12 438 50

ber of containers that meet specification on the 4pm shift is

(a) 166.7

(b) 443

(c) 33

(d) 444.3

(e) 500

16. Are all employees equally prone to having accidents? To investigate this

hypothesis, Parry (1985) looked at a light manufacturing plant and clas-

sified the accidents by type and by age of the employee.

Accident Type

Age Sprain Burn Cut

Under 25 | 9 17 5

25 or over | 61 13 12

(b) Age seems to be independent of accident type.

(c) Accident type does not seem to be independent of age.

(d) There appears to be a 20.78% correlation between accident type and

age.

(e) The proportion of sprain, cuts and burns seems to be similar for both

age classes.

2006

c Carl James Schwarz 8

Solution: c

Past performance 1989 Apr - 64%

two questions: Question 1. Are you happy with your financial situation?

Question 2. Do you approve of the Federal government’s economic poli-

cies? The responses are:

Question 1.

Yes No | Total

Question Yes 22 48 | 70

2 No 12 18 | 30

Total 34 66 | 100

response to Question 2 at 5% level, the expected frequency for the cell

(Yes,Yes) and the critical value of the associated test statistic are:

(b) 10.2 and 3.84 respectively

(c) 23.8 and 3.84 respectively

(d) 23.8 and 7.81 respectively

(e) 10.2 and 7.81 respectively

Solution: c

smoking are related. The following information was compiled for 600

individuals:

Smoker Non-smoker

Drinker 193 165

Non-drinker 89 153

Consumption are independent.

(b) The appropriate null hypothesis is H: Smoking and Alcohol Con-

sumption are not independent.

(c) The calculated value of the test statistic is 3.84.

(d) The calculated value of the test statistic is 7.86.

2006

c Carl James Schwarz 9

(e) At level .01 we conclude that smoking and alcohol consumption are

related.

Solution: e

Intermediate. The number of doctors who prescribed tetracycline to at

least one patient under the age of 8 were recorded for each of these practice

areas. The results are:

Tetracycline 95 74 31

No tetracycline 126 84 30

If the county type of practice and the use of tetracycline are independent,

then the expected number of rural doctors who prescribe tetracycline is:

(a) 31.0

(b) 27.7

(c) 1.37

(d) 51%

(e) 62

Solution: b

20. For the problem outlined above, the critical value(table value) of the test

statistic when the level of significance is α =0.05, is:

(a) 0.1026

(b) 7.3778

(c) 5.9915

(d) 12.5916

(e) 7.8147

Solution: c

A study was conducted to determine if the fatality rate depends on the

size of the automobile. The analysis of accidents is as follows (with some

values hidden):

2006

c Carl James Schwarz 10

DEATH SIZE

FREQUENCY| m | s | L | TOTAL

---------+--------+--------+--------+

no | 63 | 128 | 46 | 237

---------+--------+--------+--------+

yes | 26 | 95 | 16 | 137

---------+--------+--------+--------+

TOTAL 89 223 62 374

STATISTIC DF VALUE PROB

------------------------------------------------------

CHI-SQUARE * 8.663 *****

LIKELIHOOD RATIO CHI-SQUARE * 8.838 *****

21. Under a suitable null hypothesis, the expected frequency for the cell cor-

responding to fatal type of accident and small size automobile is:

(a) 81.68

(b) 67.00

(c) 61.43

(d) 63.41

(e) 59.72

Solution: a

Past performance 1990 Apr - 92%

status. Each accident was counted in one and only one cell.

(b) The null hypothesis is that the fatality status is independent of the

size of the automobile.

(c) The null hypothesis is that the proportion of fatality status is the

same for all three sizes of automobiles.

(d) All expected cell counts should be greater than five in order that

the distribution of the test statistic is an approximate chi-square

distribution.

(e) If we reject the null hypothesis than we have proven that the size of

the automobile affects the chances of a fatality.

Solution: e

Past performance 1990 Apr - 39% (B-12%, C-36%)

Past performance 1990 Dec - 20% ( 15% - c, 56% - d)

2006

c Carl James Schwarz 11

23. The null hypothesis will be rejected at α=0.05 if the test statistic exceeds:

(a) 12.59

(b) 7.81

(c) 5.99

(d) 3.84

(e) 9.49

Solution: c

Past performance 1990Apr - 79%

(b) between .005 and .010

(c) between .010 and .025

(d) between .025 and .050

(e) between .050 and .100

Solution: c

Past performance 1990 Dec - 78%

Past performance 1993 Apr - 80%

25. A controversial issue in sports is the use of the “instant replay” for making

decisions on plays that are extremely close or hard to call by an official.

A survey of players in each of four professional sports was conducted,

asking them if they felt “instant replays” should be used to decide close or

controversial calls. The results are as follows:

Favor Oppose

Football 22 2

Baseball 18 6

Basketball 15 26

Soccer 3 10

In testing to see whether opinion with respect to the use of instant replays

is independent of sport, a table of expected frequencies is found. In this

table, the expected number of professional baseball players opposing the

use of instant replays is equal to:

(a) 10.4

(b) 24.1

2006

c Carl James Schwarz 12

(c) 11.0

(d) 6.0

(e) 8.4

26. Each person in a random sample of males and females was asked to state

his/her sex and preferred colour. The resulting frequencies are shown

below.

Colour

Red Blue Green

Male 3 11 6

Sex Female 17 11 2

(a) 55% of males prefer the colour blue.

(b) Of those who prefer the colour green, 75% are males.

(c) 44% of people surveyed prefer the colour blue.

(d) A higher percentage of males prefered the colour blue than females.

(e) 15% of people are males who prefer the colour red.

Solution: e

Past performance 2006 Oct - 76% (16%=d)

2006

c Carl James Schwarz 13

Multiple Choice Questions

Experimental and Survey Design

cream sales. This is an example of an association likely caused by:

(a) coincidence

(b) cause and effect relationship

(c) confounding factor

(d) common cause

(e) none of the above

Solution: d

Past performance 1991 Oct - 31% (30% a, 25% c)

Past performance 1992 Oct - 55% (17% a; 17% b)

Past performance 2006 Oct - 70% (10% a; 15% c)

headaches. Four hours after taking the new remedy, 20 of the subjects

reported that their headaches had disappeared. From this information

you conclude:

(b) nothing, because the sample size is too small.

(c) nothing, because there is no control group for comparison.

(d) that the new treatment is better than aspirin.

(e) that the remedy is not effective for the treatment of headaches.

Solution: c

Past performance 1997 Jun - 99%

Past performance 1997 Aug - 99%

1

3. A nutritionist wants to study the effect of storage time (6, 12, and 18

months) on the amount of vitamin C present in freeze dried fruit when

stored for these lengths of time. Vitamin C is measured in milligrams per

100 milligrams of fruit. Six fruit packs were randomly assigned to each of

the three storage times. The treatment, experimental unit, and response

are respectively:

(b) a fruit pack, amount of vitamin C, a specific storage time

(c) random assignment, a fruit pack, amount of vitamin C

(d) a specific storage time, a fruit pack, amount of vitamin C

(e) a specific storage time, the nutritionist, amount of vitamin C

Solution: d

Past performance 1992 Dec - 92%

Past performance 1996 Dec - 97%

and severity of the flu. We take the next 20 patients that come to the

walk-in clinic complaining of flu and, after a medical exam to verify that

the patients do have the flu, we give them the new medicine and tell them

about the new drug we are giving them. One week later, the patients are

contacted and 15 patients state the new remedy was helpful in reducing

the severity and length of the illness. Which of the following is NOT

CORRECT?

(a) This is a poor experiment because there is no control group. We do

not know how many would feel better in a week without treatment.

(b) This is a poor experiment because it is not double-blinded. The

patients may feel relief because they thought the drug should work.

(c) This is a poor experiment because a convenience sample was selected.

Patients who come to the a walk-in clinic may have more severe flu

than people who do not.

(d) This is a poor experiment because we didn’t give the remedy to people

without the flu to assess its effect in a control group.

(e) This is a poor experiment because the sample size is likely to be too

small to detect anything but a gross improvement in measuring the

proportion of people reporting an improvement.

Solution: d

Past performance 1991 Feb - 63% (c-14%, e-13%)

Past performance 1991 Dec - 69% (e-20%)

Past performance 1993 Feb - 56%

Past performance 1996 Oct - 64% (28%-e)

2006

c Carl James Schwarz 2

Past performance 1996 Dec - 68% (28%-e)

Past performance 1998 Dec - 80% (15%-e)

pare the starting salaries of women and men. For each graduate, three

variables are to be recorded (among others) ů sex, starting salary, and

area of specialization.

(a) Sex and starting salary are explanatory variables; area of specializa-

tion is a response variable.

(b) Sex is an explanatory variable; starting salary and area of specializa-

tion are response variables.

(c) Sex is an explanatory variable; starting salary is a response variable;

area of specialization is a possible confounding variable.

(d) Sex is a response variable; starting salary is an explanatory variable;

area of specialization is a possible confounding variable.

(e) Sex and area of specialization are response variables; starting salary

is an explanatory variable.

Solution: c

Past performance 1991 Dec - 74% (b-10%)

Past performance 1993 Apr 99%

(b) A large sample size always ensures that our sample is representative

of the population.

(c) If all other things are equal, we need a larger sample size for a larger

population.

(d) In a properly chosen sample, an estimate will be less variable with a

large sample size and hence more precise.

(e) In random samples, the randomization ensures that we get precise

and accurate estimates.

Solution: d

Past performance 1992 Dec - 63% (30%e)

Past performance 1996 Dec - 89%

standard fish food and a new product) work equally well at producing fish

of equal weight after a 2-month feeding program. The experimenter has 2

2006

c Carl James Schwarz 3

identical fish tanks (1 & 2) to put fish in and is considering how to assign

the 40 tagged fish to the tanks. To properly assign the fish, one step would

be to:

(a) put all the odd tagged numbered fish in one tank, the even in the

other, and give the standard food type to the odd numbered ones

(b) obtain pairs of fish whose weights are virtually equal at the start of

the experiment and randomly assign one to the group tank 1, the

other to tank 2 with the feed assigned at random to the tanks.

(c) to proceed as in as in (b), but put the heavier of the pair into tank

2.

(d) assign the fish at random to the two tanks and give the standard feed

to tank 1.

(e) not to proceed as in (b) because using the initial weight in (b) is a

non-random process.Use the initial length of the fish instead.

Solution: d

a soybean crop. She has 20 plots of land available and she decides to use

a paired experiment – using 10 pairs of plots. Thus, she will:

(a) use a table of random numbers to divide the 20 plots into 10 pairs

and then, for each pair, flip a coin to assign the fertilizers to the 2

plots.

(b) subjectively divide the 20 plots into 10 pairs (making the plots within

a block as similar as possible) and then, for each pair, flip a coin to

assign the fertilizers to the 2 plots.

(c) use a table of random numbers to divide the 20 plots into 10 pairs

and then use the table of random numbers a second time to decide

upon the fertilizer to be applied to each pair.

(d) flip a coin to divide the 20 plots into 10 pairs and then, for each pair,

use a table of random numbers to assign the fertilizers to the 2 plots.

(e) use a table of random numbers to assign the 2 fertilizers to the 20

plots and then use the table of random numbers a second time to

place the plots into 10 pairs.

Solution: b

9. A student wishes to examine the effect of wing width and wing length on

the length of flight of a paper airplane. There are 4 different models of

airplanes. Which of the following is NOT correct?

2006

c Carl James Schwarz 4

(a) A factor (such as wing width) is an experimental variable under con-

trol of the experimenter.

(b) The order of flights was randomized to remove the influence of any

other variables upon the flight distance of each flight.

(c) It would be better to make four copies of each model of plane to give

some feel for the plane-to-plane variations. Flying a single copy four

times gives information about the internal variation.

(d) Interaction between two factors means that the effect of a factor at

one level depends on the level of the second factor.

(e) Planned experiments (where randomization can take place) is one of

the strongest pieces of evidence in try to establish a causal relation-

ship.

Solution: b - randomization does not remove influences - makes them

equal in all groups

Past performance 1996 Nov - 8% (41%-c; 18%-d; 30%-e)

10. An experiment was conducted where you flew paper airplanes after mod-

ifying wing depth and wing length. There were four different models of

airplane. One design consideration was the choice between

flying each plane four times or making four copies of each model, each of

which is flown once. Which of the following is NOT correct?

(a) Flying multiple copies of each model (i.e. separate planes of each

model) could give information on variability in flight due to fabrica-

tion effects (i.e. how you made the plane).

(b) Flying a single copy of each model four times could give information

on variability in flight due to changes in initial launch conditions.

(c) The differences in flight length among the different models gives in-

formation on the “effects” of the design factors - wing depth and wing

length.

(d) The response variable is flight length; the explanatory variables are

wing depth and wing width.

(e) Interaction between the effects of wing depth and wing width implies

that the effects of wing depth are the same for all wing widths.

Solution: e

Past performance 1997 Jul - 83%

the amount of water and seed variety upon subsequent growth of plants.

Each plant was potted in a clay plot, and a measured amount of water

was given weekly. The height of the plant at the end of the experiment

was measured. Which of the following is not correct?

2006

c Carl James Schwarz 5

(a) The response variable is the plant height.

(b) The explanatory variables are the amount of water and seed variety.

(c) Randomization was used to eliminate the effect of other possible fac-

tors upon the growth of the plants.

(d) A possible uncontrollable factor in this experiment is any nutrients

that might be present in the clay pots.

(e) Designed experiments give the best evidence of “cause-and-effect” re-

lationships.

Solution: c - randomization does not remove influences - makes them

equal in all groups

Past performance 1997 Jun - 54% (11%-b; 19%-d; 15%-e)

12. A survey was conducted by visiting a student parking lot to estimate the

proportion of cars that were red. Which of the following is NOT correct?

(a) If the sampled stall was empty, we can simply choose another stall, at

random, to take its place because it is not likely that the stall being

vacant is related to a car being red.

(b) The sample would be representative of the population if 100 cars were

chosen regardless if randomization was used or not.

(c) Even though a random sample was taken from cars in the parking

lot, the sample may not be representative of the cars driven by SFU

students because the decision to park in B-lot is self-selected.

(d) If a another sample of cars was chosen, it is likely that a different

proportion of cars that are red would be obtained.

(e) The confidence interval computed gave a 95% confidence interval for

the true proportion of cars that were red in the population of cars

that park in B-lot (assuming that the sample was selected using the

3 R’s).

Solution: b

Past performance 1997 Jun - 91%

13. A survey was done to estimate the proportion of cars that are red and are

Japanese made in the City of Vancouver by taking a random sample of

size 25 from a student parking lot at Simon Fraser University. Which of

the following is NOT CORRECT:

(a) This sample may not be representative of the cars in Vancouver be-

cause mainly students park at SFU.

(b) If the particular stall is vacant, we can simply select another stall at

random because it is unlikely that a stall is vacant is related to the

color or manufacturer of the car.

2006

c Carl James Schwarz 6

(c) It would be dangerous to simply select the first 25 stalls in the lot

closest to the Applied Science Building because there are a number

of stall reserved for service vehicles whose primary color is white.

(d) Different students obtained different answers for their sample propor-

tions. This is an example of a sampling distribution for an estimator.

(e) The margin of error will depend upon the total number of cars in the

lot when we did the sample.

Solution: e

Past performance 1998 Nov - 76%

set of variables to allow you distinguish among groups, e.g. in one of the

assignments, you tried to distinguish among authors based on sentence

length and other statistics. Which of the following is NOT CORRECT?

(a) We needed to adjust some variables to a “per 100 word basis” or to

a “per sentence basis” to adjust for the different number of words in

the texts where authorship is known.

(b) Potentially useful variables are selected by finding variables whose

distribution are as similar as possible for all the authors.

(c) Another example of this method might be a bank making a decision

on granting a student a loan based on characteristics such as grade

point average, past credit history, etc.

(d) We looked at many pairs of plots to find the pair of variables that

gave the best separation between the two authors.

(e) Because of natural variability, errors can always be made. However,

the goal of this analysis is to minimize the costs of misclassification.

Solution: b

Past performance 1998 Nov - 80%

15. An experiment was conducted where here you tried to distinguish among

authors based on sentence length and other statistics. Which of the fol-

lowing is NOT correct?

(a) We needed to adjust some variables to a “per 100 word basis” to

adjust for the different number of words on a page.

(b) This was a simplified form of discriminant analysis where, in general,

one wishes to distinguish among groups of objects based on charac-

teristics observed.

(c) Another example of this method might be a bank making a decision

on granting a student a loan based on characteristics such as grade

point average, past credit history, etc.

2006

c Carl James Schwarz 7

(d) The polygon plot is a way of “enclosing” typical values of the statistics

for each author.

(e) Potentially useful variables are selected by finding variables whose

distribution are as similar as possible for all the authors.

Solution: e

Past performance 1997 Jul - 71% (20%-c)

16. An experiment was conducted where you analyzed the results of the plant

growth experiment after you manipulated the amount of water and seed

variety. Which of the following is correct?

(a) We randomized the plants to plots to eliminate any effect of hidden

variables.

(b) We could determine the best combination of water and seed variety

by examining the difference in the plant height in the final week of

the experiment.

(c) The variability in growth among plants of the same variety who re-

ceived the same amount of water was constant over time.

(d) The growth of a particular plant in week 3 is likely to be independent

(unrelated) of the growth of the same plant in week 2.

(e) The growth of the plants was linear over time.

Solution: b

Past performance 1997 Jul - 39% (30%-a; 11%-c; 11%-d; 7%-e)

17. The following numbers are extracted from a table of random digits:

sample of sites selected without replacement from a population of 45 sites.

The sites are labeled 01, 02, ..., 45 and she starts at the beginning of the

line of random digits and takes consecutive pairs of digits. Which of the

following is correct?

(a) Her sample is 38, 25, 02, 38, 22

(b) Her sample is 38, 68, 35, 02, 22

(c) Her sample is 38, 35, 27, 28, 08

(d) Her sample is 38, 65, 35, 02, 79

(e) Her sample is 38, 35, 02, 22, 40

Solution: e

2006

c Carl James Schwarz 8

18. We wish to draw a sample of size 5 without replacement from a population

50 households. Suppose the households are numbered 01, 02, . . . , 50, and

suppose that the relevant line of the random number table is:

(a) households 11 13 36 62 73

(b) households 11 36 23 08 42

(c) households 11 36 23 23 08

(d) households 11 36 23 56 92

(e) households 11 35 96 90 46

Solution: b

Past performance 1998 Dec - 50% (19% c; 27% d)

Note that (c) is WITH replacement; (d) uses pairs corresponding to house

numbers not in the range 1..50

affected by Turner’s Syndrome was carried out recently in Vancouver. All

34 girls in the study were given the growth hormone and their heights

were measured at the time the hormone was given and again one year

later. No measurements were made on their final adult heights. Which of

the following is NOT a problem with this experiment:

(a) there was no blinding

(b) there was no control group

(c) nonresponse bias

(d) there was insufficient attention to the placebo effect

(e) Because final heights were not measured, it would be impossible to

tell if the hormone affected final height or only accelerated growth

and made no difference to final height.

Solution: c

Past performance 1998 Oct - 71%

tend to behave differently from people who respond.

(b) Non-sampling errors are often bigger than the random sampling er-

rors in surveys.

2006

c Carl James Schwarz 9

(c) Slight changes in the wording of questions can make a measurable

difference to survey results.

(d) People will sometimes answer a question differently for different in-

terviewers.

(e) Sophisticated statistical methods can always correct the results if the

population you are sampling from is different from the population of

interest, e.g. due to under-coverage.

Solution: e

Past performance 1998 Oct - 87%

total population of about 30 million) and 1000 Americans (from a total

population of about 300 million). Which of the following is FALSE?

(a) Randomization ensures that both samples are representative of their

respective populations.

(b) The precision is determined by the ratio of the sample size to the

total population size.

(c) A smaller proportion of the American population has been chosen.

Therefore, a particular person has a smaller chance of being selected

in America than in Canada.

(d) A potential stratification variable for both countries could be location

- eastern, middle, or western continental.

(e) Random digit dialing to select people for the survey could induce

biases in the results if the characteristic of interest for the survey is

related to income.

Solution: b - because precision is determined mainly by sample size

Past performance 1998 Oct - 54% (25% c)

Past performance 2006 Nov - 67% (19% c)

22. An experiment was conducted by the Schwarz family to look at the yield

of popcorn (total grams that popped when 15 g of popcorn were heated)

when two variables (the type of popcorn: gourmet or plain) and the

amount of oil (little or lots) was used. A profile plot of the results is

below:

2006

c Carl James Schwarz 10

Which of the following is NOT CORRECT:

(a) Because the lines are not parallel, there appears to be evidence of

interaction between the two variables.

(b) The two explanatory factors are the amount of oil and the type of

popcorn. The response variable is the yield of popcorn.

(c) The difference in yield between gourmet and plain popcorn is esti-

mated to increase by about 6 g when lots of oil were used.

(d) There was little change in the yield for plain popcorn when either

little or lots of oil were used.

(e) An interaction would exist if the increase in yield from going from

little to lots of oil were the same for both types of popcorn.

Solution: e

Past performance 1998 Nov - 63% (16% a; 13% c)

bution of universities to the economy was circulated to 394 people who

the magazine decided “are the most likely to know how important are uni-

versities to the Canadian economy”. The main problem with using these

results to draw conclusions about the general public’s perception is:

(a) selection bias

(b) insufficient attention to the placebo effect

(c) no control group

(d) non-response bias

(e) interviewer bias

2006

c Carl James Schwarz 11

Solution: a

Past performance 1998 Dec - 90%

the distance origami frogs jumped. Which of the following is FALSE?

(a) This experiment had pseudo-replication because each frog was tested

multiple times.

(b) A better experiment would require us to make multiple copies of each

frog from each paper weight.

(c) Because the stiffer paper is harder to fold, a better experiment would

use a larger sheet of the stiffer paper while making a frog.

(d) A proper experiment could use 10 replicate frogs of the lighter weight

paper and only 5 replicate frogs of the stiffer paper in a completely

random order.

(e) It would be a poor experiment if two people made the frogs jump

with person A using the light weight frogs and person B using the

heavier weight frogs.

Solution: c - there is actually nothing wrong with an unbalanced design

as long as proper randomization is used. In more advanced classes you

will see that the design with the best power and small se for the estimated

difference have equal sample sizes, but this does not invalidate the exper-

iment.

Past performance 2006 Oct - 47% (45%-d)

Past performance 2006 Dec - 67% (25%-d)

portion of class who used marijuana in the last year. Each student ob-

tained a random digit between 0 and 9 (inclusive). Of those who received

the digits 0, 1, 2, 3, or 4, these students answered the question on mari-

juana usage. Of those who received the digits 5, 6, 7, 8, 9, these students

answered the question if their favorite person’s birthday was in January

to June (inclusive). We obtained a total of 150 yes and 250 no responses.

Which of the following is FALSE?

(a) We estimate that about 25% of students have used marijuana in the

last year.

(b) About 50% of people have birthdays in January-June (inclusive)

(c) Of the 150 yeses, about 66%=100/150 of these had favorite people

with birthdays in January-June (inclusive).

(d) Of people with birthdays in January-June, we estimate that about

25% used marijuana in the last year.

2006

c Carl James Schwarz 12

(e) About 37%=150/400 said yes to having used marijuana in the last

year.

Solution: e

Past performance 2006 Oct - 71% (12%-d

26. Recall in one assignment, you conducted a two factor experiment to com-

pare the flying distances of paper airplanes. One factor was wing length

with two levels; the second factor was wing depth also with two factors.

Which of the following is CORRECT?

(a) A good experiment would fly all four copies of the different airplanes

in sequential order.

(b) A good experiment would control for the person launching the planes

by having the same person do all the launches.

(c) A good experiment would make a single copy of each treatment com-

bination and test each copy 10 times.

(d) A good experiment would examine the effect of paper weight on flying

by making all planes of the same weight of paper.

(e) A good experiment would order the planes by weight while running

the experiment.

Solution: b

Past performance 2006 Nov - 70%; 12% choose (c); 14% choose (d)

27. Recall in one assignment you surveyed cars in a parking lot to estimate

the proportion that were red or the proportion that were from a Japanese

manufacturer. Which of the following is NOT CORRECT?

(a) A convenience sample of the cars closest to the Applied Science build-

ing may give a biased estimate of the proportion of cars which are

from a Japanese manufacturer.

(b) Different students may get different answers for the proportion of

cars that are red.

(c) The sample proportion of cars that are red is an unbiased estimate of

the population proportion if the sampling is a simple random sample.

(d) A sample of 100 cars in a convenience sample is always better than

a sample of 20 cars from a proper random sample.

(e) A sample of 100 cars from a proper random sample will give more

precise estimates of the proportion of cars that are red than a sample

of 20 cars from a proper random sample.

Solution: d

Past performance 2006 Nov - 92%

2006

c Carl James Schwarz 13

28. Consider an experiment to investigate the efficacy of different insecticides

in controlling pests and their effects on subsequent yield. What is the best

reason for randomly assigning treatment levels (spraying or not spraying)

to the experimental units (farms)?

(a) Randomization make the experiment easier to conduct because we

can apply the insecticide in any pattern rather than in a systematic

fashion.

(b) Randomization makes the analysis easier because the data can be

collected and entered into the computer in any order.

(c) Randomization is required by statistical consultants before they will

help you analyze the experiment.

(d) Randomization implies that it is not necessary to be careful during

the experiment, during data collection, and during data analysis.

(e) Randomization will tend to average out all other uncontrolled fac-

tors such as soil fertility so that they are not confounded with the

treatment effects.

Solution: e

Past performance 1990 Feb - 97%

Past performance 1993 Feb - 98%

Past performance 1996 Dec - 100%

Past performance 2006 Dec - 99%

2006

c Carl James Schwarz 14

Multiple Choice Questions

Inference - Paired samples on means

periment?

(a) The object of pairing (or blocking) is to account for the effect of

possible other factors (such as fertility of soils).

(b) The analysis of paired data starts by finding the difference between

the values of the pair. The order of the difference (as long as it is

consistent) is unimportant.

(c) It is crucial to recognize pairing. If pairing is not recognized, the

results will not be as accurate and precise as possible.

(d) The degrees of freedom is equal to the number of pairs - 1.

(e) Because pairing is beneficial, we can pair all data by matching the

smallest value of each sample, the second smallest value of each sam-

ple, the third smallest value of each sample, etc.

Solution: e

Past performance 1990 Dec - 65%

Past performance 1992 Dec - 93%

2. Trace metals in drinking water wells affect the flavor of the water and un-

usually high concentrations can pose a health hazard. Furthermore, the

water in well may vary in the concentration of the trace metals depending

upon from where it is drawn. In the paper, “Trace Metals of South Indian

River Region” (Environmental Studies, 1982, 62-6), trace metal concen-

trations (mg/L) on zinc were found from water drawn from the bottom

and the top of each of 6 wells. The data follows:

1 .430 .415

2 .266 .238

3 .567 .390

4 .531 .410

5 .707 .605

6 .716 .609

1

A a 95% confidence interval for the mean difference in the zinc concentra-

tions in this area between water drawn from the top and bottom of wells

is:

(b) .0917 ± 2.45(.061)

(c) .0917 ± 2.57(.025)

(d) .0917 ± 2.45(.025)

(e) .0917 ± 2.20(.025)

Solution: c

Past performance 1990 Dec - 64%

Past performance 1992 Dec - 75% (20%a)

2006

c Carl James Schwarz 2

Multiple Choice Questions

Inference - Single sample on means

tion based on information contained in a sample.

(b) A statement made about a sample based on the measurements in

that sample.

(c) A set of data selected from a larger set of data.

(d) A decision, estimate, prediction or generalization about sample based

on information contained in a population.

(e) A set of data that characterizes some phenomenon.

Solution: a

RECT?

(a) If we keep the sample size fixed, the confidence interval gets wider as

we increase the confidence coefficient.

(b) A confidence interval for a mean always contains the sample mean.

(c) If we keep the confidence coefficient fixed, the confidence interval gets

narrower as we increase the sample size.

(d) If the population standard deviation increases, the confidence interval

decreases in width.

(e) If the confidence intervals for two means do not overlap very much,

there is evidence that the two population means are different.

Solution: d

Past performance 1990 Dec - 72%

Past performance 1996 Nov - 76%

1

3. You have measured the systolic blood pressure of a random sample of 25

employees of a company. A 95% confidence interval for the mean systolic

blood pressure for the employees is computed to be (122,138). Which of

the following statements gives a valid interpretation of this interval?

(a) About 95% of the sample of employees have a systolic blood pressure

between 122 and 138.

(b) About 95% of the employees in the company have a systolic blood

pressure between 122 and 138.

(c) If the sampling procedure were repeated many times, then approx-

imately 95% of the resulting confidence intervals would contain the

mean systolic blood pressure for employees in the company.

(d) If the sampling procedure were repeated many times, then approxi-

mately 95% of the sample means would be between 122 and 138.

(e) The probability that the sample mean falls between 122 and 138 is

equal to 0.95.

Solution: c

Past performance 1997 Aug - 40% (40%-d; 15%-e)

Past performance 1998 Nov - 57% (15%-d; 15%-b)

their summer break from studies. A random sample of students gave a

sample average of $3975 and a 95% confidence interval was found to be

($3525 < µ < $4425). This interval is interpreted to mean that:

(a) if the study were to be repeated many times, there is a 95% prob-

ability that the true average summer earnings is not $4500 as the

government claims.

(b) because our specific confidence interval does not contain the value

$4500 there is a 95% probability that the true average summer earn-

ings is not $4500.

(c) if we were to repeat our survey many times, then about 95% of all

the confidence intervals will contain the value $4500.

(d) if we repeat our survey many times, then about 95% of our confi-

dence intervals will contain the true value of the average earnings of

students.

(e) there is a 95% probability that the true average earnings are between

$3525 and $4425 for all students.

about the answer

Solution: d

2006

c Carl James Schwarz 2

5. Does playing music to dairy cattle increase their milk production? An

experiment was conducted where a group of dairy cattle was divided into

two groups. Music was played to one group; the control group did not

have music played. The average increase in production was 2.5 L/cow over

the time period in question. A 95% confidence interval for the difference

(treatment-control) in the mean production was computed to be (1.5,3.5)

L/cow. This means:

(a) 95% of the cows increased their production by between 1.5 and 3.5

L.

(b) We are 95% confident that the average increase in production in the

sample is 2.5 L/cow.

(c) Because the confidence interval does not contain zero, we are 95%

confident that there was no effect of playing music.

(d) We don’t know the true increase in production, but we are 95% con-

fident that the increase in the mean production is in this interval.

(e) Because the confidence interval does not include zero, we are 95% con-

fident that the true increase in production for all cows is 2.5 L/cow.

Solution: d

Past performance 1992 Dec - 76% (10%e)

Past performance 1996 Dec - 86%

of oats. A sample of 20 plots gave a mean yield of 2.9 t/hectare, and a

95% confidence interval of (2.48, 3.32) t/ha. This means:

(a) We are sure the true mean yield of this new variety is between 2.48

and 3.32 t/ha.

(b) We are 95% confident that the true mean yield of this variety is 2.9

t/ha.

(c) About 95% of the yields of the new variety will be between 2.48 and

3.32 t/ha.

(d) We are 95% confident that the true mean yield of this variety is

between 2.48 and 3.32 t/ha.

(e) We are 95% confident that the mean yield of 2.9 t/hectare is between

2.48 and 3.32 t/ha.

Solution: d

Past performance 1990 Dec - 87%

7. A 95 percent confidence interval for the mean time taken to process new

insurance policies is (11, 12) days. This interval can be interpreted to

mean that:

2006

c Carl James Schwarz 3

(a) only 5 percent of all policies take less than 11 or more than 12 days

to process

(b) only 5 percent of all policies take between 11 and 12 days to process

(c) about 95 out of every 100 such intervals constructed from random

samples of the same size will contain the population mean processing

time

(d) the probability is .95 that all policies take between 11 and 12 days

to process

(e) none of the above

Solution: c

unknown mean and variance. A random sample of size 25 gave a mean

2.5 cm. The 95% confidence interval had length 4 cm. Then

(b) The sample variance is 26.03.

(c) The population variance is 4.84.

(d) The population variance is 23.47.

(e) The sample variance is 23.47.

and ’small ’ sample

9. A turkey producer knows from previous experience that profits are maxi-

mized by selling turkeys when their average weight is 12 kilograms. Before

determining whether to put all their full grown turkeys on the market this

month, the producer wishes to estimate their mean weight. Prior knowl-

edge indicates that turkey weights have a standard deviation of around 1.5

kilograms. The number of turkeys that must be sampled in order to esti-

mate their true mean weight to within 0.5 kilograms with 95% confidence

is:

(a) 35

(b) 5

(c) 65

(d) 10

(e) 150

2006

c Carl James Schwarz 4

Solution: a

Past performance 1992 Dec - 85%

Past performance 1998 Nov - 85%

10. A random sample of 4 Herefords, each with a frame size of three (on a

one-to-seven scale), gave a sample mean weight of 452 kg and a sample

standard deviation of 12 kg. A 95% confidence interval for the average

weight of all Herefords of this frame size is (using an “exact” confidence

interval):

(a) (435.3, 468.7)

(b) (432.9, 471.1)

(c) (440.2, 463.8)

(d) (428.5, 475.5)

(e) (436.6, 467.4)

Solution: b

Past performance 1990 Dec - 75%

Past performance 1997 Jul - 75%

11. Referring to the previous question, about how many animals should be

sampled (in total) in order to be 95% confident of determining the true

mean weight WITHIN 2 kg?

(a) 140

(b) 170

(c) 550

(d) 100

(e) 190

Solution: a

Past performance 1990 Dec - 72%

Past performance 1997 Jul - 60%

farm was found to be 150 bushels. If the yield in bushels per plot in

previous studies was found to be approximately normally distributed with

a variance of 400 bushels2 , a 98% confidence interval for the mean yield

is:

(b) (144.8, 155.2)

2006

c Carl James Schwarz 5

(c) (132.8, 167.2)

(d) (134.5, 165.5)

(e) (145.7, 154.4)

Solution: d

Past performance 1989 Dec - 61% ( 22% -b)

percent confidence interval for mean monthly family income for a large

population: ($600, $800). If the analyst had used a 99 percent confidence

coefficient instead, the confidence interval would be:

(b) wider and would involve a smaller risk of being incorrect

(c) narrower and would involve a smaller risk of being incorrect

(d) wider and would involve a larger risk of being incorrect

(e) wider but it cannot be determined whether the risk of being incorrect

would be larger or smaller

Solution: b

timber plot last year. A random sample of n = 100 seedlings is selected

and the one-year growth for each is measured. The sample results are: X

= 5.62 cm and s = 2.50 cm. The 95 percent confidence interval for the

mean growth is:

(b) (4.98, 6.26)

(c) (5.13, 6.11)

(d) (5.37, 5.87)

(e) (5.57, 5.67)

Solution: c

Past performance 1989 Dec - 93%

biochemist prepares extracts of the mold culture and then measures the

amount of the toxic substance per gram of solution. From six preparations

of the mold culture the following observations on toxic substances (mg)

are obtained:

2006

c Carl James Schwarz 6

1.2, .8, .6, 1.1, 1.2, .8.

A 95% confidence interval for the mean amount of toxic substances is:

(b) .95 ś 1.96 (.10)

(c) .95 ś 2.57 (.25)

(d) .95 ś 1.96 (.25)

(e) .95 ś 2.02 (.10)

Solution: a

Past performance 1989 Dec - 57% (18% - c, 11% -b,d)

Past performance 1990 Dec - 58% (14% - c, 14% - b)

Past performance 1990 Dec - 63% (11% - d, 21% - c)

16. The effect of acid rain upon the yield of crops is of concern in many places.

In order to determine baseline yields, a sample of 13 fields was selected,

and the yield of barley (g/400m2 ) was determined. The output from SAS

appears below:

QUANTILES(DEF=4) EXTREMES

N 13 SUM WGTS 13 100% MAX 392 99% 392 LOW HIGH

MEAN 220.231 SUM 2863 75% Q3 234 95% 392 161 225

STD DEV 58.5721 VAR 3430.69 50% MED 221 90% 330 168 232

SKEW 2.21591 KURT 6.61979 25% Q1 174 10% 163 169 236

USS 671689 CSS 41168.3 0% MIN 161 5% 161 179 239

CV 26.5958 STD MEAN 16.245 1% 161 205 392

(b) 220.2 ± 1.96(16.2)

(c) 220.2 ± 2.18(58.6)

(d) 220.2 ± 2.18(16.2)

(e) 220.2 ± 2.16(16.2)

Solution: d

Past performance 1989 Dec - 60% (25% - b)

17. The effect of salinity upon the growth of grasses is of concern in many

places where excess irrigation is causing salt to rise to the surface. In

order to determine baseline yields, a sample of 24 fields was selected, and

the biomass of grasses in a standard sized plot was measured (kg). The

output from SAS appears below:

2006

c Carl James Schwarz 7

QUANTILES(DEF=4) EXTREMES

N 24 SUM WGTS 24 100% MAX 22.6 99% 22.6 LOW HIGH

MEAN 9.09 SUM 218.3 75% Q3 11.45 95% 22.52 0.7 15.1

STD DEV 6.64 VARIANCE 44.0 50% MED 8.15 90% 21.8 1 19.8

SKEWNE 0.924 KURTO -0.0209 25% Q1 3.775 10% 1.6 2.2 21.3

USS 2998 CSS 1012.73 0% MIN 0.7 5% 0.77 2.2 22.3

CV 72 STD MEAN 1.35 1% 0.7 2.8 22.6

T:MEAN=0 6.7153 PROb>|T| 0.0001 RANGE 21.9

(b) 9.09 ± 2.0639(1.35)

(c) 9.09 ± 2.0639(6.64)

(d) 9.09 ± 2.0687(1.35)

(e) 9.09 ± 2.0687(6.64)

Solution: d

Past performance 1990 Dec - 65%

Past performance 1996 Nov - 82%

estimate its mean life. Assuming that the life of the light bulb is normally

distributed and that the standard deviation is known to be 40 hours, how

many bulbs should be tested so that we can be 90 percent confident that

the estimate of the mean will not differ from the true mean life by more

than 10 hours?

(a) 7

(b) 44

(c) 8

(d) 62

(e) 87

Solution: b

Past performance 1989 Dec - 70%

of its passengers disembarking at the Winnipeg airport, took an average

of 24.1 minutes to claim their luggage. From a previous survey it was

willing to assume that time to claim luggage is normally distributed with

a variance of 18 (min 2 ). A 95% confidence interval for the mean time to

claim one’s luggage has endpoints.

2006

c Carl James Schwarz 8

(a) 24.1 ± 8.32

(b) 24.1 ± 3.92

(c) 24.1 ± 2.77

(d) 24.1 ± 3.26

(e) 24.1 ± 9.78

Solution: c

20. Consider the following graph of the mean yield of barley in 1980, 1984,

and 1988 along with a 95% confidence interval.

(a) Since the confidence intervals for 1984 and 1980 have considerable

overlap, there is little evidence that the sample means differ.

(b) Since the confidence intervals for 1988 and 1980 do not overlap, there

is good evidence that their respective population means differ.

(c) The sample mean for 1984 is about 195 g/400 m2 .

(d) The sample mean for 1988 is less than the sample mean for 1984.

(e) The estimate of the population mean in 1988 is more precise than

that for 1980 because the confidence interval for 1988 is narrower

than that for 1980.

Solution: a

Past performance 1989 Dec - 30% (41% - e, 20% - b)

Past performance 1990 Dec - 49% (39% - e)

Past performance 1990 Dec - 40% (25% - b, 32% -e)

Past performance 1991 Dec - 79% (13%-b)

Past performance 1996 Nov - 25%

2006

c Carl James Schwarz 9

21. A researcher in biochemistry is attempting to summarize the results of an

experiment. The experiment involved measuring enzyme active under a

variety of conditions. The analysis has yielded the following statistics:

n 10

Median 157.00

Mean 163.50

Variance 45.29

Std. Deviation 6.73

Range 38.00

(b) (154.9, 159.1)

(c) (158.8, 168.2)

(d) (158.7, 168.3)

(e) (152.2, 161.8)

Solution: d

Past performance 1991 Dec - 95%

22. The United States Golf Association (USGA) tests new brands of golf balls

to assure that they meet USGA specifications. One test involves measuring

the average distance traveled when the ball is hit by a machine called

“Iron Byron”. Past tests have indicated that the standard deviation of the

distances “Iron Byron” hits golf balls is 10 meters. How many golf balls

should be hit by “Iron Byron” in order to estimate the mean distance for

a new brand with a 90% confidence interval of WIDTH 2 meters?

(a) 17

(b) 9

(c) 384

(d) 68

(e) 271

Solution: e

week taken by college students. Based on a preliminary sample he believes

that σ 2 is close to 2.1. How large a sample is needed if his estimate is to

be within 0.3 with probability 0.95.

2006

c Carl James Schwarz 10

(a) 183

(b) 253

(c) 64

(d) 359

(e) 90

Solution: e

24. Recently, a price war has developed among retailers selling Brand X denim

jeans. A major chain buyer wishes to estimate the mean price of these

jeans during this period to compare it to the normal selling price of $20.00.

A random sample of 7 major retailers produces a mean retail price of

$13.50 with a standard deviation of $3.50. A 80% confidence interval for

the true mean retail price of Brand X jeans during the price war is:

(b) (8.46, 18.54)

(c) (11.81, 15.19)

(d) (10.00, 17.00)

(e) (11.60, 15.40)

the following statements is/are true if the sample size, n, is “large”?

(a) This interval will contain the true value of µ approximately 95 times

out of one hundred.

(b) This interval is an approximate 95% confidence interval for µ

(c) This interval is too narrow to be a useful interval estimator for µ.

(d) This interval will contain the true value of µ 997 time out of 1000.

(e) Both (a) and (b) are true.

Solution: e

only information she has right now is that the strength of a similar fastener

has a standard deviation of 35. Assuming that the new fasteners have the

same standard deviation, how many fasteners should she test so that she

can be 99% confident that the sample mean will be within ś 10 of the true

mean strength? Choose the answer that is closest to your computed value.

2006

c Carl James Schwarz 11

(a) 15

(b) 30

(c) 50

(d) 80

(e) 325

Solution: d - Note that if you use a 3 multiplier for a 99% c.i. you will

get an answer near 110.

The exact multipler for a 99% confidence interval is 2.57 (look for the

99.5th percentile on a normal curve

which gives you an answer of 81.

is going to select a random sample of 30 accounts from Population A and

he is going to use the average amount owing in these sampled accounts as

an estimate of the average amount owing in Population A. Auditor B is

faced with a population of 10,000 accounts (Population B). He is going to

select a random sample of 30 accounts from Population B and he is going

to use the average amount owing in these sample accounts as an estimate

of the average amount owing in Population B. Other things being equal:

(a) Auditor A’s estimate will be about 10 times more accurate than

Auditor B’s estimate.

(b) Auditor B’s estimate will be about 10 times more accurate than Au-

ditor A’s estimate.

(c) Auditor A’s estimate will be about 3.16 times more accurate than

Auditor B’s estimate.

(d) Auditor B’s estimate will be about 3.16 times more accurate than

Auditor A’s estimate.

(e) the accuracy of the two estimates will be about the same.

Solution: e

Past performance 1991 Dec - 95%

28. You wish to estimate µ, the average lifetime of a particular type of battery.

You are planning to select n batteries of this type and to operate them

continuously until they fail. You have some feeling that the standard

deviation of the lifetimes should be around 20 hours, and you wish your

estimate of µ to be within 1 hour of µ with probability 0.95. How many

batteries should you select?

(a) 1537

(b) 784

2006

c Carl James Schwarz 12

(c) 40

(d) 77

(e) 1083

Solution: a - The exact answer of 1537 is found using the exact multi-

plier of 1.96 = 97.5th percentile

of the normal curve rather than the approximate multiplier of 2.

29. A statistical procedure to estimate the mean shell thickness of eggs from

chickens contaminated with PCBs obtains a point estimate of 0.70 mm

and an estimated standard error of .05 mm. This means:

(a) The standard deviation of actual shell thickness in the sample was

.05 mm.

(b) We are 95% confident that the sample mean shell thickness is accurate

to with .05 mm.

(c) An estimate of the standard deviation of the sample mean shell thick-

ness over repeated samples is .05 mm

(d) The standard deviation of the population mean over all eggs is about

.05 mm.

(e) An approximate 95% confidence interval for the sample mean shell

thickness is .70mm ± .10mm.

Solution: c - note that e refers to “sample mean”

Past performance 1996 Dec - 34% (13%-d; 45%-e)

2006

c Carl James Schwarz 13

Multiple Choice Questions

Inference - Single sample on proportions

of a certain variety of tomato seeds and tests the sample for percentage

germination. If 155 of the 200 seeds germinate, then a 95% confidence

interval for p, the population proportion of seeds that germinate is:

(b) (.717, .833)

(c) (.706, .844)

(d) (.713, .844)

(e) (.726, .833)

Solution: b

2. Some scientists believe that a new drug would benefit about half of all peo-

ple with a certain blood disorder. To estimate the proportion of patients

who would benefit from taking the drug, the scientists will administer it to

a random sample of patients who have the blood disorder. What sample

size is needed so that the 95% confidence interval will have a width of

0.06?

(a) 748

(b) 1,068

(c) 1,503

(d) 2,056

(e) 2,401

Solution: b

Past performance 1989 Dec - 74%

1

3. In a random sample of 800 Winnipeg automobile owners, it was found

that 480 would like to see the size of the cars reduced. A 95% confidence

interval for the proportion of all Winnipeg car owners who would like to

see smaller cars is:

(b) (0.572, 0.628)

(c) (0.532, 0.667)

(d) (0.555, 0.645)

(e) (0.560, 0.630)

Solution: a

Past performance 1991 Dec - 92%

4. A random sample of 900 individuals has been selected from a large pop-

ulation. It was found that 180 are regular users of vitamins. Thus, the

proportion of the regular users of vitamins in the population is estimated

to be 0.20. An estimate of the standard error of this estimate is:

(a) 0.1600

(b) 0.0002

(c) 0.4000

(d) 0.0133

(e) 0.0267

Solution: d

Past performance 1996 Dec - 86%

5. A Gallup poll of 1089 adults found 326 supported the policies of a particu-

lar political party. A 95% confidence interval for the true level of support

in the entire Canadian population is:

(b) (.299, .300)

(c) (.285, .313)

(d) (.267, .332)

(e) (.273, .327)

Solution: e

Past performance 1989 Dec - 81%

Past performance 1990 Dec - 68%

Past performance 1992 Dec - 77% (12%a)

Past performance 1993 Apr - 80% (a-10%)

2006

c Carl James Schwarz 2

6. Refer to the previous question. What sample size would be needed in

order to be 95% confident that the true level of support is within .01 of

the estimated proportion, assuming that the previous poll provides us with

a reasonable estimate of the true support?

(a) 5047

(b) 9604

(c) 1089

(d) 3458

(e) 8068

Solution: e

found that about 80% favoured capital punishment. A Gallup poll of a

sample of 1089 Americans (total population of 260,000,000) also found

that 80% favoured capital punishment. Which if the following statements

is TRUE?

(a) The Canadian poll is much more accurate because a larger proportion

of the total population was surveyed.

(b) The American poll is more accurate because they have a larger total

population.

(c) Both polls are almost equally precise because they have the same

sample size and the two populations are relatively large.

(d) You cannot compare the precision of the two polls because we do not

know the confidence coefficient used.

(e) Both polls are equally precise because in both polls 871 of respondents

favoured capital punishment.

Solution: c

Past performance 1989 Dec - 88%

Past performance 1990 Dec - 81%

Past performance 1992 Dec - 77% (18%e)

Past performance 1993 Apr - 88%

Past performance 1996 Dec - 92%

Past performance 1998 Dec - 87%

television viewers who watch a particular prime-time comedy on May 24th.

The proportion is thought to be about .30 . What is the least number of

viewers that should be randomly selected to ensure that a 95% confidence

interval for the true proportion of viewers will have a WIDTH of .06 or

less ?

2006

c Carl James Schwarz 3

(a) 225

(b) 1068

(c) 267

(d) 897

(e) 683

Solution: d

in a large lot of lightbulbs. From past experience, he feels that the actual

fraction of defective bulbs should be somewhere around 0.2 . How large

a sample should be taken if he wants to estimate the true fraction within

.02 using a 95% confidence interval?

(a) 6147

(b) 24587

(c) 38416

(d) 4330

(e) 1537

Solution: e

proportion of air conditioners that have an energy efficiency ratio of at

least 8. He takes a random sample of 400 owners of air conditioners and

finds that 240 own air conditioners with energy efficiency ratio of at least

8. The width of the 95% confidence interval of the true proportion of air

conditioners that have an energy efficiency ratio of at least 8 is:

q 240

(1− 240 )

(a) 1.96 400 400 400

q 240

(1− 240 )

(b) 1.645 400 400 400

q 240

(1− 240 )

(c) 2(1.96) 400 400 400

q 240

(1− 240 )

(d) 2(1.645) 400 400 400

240

(e) r

240 (1− 240 )

400±1.96 400 400

400

Solution: c

2006

c Carl James Schwarz 4

11. Many television viewers express doubts about the validity of certain com-

mercials. In an attempt to answer their critics, the Timex Corporation

wishes to estimate the proportion of consumers who believe what is shown

in Timex television commercials. Let p represent the true proportion of

consumers who believe what is shown in Timex television commercials. If

Timex has no prior information regarding the true value of p, how many

consumers should be included in their sample so that they will be 85%

confident that their estimate is within 0.03 of the true value of p ?

(a) 400

(b) 12

(c) 576

(d) 384

(e) 544

12. The 3ůM company started a new recreation program for its employees in

the hope that a little recreation would improve an employee’s performance

at work. To determine whether the high cost of the program is justified,

the president of the company wishes to estimate the proportion of the

employees who participate in the recreational activities. In a random

sample of 200 employees, 60 were found to regularly participate in the

recreation program. A 95% confidence interval for the true proportion of

3-M employees who participate in the new recreation program is:

(b) (0.298, 0.302)

(c) (0.267, 0.333)

(d) (0.247, 0.353)

(e) (0.237, 0.364)

Solution: e

13. A random sample of married people were asked “Would you remarry your

spouse if you were given the opportunity for a second time?”; Of the

150 people surveyed, 127 of them said that they would do so. Find a

95% confidence interval for the proportion of married people who would

remarry their spouse.

(b) 0.847 ś 0.029

2006

c Carl James Schwarz 5

(c) 0.847 ś 0.048

(d) 0.847 ś 0.058

(e) 0.847 ś 0.113

Solution: d

Past performance 1990 Dec - 83%

14. A music buff wants to estimate the percentage of students at the University

of Manitoba who believe that Elvis is still alive. How many students should

he include in a random sample if he wants a 90% confidence interval that

is less than 10 percentage points wide? Choose the sample size that is

closest to your solution

(a) 68

(b) 97

(c) 269

(d) 385

(e) 1022

95th percentile of a normal curve(why?).

As well, the WIDTH is .10 which gives a plus/minus size of .05. Because

the actual proportion is

not known, use .5. This gives n = 1.6452 (.5)(.5)/.052 = 270.

15. You would like to estimate the percentage of “regular users of vitamins”

in a large population and you would like your estimate to be accurate to

within 4 percentage points, 19 times out of 20. Approximately how large

should your sample size be?

(a) 600

(b) 2400

(c) 400

(d) 1000

(e) 150

Solution: a

Past performance 1990 Dec - 37% (14% - b, 14% -c, 27% - c)

Past performance 1992 Dec - 78% (13%-b)

16. In order for the confidence interval in the previous question to be valid:

2006

c Carl James Schwarz 6

(a) we must assume that we have a random sample from a normal pop-

ulation.

(b) we must assume that we have a random sample from some population

(but it need not be a normal population because of the Central Limit

Theorem).

(c) we must assume that the population is normal (but we do not require

a random sample because of the Central Limit Theorem).

(d) we do not need to assume that the population is normal nor that the

sample is random (because of the Central Limit Theorem).

(e) we must assume that we have a random sample from a dichotomous

population.

Solution: b - the Wonderful CLT (it will change your life) strikes again.

on gun control. Each person was asked if they were in favor or gun control

or not in favor of gun control - non respondents were removed from the

results. The survey found that 25% of people contacted were not in favor

of gun control laws. These results were accurate to within 3 percentage

points, 19 times out of 20. Which of the following is NOT CORRECT?

(a) The 95% confidence interval is approximately from (22% to 28%).

(b) We are 95% confidence that the true proportion of people not in favor

is within 3 percentage points of 25%.

(c) In approximately 95% of polls on this issue, the confidence interval

will include 25%.

(d) If another poll of similar size were taken, the percentage of people

IN FAVOR of gun control would likely range from 72% to 78%.

(e) A properly designed poll of the same size in the United States would

have the same margin of error.

Solution: c

Past performance 1998 Nov - 25% (10% a; 15% b; 33% d; 14% e)

18. A 95% confidence interval for p the proportion of Canadian beer drinkers

who prefer Lion Red was found to be (0.236 to 0.282). Which of the

following is correct?

(a) About 95% of beer drinkers have between a 23.6% and a 28.2% chance

of drinking Lion Red.

(b) There is a 95% probability that the sample proportion lies between

0.236 and 0.282.

2006

c Carl James Schwarz 7

(c) If a second sample was taken, there is a 95% chance that its confidence

interval would contain 0.25.

(d) This confidence interval indicates that we would likely reject the hy-

pothesis H: p=0.25.

(e) we are reasonably certain that the true proportion of beer drinkers

who prefer Lion Red is between 24% and 28%.

Solution: e

Past performance 1998 Dec - 71% (15% c)

19. Refer to the previous question. Suppose that the same poll was repeated

in the United States (whose population is 10 times larger than Canada),

but in this new pool, four times the number of people were interviewed.

The resulting 95% confidence intervals will be:

(a) about 1/2 as wide as the Canadian interval

(b) about 1/4 as wide as the Canadian interval

(c) about 1/10 as wide as the Canadian interval

(d) about 4/10 times as wide as the Canadian interval

(e) the same size as the Canadian interval

Solution: a

Past performance 1998 Dec - 38% (30% b; 20% e)

If you increase the sample size by a factor of x, the ci decreases in width

by sqrt(x)

The easiest way to see this is to simply compute the two se.

20. Suppose that we wish to estimate the proportion of Canadians who ac-

tually understand the Constitution of Canada. What is the approximate

number of Canadians who need to be sampled so that the 95% confidence

interval has a width of 2 percentage points?

(a) about 500

(b) about 1,000

(c) about 2,500

(d) about 5,000

(e) about 10,000

Solution: e

Past performance 1998 Dec - 42% (15% b; 28% c)

2006

c Carl James Schwarz 8

Multiple Choice Questions

Inference - Two independent samples on means

following different feeding programs were compared. One group contained

breast-fed infants, while the children in another group were fed by a stan-

dard baby formula without any iron supplements. Here are summary

results of blood hemoglobin levels at 12 months of age.

Breast-fed 8 13.3 1.7

Formula-fed 10 12.4 1.8

between the two populations of infants is:

(b) 0.9 ± 2.08

(c) 0.9 ± 2.13

(d) 0.9 ± 2.15

(e) 0.9 ± 1.63

Solution: d

Past performance 1989 Dec - 64% (14% a,c)

Past performance 1990 Dec - 73%

treating Stage 4 AIDS patients. A group of AIDS patients was randomly

divided into two groups. One group received the new drug; the other

group received a placebo. The difference in mean subsequent survival

(those with drugs - those without drugs) was found to be 1.04 years and

a 95% confidence interval was found to be 1.04 ± 2.37 years. Based upon

this information:

(a) We can conclude that the drug was effective because those taking the

drug lived, on average, 1.04 years longer.

1

(b) We can conclude that the drug was ineffective because those taking

the drug lived, on average, 1.04 years less.

(c) We can conclude that there is no evidence the drug was effective

becaue the 95% confidence interval covers zero.

(d) We can conclude that there is evidence the drug was effective because

the 95% confidence interval does not cover zero.

(e) We can make no conclusions because we do not know the sample size

nor the actual mean survival of each group.

Solution: c

Past performance 1990 Dec - 79%

Past performance 1998 Dec - 77%

Past performance 2006 Dec - 85%

supermarket to measure the percentage of fat present in the meat, with

the following summary data.

Outlet 1 Outlet 2

n 5 10

mean 10.3 10.7 percent

std.dev 1.6 2.3 percent

Hence, the pooled standard deviation is:

(a) 1.95

(b) 2.08

(c) 4.38

(d) 2.09

(e) 2.11

Solution: e

Past performance 1989 Dec - 72%

4. The degrees of freedom of the pooled estimate in the previous question is:

(a) 15

(b) 13

(c) 7.5

(d) 5

(e) 10

2006

c Carl James Schwarz 2

Solution: b

Past performance 1989 Dec - 90%

in an introductory statistics course. Students in one section taught by

instructor A received no assignments. Students in another section taught

by instructor B, received assignments. The final grade of each student was

recorded. A 95% confidence interval for the difference in the mean grades

(Section A - Section B) was computed to be −3.5 ± 1.8. This means:

(a) There is evidence that doing assignments improves the average grade

because the difference in the population means is less than zero.

(b) There is little evidence that doing assignments improves the average

grade because the 95% confidence interval does not cover 0.

(c) There is evidence that doing assignments improves the average grade

because the 95% confidence interval does not cover 0.

(d) There is evidence that doing assignments does not improve the aver-

age grade because the 95% confidence interval does not cover 0.

(e) There is little evidence that doing assignments does not improve the

average grade because the 95% confidence interval does cover 0.

Solution: c

Past performance 1989 Dec - 73%

number of dental caries (cavities) in children. A sample of children was

(with parental consent) entered into a study and followed for several years.

Each child was classified as a sweetened-cereal lover or a non-sweetened

cereal lover. At the end of the study, the amount of tooth damage was

measured. Here is the summary data:

Sugar Bombed 10 6.41 5.0

No sugar 15 5.20 15.0

tooth damage is:

q

5

(a) (6.41 − 5.20) ± 2.26 10 + 15

15

q

(b) (6.41 − 5.20) ± 2.26 25 225

10 + 15

q

(c) (6.41 − 5.20) ± 1.96 25 225

10 + 15

2006

c Carl James Schwarz 3

q

146 146

(d) (6.41 − 5.20) ± 2.07 10 + 15

q

146 146

(e) (6.41 − 5.20) ± 1.96 10 + 15

Solution: b

Past performance 1990 Dec - 55%

the prevention of tapeworms in the stomachs of a new breed of sheep.

Samples of size 5 and 8 from each breed were given the drug and the two

sample means were 28.6 and 40.0 worms/sheep. From previous studies, it

is known that the variances in the two groups are 198 and 232, respectively,

and that the number of worms in the stomachs has an approximate normal

distribution. A 95% confidence interval for the the difference in the mean

number of worms per sheep is:

(b) 11.4 ± 18.2

(c) −11.4 ± 17.9

(d) 11.4 ± 16.2

(e) −11.4 ± 16.6

Solution: d

Past performance 1989 Dec - 43% (27% -a)

8. A researcher wants to see if birds that build larger nests lay larger eggs.

She selects two random samples of nests: one of small nests and the other

of large nests. She weighs one egg from each nest. The data are summa-

rized below.

sample size 60 159

sample mean (g) 37.2 35.6

sample variance 24.7 39.0

A 95% confidence interval for the difference between the average mass of

eggs in small and large nests.

(b) 1.6 ± 1.48 = (0.12, 3.08)

(c) 1.6 ± 1.59 = (0.01, 3.19)

(d) 1.6 ± 1.76 = (−0.16, 3.36)

2006

c Carl James Schwarz 4

(e) 1.6 ± 7.31 = (−5.71, 8.91)

Solution: c

Past performance 1992 Dec - 82%

within 1.0 g of the true value. What is the approximate sample size is

needed for each group?

(a) 240

(b) 60

(c) 8000

(d) 2000

(e) 125

Solution: a

Past performance 1992 Dec - 79%

A researcher wants to see if birds that build larger nests lay larger eggs.

She selects two random samples of nests: one of small nests and the other

of large nests. She measures one egg from each nest. The data are sum-

marized below.

2006

c Carl James Schwarz 5

10. Refer to the 95% confidence interval circled on the output. This means:

(a) We are 95% confident that the sample mean egg size in large nests is

between 37 and 40 mm if the survey was repeated.

(b) If the survey was repeated, we are 95% confident that eggs sizes in

large nests are between 37 and 40 mm.

(c) We are 95% confident that nests will be have large eggs between 37

and 40 mm if the survey was repeated.

(d) We are 95% confident that the true mean eggs size for large nests is

between 37 and 40 mm.

(e) We are 95% confident that repeated surveys will have population

means between 37 and 40 mm.

Solution: d

Past performance 2006 Dec - 61% (19%-a; 12%-b)

2006

c Carl James Schwarz 6

(a) Because the 95% confidence interval for the difference in means in-

cludes zero, there is no evidence of a difference in the mean egg size.

(b) Because the one-sided p-value is .18, there is no evidence of a differ-

ence in mean egg sizes.

(c) Because the confidence intervals for the two groups have a great deal

of overlap, there is no evidence of a difference in the mean egg size.

(d) Because the individual values of the eggs sizes for the two groups

have a great deal of overlap, there is no evidence of a difference in

the means.

(e) Because the 95% confidence intervals for the mean eggs sizes are

approximately equal in width, the two estimates are about equally

precise.

Solution: d

Past performance 2006 Dec - 58% (14%-a; 15%-b; 19%-d)

2006

c Carl James Schwarz 7

Multiple Choice Questions

Inference - Two independent samples on

proportions

1. Two surveys were conducted before and after the recent Autopac rate

increases to find the proportion of voters who state they would vote for

the current government. The results were as follows:

No. surveyed 400 600 1000

No. in favor

of current gov’t 150 150 300

q

(a) (.375 − .250) ± 1.96 (.375)(.625)

400 + (.250)(.750)

600

q

(.375)(.625) (.250)(.750)

(b) (.375 − .250) ± 1.96 1000 + 1000

q

(c) (.375 − .250) ± 1.96 (.300)(.700)

400 + (.300)(.700)

600

q

(d) (.375 − .250) ± 1.96 (.300)(.700)

1000 + (.300)(.700)

1000

q

(e) (.375 − .250) ± 1.96 (.375)(.625)

500 + (.250)(.750)

500

Solution: a

Past performance 1992 Dec - 97%

2. The above confidence intevals are of the order ś6 percentage points. What

sample size for each poll would be needed so that we are 95% confident

of being within 2 percentage points of the true difference assuming that

the above proportions are reasonable estimates of the proportions in the

population?

(a) 6,000

1

(b) 1,000

(c) 15,000

(d) 2,000

(e) 4,000

Solution: e

Past performance 1992 Dec - 73%

3. Two surgical procedures are widely used to treat a certain type of cancer.

To compare the success rates of the two procedures, a random sample

from each type of procedure is obtained, and the number of patients with

no reoccurrence of the disease after 1 year was recorded. Here is the data.

n No occurrence

Procedure A 100 78

Procedure B 120 102

(b) .07 ± .0054

(c) .07 ± .103

(d) .07 ± .115

(e) .07 ± .059

Solution: c

Past performance 1989 Dec - 78%

4. There may be a cure for male pattern baldness (at least millions of males

hope there will be) using the blood pressure drug Minoxidil. A group of

males was randomly assigned to two groups. One group received topi-

cal applications of the drug; the other group received applications of an

identical looking placebo. The summary data

Number with

Sample Size New $H_A$ir Growth

Minoxidil group 310 100

Placebo group 100 25

showing new hair growth is:

2006

c Carl James Schwarz 2

(b) .073 ± .048

(c) .073 ± .024

(d) .073 ± .051

(e) .073 ± .099

Solution: e

that is currently in use. Two rooms of equal size are sprayed with the

same amount of spray, one room with Type A and the other with Type

B. Two hundred insects are released into each room, and after one hour

the numbers of dead insects are counted. The results are given in the

following table:

SPRAY A SPRAY B

Total number of insects 200 200

Total number of dead insects 140 100

A 90% confidence interval for the difference in the rates of kill for the two

sprays, is:

q

.46

(a) .2 ± 1.645 200

q

.48

(b) .2 ± 1.645 200

q

.46

(c) .2 ± 1.96 200

q

.48

(d) .2 ± 1.96 200

q

.48

(e) .2 ± 2.326 200

Solution: a

Past performance 1990 Dec - 78%

the difference in success rate very accurately, i.e. to be 95% sure that the

estimated difference is within 0.01 of the true difference. If both vaccines

are expected to have an approximate success rate of 80%, then the required

sample size for each group is obtained by solving:

q

(a) .01 = 1.96 .8(.2)

n + n

.8(.2)

q

(b) .02 = 1.96 .8(.2)

n + n

.8(.2)

2006

c Carl James Schwarz 3

q

(c) .01 = 1.96 .5(.5)

n +

.5(.5)

n

q

(d) .02 = 1.96 .5(.5)

n +

.5(.5)

n

Solution: a

Past performance 1989 Dec - 80%

the difference in success rate very accurately, i.e. to be 95% sure that the

estimated difference is within 0.01 of the true difference. If both vaccines

are expected to have an approximate success rate of 80%, then the required

sample size is:

(b) about 1500 in each group for a total of 3000 people.

(c) about 3000 in each group for a total of 6000 people.

(d) about 6000 in each group for a total of 12000 people.

(e) about 12000 in each group for a total of 24000 people.

Solution: e

Past performance 1990 Dec - 32% ( 12% - b, 14% - c, 36% - d, 31% - e)

2006

c Carl James Schwarz 4

Multiple Choice Questions

Probability - Binomial

1. A random sample of 15 people is taken from a population in which 40%

favour a particular political stand. What is the probability that exactly 6

individuals in the sample favour this political stand?

(a) 0.4000

(b) 0.5000

(c) 0.4000

(d) 0.2066

(e) 0.0041

Solution: d

2. Experience has shown that a certain lie detector will show a positive read-

ing (indicates a lie) 10% of the time when a person is telling the truth and

95% of the time when a person is lying. Suppose that a random sample of

5 suspects is subjected to a lie detector test regarding a recent one-person

crime. Then the probability of observing no positive reading if all suspects

plead innocent and are telling the truth is

(a) 0.409

(b) 0.735

(c) 0.00001

(d) 0.591

(e) 0.99999

Solution: d

1

1 PROBABILITY - BINOMIAL DISTRIBUTION

3. It has been estimated that about 30% of frozen chicken contain enough

salmonella bacteria to cause illness if improperly cooked. A consumer

purchases 12 frozen chickens. What is the probability that the consumer

will have more than 6 contaminated chickens?

(a) .961

(b) .118

(c) .882

(d) .039

(e) .079

Solution: d

Past performance 1989 Dec - 74%

Past performance 1990 Oct - 68%

Past performance 1992 Oct - 93%

Past performance 1997 Aug - 91%

frozen chickens from a supplier. Find an approximate 95% interval for the

number of frozen chickens that may be contaminated.

(a) (90, 510)

(b) (285, 315)

(c) (0, 730)

(d) (270, 330)

(e) (255, 345)

Solution: d

Past performance 1990 Oct - 74%

Past performance 1997 Aug - 81% (13%-b)

tion?

(a) All trials must be identical.

(b) All trials must be independent.

(c) Each trial must be classified as a success or a failure.

(d) The number of successes in the trials is counted.

(e) The probability of success is equal to .5 in all trials.

Solution: e

Past performance 1990 Oct - 84%

Past performance 1996 Nov - 97%

2006

c Carl James Schwarz 2

1 PROBABILITY - BINOMIAL DISTRIBUTION

6. It has been estimated that as many as 70% of the fish caught in certain

areas of the Great Lakes have liver cancer due to the pollutants present.

Find an approximate 95% range for the number of fish with liver cancer

present in a sample of 130 fish.

(a) (80, 102)

(b) (86, 97)

(c) (63, 119)

(d) (36, 146)

(e) (75, 107)

Solution: a

Past performance 1989 Dec - 83%

Past performance 1991 Oct - 56% (11%d, 20% e)

Past performance 1992 Oct - 78%

which are alike, and is asked to pick out the odd one by testing. If a tester

has no well developed sense and can pick the odd one only, by chance,

what is the probability that in five trials he will make four or more correct

decisions?

(a) 11/243

(b) 1/243

(c) 10/243

(d) 233/243

(e) 232/243

Solution: a

1/4. If a random sample of 6 items is taken from the output of this

machine, what is the probability that there will be 5 or more defectives in

the sample?

(a) 1/4096

(b) 3/4096

(c) 4/4096

(d) 18/4096

(e) 19/4096

Solution: e

2006

c Carl James Schwarz 3

1 PROBABILITY - BINOMIAL DISTRIBUTION

0.20. If a random sample of 6 items is taken from the output of this

machine, what is the probability that there will be 5 or more defectives in

the sample?

(a) .0001

(b) .0154

(c) .0015

(d) .2458

(e) .0016

Solution: e

10. Suppose 60% of a herd of cattle is infected with a particular disease. Let Y

= the number of non-diseased cattle in a sample of size 5. The distribution

of Y is

(a) binomial with n = 5 and p = 0.6

(b) binomial with n = 5 and p = 0.4

(c) binomial with n = 5 and p = 0.5

(d) the same as the distribution of X, the number of infected cattle.

(e) Poisson with λ = .6

Solution: b

11. Fifteen percent of new residential central air conditioning units installed

by a supplier need additional adjustments requiring a service call. Assume

that a recent sample of seven such units constitutes a Bernoulli process.

Interest centers on X, the number of units among these seven that need

additional adjustments. The mean and variance of X are, respectively

(a) .15; .85

(b) .15; 1.05

(c) .15; .8925

(d) 1.05; .1275

(e) 1.05; .8915

Solution: e - remember variance = (std dev) squared

12. If you buy one ticket in the Provincial Lottery, then the probability that

you will win a prize is 0.11. If you buy one ticket each month for five

months, what is the probability that you will win at least one prize?

2006

c Carl James Schwarz 4

1 PROBABILITY - BINOMIAL DISTRIBUTION

(a) 0.55

(b) 0.50

(c) 0.44

(d) 0.45

(e) 0.56

Solution: c

13. Suppose that the probability that a cross between two varieties will express

a particular gene is 0.20. What is the probability that in 8 progeny plants,

two or fewer plants will express the gene?

(a) .2936

(b) .3355

(c) .1678

(d) .6291

(e) .7969

Solution: e

Past performance 1989 Oct - 95%

14. Refer to the previous question. Suppose that 120 crosses are bred. Find

a likely 95% range for the number of progeny that will express the gene.

(a) 24ś19.2

(b) 24ś4.4

(c) 24ś8.8

(d) 24ś4.9

(e) 24ś9.8

Solution: c

Past performance 1989 Oct - 65%

15. Seventeen people have been exposed to a particular disease. Each one

independently has a 40% chance of contracting the disease. A hospital

has the capacity to handle 10 cases of the disease. What is the probability

that the hospital’s capacity will be exceeded?

(a) .965

(b) .035

(c) .989

(d) .011

2006

c Carl James Schwarz 5

1 PROBABILITY - BINOMIAL DISTRIBUTION

(e) .736

Solution: b

Past performance 1991 Oct - 75%

Past performance 1993 Feb - 59% (c-14%; d-14%)

Past performance 1993 Apr - 70%

Past performance 1996 Nov - 90%

Past performance 1998 Nov - 88%

16. Refer to the previous problem. Planners need to have enough beds avail-

able to handle a proportion of all outbreaks. Suppose a typical outbreak

has 100 people exposed, each with a 40% chance of coming down with the

disease. Which is not correct:

(a) This experiment satisfies the assumptions of a binomial distribution.

(b) About 95% of the time, between 30 and 50 people will contract the

disease.

(c) Almost all of the time, between 25 and 55 people will contract the

disease.

(d) On average, about 40 people will contract the disease.

(e) Almost all of time, less than 40 people will be infected.

Solution: e

Past performance 1993 Feb - 73% (d-13%)

Past performance 1996 Nov - 80% (d- 8%)

Past performance 1998 Nov - 87%

17. There are 10 patients on the Neo-Natal Ward of a local hospital who are

monitored by 2 staff members. If the probability (at any one time) of a

patient requiring emergency attention by a staff member is .3, assuming

the patients to be behave independently, what is the probability at any

one time that there will not be sufficient staff to attend all emergencies?

(a) .3828

(b) .3000

(c) .0900

(d) .9100

(e) .6172

Solution: e

18. A newborn baby whose Apgar score is over 6 is classified as normal and

this happens in 80% of births. As a quality control check, an auditor

examined the records of 100 births. He would be suspicious if the number

2006

c Carl James Schwarz 6

1 PROBABILITY - BINOMIAL DISTRIBUTION

of normal births in the sample of 100 births fell above the upper limit of

a “95%-normal-range”. What is this upper limit?

(a) 112

(b) 72

(c) 88

(d) 8

(e) none of these

Solution: c

Past performance ???? 73% (18% -e)

19. Refer to the previous question. Babies that have Apgar scores of 6 or lower

require more expensive medical care. What is the probability that in the

next 10 births, 3 or more babies will have Apgar scores of 6 or lower?

(a) .2013

(b) .3222

(c) .9999

(d) .0001

(e) .1536

Solution: b

Past performance ???? 48% (19%-c; 11%-d; 14%-e)

20. Newsweek in 1989 reported that 60% of young children have blood lead

levels that could impair their neurological development. Assuming that a

class in a school is a random sample from the population of all children at

risk, the probability that at least 5 children out of 10 in a sample taken

from a school may have a blood level that may impair development is:

(a) about .25

(b) about .20

(c) about .84

(d) about .16

(e) about .64

Solution: c

Past performance 1998 Dec - 80%

21. Refer to the previous problem. The total number of children in the school

is about 400. In order to estimate the cost of treating all the children at

one school, the health board wishes to be reasonably sure of the upper

limit on the number of children affected. This upper limit is:

2006

c Carl James Schwarz 7

1 PROBABILITY - BINOMIAL DISTRIBUTION

(b) about 350

(c) about 240

(d) about 400

(e) about 250

Solution: a

Past performance 1998 Dec - 72% (15% c)

22. Consider 8 blood donors chosen randomly from a population. The prob-

ability that the donor has type A blood is .40. Which of the following is

CORRECT?

(a) The probability of 1 or fewer donors having type A blood is about

.11.

(b) The probability of 7 or more donors NOT having type A blood is

about .0087.

(c) The probability of exactly 5 donors having type A blood is about .28.

(d) The probability of exactly 5 donors NOT having type A blood is

about .12.

(e) The probability that between 3 and 5 donors (inclusive) will have

type A blood is about .37.

Solution: a

Past performance 2006 Nov - 84%

Past performance 2006 Dec - 79%

23. Consider 100 blood donors chosen randomly from a population where the

probability of type A is 0.40? What is the approximate probability that

at least 43 donors will have type A blood?

(a) about .43

(b) about .62

(c) about .73

(d) about .27

(e) about .38

Solution: d

Past performance 2006 Nov - 64%

Past performance 2006 Dec - 58% (27%-c)

2006

c Carl James Schwarz 8

Multiple Choice Questions

Probability - Expected Value

1. Cans of soft drinks cost $0.30 in a certain vending machine. What is the

expected value and variance of daily revenue (Y) from the machine, if X,

the number of cans sold per day has E(X) = 125, and V ar(X) = 50 ?

(a) E(Y ) = 37.5 , V ar(Y ) = 50

(b) E(Y ) = 37.5 , V ar(Y ) = 4.5

(c) E(Y ) = 37.5 , V ar(Y ) = 15

(d) E(Y ) = 37.5 , V ar(Y ) = 15

(e) E(Y ) = 125 , V ar(Y ) = 4.5

Solution: b - remember variance = (std dev)2

2. A crop insurance company establishes the following loss table based upon

previous claims

probability | .90 .05 .02 ????

loss in $/hectare is approximately:

(a) 5.2

(b) 7.9

(c) 4.5

(d) 37.5

(e) 25.0

Solution: b

Past performance 1990 Oct - 57%

Past performance 1992 Oct - 92%

Past performance 2006 Nov - 68%

1

3. A rock concert producer has scheduled an outdoor concert. If it is warm

that day, she expects to make a $20,000 profit. If it is cool that day, she

expects to make a $5,000 profit. If it is very cold that day, she expects to

suffer a $12,000 loss. Based upon historical records, the weather office has

estimated the chances of a warm day to be .60; the chances of a cool day

to be .25. What is the producer’s expected profit?

(a) $5,000

(b) $13,000

(c) $15,050

(d) $13,250

(e) $11,450

Solution: e

Past performance 1989 Apr - 92%

Past performance 1997 Aug - 93%

The projected annual cash flow for the new location is:

Annual

Cash Flow $10,000 $30,000 $70,000 $90,000 $100,000

Probability 0.10 0.15 0.50 0.15 ?

(a) $12,800

(b) $64,000

(c) $70,000

(d) $60,000

(e) $50,000

Solution: b

Past performance 1997 Jul - 99%

the next year on a particular model of car:

prob | .60 .05 .13 ????

(a) $155

2006

c Carl James Schwarz 2

(b) $595

(c) $875

(d) $645

(e) $495

Solution: b

Past performance 1989 Oct - 91%

Past performance 1991 Oct - 90%

Past performance 1993 Feb - 96%

Past performance 1996 Dec - 96%

6. Before planting a crop for the next year, a producer does a risk assess-

ment. According to her assessment, she concludes that there are three

possible net outcomes: a $7,000 gain, a $4,000 gain, or a $10,000 loss with

probabilities 0.55, 0.20 and 0.25 respectively. The expected profit is:

(a) $3,850

(b) $0

(c) $2,150

(d) $2,500

(e) $800

Solution: c

Past performance 1992 Dec - 97%

profit of $10,000 with probability 3/20, to make a profit of $5,000 with

probability 9/20, to break even with probability 1/4 and to lose $5,000

with probability 3/20. The expected profit in dollars is:

(a) 1,500

(b) 0

(c) 3,000

(d) 3,250

(e) - 1,500

Solution: c

Past performance 1989 Dec - 96%

Suppose that the following is the distribution of the length of stay in a

hospital after a minor operation:

2006

c Carl James Schwarz 3

Days 2 3 4 5 6

Prob .05 .20 .40 .20 ?

(a) .15

(b) .17

(c) 3.3

(d) 4.0

(e) 4.2

Solution: e

Past performance 1993 Apr - 74% (a-13%)

Past performance 1996 Dec - 92%

Past performance 1998 Dec - 95%

conditions: The replacement cost ($5000) will be paid for a total loss. If

it is not a total loss, but the damage is more than $2000, then $1500 will

be paid. Nothing will be paid for damage costing $2000 or less and of

course nothing is paid out if there is no damage. The company estimates

the probability of the first three events as .02, .10, and .30 respectively.

The amount the company should charge if it wishes to make a profit of

$50 above the expected amount paid out in a year is:

(a) $250

(b) $201

(c) $300

(d) $1200

(e) $165

Solution: c

Past performance 1998 Nov - 77%

2006

c Carl James Schwarz 4

Multiple Choice Questions

Probability - General

1. The probability that the Red River will flood in any given year has been

estimated from 200 years of historical data to be one in four. This means:

(a) The Red River will flood every four year.

(b) In the next 100 years, the Red River will flood exactly 25 times.

(c) In the last 100 years, the Red River flooded exactly 25 times.

(d) In the next 100 years, the Red River will flood about 25 times.

(e) In the next 100 years, it is very likely that the Red River will flood

exactly 25 times.

2. The chances that you will ticketed for illegal parking on campus are about

1/3. During the last nine days, you have illegally parked every day and

have NOT been ticketed (you lucky person)! Today, on the 10th day, you

again decide to park illegally. The chances that you will be caught are:

(a) greater than 1/3 because you were not caught in the last nine days.

(b) less than 1/3 because you were not caught in the last nine days.

(c) still equal to 1/3 because the last nine days do not affect the proba-

bility.

(d) equal to 1/10 because you were not caught in the last nine days.

(e) equal to 9/10 because you were not caught in the last nine days.

3. The chance that a person will contract AIDS after a sexual contact with

an infected partner has been estimated to be 1/4. This means:

(a) A person will be infected after exactly 4 sexual contacts with infected

partners.

(b) Of 1000 people having sexual contacts with infected partners, exactly

250 will become infected.

(c) Of 200 people having sexual contacts with infected partners, about

50 will become infected.

1

(d) In exactly 25% of all sexual contacts with infected partners, the in-

fection will spread.

(e) Of 20 people having sexual contacts with infected partners, it is very

likely that exactly 5 people will become infected.

4. A random variable Y has the following distribution:

Y | -1 0 1 2

P(Y)| 3C 2C 0.4 0.1

(a) 0.10

(b) 0.15

(c) 0.20

(d) 0.25

(e) 0.75

5. A random variable X has a probability distribution as follows:

r | 0 1 2 3

P(R=r) | 2k 3k 13k 2k

(a) .90

(b) .25

(c) .65

(d) .15

(e) 1.00

6. Suppose that the allele for tallness (T) is dominant over shortness (t); that

for Yellow (Y) is dominant over green (y); and that for roundness (W) is

dominant over wrinkled(w). Suppose we cross two plants with genotypes

TTYyWw and TtYyWw. The probability of a Tall, Yellow, Round plant

is:

(a) 9/16

(b) 3/32

(c) 1/16

(d) 9/32

(e) 3/16

2006

c Carl James Schwarz 2

7. It has been estimated that about 20% of people between the ages of 18

and 25 have used marijuana in the last year. Which of the following is

CORRECT about this statement?

(a) Five people of this age group were randomly selected. This means

that exactly one of them must have used marijuana in the last year.

(b) Twenty people were randomly selected from this age group. Eighteen

of them use marijuana in the last year. The next person selected at

random will have a lower probability of using marijuana.

(c) Ten people were randomly selected from this age group. None of

them have used marijuana in the last year. The next person selected

must have a higher probability of using marijuana in the last year.

(d) A thousand people from this age group were randomly selected. It is

not unusual to find that 217 of them have used marijuana in the last

year.

(e) A million people from this age group were randomly selected. There

must be exactly 200,000 of them that have used marijuana in the last

year.

All human blood can be “ABO” typed as belonging to one of A, B, O, or

AB types. The actual distribution varies slightly among different groups

of people, but for a randomly chosen person from North America, the

following are the approximate probabilities:

Blood type O A B AB

Probability .45 .40 .11 .04

8. Consider an accident victim with type B blood. She can only receive a

transfusion from a person with type B or type O blood. What is the

probability that a randomly chosen person will be suitable donor?

(a) about .11

(b) about .04

(c) about .15

(d) about .45

(e) about .56

9. What is the probability that both people in a couple will have the SAME

blood type if matings are random with respect to blood type, i.e. one

partner’s blood type does not influence the blood type of the other partner.

2006

c Carl James Schwarz 3

(a) about .21

(b) about .16

(c) about .002

(d) about .01

(e) about .38

2006

c Carl James Schwarz 4

Multiple Choice Questions

Probability - General

1. The probability that the Red River will flood in any given year has been

estimated from 200 years of historical data to be one in four. This means:

(a) The Red River will flood every four year.

(b) In the next 100 years, the Red River will flood exactly 25 times.

(c) In the last 100 years, the Red River flooded exactly 25 times.

(d) In the next 100 years, the Red River will flood about 25 times.

(e) In the next 100 years, it is very likely that the Red River will flood

exactly 25 times.

Solution: d

Past performance 1989 Oct - 90%

Past performance 1990 Dec - 99%

2. The chances that you will ticketed for illegal parking on campus are about

1/3. During the last nine days, you have illegally parked every day and

have NOT been ticketed (you lucky person)! Today, on the 10th day, you

again decide to park illegally. The chances that you will be caught are:

(a) greater than 1/3 because you were not caught in the last nine days.

(b) less than 1/3 because you were not caught in the last nine days.

(c) still equal to 1/3 because the last nine days do not affect the proba-

bility.

(d) equal to 1/10 because you were not caught in the last nine days.

(e) equal to 9/10 because you were not caught in the last nine days.

Solution: c

Past performance 1989 Oct - 96%

3. The chance that a person will contract AIDS after a sexual contact with

an infected partner has been estimated to be 1/4. This means:

1

(a) A person will be infected after exactly 4 sexual contacts with infected

partners.

(b) Of 1000 people having sexual contacts with infected partners, exactly

250 will become infected.

(c) Of 200 people having sexual contacts with infected partners, about

50 will become infected.

(d) In exactly 25% of all sexual contacts with infected partners, the in-

fection will spread.

(e) Of 20 people having sexual contacts with infected partners, it is very

likely that exactly 5 people will become infected.

Solution: c

Past performance 1989 Dec - 88%

Past performance 1990 Oct - 94%

Past performance 1991 Oct - 95%

Y | -1 0 1 2

P(Y)| 3C 2C 0.4 0.1

(a) 0.10

(b) 0.15

(c) 0.20

(d) 0.25

(e) 0.75

Solution: a

r | 0 1 2 3

P(R=r) | 2k 3k 13k 2k

(a) .90

(b) .25

(c) .65

(d) .15

2006

c Carl James Schwarz 2

(e) 1.00

Solution: b

6. Suppose that the allele for tallness (T) is dominant over shortness (t); that

for Yellow (Y) is dominant over green (y); and that for roundness (W) is

dominant over wrinkled(w). Suppose we cross two plants with genotypes

TTYyWw and TtYyWw. The probability of a Tall, Yellow, Round plant

is:

(a) 9/16

(b) 3/32

(c) 1/16

(d) 9/32

(e) 3/16

Solution: a

Past performance 1992 Oct 78%

7. It has been estimated that about 20% of people between the ages of 18

and 25 have used marijuana in the last year. Which of the following is

CORRECT about this statement?

(a) Five people of this age group were randomly selected. This means

that exactly one of them must have used marijuana in the last year.

(b) Twenty people were randomly selected from this age group. Eighteen

of them use marijuana in the last year. The next person selected at

random will have a lower probability of using marijuana.

(c) Ten people were randomly selected from this age group. None of

them have used marijuana in the last year. The next person selected

must have a higher probability of using marijuana in the last year.

(d) A thousand people from this age group were randomly selected. It is

not unusual to find that 217 of them have used marijuana in the last

year.

(e) A million people from this age group were randomly selected. There

must be exactly 200,000 of them that have used marijuana in the last

year.

Solution: d

Past performance 2006 Nov - 91%

2006

c Carl James Schwarz 3

The following two questions refer to the following situation.

All human blood can be “ABO” typed as belonging to one of A, B, O, or

AB types. The actual distribution varies slightly among different groups

of people, but for a randomly chosen person from North America, the

following are the approximate probabilities:

Blood type O A B AB

Probability .45 .40 .11 .04

8. Consider an accident victim with type B blood. She can only receive a

transfusion from a person with type B or type O blood. What is the

probability that a randomly chosen person will be suitable donor?

(b) about .04

(c) about .15

(d) about .45

(e) about .56

Solution: e

Past performance 2006 Nov - 96%

9. What is the probability that both people in a couple will have the SAME

blood type if matings are random with respect to blood type, i.e. one

partner’s blood type does not influence the blood type of the other partner.

(a) about .21

(b) about .16

(c) about .002

(d) about .01

(e) about .38

Solution: e

Past performance 2006 Nov - 73%

Past performance 2006 Dec - 85%

2006

c Carl James Schwarz 4

Multiple Choice Questions

Normal approximations to discrete distributions

1. The National Broomball League claims to have a balanced league; that is,

for any given game each team has an equal chance of winning or losing with

no ties. Assuming the claim is true, what is the approximate probability

that a given team will lose more than 61 games out of the 100 played?

(a) 0.0500

(b) 0.4918

(c) 0.0107

(d) 0.0082

(e) 0.0164

Solution: c

2. The probability of getting a parking ticket when not paying for a 2-hour

period is 0.3. What is the probability of getting at least 60 tickets if you

park on 250 occasions for a 2-hour period and don’t pay?

(a) 0.016

(b) 0.019

(c) 0.98

(d) 0.93

(e) 0.072

Solution: c

3. A professional basketball player sinks 80% of his foul shots, in the long

run. If he gets 100 tries during a season, then the probability that he sinks

between 75 and 90 shots (inclusive) is approximately equal to:

(a) P r(−1.25 ≤ Z ≤ 2.5)

(b) P r(−1.125 ≤ Z ≤ 2.625)

1

(c) P r(−1.125 ≤ Z ≤ 2.375)

(d) P r(−1.375 ≤ Z ≤ 2.375)

(e) P r(−1.375 ≤ Z ≤ 2.625)

Solution: e

ments. If 200 students are randomly selected, then the probability that the

number of them living in apartments will be between 50 and 75 inclusive,

is:

(a) .9167

(b) .9298

(c) .9390

(d) .9268

(e) .9208

Solution: c

probability of the event {155 < X < 175} is:

(a) 0.6552

(b) 0.6429

(c) 0.6078

(d) 0.6201

(e) 0.6320

Solution: c

the approximate probability is;

(a) 0.4

(b) larger than that in the previous question

(c) smaller than that in the previous question

(d) equal to that in the previous question

(e) may be smaller or larger than that in the previous question

Solution: b

2008

c Carl James Schwarz 2

7. Companies are interested in the demographics of those who listen to the

radio programs they sponsor. A radio station has determined that only

20% of listeners phoning in to a morning talk program are male. During

a particular week, 200 calls are received by this program. What is the

approximate probability that at least 50 of the callers are male?

(a) .0466

(b) .0212

(c) .1168

(d) .1402

(e) Not within ś .01 of any of the above.

Solution: a

people from the labour force is drawn. Find the approximate probability

that the sample contains at least ten unemployed people.

(a) .3879

(b) .3245

(c) .3419

(d) .2946

(e) .3594

Solution: e

9. A politician has targeted 100 homes to visit during a week. From past

experience, 50 percent of the households answer the bell and invite him

in. Of this, 80 percent will agree with his policies. The approximate

probability that the politician will get support from at least 45 households

during a week is:

(a) 0.1991

(b) 0.3212

(c) 0.8643

(d) 0.1376

(e) 0.1788

Solution: d

10. People who have been in contact with a carrier of a disease, have a 40%

chance of contracting the disease. Suppose that the carrier of the dis-

eases may have infected a school with 500 people. Find the approximate

probability that at least 215 people will contract the disease.

2008

c Carl James Schwarz 3

(a) .09

(b) .91

(c) between .05 and .34

(d) 1.37

(e) between 2.5% and 17%

Solution: a

Past performance 1993 Apr - 40% (b-22%, c-22%)

2008

c Carl James Schwarz 4

Multiple Choice Questions

Probability - Normal distribution

1. One of the side effects of flooding a lake in northern boreal forest areas

(e.g. for a hydro-electric project) is that mercury is leached from the soil,

enters the food chain, and eventually contaminates the fish. The concen-

tration in fish will vary among individual fish because of differences in

eating patterns, movements around the lake, etc. Suppose that the con-

centrations of mercury in individual fish follows an approximate normal

distribution with a mean of 0.25 ppm and a standard deviation of 0.08

ppm. Fish are safe to eat if the mercury level is below 0.30 ppm. What

proportion of fish are safe to eat?

(a) 63%

(b) 23%

(c) 73%

(d) 27%

(e) 37%

Solution: c

Past performance 1992 Dec - 45% (16%a, 22%b, 15%d)

Past performance 1993 Apr - 57% (a-17%; d-17%)

Past performance 1996 Nov - 93%

Past performance 1997 Aug - 84%

Past performance 2006 Dec - 91%

wishes to know the mercury level of the top 20% of the fish. The appro-

priate percentile and mercury level for this lake is:

(a) 20th percentile has a value of −0.84 ppm

(b) 20th percentile has a value of 0.18 ppm

(c) 80th percentile has a value of 0.32 ppm

(d) 80th percentile has a value of 0.84 ppm

1

(e) 20th percentile has a value of 0.07 ppm

Solution: c

Past performance 1992 Dec - 46% (28%-b, 15%-d)

Past performance 1997 Aug - 77% (13%-d)

Past performance 2006 Dec - 84% (11%-c)

3. The following graph is a normal probability plot for the amount of rainfall

in acre-feet obtained from 26 randomly selected clouds that were seeded

with silver oxide:

(a) The data appear to show exponential growth; that is, the amount

of rainfall increases exponentially as the amount of silver oxide in-

creases.

(b) The pattern suggests that the measurement is not normally dis-

tributed.

(c) A least squares regression line should be fitted to the rainfall variable.

(d) It can be expected that the histogram of rainfall amount will look

like the normal curve.

(e) The shape of the curve suggests that rainfall is caused by seeding the

clouds with silver oxide.

65 and a standard deviation of 12. Approximately what percentage of the

students have scores below 50?

(a) 11%

(b) 89%

(c) 15%

2006

c Carl James Schwarz 2

(d) 18%

(e) 39%

Solution: a

of the mark distribution?

(a) 80

(b) 90

(c) 85

(d) 75

(e) 95

Solution: a

and a variance of 225. If the instructor wishes to assign B’s or higher to

the top 30% of the students in the class, what mark is required to get a

B or higher?

(a) 68.7

(b) 71.5

(c) 73.2

(d) 74.6

(e) 69.9

Solution: e

Past performance 1989 Dec - 50% (25% -d, 10% -b,c)

Past performance 1991 Oct - 67% (10% c, 14% d)

approximately normally distributed with mean equal to 2.4 and standard

deviation equal to 0.8. What fraction of the students will possess a grade

point average in excess of 3.0 ?

(a) .7500

(b) .6000

(c) .2734

(d) .2500

(e) .2266

2006

c Carl James Schwarz 3

Solution: e

Past performance 1989 Dec - 52% (18% c,d)

Past performance 1989 Apr - 50% (C-23%, D-18%)

Past performance 1991 Dec - 80% (c-13%)

8. In some courses (but certainly not in an intro stats course!), students are

graded on a “normal curve”. For example, students within ś 0.5 stan-

dard deviations of the mean receive a C; between 0.5 and 1.0 standard

deviations above the mean receive a C+; between 1.0 and 1.5 standard

deviations above the mean receive a B; between 1.5 and 2.0 standard de-

viations above the mean receive a B+, etc. The class average in an exam

was 60 with a standard deviation of 10. The bounds for a B grade and the

percentage of students who will receive a B grade if the marks are actually

normal distributed are:

(a) (65, 75), 24.17%

(b) (70, 75), 18.38%

(c) (70, 75), 9.19%

(d) (65, 75), 12.08%

(e) (70, 75), 6.68%

Solution: c

Past performance 1997 Jul - 85%

Refer to the previous question. Another Instructor decides that the lower

B cutoff should be the 70th percentile. The lower-cutoff for a B grade is:

(a) 70

(b) 65

(c) 60

(d) 75

(e) 80

Solution: b

Past performance 1997 Jul - 71% (14%-a)

with a mean of 2.5 cm and standard deviation of .02 cm. The probability

that a disk picked at random has a diameter greater than 2.54 cm is about:

(a) .5080

(b) .2000

(c) .1587

2006

c Carl James Schwarz 4

(d) .0228

(e) .4920

Solution: d

10. Suppose the test scores of 600 students are normally distributed with a

mean of 76 and standard deviation of 8. The number of students scoring

between 70 and 82 is:

(a) 272

(b) 164

(c) 260

(d) 136

(e) 328

Solution: e

11. Bolts that are used in the construction of an electric transformer are sup-

posed to be 0.060 inches in diameter, and any bolt with diameter less than

0.058 inches or greater than 0.062 inches must be scrapped. The machine

that makes these bolts is set to produce bolts of 0.060 inches in diameter,

but it actually produces bolts with diameters following a normal distribu-

tion with µ = 0.060 inches and σ = 0.001 inches. The proportion of bolts

that must be scrapped is equal to:

(a) 0.0456

(b) 0.0228

(c) 0.9772

(d) 0.3333

(e) 0.1667

Solution: a

12. The cost of treatment per patient for a certain medical problem was mod-

eled by one insurance company as a normal random variable with mean

$775 and standard deviation $150. What is the probability that the treat-

ment cost of a patient is less than $1,000, based on this model?

(a) .5000

(b) .6826

(c) .8531

(d) .9332

2006

c Carl James Schwarz 5

(e) Cannot be computed without knowledge of additional parameters

Solution: d

13. The time that a skier takes on a downhill course has a normal distribution

with a mean of 12.3 minutes and standard deviation of 0.4 minutes. The

probability that on a random run the skier takes between 12.1 and 12.5

minutes is:

(a) 0.1915

(b) 0.3830

(c) 0.3085

(d) 0.6170

(e) 0.6826

Solution: b

with µ = 1200 ohms and σ = 120 ohms. What proportion of the resistors

have resistances that differ from the mean resistance by more than 120

ohms?

(a) 0.9544

(b) 0.3413

(c) 0.1587

(d) 0.6826

(e) 0.3174

Solution: e

tributed with a mean of 12 minutes and a standard deviation of 1.5 min.

Find the probability that a particular assembly takes more than 14.25

minutes.

(a) .9332

(b) .0668

(c) .3413

(d) .4332

(e) .1587

Solution: b

2006

c Carl James Schwarz 6

16. Heights of males are approximately normally distributed with a mean of

170 cm and a standard deviation of 8 cm. What fraction of males are

taller than 176 cm?

(a) .7500

(b) .6000

(c) .2734

(d) .2500

(e) .2266

Solution: e

Past performance 1990 Oct - 68%

Past performance 1993 Feb - 87%

Past performance 1998 Dec - 92%

mean of 175 cm and standard deviation 6 cm. The 20th percentile of the

distribution of heights is:

(a) 175

(b) 179

(c) 170

(d) 172

(e) 174

Solution: c

18. The heights of students at a college are normally distributed with a mean

of 175 cm and a standard deviation of 6 cm. One might expect in a sample

of 1000 students that the number with heights less than 163 cm is:

(a) 997

(b) 23

(c) 477

(d) 228

(e) 456

Solution: b

Past performance 1991 Oct - 62% (12% c, 20% d)

Past performance 1996 Dec - 83% (11% d)

Past performance 2006 Nov - 84%

2006

c Carl James Schwarz 7

19. The height of an adult male is known to be normally distributed with a

mean of 69 inches and a standard deviation of 2.5 inches. The height of

the doorway such that 96 percent of the adult males can pass through it

without having to bend is:

(a) 1.8

(b) about 65

(c) about 74

(d) about 80

(e) about 58

Solution: c

Past performance 2006 Nov - 96%

distributed. The mean is 80 kg. and approximately 68% of the weights

are between 70 and 90 kg. The standard deviation of the distribution of

weights is equal to:

(a) 20

(b) 5

(c) 40

(d) 50

(e) 10

Solution: e

normally distributed with µ = 55 kg and σ = 5 kg. Which of the following

is true?

(a) About 16 percent of the students will be over 60 kg.

(b) About 2.5 percent will be below 45 kg.

(c) Half of them can be expected to weigh less than 55 kg.

(d) About 5 percent will weigh more than 63 kg.

(e) All the above are true.

Solution: e

distributed with a mean of 35 kg/day and a standard deviation of 6 kg/day.

The probability that a days production for a single animal will be less than

28 kg. is approximately:

2006

c Carl James Schwarz 8

(a) .41

(b) .09

(c) .38

(d) .12

(e) .62

Solution: d

Past performance 1990 Dec - 66%

23. Refer to the previous question. The producer is concerned when the milk

production of a cow falls below the 5th percentile because the animal

may be ill. The 5th percentile (in kg) of the daily milk production is

approximately:

(a) 1.645

(b) -1.645

(c) 33.36

(d) 25.13

(e) 44.87

Solution: d

Past performance 1990 Dec - 64%

24. Which of the following is NOT CORRECT about a standard normal dis-

tribution?

(a) P (0 ≤ Z ≤ 1.50) = .4332

(b) P (Z ≤ −1.0) = .1587

(c) P (Z ≥ 2.0) = .0228

(d) P (Z ≤ 1.5) = .9332

(e) P (Z ≥ −2.5) = .4938

Solution: e

Past performance 1989 Dec - 78%

Past performance 1990 Oct - 76%

25. The measurement of the width of the index finger of a human right hand

is a normally distributed variable with a mean of 6 cm. and a standard

deviation of 0.5 cm. What is the probability that the finger width of a

randomly selected person will be between 5 cm. and 7.5 cm.?

(a) .9759

2006

c Carl James Schwarz 9

(b) .0241

(c) .9500

(d) 1.000

(e) not within ś 0.001 of these

Solution: a

26. Lice are a pesky problem for school aged children and is unrelated to

cleanliness. The lifetimes of lice that have fallen off the scalp onto bed-

ding is approximately normally distributed with a mean of 2.2 days and a

standard deviation of 0.4 days. We would expect that approximately 90%

of the lice would die within:

(a) about 2.6 days

(b) about 3.9 days

(c) about 2.5 days

(d) about 2.7 days

(e) about 3.0 days

Solution: d

Past performance 1998 Nov - 67% (23% e)

2006

c Carl James Schwarz 10

Multiple Choice Questions

Probability - Poisson

1. It is sometimes possible to obtain approximate probabilities associated

with values of a random variable by using the probability distribution of a

different random variable. For example, binomial probabilities using the

Poisson probability function, binomial probabilities using the normal etc.

In order for the Poisson to give “good” approximate values for binomial

probabilities we must have the condition(s) that:

(a) the population size is large relative to the sample size.

(b) the sample size is large

(c) the probability, p, is small and the sample size is large

(d) the probability, p, is close to .5 and the sample size is large

(e) the probability, p, is close to .5 and the population size is large

Solution: c

2. Suppose flaws (cracks, chips, specks, etc.) occur on the surface of glass

with density of 3 per square metre. What is the probability of there being

exactly 4 flaws on a sheet of glass of area 0.5 square metre?

(a) 0.047

(b) 0.168

(c) 0.981

(d) 0.815

(e) 0.647

Solution: a

1

1 PROBABILITY - POISSON DISTRIBUTION

3. The rate at which a particular defect occurs in lengths of plastic film being

produced by a stable manufacturing process is 4.2 defects per 75 metre

length. A random sample of the film is selected and it was found that the

length of the film in the sample was 25 metres. What is the probability

that there will be at most 2 defects found in the sample?

(a) .2102

(b) .2417

(c) .8335

(d) .1323

(e) .1665

Solution: c

Past performance 1997 Jul - 86%

larger amount of film. She selects 1000 m of film. If there were no change

in the defect rate from the old process, what would be the number of

defects seen in approximately 95% of such examinations?

(a) (49 to 63)

(b) (34 to 78)

(c) (62 to 98)

(d) (41 to 71)

(e) (71 to 89)

Solution: d

Past performance 1997 Jul - 67% (21% - a)

4. The number of traffic accidents per week in a small city has a Poisson

distribution with mean equal to 1.3. What is the probability of at least

two accidents in 2 weeks?

(a) 0.2510

(b) 0.3732

(c) 0.5184

(d) 0.7326

(e) 0.4816

Solution: d

5. The number of traffic accidents per week in a small city has Poisson dis-

tribution with mean equal to 3. What is the probability of at least one

accident in 2 weeks?

2006

c Carl James Schwarz 2

1 PROBABILITY - POISSON DISTRIBUTION

(a) 0.0174

(b) 0.9502

(c) 0.9975

(d) 0.1991

(e) 0.0025

Solution: c

6. Significant birth defects occur at a rate of about 4 per 1000 births in human

populations. After a nuclear accident, there were 10 defects observed in

the next 1500 births. Find the probability of observing at least 10 defects

in this sample if the rate had not changed after the accident.

(a) .008

(b) .003

(c) .041

(d) .084

(e) .042

Solution: d

Past performance 1990 Oct - 58%

Past performance 1991 Dec - 66% (c-17%)

Past performance 1996 Nov - 79% (c-12%)

number of defects that would occur in 1500 births (assuming that the rate

has not changed) is:

(a) (4, 8)

(b) (2, 10)

(c) (2, 6)

(d) (0, 8)

(e) (0, 12)

Solution: b

Past performance 1990 Oct - 78%

Past performance 1996 Dec - 77% (10%-a)

error per 10 seconds. Let the distribution of transmission errors be Pois-

son. What is the probability of more than 1 error in a communication

one-half minute in duration?

2006

c Carl James Schwarz 3

1 PROBABILITY - POISSON DISTRIBUTION

(a) 0.950

(b) 0.262

(c) 0.738

(d) 0.199

(e) 0.801

Solution: e

that a large batch of hamburger has an average contamination of 0.3 bac-

teria/gram. Then the probability that a 10 gram sample will contain one

or fewer bacteria is:

(a) .2222

(b) .7408

(c) .9603

(d) .1494

(e) .1992

Solution: e

Past performance 1989 Oct - 89%

Past performance 1991 Oct - 84%

Past performance 1997 Aug - 92%

10. Refer to the previous question. A 95% range for the likely number of

bacteria present in a 100 g sample is:

(a) 30ś30.0

(b) 30ś5.5

(c) 30ś11.0

(d) 30ś16.4

(e) 30ś2.8

Solution: c

Past performance 1989 Oct - 77%

Past performance 1991 Oct - 71% (19% b)

Past performance 1997 Aug - 85%

11. The number of bacteria in a drop of water from a lake has a Poisson

distribution with an average of 0.5 bacteria/drop. A small dish containing

four drops of water from the lake is placed under a microscope. The

probability of observing at most one bacteria in the sample is

2006

c Carl James Schwarz 4

1 PROBABILITY - POISSON DISTRIBUTION

(a) 0.910

(b) 0.406

(c) 0.271

(d) 0.135

(e) 0.303

Solution: b

Past performance 1989 Dec - 75%

Past performance 1992 Oct - 82%

Past performance 2006 Dec - 74% (11%-a;)

12. Refer to the previous question. An approximate 95% range for the number

of bacteria present in 400 drops of water is:

(a) (171,229)

(b) (361,439)

(c) (185,215)

(d) (157,243)

(e) (0,400)

Solution: a

Past performance 1989 Dec - 70%

Past performance 1992 Oct - 87%

Past performance 2006 Dec - 75% (16%-c)

(a) It is used to compute the probability of rare events.

(b) Every event is independent of every other event.

(c) It is parameterized by the sample size and the probability that an

event will occur.

(d) The theoretical range for the number of events that could occur is

0,1,2,3, ...

(e) In order to compute the parameter value, we need to know the stan-

dardized rate and the sample size.

Solution: c

Past performance 1996 Nov - 56% (25%-d; 14%-e)

14. In a biological cell the average member of genes that will change into

mutant genes, when treated radioactively, is 2.4. Assuming Poisson prob-

ability distribution find the probability that there are at most 3 mutant

genes in a biological cell after the radioactive treatment.

2006

c Carl James Schwarz 5

1 PROBABILITY - POISSON DISTRIBUTION

(a) .2090

(b) .7576

(c) .5697

(d) .7787

(e) 1.000

Solution: d

15. The number of telephone calls that pass through a switchboard has a

Poisson distribution with mean equal to 2 per minute. The probability

that no telephone calls pass through the switch board in two consecutive

minutes is:

(a) 0.2707

(b) 0.0517

(c) 0.0183

(d) 0.0366

(e) 0.1353

Solution: c

16. The distribution of phone calls arriving in one minute periods at a switch-

board is assumed to be Poisson with the parameter λ. During 100 periods,

the following distribution was obtained:

# (calls) 0 1 2 3 4 or more

Frequency 30 43 21 6 0

(a) 1.00

(b) 1.03

(c) 1.04

(d) 1.33

(e) 1.37

Solution: b

17. A can company reports that the number of breakdowns per 8-hour shift

on its machine-operated assembly line follows a Poisson distribution with

a mean of 1.5. Assuming that the machine operates independently across

shifts, what is the probability of no breakdowns during three consecutive

8-hour shifts?

2006

c Carl James Schwarz 6

1 PROBABILITY - POISSON DISTRIBUTION

(a) .0744

(b) .0498

(c) .6065

(d) .2231

(e) .0111

Solution: e

18. A fisherman arrives at his favorite fishing spot. From past experience

he knows that the number of fish he catches per hour follows a Poisson

distribution at 0.5 fish/hour. The probability that he catches at least 3

fish in four hours is:

(a) .0126

(b) .0144

(c) .1804

(d) .3233

(e) .8571

Solution: d

19. The number of arrivals per hour at an automatic teller machine is Poisson

distributed with a mean of 3.5 arrivals/hour. What is the probability that

more than three arrivals occur in an hour?

(a) .3209

(b) .4633

(c) .5367

(d) .6791

(e) .7246

Solution: b

20. The marketing manager of a company has noted that she usually receives

10 complaint calls during a week (consisting of five working days), and

that the calls occur at random. Let us suppose that the number of calls

during a week follows the Poisson distribution. The probability that she

gets five such calls in one day is:

(a) .0361

(b) .0378

2006

c Carl James Schwarz 7

1 PROBABILITY - POISSON DISTRIBUTION

(c) .9834

(d) .2000

(e) .5

Solution: a

21. Cataracts are a very rare birth defect. In Canada, they occur at a rate

of approximately 3 babies in every 100,000 births. In 1989, there were

approximately 57,000 births in BC. The probability that more than 5

babies will be born with cataracts is approximately:

(a) about .1080

(b) about .0295

(c) about .0216

(d) about .0080

(e) about .0839

Solution: d

Past performance 1998 Nov - 78% (13% a)

Past performance 2006 Nov - 82% (10% b)

22. The number of deaths due to stroke in the Vancouver region each year

varies randomly with a mean of about 555 deaths per year. Assuming

that the number of deaths has an approximate Poisson distribution, then

the probability that there will be at least 600 deaths due to stroke in any

one year is:

(a) about 1%

(b) about 32%

(c) about 16%

(d) about 5%

(e) about 2.5%

Solution: e

Past performance 1998 Nov - 41% (10% a; 14% b; 20% c; 15% d)

Past performance 2006 Nov - 84%

23. The number of babies born with a particular severe eye defect each year

varies randomly, but at a rate of about 30/10,000 live births. Last year

there were about 15,000 live births. The approximate probability that

there will be more than 58 babies born with this eye defect is:

(a) about 16%

2006

c Carl James Schwarz 8

1 PROBABILITY - POISSON DISTRIBUTION

(b) about 5%

(c) about 1%

(d) about 0.5%

(e) about 2.5%

Solution: e

Past performance 1998 Dec - 65% (12% d)

2006

c Carl James Schwarz 9

Multiple Choice Questions

Correlation

between the eye color (brown, green, blue) of an experimental animal and

the amount of nicotine that is fatal to the animal when consumed. This

indicates:

(a) nicotine is less harmful to one eye color than the others.

(b) the lethal dose of nicotine goes down as the eye color of the animal

changes.

(c) one must always consider the eye color of animals in making state-

ments about the effect of nicotine consumption.

(d) the researchers need to do further study to explain the causes of this

negative correlation.

(e) the researchers need to take a course in statistics because correlation

is not an appropriate measure of association in this situation.

Solution: e - correlation cannot be computed with nominal variables

Past performance 1997 Jun - 98%

2. If the correlation between body weight and annual income were high and

positive, we could conclude that:

(b) low incomes cause people to eat less food.

(c) high income people tend to spend a greater proportion of their income

on food than low income people, on average.

(d) high income people tend to be heavier than low income people, on

average.

(e) high incomes cause people to gain weight.

Solution: d

Past performance 1991 Dec - 70% (c-25%)

Past performance 1993 Apr - 75% (c-25%)

1

3. A study found a correlation of r = −0.61 between the sex of a worker and

his or her income. You conclude that:

(b) women earn less than men on average.

(c) an arithmetic mistake was made; this is not a possible value of r.

(d) this is nonsense because r makes no sense here.

(e) the correlation of −0.61 is not meaningful here because the relation-

ship between sex and income is likely nonlinear.

Solution: d

Past performance 1993 Feb - 60% (e-33%)

4. A study examined the relationship between the sepal length and sepal

width for two varieties of an exotic tropical plant. Varieties A and B are

represented by x’s and o’s, respectively, in the following plot:

sepal length and sepal width.

(b) Considering variety B alone, the least squares regression line for pre-

dicting sepal length from sepal width has a negative slope.

(c) Considering both varieties together, there is a positive correlation

between sepal length and sepal width.

(d) Considering each variety separately, there is a positive correlation

between sepal length and sepal width.

(e) Considering both varieties together, the least squares regression line

for predicting sepal length from sepal width has a positive slope.

Solution: d

2006

c Carl James Schwarz 2

5. From tax records, it is relative easy to determine the amount of liquor

consumed per capita and the number of cigarettes consumed per capita

for each of the 10 provinces of Canada. These are plotted on a scatter

plot and a high positive correlation is found. Which of the following is

correct?

(a) This implies that heavy smoking causes people to drink more.

(b) This implies that heavy drinking causes people to smoke more.

(c) We cannot conclude cause and effect, but this also implies that there

is a high positive correlation between cigarette smoking and alcohol

consumption for individuals.

(d) This could be an example of a correlation caused by a common cause

because both activities are highly correlated with average family in-

come and average income varies widely among the provinces.

(e) We cannot conclude cause and effect, but this also implies that the

same individuals both smoke and consume liquor.

Solution: d

Past performance 1993 Feb - 44% (c-44%; e-10%)

changes in another variable.

(b) a measure of the strength of the linear association between two cat-

egorical variables.

(c) a measure of the strength of the association (not necessarily linear)

between two categorical variables.

(d) a measure of the strength of the linear association between two quan-

titative variables.

(e) a measure of the strength of the linear association between a quanti-

tative variable and a categorical variable.

Solution: d

7. On May 11th, 50 randomly selected subjects had their systolic blood pres-

sure (SBP) recorded twice – the first time at about 9:00 a.m. and the

second time at about 2:00 p.m. If one were to examine the relationship

between the morning and afternoon readings, then one might expect:

(a) the correlation to be near zero, as the morning and afternoon readings

should be independent of one another.

2006

c Carl James Schwarz 3

(b) the correlation to be high and positive, as those with relatively high

readings in the morning will tend to have relatively high readings in

the afternoon.

(c) the correlation to be high and negative, as those with relatively high

readings in the morning will tend to have relatively low readings in

the afternoon.

(d) the correlation to be near zero, as correlation measures the strength

of the linear association.

(e) the correlation to be near zero, as blood pressure readings should

follow approximately a normal distribution.

Solution: b

Past performance 1996 Dec - 62% (23%-d)

Past performance 1998 Oct - 68%

8. Men tend to marry women who are slightly younger than themselves.

Suppose that every man married a woman who was exactly .5 of a year

younger than themselves. Which of the following is CORRECT?

(a) The correlation is −.5.

(b) The correlation is .5.

(c) The correlation is 1.

(d) The correlation is −1.

(e) The correlation is 0

Solution: c - Draw a scatterplot of various aged men and their wives

Past performance 2006 Oct - 75% (10%-e)

2006

c Carl James Schwarz 4

Multiple Choice Questions

Least squares

and Y , we would consider fitting a straight line with X as an explanatory

variable if:

(b) the change in Y is a constant for each unit change in X

(c) the change in Y is a fixed percent of Y

(d) the change in Y is exponential

(e) none of the above

Solution: b

(a) which is determined by use of a function of the distance between the

observed Y ’s and the predicted Y ’s.

(b) which has the smallest sum of the squared residuals of any line

through the data values.

(c) for which the sum of the residuals about the line is zero.

(d) which has all of the above properties

(e) which has none of the above properties.

Solution: b

3. The following information was obtained from the manager of a city water

department for predicting the consumption of water (in gallons) from the

size of household:

1

Household Water

Size Used

(x) (y)

2 650

7 1200

9 1300

4 430

12 1400

6 900

9 1800

3 640

3 793

2 925

Here

P are the summary statistics:

P X = 57,

P Y 2= 10, 038,

P X2 = 433,

P Y = 11, 641, 474,

XY = 67, 669

household size is given by:

(b) Yb = 999.220 + 0.803X

(c) Yb = −1.0028 + 0.0067X

(d) Yb = 452.66 + 96.692X

(e) Yb = 1003.8 − 96.692X

4. For children between the ages of 18 months and 29 months, there is approx-

imately a linear relationship between “height” and “age”. The relationship

can be represented by: Yb = 64.93 + 0.63(x), where Y represents height

(in centimetres) and X represents age (in months). Joseph is 22.5 months

old and is 80 centimetres tall. What is Joseph’s residual?

(a) 79.1

(b) -0.9

(c) +0.9

(d) 56.6

(e) 64.93

2006

c Carl James Schwarz 2

Solution: c

and “age”. One child was measured monthly. Her height was 75 cm at 3

years of age and 85 cm when she was measured 18 months later. A least-

squares line was fit to her data. The slope of this line is approximately:

(b) 10 cm/m

(c) 25 cm/m

(d) 1.57 cm/m

(e) 2.1 cm/m

Solution: a

Past performance 1993 Feb - 72% (b-16%)

Past performance 1996 Oct - 96%

and their age (from 5 to 18 years) described by:

is not correct?

(a) The estimated slope is 6.01 which implies that children increase by

about 6 cm for each year they grow older.

(b) The estimated height of a child who is 10 years old is about 110 cm.

(c) The estimated intercept is 50.3 cm which implies that children reach

this height when they are 50.3/6.01=8.4 years old.

(d) The average height of children when they are 5 years old is about

50% of the average height when they are 18 years old.

(e) My niece is about 8 years old and is about 115 cm tall. She is taller

than average.

Solution: c

Past performance 1993 Apr - 83%

Past performance 1997 Jun - 96%

7. A study was conducted to examine the quality of fish after seven days in

ice storage. For this study:

2006

c Carl James Schwarz 3

Y = measurement of fish quality (on a 10 point scale with 10 = BEST.)

X = # of hours after being caught that the fish were packed in ice.

The sample linear regression line is: Yb = 8.5 − .5X. From this we can say

that:

(a) A one hour delay in packing the fish in ice decreases the estimated

quality by .5

(b) A one hour delay in packing the fish in ice increases the estimated

quality by .5

(c) If the estimated quality increases by 1 then the fish have been packed

in ice one hour sooner.

(d) If the estimated quality increases by 1 the fish have been packed in

ice two hours later.

(e) Can’t really say until we see a plot of the data.

Solution: a

of fertilizer applied, X (kg/ha). An experiment was conducted by apply-

ing different amounts of fertilizer (0 to 10 kg/ha) to plots of land and

measuring the resulting yields. The following estimated regression line

was obtained:

yield

d = 4.85 + .05(f ertilizer)

(b) If fertilizer is applied at 10 kg/ha, the estimated yield is 5.35 t/ha.

(c) For every additional kg/ha of fertilizer applied, the yield is estimated

to increase 0.05 t/ha.

(d) To obtain an estimated yield of 5.2 t/ha., you need to apply 7.0 kg/ha

of fertilizer.

(e) If the current level of fertilizer is changed from 7.0 to 9.0 kg/ha, the

yield is estimated to increase by 0.20 t/ha.

Solution: e

Past performance 1991 Apr - 96%

Growth hormones are often used to increase the weight gain of chickens.

In an experiment using 15 chickens, five different doses of growth hormone

(0, .2, .4, .8, and 1.0 mg/kg) were injected into chickens (three for each

2006

c Carl James Schwarz 4

dose) and the subsequent weight gain was recorded. An experimenter

plots the data and finds that a linear relationship appears to hold. The

output from SAS follows:

MODEL 1 78.4083 78.4083 8.11 .0137

ERROR 13 125.7410 9.6723

CORRECTED TOTAL 14 204.1493

PARAMETER ESTIMATE PARAMETER=0 ESTIMATE

INTERCEPT 3.7816 3.23 0.0066 1.1705

DOSE 4.0416 2.85 0.0137 1.4195

(b) Yb = 3.23 + 2.85X

(c) Yb = 2.85 + 3.23X

(d) Yb = 3.78 + 4.04X

(e) Yb = 1.17 + 1.42X

Solution: d

Past performance 1989 Apr - 83%

Past performance 1990 Dec - 97%

Past performance 1996 Dec - 84%

(b) 4.04 ± 1.77(1.42)

(c) 4.04 ± 2.16(1.42)

(d) 3.78 ± 1.77(1.17)

(e) 3.78 ± 2.16(1.17)

Solution: c

Past performance 1989 Apr - 50% (A-32%)

Past performance 1990 Dec - 90%

Past performance 1996 Dec - 86%

11. It is suspected that weight gain should increase with dose. An appropriate

null and alternate hypothesis to test the slope, the test statistic, and the

p-value are:

2006

c Carl James Schwarz 5

(a) H: β1 = 0 A:β1 6= 0; T ∗ = 2.85; p-value = .0069

(b) H: β0 = 0 A:β0 6= 0; T ∗ = 3.23; p-value = .0066

(c) H: β1 = 0 A:β1 > 0; T ∗ = 2.85; p-value = .0137

(d) H: β0 = 0 A:β0 > 0; T ∗ = 3.23; p-value = .0033

(e) H: β1 = 0 A:β1 > 0; T ∗ = 2.85; p-value = .0069

Solution: e

Past performance 1989 Apr - 49% (C-31%)

Past performance 1996 Dec - 82%

Growth hormones are often used to increase the weight gain of chickens.

In an experiment using 15 chickens, five different doses of growth hormone

(0, .2, .4, .8, and 1.0 mg/kg) were injected into chickens (three for each

dose) and the subsequent weight gain was recorded. An experimenter

plots the data and finds that a linear relationship appears to hold. The

output from JMP follows:

(a) Yb = 4.55 + .617X

(b) Yb = 4.83 + 4.55X

(c) Yb = 4.83 + 1.02X

(d) Yb = 4.55 + 4.75X

(e) Yb = 4.55 + 4.83X

Solution: e

Past performance 1996 Dec - 84%

(a) 4.55 ± .617

(b) 4.83 ± 2.03

(c) 4.83 ± 1.02

(d) 4.55 ± 1.33

(e) 4.83 ± 4.75

2006

c Carl James Schwarz 6

Solution: b

Past performance 1996 Dec - 86%

14. It is suspected that the weight gain should increase with dose. An appro-

priate null and alternate hypothesis to test the slope, the test statistic,

and the p-value are:

(a) H: β1 = 0, A: β1 6= 0; T* = 7.37; p-value < .0001.

(b) H: β0 = 0, A: β0 6= 0; T* = 4.75; p-value = .0004.

(c) H: b1 = 0 A:b1 > 0 T* = 7.37; p-value = .0002.

(d) H: b0 = 0 A:b0 > 0 T* = 4.75; p-value = .0002.

(e) H: β1 = 0, A: β1 > 0; T* = 4.75; p-value = .0002.

Solution: e

Past performance 1996 Dec - 82%

(in inches), and X, the number of weeks P after planting.PThe summary

data

P are: n = 6, X = 4.67, Y = 9.467, X 2 = 154, Y 2 = 696.54,

XY = 325.9 The fitted regression line for seedling height on the number

of weeks after planting is:

(a) Yb = 2.8 + 2.62X

(b) Yb = −2.8 + 2.62X

(c) Yb = 2.62 + 2.8X

(d) Yb = 9.5 + 2.62X

(e) Yb = 2.62X

Solution: b

16. Refer to the previous question. If the number of weeks after planting

ranged from 2 to 8, what is the predicted height for a seedling after 12

weeks?

X may not be linear beyond 8 weeks.

(b) 9.467

(c) 24.804

(d) 28.584

(e) 31.284

2006

c Carl James Schwarz 7

Solution: a

17. A research group was interested in predicting the number of bus riders per

capita in census districts. They felt that the rider-ship per capita, Y , could

be predicted using the average income, X, for the census district. A sample

of 29 census districts were taken and the observations on theP samples were

used to obtain nP= 29, Y = 62.1429, X = 3452.178; (X − X)(Y −

Y ) = 189, 312.0; (X − X)2 = 19, 910, 691.0; (Y − Y )2 = 13, 369.381;

P

M SE = 428.5 Based on this data, a 98% confidence interval for β1 is:

(b) .0095 ± (2.33)(.0046)

(c) .0095 ± (2.33)(20.7894)

(d) .0095 ± (2.467)(.0046)

(e) .0095 ± (2.473)(.0046)

Solution: e

The effects of a toxic pollutant upon fish was examined by placing fish in

a two liter solution of water with various concentrations of the pollutant.

The time (in minutes) until the fish showed distress was recorded at which

time the fish were removed from the container. A total of 18 different

experiments were performed. Note that the pollutant is measured on a

logarithmic scale where a change of one unit represents an increase of 10

fold in the pollution concentration. A preliminary plot of the data showed

that the relationship of time vs. log(pollution) was approximately linear.

The output appears below:

ERROR 16 6.45556062 0.40347254

CORR. TOTAL 17 8.67015774

PARAMETER ESTIMATE PARAMETER=0 ESTIMATE

INTERCEPT 7.5641 3.82 0.0015 1.978

LOGPOLLUT -1.0269 -2.34 0.0324 0.438

(b) Yb = 7.56 − 1.03X

2006

c Carl James Schwarz 8

(c) Yb = 3.28 − 2.34X

(d) Yb = 7.56 − 10.27X

(e) Yb = −1.03 + 75.64X

Solution: b

Past performance 1990 Apr - 89%

Past performance 1991 Dec - 93%

(b) −1.03 ± 1.96(0.438)

(c) 7.56 ± 2.1098(1.978)

(d) −1.03 ± 2.1098(.438)

(e) −1.03 ± 2.1199(.438)

Solution: e

Past performance 1990 Apr - 72%(D-14%)

Past performance 19 91 Dec - 88%

20. An appropriate null and alternate hypothesis to test the slope, the test

statistic, and the p-value are:

(b) H: β0 = 0 A:β0 6= 0; T ∗ = 3.82; p-value = .0007

(c) H: β1 = 0 A:β1 < 0; T ∗ = -2.34; p-value = .0324

(d) H: β0 = 0 A:β0 6= 0; T ∗ = 3.82; p-value = .0015

(e) H: β1 = 0 A:β1 < 0; T ∗ = -2.34; p-value = .0162

Solution: e

Past performance 19 90 Apr - 48% (A-24%, C-18%)

22. A similar experiment was performed using a second pollutant. The esti-

mated regression line is found to be Yb = 27.63 − 2.03X. Which of the

following is NOT CORRECT?

sented by an increase of 2 on the logarithmic scale), the average time

to distress decreases by 4.06 minutes.

2006

c Carl James Schwarz 9

(b) In order to obtain an estimated time to distress of 25 minutes, the

log(concentration ) of the pollutant should be 1.30.

(c) A ten-fold increase in pollution (represented by an increase of one

unit on the log scale) decreases the time to distress by 20.3 minutes.

(d) It would be inadvisable to extrapolating the line outside of the ob-

served values of the pollutant concentration.

(e) The method of least squares is often used to obtain the estimates of

the slope and intercept.

Solution: c

Past performance 1990 Apr - 70% (A-11%, B-12%)

Past performance 1991 Dec - 56% (a-17%, b-17%)

plot and the fitted regression line are shown below:

(b) Yb = 20 - 4X; r = -0.6

(c) Yb = 20 - 2X; r = -0.9

(d) Yb = 20 - 4X; r= -0.9

(e) Y = 20 - 2X; r = -0.3

b

Solution: a

Past performance 1990 Apr - 32% (B-12%, C-28%, E-23%)

Past performance 1991 Dec - 38% (b-13%, c-31%, e-11%)

One concern about the depletion of the ozone layer is that the increase

in UV light will decrease crop yields. An experiment was conducted in a

2006

c Carl James Schwarz 10

green house where soybean plants were exposed to varying levels of UV

levels - measured in Dobson units. At the end of the experiment the

yield (kg) was measured. A regression analysis was performed with the

following results:

Here is some output:

(a) which minimizes the sum of the squared differences between the ac-

tual UV values and the predicted UV values.

(b) which minimizes the sum of the squared residuals between the actual

yield and the predicted yield.

(c) which minimizes the sum the squared differences between the actual

yield and the predicted UV.

(d) which minimizes the sum of the squared residuals between the actual

UV reading and the predicted UV reading.

(e) which minimizes the total variation in the data.

Solution: b

Past performance 1993 Apr - 36% (a-14%; c-25%; e-18%)

Past performance 1997 Aug - 60% (a-15%; d-15%)

Past performance 2006 Oct - 60% (c-15%; e-10%)

(a) If the UV reading is increased by 1 Dobson unit, the yield is expected

to increase by .0463 kg.

(b) If the yield increases by 1 kg, the UV reading is expected to decline

by .0463 Dobson units.

(c) The estimated yield is 3.98 kg when the UV reading is 0 Dobson

units.

(d) The predicted yield is 4.3 kg when the UV reading is 20 Dobson units.

(e) The t-ratios are used to test if the estimated slope are different from

zero.

Solution: c

Past performance 1993 Apr - xx% (b-42%; e-10%)

Past performance 1997 Aug - xx% (b-14%)

Past performance 2006 Oct - 86% (b-14%)

2006

c Carl James Schwarz 11

26. A 95% confidence interval for the slope will be centered on the estimated

slope and:

(a) ±0.011

(b) ±0.108

(c) ±0.054

(d) ±0.046

(e) ±0.021

Solution: e

Past performance 19 93 Apr - 37% (a-18%; c-18%; d-20%)

Past performance 19 97 Aug - 87%

27. The null and alternate hypothesis for a test of the slope, the test statistic,

and the p- value are:

(b) H:β0 = 0; A:β0 < 0; T ∗ = -74.01; p-value < .0001.

(c) H:β1 = 0; A:β1 < 0; T ∗ = -4.31; p-value = .0004.

(d) H:βb1 = 0; A:βb1 < 0; T ∗ = -4.31; p-value = .0004.

(e) H:βb1 = 0; A:βb1 6= 0; T ∗ = -4.31; p-value = .0008.

Solution: c

Past performance 1993 Apr - 72% (d-18%)

Past performance 1997 Aug - 74% (d-18%)

28. A 95% confidence interval for the mean yield when the UV reading is 20

Dobson units is:

(a) 3.3 ± 0.86

(b) 3.3 ± 2.12

(c) 3.3 ± 0.40

(d) 3.3 ± 0.98

(e) 3.3 ± 0.71

Solution: a

Past performance 1993 Apr - 23% (b-25%; c-22%; d-21%; e-10%)

2006

c Carl James Schwarz 12

29. Another experiment was computed where the plants were sprayed with a

chemical that acts like a sun-screen. The following plot was obtained:

(a) 0.06 1.10

(b) 1.10 0.06

(c) 0.10 0.06

(d) 0.06 0.10

(e) 0.10 1.10

Solution: d - note that the intercept is the value of Y when X = 0, but

the vertical axis does not occur at X = 0 in the above graph.

Past performance 1993 Apr - 11% (a-65%; d-10%; e-10%)

Past performance 2006 Oct - 47% (a-44%)

2006

c Carl James Schwarz 13

Which of the following provides the most reasonable approximation to the

least squares regression line?

(a) Yb = 50 + 10X

(b) Yb = 50 + X

(c) Yb = 10 + 50X

(d) Yb = 1 + 50X

(e) Yb = 10 + X

Solution: a

Past performance 1990 Dec - 80%

31. In simple linear regression the model that is being assumed relates the

Dependent Variable, Y , to the Independent Variable, X, according to the

following relationship: Yi = β0 + β1 Xi + i , i = 1, 2, . . . . ,n. For setting

up confidence interval statements for the parameter β1 based on the least

squares estimates, it is necessary to make the following assumption(s)

about the i ’s:

(b) they are normally distributed

(c) they have a common variance, σ 2

(d) all of the above.

2006

c Carl James Schwarz 14

(e) least squares is purely a mathematical technique so no assumptions

are required.

32. A marine biologist wants to test the effect of water temperature on the

average dive duration for sea otters. Several otters are available for an

experiment. The biologist collects the following data:

Water. Dive

Temp (C) Duration (sec)

Otter X Y

J2 4 63

J1 8 75

B7 8 84

B9 12 91

M3 12 101

D4 16 110

B8 20 115

X 2 = 1088,

P P P

The

P 2summary statistics

P are: X = 80, Y = 639,

Y = 60457, XY = 7888

The least squares regression line is equal to:

(a) Yb = 3.4 + 52X

(b) Yb = 8.4 + 7.3X

(c) Yb = 4.7 + 21X

(d) Yb = 53 + 3.4X

(e) Yb = 50 − 3.3X

Solution: not available

For each circle they guessed the actual area, and then measured the actual

area. The scatter-plot had the guessed areas on the vertical axis and the

actual areas on the horizontal axes. A fitted line was fit to these data

points. One student’s fitted line was Guessed area = 5 + .65 Actual area.

Which of the following is not correct?

(a) The student guessed that a circle has an area of 125 mm2 . A better

guess would be 86 mm2 .

(b) The slope in the above equation indicates that, on average, a student

increases her guess by only .65 mm2 for every 1 mm2 increase in

actual area.

2006

c Carl James Schwarz 15

(c) “Calibration” refers to the process where the relationship between the

guessed and real areas is used to correct future guesses.

(d) If the fitted regression line tends to fall below the “45ř line”, then this

student tends to underestimate real areas.

(e) The fitted straight line was fit using “least squares”. This line mini-

mizes the sum of the square of the deviations between the actual and

predicted values.

Solution: a

Past performance 1997 Jun - 76%

vs. the amount of fat gave the following results: Calories = 97.1053 +

9.6525F at Which of the following is FALSE:

(a) It is estimated that for every additional gram of fat in the cereal, the

number of calories increases by about 9.

(b) It is estimated that in cereals with no fat, the total amount of calories

is about 97.

(c) If a cereal has 2 g of fat, then it is estimated that the total number

of calories is about 115.

(d) If a cereal has about 145 calories, then this equation indicates that

it has about 5 grams of fat.

(e) One cereal has 140 calories and 5 g of fat. Its residual is about 5 cal.

−5

Past performance 1998 Oct - 55% (12% a; 13% b; 16% e)

35. A selection of cereals was sampled and the number of calories was plotted

against the number of grams of protein with the following results:

(a) The 95% confidence interval for the number of calories per gram of

protein indicates that the known true value of 4 cal/gram may be

consistent with the data.

2006

c Carl James Schwarz 16

(b) It is estimated that cereals with no protein would have just over 100

calories/serving.

(c) The observed regression line is Y = 106.0 + .339(protein)

(d) One plausible reason that the confidence interval for the slope is so

wide is that confounding variables may cloud the relationship be-

tween calories and grams of protein.

(e) The standard error for the slope indicates how much the calories may

vary among different cereals in the sample.

Solution: e

Past performance 1998 Nov - 53% (15% a)

Fitness can be measured by the rate of oxygen consumption during exer-

cises with more fit people having higher rates. Unfortunately, this mea-

surement is quite costly to obtain, and so an experiment was done to see

if this measurement could be predicted from the time it takes (in minutes)

to run 1500 m. The following output from JMP was obtained - the M and

F refer to males and females respectively.

2006

c Carl James Schwarz 17

36. Which of the following is NOT CORRECT?

(a) We are about 95% confident that the slope for this data is between

-4.0 and -2.5.

(b) The fitted regression line is approximately Yb = 82.42−3.31(runtime)

(c) There is good evidence that there is a relationship between oxygen

consumption and the run time.

(d) A person who runs 1500 m in 10 minutes would have an estimated

oxygen consumption rate of about 50.

(e) The se of .36 measures how much the estimated slope would vary if

another sample of people were measured.

Solution: a

Past performance 1998 Dec - 39% (16% c; 39% e)

(a) The most relevant null hypothesis is that the estimated change in

oxygen consumption for people who take an additional minute to

run 1500 m is 0.

(b) The most relevant null hypothesis is: H: β1 = −3.31.

(c) The most relevant null hypothesis is that there is no relationship

between the oxygen consumption rate and the time to run 1500 m

among all people.

(d) The most relevant null hypothesis is that we are 95% confident that

the slope is between -4.04 and -2.57.

(e) The most relevant null hypothesis is that we haven’t a clue what this

question is about.

Solution: c

Past performance 1998 Dec - 68% (18% a; 10% b)

38. In the above graph, both males and females appear to have the same

relationship. However, this is, in general, not true. If the relationship

for each group was not the same, then which of the following is NOT

CORRECT?

(a) The slope for the combined data could be substantially different than

either group’s slope.

(b) The intercept for the combined data could be substantially different

than either group’s intercept.

(c) The sample correlation in the combined group could be substantially

different than either group’s correlation.

2006

c Carl James Schwarz 18

(d) The combined results may be influenced by a lurking variable, in this

case gender.

(e) The median oxygen consumption for the combined group will be the

average of the medians of each group.

Solution: e

Past performance 1998 Dec - 82%

2006

c Carl James Schwarz 19

Multiple Choice Questions

Regression, Correlation, Trends

tially over time is by:

(a) plotting the variable against time and looking for a straight-line pat-

tern.

(b) calculating the least squares regression line of the variable against

time and examining the residuals.

(c) plotting the logarithm of the variable aginst time and looking for a

straight line pattern.

(d) smoothing the time series by running medians of three or five.

(e) smothing the scatter plot by median trace

Solution: c

that the revenue is consistently highest in December. The high December

revenue is an illustration of:

(a) trend

(b) seasonal variation

(c) irregular fluctuations

(d) a cycle

(e) ??????

Solution: not availabe

3. The following data come from a time series of yearly sales of equipment

by a large manufacturer:

Units Sold 330 241 200 499 322 500 601

1

In order to smooth this series a running median of 3 is calculated. The

smoothed series for the years 1969 to 1973 respectively is:

(b) 330 499 499 500 601

(c) 241 241 322 499 500

(d) 257 313 340 440 474

(e) not enough information is given for us to determine the values.

4. The following plot is the net sales (billions of dollars) for Eastman Kodak

Ltd. for the years 1970 through 1989 (1970 is coded as 0):

a(n) pattern in the data.

(b) data set, stem and leaf

(c) linear model, correlation

(d) time series, trend

(e) regression model, multiple variable

oil in the last 100 years, can both be described as being:

2006

c Carl James Schwarz 2

(b) well represented by a straight line.

(c) approximately exponential growth.

(d) difficult to determine without detailed statistical analysis.

(e) regular with large residuals.

2006

c Carl James Schwarz 3

Multiple Choice Questions

Sampling Distributions

1. The Gallup Poll has decided to increase the size of its random sample of

Canadian voters from about 1500 people to about 4000 people. The effect

of this increase is to:

(a) reduce the bias of the estimate.

(b) increase the standard error of the estimate.

(c) reduce the variability of the estimate.

(d) increase the confidence interval width for the parameter.

(e) have no effect because the population size is the same.

Solution: c

Past performance 1992 Dec - 65% (11%a, 16%e)

Past performance 1997 Jul - 92%

weights of passengers traveling by air between Toronto and Vancouver

have a mean of 78 kg and a standard deviation of 7 kg, the approximate

probability that the combined weight of 100 passengers will exceed 8,000

kg is:

(a) 0.4978

(b) 0.3987

(c) 0.1103

(d) 0.0044

(e) .0022

Solution: e

Past performance 1996 Nov - 84% (10%-b)

Past performance 1997 Aug - 73% (18%-b)

Past performance 1998 Nov - 85%

Past performance 1998 Dec - 88%

1

3. Government regulations indicate that the total weight of cargo in a certain

kind of airplane cannot exceed 330 kg. On a particular day a plane is

loaded with 100 boxes of goods. If the weight distribution for individual

boxes is normal with mean 3.2 kg and standard deviation 7 kg, what is

the probability that the regulations will NOT be met:

(a) 1.5%

(b) 92%

(c) 8%

(d) 15%

(e) 85%

Solution: c

Past performance 1997 Jul - 75%

Past performance 2006 Nov - 78%

tributed with a mean of 12 minutes and a standard deviation of 1.5 min.

Find the probability that the time required to assemble all nine compo-

nents (i.e. the total assembly time) is greater than 117 minutes.

(a) 2514

(b) .2486

(c) .4772

(d) .0228

(e) .0013

Solution: d

is a normal random variable with a mean of $200 and a standard deviation

of $50. What is the probability that the total amount in a random sample

of 20 orders is greater than $4500?

(a) .1915

(b) .0125

(c) .3085

(d) .0228

(e) .4875

Solution: b

2006

c Carl James Schwarz 2

6. A random sample of 100 observations is to be drawn from a population

with a mean of 40 and a standard deviation of 25. The probability that

the mean of the sample will exceed 45 is:

(a) 0.4772

(b) 0.4207

(c) 0.0793

(d) 0.0228

(e) not possible to compute, based on the information provided.

Solution: d

distribution of the sample mean:

(a) The standard error of the sample mean will decrease as the sample

size increases.

(b) The standard error of the sample mean is a measure of the variability

of the sample mean among repeated samples.

(c) The sample mean is unbiased for the true (unknown) population

mean.

(d) The sampling distribution shows how the sample mean will vary

among repeated samples.

(e) The sampling distribution shows how the sample was distributed

around the sample mean.

Solution: e

Past performance 1990 Dec - 40% (c-18%, d-24%)

Past performance 1991 Dec - 41% (a-10%, c-27%, d-18%)

8. The sample mean is an unbiased estimator for the population mean. This

means:

(b) The average sample mean, over all possible samples, equals the pop-

ulation mean.

(c) The sample mean is always very close to the population mean.

(d) The sample mean will only vary a little from the population mean.

(e) The sample mean has a normal distribution.

Solution: b

Past performance 1989 Dec - 77%

2006

c Carl James Schwarz 3

9. Which of the following statements is NOT CORRECT?

known (and often equal) chance of being selected.

(b) The precision of a sample mean or sample proportion depends only

upon the sample size (and not the population size) in a proper ran-

dom sample.

(c) Convenience sampling often leads to biases in estimates because the

sample is often not representative of the population.

(d) If a sample of 1,000,000 families is randomly selected from all of

Canada (with about 8,000,000 families) and the average family in-

come is computed, then the true value of the family income for all

families in Canada is known.

(e) The sampling distribution of the sample mean describes how the

sample mean will vary among repeated samples.

Solution: d

Past performance 1989 Dec - 92%

Past performance 1990 Dec - 90%

(a) the distribution of the various sample sizes which might be used in a

given study

(b) the distribution of the different possible values of the sample mean

together with their respective probabilities of occurrence

(c) the distribution of the values of the items in the population

(d) the distribution of the values of the items actually selected in a given

sample

(e) none of the above

Solution: b

11. The average monthly mortgage payment for recent home buyers in Win-

nipeg is µ = $732, with standard deviation of σ = $421 A random sample

of 125 recent home buyers is selected. The approximate probability that

their average monthly mortgage payment will be more than $782 is:

(a) 0.9082

(b) 0.4522

(c) 0.4082

(d) 0.0478

2006

c Carl James Schwarz 4

(e) 0.0918

Solution: e

12. Can of salmon have a nominal net weight of 250 g. However, due to

variation in the canning process, the actual net weight has an approximate

normal distribution with a mean of 255 g and a standard deviation of 10

g. According to Consumer Affairs, a sample of 16 tins should have less

than a 5% chance that the mean weight is less than 250 g. What is the

actual probability that a sample of 16 tins will have a mean weight less

than 250 g?

(a) .1915

(b) .3085

(c) .0228

(d) .4772

(e) .0500

Solution: c

Past performance 1993 Apr - 58% (b-32%)

Past performance 1996 Nov - 77% (b-19%)

closely by a normal curve

(b) if n is large, and if the population is normal, then the variance of the

sample mean must be small.

(c) if n is large, then the sampling distribution of the sample mean can

be approximated closely by a normal curve

(d) if n is large, and if the population is normal, then the sampling

distribution of the sample mean can be approximated closely by a

normal curve

(e) if n is large, then the variance of the sample must be small.

Solution: c

Which statement is generally correct?

(a) µ is an estimate of X; σ is an estimate of s.

(b) X is an estimate of µ; s is an estimate of σ.

2006

c Carl James Schwarz 5

(c) µ is an estimate of X; s is an estimate of the standard deviation of

the sample mean.

(d) X is an estimate of µ; s is an estimate of the standard deviation of

the sample mean.

(e) X is an estimate of µ; s is the standard error of the sample mean.

Solution: b

15. The central limit theorem tells us that the sampling distribution of is

approximately normal. Which of the following conditions are necessary

for the theorem to be valid:

(b) We have to be sampling from a normal population.

(c) The population has to be symmetric.

(d) Population variance has to be small

(e) Both A and C.

Solution: a

to use the normal distribution to make inferences concerning the popula-

tion mean:

(a) provided that the population is normally distributed and the sample

size is reasonably large.

(b) provided that the population is normally distributed (for any sample

size).

(c) provided that the sample size is reasonably large (for any population).

(d) provided that the population is normally distributed and the popu-

lation variance is known (for any sample size).

(e) provided that the population size is reasonably large (whether the

population distribution is known or not).

Solution: c

(b) it guarantees that , when it applies, the samples that are drawn are

always randomly selected.

2006

c Carl James Schwarz 6

(c) it enables reasonably accurate probabilities to be determined for

events involving the sample average when the sample size is large

regardless of the distribution of the variable

(d) it tells us that if several samples have produced sample averages

which seem to be different than expected, the next sample average

will likely be close to its expected value.

(e) it is the basis for much of the theory that has been developed in the

area of discrete random variables and their probability distributions.

Solution: c

18. One class decided to estimate the proportion of cars that are red in a

parking lot. They took a random sample of the cars in the closest parking

lot to the class. Which of the following is NOT correct?

(a) Even though the sample was random sample of cars in the parking

lot, the sample may not be representative of the population of cars

driven by SFU students because the decision to park in B-lot is a

self-selected sample.

(b) If another sample of cars was taken, it is likely that a different propor-

tion for Japanese made cars would be found. The set of all possible

values for the proportion is known as the sampling distribution.

(c) The confidence interval computed refers to the proportion of cars in

the sample that were red.

(d) The sample was a simple random sample from cars parked. This

means that every car in the lot had an equal chance of being selected.

(e) A convenience sample could be chosen by selecting the first 25 cars

in the parking lot that are closest to the Applied Science Building.

Solution: c

Past performance 1996 Nov - 82%

19. Recall in one assignment you surveyed cars in a parking lot to estimate

the proportion that were red or the proportion that were from a Japanese

manufacturer. Which of the following is NOT CORRECT?

(a) A convenience sample of the cars closest to the Applied Science build-

ing may give a biased estimate of the proportion of cars which are

from a Japanese manufacturer.

(b) Different students may get different answers for the proportion of

cars that are red.

(c) The sample proportion of cars that are red is an unbiased estimate of

the population proportion if the sampling is a simple random sample.

2006

c Carl James Schwarz 7

(d) A sample of 100 cars in a convenience sample is always better than

a sample of 20 cars from a proper random sample.

(e) A sample of 100 cars from a proper random sample will give more

precise estimates of the proportion of cars that are red than a sample

of 20 cars from a proper random sample.

Solution: d

Past performance 2006 Nov - 92%

(a) The sample standard deviation measures variability of our sample

values.

(b) A larger sample will give answers that vary less from the true value

than smaller samples (assuming both are properly chosen).

(c) The sampling distribution describes how our estimate (answer) will

vary if a new sample is taken.

(d) The standard error measures how much our estimate (answer) may

vary if a new sample of the same size is chosen using the same sam-

pling method.

(e) A large sample size always gives unbiased estimators regardless of

how the sample is chosen.

Solution: e

Past performance 2006 Nov - 93%

2006

c Carl James Schwarz 8

Multiple Choice Questions

Hypothesis Testing - Introduction

1 Testing - Introduction

1. To determine the reliability of experts used in interpreting the results of

polygraph examinations in criminal investigations, 280 cases were studied.

The results were:

True Status

Innocent Guilty

Examiner’s Innocent 131 15

Decision Guilty 9 125

we could estimate the probability of making a type II error as:

(a) 15/280

(b) 9/280

(c) 15/140

(d) 9/140

(e) 15/146

Solution: c

The second column percentage is the probability that the examiner con-

cludes a person is is not or guilty given the person is guilty. This is what is

required for a Type II error, i.e. conditional upon the person really being

guilty.

Past performance 1993 Feb - 13% (a-65%; e-13%)

II. The power of the test, 1 − β is then:

1

1 TESTING - INTRODUCTION

(c) the probability of failing to reject H0 when H0 is true

(d) the probability of rejecting H0 when H0 is true

(e) the probability of failing to reject H0 .

Solution: a

when α, the level of significance, is reduced?

(b) The rejection region is reduced in size.

(c) The rejection region is increased in size.

(d) The rejection region is unaltered.

(e) The answer depends on the form of the alternative hypothesis.

Solution: b

warning light indicates that the fuel guage may be broken. If Jones decides

to check the fuel level by hand, it will delay the flight by 45 minutes. If

Jones decides to ignore the warning, the aircraft may run out of fuel before

it gets to Gimli. In this situation, what would be:

ii) a type I error?

Type I error: decide to check the fuel by hand when there is in fact

enough fuel.

(b) Null Hypothesis: assume that the warning can be ignored.

Type I error: decide to ignore the warning when there is in fact not

enough fuel.

(c) Null Hypothesis: assume that the fuel should be checked by hand.

Type I error: decide to ignore the warning when there is in fact not

enough fuel.

(d) Null Hypothesis: assume that the fuel should be checked by hand.

Type I error: decide to check the fueld by hand when there is in fact

enough fuel.

(e) Null Hypothesis: assume that the aircraft is already late.

Type I error: taking a commercial flight to Gimli in the first place.

2006

c Carl James Schwarz 2

1 TESTING - INTRODUCTION

α level.

(b) The probability of a Type II error is controlled by the sample size.

(c) The power of a test depends upon the sample size and the distance

between the null and alternate hypothesis.

(d) The p-value measures the probability that the null hypothesis is true.

(e) The rejection region is controlled by the α level and the alternate

hypothesis.

Solution: d

Past performance 1991 Apr - 55%

(a) The critical region is the values of the test statistic for which we

reject the null hypothesis.

(b) The level of significance is the probability of type I error.

(c) For testing H0 µ = µ0 , HA : µ > µ0 , we reject H0 for high values of

the sample mean X.

(d) In testing H0 : µ = µ0 , HA : µ 6= µ0 , the critical region is two sided.

(e) The p-value measures the probability that the null hypothesis is true.

Solution: e

(a) Probability of rejecting H0 when H0 is true.

(b) Probability of not rejecting H0 when H0 is true.

(c) Probability of not rejecting H0 when HA is true.

(d) Probability of rejecting H0 when HA is true

(e) 1 − β.

Solution: b

2006

c Carl James Schwarz 3

1 TESTING - INTRODUCTION

Accept $H_0$ (1) (2)

Reject $H_0$ (3) (4)

(b) The P(making entry (2)) is controlled by the sample size for a given

α level.

(c) A Type I error occurs if entry (3) occurs.

(d) Power refers to P(entry (4))

(e) A Type II error occurs when entry (1) is made.

Solution: e

Past performance 1991 Feb - 66% (a-12%, c-12%)

(a) the null hypothesis will not be rejected unless the data are not un-

usual (given that the hypothesis is true).

(b) the null hypothesis will not be rejected unless the p-value indicates

the data are very unusual (given that the hypothesis is true).

(c) the null hypothesis will not be rejected only if the probability of

observing the data provide convincing evidence that it is true.

(d) the null hypothesis is also called the research hypothesis; the alter-

native hypothesis often represents the status quo.

(e) the null hypothesis is the hypothesis that we would like to prove; the

alternative hypothesis is also called the research hypothesis.

Solution: b

Past performance 1993 Apr - 59% (c-26%; e-10%)

Past performance 1997 Aug - 93%

of 15 experimental plots in a field. Following the collection of data, a

test of significance was conducted under appropriate null and alternative

hypotheses and the P-value was determined to be approximately .03. This

indicates that:

(b) the probability of being wrong in this situation is only .03.

(c) there is some reason to believe that the null hypothesis is incorrect.

2006

c Carl James Schwarz 4

1 TESTING - INTRODUCTION

(d) if this experiment were repeated 3 per cent of the time we would get

this same result.

(e) the sample is so small that little confidence can be placed on the

result.

Solution: c

Past performance 1996 Dec - 82%

Past performance 1998 Nov - 80%

α = 0.05,

(a) 95% of the time we will make an incorrect inference

(b) 5% of the time we will say that there is a real difference when there

is no difference

(c) 5% of the time we will say that there is no real difference when there

is a difference

(d) 95% of the time the null hypothesis will be correct

(e) 5% of the time we will make a correct inference

Solution: b

Note that (b) is a Type I error; (c) is a Type II error.

The α level controls the Type I error rate.

(a) An extremely small p-value indicates that the actual data differs

markedly from that expected if the null hypothesis were true.

(b) The p-value measures the probability that the hypothesis is true.

(c) The p-value measures the probability of making a Type II error.

(d) The larger the p-value, the stronger the evidence against the null

hypothesis

(e) A large p-value indicates that the data is consistent with the alter-

native hypothesis.

Solution: a

Past performance 1998 Dec - 87%

2006

c Carl James Schwarz 5

Multiple Choice Questions

Hypothesis Testing - Multinomial proportions

from a single sample

There are extensive breeding programs for salmon on the West Coast of

Canada to enhance the salmon fishery. One question of interest is whether

inbreeding affects subsequent fitness of the fish. An experiment was conducted

where released salmon were classified as unrelated if the parents were unrelated,

half-sibs if the one of the parents was in common, and full sibs if both parents

were in common. In one release, 25% of the fish were half-sibs, 40% were

unrelated, and 35% were full-sibs. Of 237 returning adult salmon, 45% were

unrelated, 25% were full-sibs, and 30% were half- sibs.

(b) The return rate is dependent upon the relatedness of the fish.

(c) The return rates are 45%, 25%, and 30% for unrelated, full-sibs, and

half-sibs respectively.

1

(d) The return rates are 40%, 35%, and 25% for unrelated, full-sibs, and

half-sibs respectively.

(e) The release percentages are different from the return percentages.

Solution: d

(d) is preferred over (a) because the hypothesis of independence

is only applicable when there are two classification variables. Here

there is only variable - the sibship. Also, the proportions that should

return when the H is true is known exactly. In the contingency table

analysis, you test if the proportions are the same for all the groups,

but the actual proportions are unknown.

Past performance 1993 Apr - 33% (a-54%)

Past performance 1997 Aug - 82% (a-11%)

(a) 13.1

(b) 4.5

(c) 5.4

(d) 10.8

(e) 6.0

Solution: d

Past performance 1993 Apr - 73% (b-10%; c-10%)

(a) < .005

(b) between .005 and .01

(c) between .01 and .02

(d) between .02 and .05

(e) > .05

Solution: a

Past performance 1993 Apr - 62% (c-14%; e-13%)

(a) .29958

(b) 71

(c) 10.81

(d) 25%

2006

c Carl James Schwarz 2

(e) 59.25

Solution: e

Past performance 1997 Aug - 84%

(a) .0034

(b) .0068

(c) .0090

(d) .0045

(e) .0022

Solution: d

Past performance 1997 Aug - 89%

The paper “Linkage Studies of the Tomato” (Trans. Royal Canad. Inst.

(1931)) reported the accompanying data on phenotypes resulting from

crossing tall cut-leaf tomatoes with dwarf potato-leaf tomatoes. We wish

to investigate if the frequencies below are consistent with the Mendellian

laws which state the phenotypes should occur in the ratio 9:3:3:1.

Phenotype

Tall Tall Dwarf Dwarf

Cut Pot Cut Pot

leaf leaf leaf leaf

Frequency 926 288 293 104

(a) 7.81

(b) 5.99

(c) 1.18

(d) 1.47

(e) 964.01

Solution: d

Past performance 1991 Apr - 90%

(a) 7.81

2006

c Carl James Schwarz 3

(b) 5.99

(c) 3.84

(d) 9.49

(e) 11.07

Solution: a

Past performance 1991 Apr - 94%

Number of spots | 1 2 3 4 5 6

Frequency | 1 4 9 9 2 5

If a chi-square goodness of fit test is used to test the hypothesis that the

die is fair at a significance level of α = 0.05, then the value of the chi-square

statistic and the decision reached are:

(b) 11.6; accept hypothesis

(c) 22.1; reject hypothesis

(d) 22.1; accept hypothesis

(e) 42.0; reject hypothesis

Solution: a

the instructor suspected that all 300 students who answered the question

simply picked an answer at random. The distribution of students’ answers

to the question is as follows:

answer Frequency

A 68

B 53

C 61

D 75

E 43

p3 = p4 = p5 = .2 and H1 : not all pi = .2, where pi denotes the probability

of choosing answer i. The value of the test statistic is:

(a) 11.60

(b) 10.47

2006

c Carl James Schwarz 4

(c) 190.76

(d) 310.47

(e) 48

Solution: b

Past performance 1989 Apr - 87%

10. The following table gives the number of wins for each of the first four post

positions at Assiniboine Downs for 80 races during the 1978 horse-racing

season.

Post Position 1 2 3 4

Number of wins 24 17 19 20

For testing the hypothesis that the probability of winning is the same for

all four post positions, the calculated value of the test statistic is:

(a) 26.00

(b) 1.25

(c) 1.30

(d) 0.40

(e) 20.00

Solution: c

A recent estimate by a large distributor of gasoline claims that 60% of all

cars stopping at their service stations chose unleaded gas and that super

unleaded and regular were each selected 20% of the time. In order to check

the validity of these proportions, a study was conducted of cars stopping

at the distributor’s service stations in a large city. The results were as

follows:

Gasoline Selected

Regular Unleaded Super Unleaded

51 261 88

11. The expected cell counts assuming the distributor’s claim is correct are:

(b) 51, 261, 88

(c) 80, 240, 80

(d) 133, 133, 133

2006

c Carl James Schwarz 5

(e) 20%, 60%, 20%

Solution: c

12. If α=0.05, then the value of the appropriate test statistic and the critical

value respectively are:

(b) 13.15, 5.99

(c) 21.75, 7.81

(d) 13.15, 7.81

(e) 13.15, 7.38

Solution: b

Past performance 1990 Apr - 82%

A recent estimate by a large distributor of gasoline claims that 60% of all

cars stopping at their service stations chose unleaded gas and that super

unleaded and regular were each selected 20% of the time. In order to check

the validity of these proportions, a study was conducted of cars stopping

at the distributor’s service stations in a large city. The results were as

follows:

Gasoline Selected

Regular Unleaded Super Unleaded

51 261 88

(a) pregular = .333; punleaded =.333; psuper = .333

2006

c Carl James Schwarz 6

(b) pregular = .200; punleaded =.600; psuper = .200

(c) pbregular = .200; pbunleaded =.600; pbsuper = .200

(d) gasoline selected is independent of the type of car

(e) the probability of each type of gasoline is equal

Solution: b

(d) is not valid because there is no classification by type of car in this

survey

Past performance 1996 Dec - 71% (12%-c)

14. The expected cell counts assuming the distributor’s claim is correct are:

(a) 100, 200, 100

(b) 51, 261, 88

(c) 80, 240, 80

(d) 133, 133, 133

(e) 20%, 60%, 20%

Solution: c

Past performance 1996 Dec - 93%

15. The value of the appropriate test statistic and approximate p-value , re-

spectively, are:

(b) 13.15, .0014

(c) 14.64 .00035

(d) 13.15, .0028

(e) 13.15, .0007

Solution: b

Past performance 1996 Dec - 73% (15%-d)

factured parts in three shifts of 8 hours each. The following table provides

data obtained from a sample of 162 manufactured parts not conforming

to specifications:

Non-conforming 50 44 68 162

2006

c Carl James Schwarz 7

A test of the hypothesis that the nonconforming parts are uniformly dis-

tributed among the three shifts can be based upon which of the following

values of the test statistic?

(a) 5.78 with 3 degrees of freedom.

(b) 5.78 with 2 degrees of freedom.

(c) 5.48 with 2 degrees of freedom.

(d) 5.48 with 3 degrees of freedom.

(e) 5.48 with 1 degree of freedom.

Solution: b

An experiment in chicken breeding results in offspring having either very

curly, slightly curly, or normal feathers. If this is the result of a single gene

system, then the proportions of offspring in the three phenotypes should

be 0.25, 0.50, and 0.25 respectively. In one such experiment, 93 chickens

were born. Here is some JMP output (with some values hidden):

(a) H: pn = ps = pv

(b) The phenotypes are independent of the type of feather.

(c) H: pn = 0.25, ps =0.50, pv = 0.25

2006

c Carl James Schwarz 8

(d) H: pn = 0.215, ps =0.538, pv = 0.247

(e) The observed proportions of the three feather types occur with prob-

abilities of 0.25, 0.50, and 0.25 respectively.

Solution: c

Past performance 1998 Dec - 85%

(a) An approximate 95% confidence interval for the proportion of birds

with normal feathers is (17% → 26%).

(b) The test statistic is 0.72 and the p-value is .6975/2 or about .35.

(c) The p-value is not small. Consequently, we know that the null hy-

pothesis is true, i.e. it is a single gene system.

(d) Each of the individual confidence intervals includes the hypothesized

value. Hence there is no evidence against the single gene hypothesis.

(e) The se measures how much the population proportion could vary if

a new experiment was done.

Solution: d

(c) is not correct because you NEVER know the truth.

(e) is not correct, because the POPULATION proportion is fixed. The se

measures how much the SAMPLE proportion varies.

Past performance 1998 Dec - 57% (14% c; 15% b)

Are babies considerate of their mothers? A study of 700 births at a local

hospital classified births as falling on weekends or weekdays. Are babies

born equally on all days of the week? Here is some output (some parts

hidden):

2006

c Carl James Schwarz 9

19. What is the null hypothesis being tested?

(a) H : pweekend = .50; pweekday = .50

(b) H : µweekend = .22; µweekday = .78

(c) H : µweekend = 2/7; µweekday = 5/7

(d) H : pweekend = .22; pweekday = .78

(e) H : pweekend = 2/7; pweekday = 5/7

Solution: e.

Past performance 2006 Dec - 56% (26% c)

were true:

(a) 156

(b) 200

(c) 544

(d) 500

(e) 100

2006

c Carl James Schwarz 10

Solution: b

Past performance 2006 Dec - 82%

21. The test-statistic is 13.6 with a p-value that is very small. Which is COR-

RECT?

(a) There is strong evidence that the proportion of births on weekends

is different from 2/7.

(b) There is strong evidence that the mean number of births is the same

between weekends and weekdays.

(c) There is strong evidence that the mean number of births differs be-

tween weekends and weekdays.

(d) There is strong evidence that the proportion of births on weekends

is different from that on weekdays.

(e) There is strong evidence that the proportion of births on weekends

is the same as that on weekdays.

Solution: a

Past performance 2006 Dec - 45% ((19% c; 31% d)

2006

c Carl James Schwarz 11

Multiple Choice Questions

Hypothesis Testing - Population means from

paired experiments

and after treatment with a drug. The blood pressures are as follows:

1 168 171

2 171 170

3 182 180

4 167 173

5 174 178

6 170 172

significant change of the blood pressure before and after taking the drug

at 0.05 level of significance. The absolute value of the test statistic and

the absolute critical value of the test are, respectively:

(b) 1.6151 and 2.571

(c) 0.7192 and 1.96

(d) 0.7192 and 1.812

(e) 0.7192 and 2.228

2. The infamous researcher, Dr. Gnirips, claims to have found a drug that

causes people to grow taller. The coach of the Basketball team at Brandon

University has expressed interest but demands evidence. Ten people are

randomly selected from students at Brandon, their heights measured, the

drug administered, and 2 hours later their heights remeasured. The results

were as follows:

1

Pre-Drug 68 69 74 78 70 66 71 70 71 65

Post-Drug 70 69 75 78 73 69 72 73 72 66

Person 1 2 3 4 5 6 7 8 9 10

Using the proper test statistic, an appropriate decision rule for the hy-

potheses H:Drug has no effect versus A: Drug increases height at (αa =

.05) will be

(b) Reject H0 if the test statistic is > 1.645

(c) Reject H0 if the test statistic is > 1.83

(d) Reject H0 if the test statistic is > 1.73

(e) Reject H0 if the test statistic is > 2.10

3. A group of 10 men were given a special diet for two weeks to test weight

loss in pounds. The observed data was:

1 181 178

2 171 172

3 190 185

4 187 184

5 210 201

6 202 201

7 166 160

8 173 168

9 183 180

10 184 179

diet leads to a weight loss, the appropriate test procedure is either:

(b) paired t-test or Wilcoxon Signed Rank test

(c) paired t-test or Wilcoxon Rank Sum test

(d) two sample t-test or Sign test

(e) two sample t-test or paired t-test

2006

c Carl James Schwarz 2

4. A manufacturer wished to compare the wearing qualities of two different

types of automobile tires, A and B, and he had 5 cars available for use in

an experiment. To make the comparison, one tire of Type A and one of

Type B were mounted on the rear wheels of each of the five automobiles.

(For each car, a coin was flipped to decide which tire would be mounted on

the left side and which would be mounted on the right.). The automobiles

were then operated for a specified number of miles and the amount of wear

was recorded for each tire. These measurements appear below:

1 10.6 10.2

2 9.8 9.4

3 12.3 11.8

4 9.7 9.1

5 8.8 8.3

hypothesis that there is no difference in the average wear for the two

types of tires. The absolute value of the test statistic calculated from the

data is:

(a) 12.83

(b) 0.57

(c) 8.35

(d) 10.72

(e) 9.45

average dive duration for sea otters. Five otters are available for an ex-

periment and each otter is observed diving in both warm and cold water

(with the order being random). The biologist collects the following data:

Warm Cold

Otter Water Water

J2 97 92

B7 65 60

M3 75 77

D4 103 43

B8 90 81

Test for any difference in the length of dives using a non-parametric pro-

cedure:

2006

c Carl James Schwarz 3

(a) Rank-sum procedure, Wcold = 25;p−value > .111

(b) Rank-sum procedure, Wcold = 25;p−value > .222

(c) Signed-rank procedure, W − = 1;p−value = .062

(d) Signed-rank procedure, W − = 1;p−value = .124

(e) Sign-test, S = 4;p−value = .187

Solution: d

Past performance 1991 Apr - 38% (C-52%)

of male and female college graduates who find jobs. Pairs are formed by

choosing a male and a female with same major and similar grade-point

averages. Suppose a random sample of 5 pairs and the starting salaries

(in thousands) are as follows:

Pair 1 2 3 4 5

Male 25.9 20.0 28.7 13.5 18.8

Female 24.9 18.5 27.7 13.0 17.8

To test whether the mean starting salary for males is less than that of

females with α= 0.05, the absolute value of the test statistic is:

(a) 1

(b) 0.125

(c) 0.3535

(d) 5.658

(e) 6.3246

The average height of children is believed to have increased in the last

50 years due to better nutrition and better health services. To examine

this hypothesis, measurement of the heights (in centimeters) of 10 pairs

of mothers and their eldest adult daughters yielded the following results:

1 178.2 178.2 6 166.6 172.8

2 173.4 168.6 7 157.4 152.0

3 163.0 164.2 8 176.4 176.4

4 152.2 157.4 9 162.0 159.4

5 155.8 165.2 10 165.1 159.0

2006

c Carl James Schwarz 4

7. Consider the differences computed by taking the mother’s height - the

daughter’s height. The value of the Signed-Rank test statistic is:

(a) 36

(b) 19

(c) 16

(d) 6

(e) 20

Solution: c

Past performance 1990 Apr - 61%

8. No longer used

The next three questions refer to the following situation:

All of us non-smokers can rejoice - the mosaic tobacco virus that affects

and injures tobacco plants is spreading! Meanwhile, a tobacco company is

investigating if a new treatment is effective in reducing the damage caused

by the virus. Eleven plants were randomly chosen. On each plant, one

leaf was randomly selected, and one half of the leaf (randomly chosen)

was coated with the treatment - the other half was left untouched (con-

trol). After two weeks, the amount of damage to each half of the leaf was

assessed. The output from SAS follows:

1ST:

CONTROL 11 15.7273 13 9.1224 5 36

2ND:

TRT 11 13.3636 12 10.0725 2 32

1ST-2ND:

DIFF 11 2.36364 3 3.32484 -6 6

OF DIFF | | 9 1 1 | MISSING VALUES |

| DF | 2-TAIL P(BINOMIAL) | |

W | A | 10 | .0215 | PREP1 0 |

W=.8525 | T | SIGN RNK:SUM R+ R- | PREP2 0 |

PR<W | PR>A | 2.358 | 45.5 9.5 | BOTH 0 |

.05<P<=.10 | PR>|T| | 2-TAIL P(TABLES) | OTHER 0 |

| .0401 | .05<P<=.10 | TOTAL 0 |

9. What is the best reason for performing a paired experiment rather than a

two- independent sample experiment?

2006

c Carl James Schwarz 5

(a) It is easier to do because we need fewer experimental units and each

unit receives more than one treatment.

(b) It allows us to remove variation in the results caused by other factors

because we can compare both treatments within the same experi-

mental unit.

(c) The computer program is more accurate because we work only with

the differences.

(d) It requires fewer assumptions because we are only interested in the

difference between treatments

(e) It allows us to do more experiments because we use each experimental

unit twice.

Solution: b

Past performance 1991 Feb - 98%

Past performance 1997 Aug - 95%

10. What is the rejection region (α=.05) and p-value for the paired t-test?

(a) Reject if T ∗ 1.812; p-value =.040

(b) Reject if T ∗ 1.812; p-value =.020

(c) Reject if T ∗ 2.358; p-value =.040

(d) Reject if T ∗ 2.358; p-value =.020

(e) Reject if T ∗ 1.645; p-value =.020

Solution: b

Past performance 1991 Feb - 56% (a-13%, e-20%)

the differences is suspect, wishes to perform a non-parametric test. What

is the test- statistic and the exact p-value (using tables) for the signed-rank

test?

(b) R+ =45.5; p-value =.016

(c) R+ =45.5; p-value =.064

(d) R+ =45.5; p-value =.0215

(e) R+ =45.5; p-value =.0107

Solution: a

12. A group of 10 men were put on a weight reduction diet. The weights

before (b) and after (a) the diet were measured on each individual. The

differences di = ai -bi, were analyzed, yielding the following results.

2006

c Carl James Schwarz 6

- values are not given for some reason?

We wish to test if the diet has reduced the average weight. The test

statistic and critical value (α=.05) are:

(b) -1.04 2.228

(c) -.095 1.812

(d) -2.45 1.812

(e) -2.45 2.228

types of automobile tires, A and B. To make the comparison, a tire of type

A and one of type B were randomly assigned and mounted on the rear

wheels of each of five automobiles. The automobiles were then operated

for a specified number of miles and the amount of wear was recorded for

each tire. These measurements appear below:

1 10.6 10.2

2 9.8 9.4

3 12.3 11.8

4 9.7 9.1

5 8.8 8.3

The absolute value of the test statistic calculated from the data for testing

the null hypothesis that there is no difference in the average wear for the

two types of tires is:

(a) 12.83

(b) 5.7

(c) 8.35

(d) 10.72

(e) 9.45

Solution: b/option>

14. A statistics professor would like to determine whether students in his class

showed improved performance on the final examination as compared to the

mid-term examination. A random sample of 4 students selected from a

large class revealed the following mid-term and final scores:

2006

c Carl James Schwarz 7

Student #1 #2 #3 #4

Mid-term 70 62 57 68

Final 80 79 87 88

Making the appropriate assumptions, the value of the test statistic is:

(a) 19.25/8.30

(b) 19.25/(8.30/2)

p

(c) 19.25/ 28.295/4 + 28.295/4

p

(d) 19.25/ 34.92/4 + 21.67/4

(e) 19.25/(2/8.30)

Solution: b/option>

15. A sample of 8 patients had their lung capacity measured before and after

a certain treatment with the following results:

1 750 850

2 860 880

3 950 930

4 830 860

5 750 800

6 680 740

7 720 760

8 810 800

The Sign Test is used to test the hypothesis that the treatment provides

no increase in lung capacity. The probability, under H0 , of obtaining the

observed result or a more extreme one (i.e. the p-value or observed level

of significance) is:

(a) .0352

(b) .1094

(c) .0498

(d) .1445

(e) .2980

Solution: d

16. Seven sets of identical twins are given psychological tests to determine

whether the firstborn of the twins tends to be more aggressive than the

second born. The results are shown in the following table, where the

higher score represents greater aggressiveness.

2006

c Carl James Schwarz 8

Set Firstborn Second born Difference

1 86 88 -2

2 77 65 12

3 91 90 1

4 70 65 5

5 75 80 -5

6 88 81 7

7 87 72 15

ric about the median but not necessarily normal, then the value of the

appropriate test statistic is:

(b) 40 and we would reject H0 at α = .05

(c) 1.71 and we would not reject H0 at α = .05

(d) 22.5 and we would not reject H0 at α = .05

(e) 1.71 and we would reject H0 at α = .05

Solution: d

17. The following data give uric acid levels (in milligrams per 100 milliliters)

for 5 subjects before and after a special diet.

1 5.2 5.2

2 6.3 6.2

3 6.4 6.3

4 5.5 5.6

5 5.9 5.6

To test the hypothesis that the diet reduces the uric acid level, we might

use

(a) a two sample t-test because the uric acid levels before and after the

diet can be assumed independent.

(b) a sign test

(c) a paired t-test

(d) a and b

(e) b and c

Solution: e

2006

c Carl James Schwarz 9

An agricultural field station is investigating the differences between the

mean yields of two varieties of corn. Because of fertility differences, both

varieties were planted in each of seven farms across the province. At

harvest time, the plots were harvested and the yield recorded. The output

from SAS appears below.

1ST: | | 6 1 0 |

VARA 7 46.1 46.5 4.59 38.5 52.6 | DF | 2-TAIL P(BINOMIAL) |

2ND: | 6 | 0.125 |

VARB 7 43.6 41.7 3.53 40.1 49.8 | T | SIGN RNK:SUM R+ R- |

1ST-2ND: | 2.683 | 25 3 |

DIFF 7 2.5 2.8 2.43 -2.7 5.0 | PR>|T| | |

| .0364 | P(TABLES) < .10 |

(a) H: X d = 0 A: X d 6= 0

(b) H: µd = 0 A: µd 6= 0

(c) H: µd 6= 0 A: µd = 0

(d) H: µd = 0 A: µd < 0

(e) H: X d = 0 A: X d < 0

Solution: b

Past performance 1990 Feb - 97%

19. The test statistic, rejection region (α = .05), and p-value are:

(b) T* = 2.683; reject H if T ∗ > 2.45; p-value = .0364

(c) T* = 2.683; reject H if T ∗ > 1.94; p-value = .0182

(d) T* = 2.683; reject H if T ∗ > 2.45; p-value = .0182

(e) T* = 2.683; reject H if T ∗ > 1.89; p-value = .0182

Solution: b

Past performance 1990 Feb - 56% (A-22%,)

(a) There is evidence to believe that the two varieties have a different

mean yield.

2006

c Carl James Schwarz 10

(b) There is insufficient evidence to believe that the two varieties have a

different mean yield.

(c) There is evidence to believe that the two varieties have the same

mean yield.

(d) There is insufficient evidence to believe that the two varieties do not

have a difference in their mean yields.

(e) There is sufficient evidence to believe that the two varieties are paired

on each farm.

Solution: a

Past performance 1990 Feb - 83%

An agricultural field station is investigating the differences between the

mean yields of two varieties of corn. They are particularly interested

in testing if the second variety gives a lower yield than the first variety.

Because of fertility differences, both varieties were planted in each of seven

farms across the province. At harvest time, the plots were harvested and

the yield recorded. The output from JMP appears below.

(a) H: X diff = 0 A: X diff 6= 0

(b) H: µdiff = 0 A: µdiff > 0

(c) H: µdiff 6= 0 A: µdiff = 0

(d) H: µdiff = 0 A: µdiff < 0

(e) H: X diff = 0 A: X diff < 0

Solution: b

Past performance 1996 Dec - 92%

(a) 2.333 .0584

(b) 1.204 .0292

2006

c Carl James Schwarz 11

(c) 2.810 .0584

(d) 1.204 .9708

(e) 2.333 .0292

Solution: e

Past performance 1996 Dec - 89%

23. Suppose that the p-value had been .0093. This would mean:

(a) There is strong evidence against the null hypothesis of equal mean

yields.

(b) There is no evidence to believe that the two varieties have a different

mean yield.

(c) There is strong evidence to believe that the two varieties have the

same mean yield.

(d) There is no evidence to believe that the two varieties do not have a

difference in their mean yields.

(e) There is sufficient evidence to believe that the two varieties are paired

on each farm

Solution: a

Past performance 1996 Dec - 87%

A physician wants to compare the blood pressures of six patients before

and after treatment with a drug that is designed to lower blood pressure

The blood pressure is measured before and after the drug, and the change

in blood pressure is measured. The summary information on the difference

(after-before) is:

1 168 171

2 171 170

3 182 180

4 167 173

5 174 178

6 170 172

2006

c Carl James Schwarz 12

24. The null and alternate hypotheses are:

(a) H: X diff = 0 A: X diff 6= 0

(b) H: µdiff = 0 A: µdiff > 0

(c) H: µdiff 6= 0 A: µdiff = 0

(d) H: µdiff = 0 A: µdiff < 0

(e) H: X diff = 0 A: X diff < 0

Solution: d - Notice that diff = before − after, so if the drug is effective

in reducing blood pressure, the average before should be greater than the

average after.

Past performance 1997 Aug - 73%

Past performance 2006 Dec - 73% (11% c; 12% e)

2006

c Carl James Schwarz 13

25. The estimated difference and the p-value are:

(a) 2.00; .1672

(b) 1.23; .0836

(c) 1.62; .0836

(d) 2.00; .9164

(e) 2.00; .0836

Solution: e

Past performance 1997 Aug - 87%

Past performance 2006 Dec - 79% (10% a)

(a) Pairing would be a good thing if the subject-to-subject variation was

small.

(b) This is a paired design because each subject is measured twice –

before and after.

(c) An unpaired experiment with the same number of data values would

require 12 subjects, half of which would be measured without taking

the drug, and half of which would be measured after taking the drug.

(d) Pairing is a form of stratification or blocking.

(e) The same conclusions would be obtained if the difference in blood

pressure was computed as before − after rather than after − before.

Solution: a

Past performance 2006 Dec - 58% (15% c; 16% e)

2006

c Carl James Schwarz 14

Multiple Choice Questions

Hypothesis Testing - Population mean from a

single sample

produces a sample mean of 103 and a p-value of 0.08. Thus, at the 0.05

level of significance:

(b) there is sufficient evidence to conclude that µ = 100.

(c) there is insufficient evidence to conclude that µ = 100.

(d) there is insufficient evidence to conclude that µ 6= 100.

(e) there is sufficient evidence to conclude that µ = 103.

Solution: d - you always try and collect evidence against the null

produces Z = 0.8 for the value of the test statistic. The p-value of the

test is thus equal to:

(a) 0.20

(b) 0.40

(c) 0.29

(d) 0.42

(e) 0.21

Solution: d

The one-sided p-value is P (Z > .8) = .21. Because the alternative hy-

pothesis is two-sided, the two-sided p-value is found as 2 × .21 = .42.

The next 2 questions refer to the following situation

A Canadian railway company claims that its trains block crossings no

more that 8 minutes per train on the average. The actual times (minutes)

that 10 randomly selected trains block crossings were recorded:

1

10.1 9.5 6.5 8.0 8.8 >12 7.2 10.5 8.2 9.3

(a) 37

(b) 33

(c) 44

(d) 29

(e) 36

Solution: a

Past performance 1993 Apr - 74% (e-10%)

(a) .101

(b) .053

(c) .248

(d) .049

(e) .064

Solution: d

Past performance 1993 Apr - 72%

DDT is an insecticide that accumulates up the food chain. Predator birds

can be contaminated with quite high levels of the chemical by eating many

lightly contaminated prey. One effect of DDT upon birds is to inhibit

the production of the enzyme carbonic anhydrase which controls calcium

metabolism. It is believed that this causes egg shells to be thinner and

weaker than normal and makes the eggs more prone to breakage. (This is

one of reasons why the condor in California is near extinction.) An experi-

ment was conducted where 16 sparrow hawks were fed a mixture of 3 ppm

dieldrin and 15 ppm DDT (a combination often found in contaminated

prey). The first egg laid by each bird was measured and the mean shell

thickness was found to be 0.19 mm with a standard deviation of 0.01 mm.

A normal egg shell has a mean thickness of 0.2 mm.

5. The null and alternate hypotheses are:

(a) H: µ = 0.2 A: µ < 0.2

(b) H: µ < 0.2 A: µ = 0.2

(c) H: X = 0.2 A: X < 0.2

2006

c Carl James Schwarz 2

(d) H: X = 0.19 A: X = 0

(e) H: µ = 0.2 A: µ 6= 0.2

Solution: a

Past performance 1990 Apr - 98%

Past performance 1991 Dec - 84% (11%-e)

Past performance 1993 Feb - 99%

(a) -1.00

(b) -4.00

(c) 0.01

(d) 1.96

(e) 1.75

Solution: b

Past performance 1990 Apr - 95%

Past performance 1993 Feb - 99%

7. The null hypothesis will be rejected (α=0.05) if the test statistic is less

than: (note that if the rejection region is two sided, only one side has been

shown)

(a) -2.1314

(b) -1.7530

(c) -1.9600

(d) -1.6450

(e) -1.7459

Solution: b

Past performance 1990 Apr - 74%

Past performance 1993 Feb - 92%

because then the eggs are so fragile that few survive. What sample size

would be needed to be 80% sure of detecting this decrease at α=0.05?

(a) 8

(b) > 128

(c) 34

2006

c Carl James Schwarz 3

(d) 27

(e) > 101

Solution: d

Past performance 1993 Feb - 63%

In some mining operations, a byproduct of the processing is mildly radioac-

tive. Of prime concern is the possibility that release of these byproducts

into the environment may contaminate the freshwater supply. There are

strict regulations for the maximum allowable radioactivity in supplies of

drinking water, namely an average of 5 picocuries per litre (pCi/L) or less.

However, it is well known that even safe water has occasional hot spots

that eventually get diluted, so samples of water are assumed safe unless

there is evidence to the contrary. A random sample of 25 specimens of

water from a city’s water supply gave a mean of 5.39 pCi/L and a standard

deviation of 0.87 pCi/L.

9. The appropriate null and alternative hypotheses are:

(b) H0 : µ = 5.39 vs HA : µ < 5.00

(c) H0 : µ = 5 vs HA : µ = 5.39

(d) H0 : µ = 5 vs HA : µ < 5

(e) H0 : µ = 5 vs HA : µ > 5

Solution: e

Past performance 1991 Feb - 98%

10. The value of the test statistic, the rejection region (α=0.05), and the

p-value (computed by a computer) are:

(b) Z ∗ = 2.24; reject if Z ∗ > 1.645; p-value = .0125

(c) T ∗ = 2.24 with 25 df ; reject if T ∗ > 1.708; p-value = .0171

(d) T ∗ = 2.24 with 24 df ; reject if T ∗ > 1.711; p-value = .0173

(e) T ∗ = 2.24 with 24 df ; reject if T ∗ > 2.064; p-value = .0173

Solution: d

Past performance 1991 Feb - 80%

2006

c Carl James Schwarz 4

11. The average time it takes for a person to experience pain relief from aspirin

is 25 minutes. A new ingredient is added to help speed up relief. Let µ

denote the average time to obtain pain relief with the new product. An

experiment is conducted to verify if the new product is better. What are

the null and alternative hypotheses?

(a) H0 : µ = 25 vs HA : µ 6= 25

(b) H0 : µ = 25 vs HA : µ < 25

(c) H0 : µ < 25 vs HA : µ = 25

(d) H0 : µ < 25 vs HA : µ > 25

(e) H0 : µ = 25 vs HA : µ > 25

Solution: b

12. We wish to test H0 that the average family income of Manitoba families

is at least $15,000 at level of significance α = .05. In order to test the null

hypothesis a sample of size 1000 is selected from the population, and the

p-value of the test is determined to be .02. We then:

(a) reject H0 because the data are sufficiently unusual if the null hypoth-

esis were false.

(b) reject H0 because the data are sufficiently unusual if the null hypoth-

esis were true .

(c) fail to reject H0 because the data are not sufficiently unusual if the

null hypothesis were true

(d) fail to reject H0 because the data are not sufficiently unusual if the

null hypothesis were false

(e) reject H0 because the data are sufficently unusual

Solution: b

13. The profit per new car sold by a Winnipeg automobile dealer varies from

car to car. The average profit per sale tabulated for the past 6 days was

$368 with a standard deviation of $190 To test if there is sufficient evidence

to indicate that average profit per sale is less than $480, the appropriate

null and alternative hypotheses for the test are:

(b) H: µ = $480 vs A: µ > $480

(c) H: µ = $480 vs A: µ > $480

(d) H: µ = $480 vs A: µ 6= $480

(e) H: µ = $368 vs A: µ = $480

2006

c Carl James Schwarz 5

Solution: b

14. In order to study the amounts owed to the city, a city clerk takes a random

sample of 16 files from a cabinet containing a large number of delinquent

accounts and finds the average amount X owed to the city to be $230

with a sample standard deviation of $36. It has been claimed that the

true mean amount owed on accounts of this type is greater than $250. If

it is appropriate to assume that the amount owed is a normally distributed

random variable, the value of the test statistic appropriate for testing the

claim is:

(a) -3.33

(b) -1.96

(c) - 2.22

(d) -0.55

(e) - 2.1314

average $17.10 per month for long-distance telephone calls. A random

sample of 10 customers’ bills during a given month produced a sample

mean of $22.10 expended for long-distance calls and a sample variance of

45. A 5% significance test is to be performed to determine if the mean

level of billing for long distance calls per month is in excess of $17.10. The

calculated value of the test statistic and the critical value respectively are:

(b) (1.17, 2.2622)

(c) (2.36, 2.2622)

(d) (1.17, 1.8331)

(e) (0.025, 1.8125)

Solution: a

16. A group of nutritionists is hoping to prove that a new soya bean compound

has more protein per gram than roast beef, which has a mean protein

content of 20. A random sample of 5 batches of the soya compound have

been tested, with the following results:

2006

c Carl James Schwarz 6

What assumption(s) do we have to make in order to carry out a legitimate

statistical test of the nutritionists’ claim?

(b) The mean protein content of the 5 batches follows a normal distribu-

tion.

(c) The variance of the population is known.

(d) Both (a) and (b) must be assumed.

(e) Both (a), (b), and (c) must be assumed.

Solution: a

17. Refer to the previous question. What are the appropriate statistical hy-

potheses and the observed value of the corresponding test statistic?

(b) H: µ = 20 vs. A: µ > 20 and T∗ = (19.2 - 20)/sqrt(11.2/5)

(c) H: µ = 20 vs. A: µ > 20 and Z∗ = (19.2 - 20)/sqrt(11.2/5)

(d) H: µ = 20 vs. A: µ < 20 and Z∗ = (19.2 - 20)/sqrt(11.2/5)

(e) None of these is correct.

Solution: b

0.73, 1.92 ) based on n = 15 observations from a population with a normal

N(µ , σ 2 ) distribution. The hypotheses of interest are H0 : µ = 0 versus

Ha : µ 6= 0. Based on this confidence interval,

(b) we should not reject H0 at the α = 0.05 level of significance.

(c) we should reject H0 at the α = 0.10 level of significance.

(d) we should not reject H0 at the α = 0.10 level of significance.

(e) we cannot perform the required test because we do not know the

value of the test statistic

Solution: b

19. Winnipeg Tribune claims that the time of travel from downtown to the

University via the Pembina bus has an average of µ = 27 minutes. A

student who normally takes this bus believes that µ is greater than 27

minutes. A sample of six ride-times taken to test the hypothesis of interest

gave X = 27.5 minutes and standard deviation s = 2.43 minutes. The value

of the test statistic for testing this hypothesis is:

2006

c Carl James Schwarz 7

(a) - 0.532

(b) 0.460

(c) 0.504

(d) - 0.504

(e) - 0.460

Solution: c

20. In the previous question, the appropriate critical region and conclusion

when testing at a = .05 are:

(b) T ∗ > 2.571; and we fail to reject H0 .

(c) T ∗ < 2.015; and we fail to reject H0 .

(d) T ∗ < 2.571; and we fail to reject H0 .

(e) T ∗ < 1.943; and we fail to reject H0 .

Solution: a

21. A Canadian railway company claims that its trains block crossings no

more that 5 minutes per train on the average. The actual times (minutes)

that 10 randomly selected trains block crossings were:

10.4 9.7 6.5 9.5 8.8 11.2 7.2 10.5 8.2 9.3

level of 0.05 and assuming that the crossing times are normally distributed,

the value of the test statistic and the critical value are, respectively:

(b) 8.79 and 1.8331

(c) 5.91 and 1.8331

(d) 8.79 and 2.2622

(e) 2.78 and 1.96

Solution: b

H is rejected if:

2006

c Carl James Schwarz 8

(b) The value of the test statistic is in the acceptance region.

(c) The p-value is less than 0.10.

(d) The p-value is greater than 0.10.

(e) If the sample mean is not equal to 100.

Solution: c

decided to test the hypothesis H0 : µ = 0 vs HA :µ 6= 0 at the α = 0.05

level, using the same data as was used to construct the c.i..

(a) We cannot test the hypothesis without the original data.

(b) We cannot test the hypothesis at the α= 0.05 level because the α=

0.05 test is connected to the 97.5% confidence interval.

(c) We can only make the connection between hypothesis tests and c.i.

if the sample sizes are large.

(d) We would reject H0 at level α= 0.05.

(e) We would accept H0 at level α= 0.05.

Solution: d

confidence interval for µ calculated from a given random sample is (1.4,

3.6). Based on this finding we:

(a) Fail to reject H0 .

(b) Reject H0 .

(c) Cannot make any decision at all because the value of the test statistic

is not available.

(d) Cannot make any decision at all because the distribution of the pop-

ulation is unknown.

(e) Cannot make any decision at all because (1.4, 3.6) is only a 95%

confidence interval for µ .

Solution: a

that the manufacturer is not short-weighting the product (i.e., underfill-

ing products). To allow for variation in the filling process, the Federal

government takes a sample of 16 bottles of beer with nominal capacity of

344 ml, and if the mean volume in the bottles is less than 340 ml, the

manufacturer is fined. Suppose an unscrupulous brewer sets the machine

to fill, on average, 342 ml. The machine has a standard deviation of 4 ml.

The probability that a Type II error will be made is:

2006

c Carl James Schwarz 9

(a) .4772

(b) .0228

(c) .9772

(d) .1915

(e) .3085

Solution: a

The average growth of a certain variety of pine tree is 10.1 inches in three

years. A biologist claims that a new variety will have a greater three-

year growth. A random sample of 25 of the new variety has an average

three-year growth of 10.8 inches and a standard deviation of 2.1 inches.

26. The appropriate null and alternate hypotheses to test the biologist’s claim

are:

(b) H: µ = 10.8 against A: µ 6= 10.8

(c) H: µ = 10.1 against A: µ > 10.1

(d) H: µ = 10.1 against A: µ < 10.1

(e) H: µ = 10.1 against A: µ 6= 10.1

Solution: c

Past performance 1991 Apr - 98%

(a) rejected because the calculated value of the test statistic is less than

the appropriate critical value 1.711.

(b) rejected because the calculated value of the test statistic is greater

than the appropriate critical value 1.645.

(c) accepted because the calculated value of the test statistic is less than

the appropriate critical value 1.711.

(d) accepted because the calculated value of the test statistic is less than

the appropriate critical value 1.708.

(e) accepted because the calculated value of the test statistic is less than

the appropriate critical value 2.064.

Solution: c

Past performance 1991 Apr - 77%

2006

c Carl James Schwarz 10

28. The p-value for the previous test is computed to be:

(b) between .010 and .015

(c) between .015 and .025

(d) between .025 and .050

(e) between .050 and .100

Solution: e

Past performance 1991 Apr - 75% (D-12%)

Resting pulse rate is an important measure of the fitness of a person’s

cardiovascular system with a lower rate indicative of greater fitness. The

mean pulse rate for all adult males is approximately 72 beats per minute.

A random sample of 25 male students currently enrolled in the Faculty of

Agriculture and now taking 5.211 was selected and the mean pulse resting

pulse rate was found to be 80 beats per minute with a standard deviation

of 20 beats per minute. The experimenter wishes to test if the students

are less fit, on average, than the general population.

29. The null and alternate hypotheses are:

(a) H: µ = 72 A: µ < 72

(b) H: X = 72 A: X < 72

(c) H: µ = 80 A: µ = 72

(d) H: X = 80 A: X > 72

(e) H: µ = 72 A: µ > 72

Solution: e

Past performance 1990 Feb - 88%

Past performance 1993 Apr - 80% (a-17%)

Past performance 1996 Dec - 92%

(a) .32

(b) 2.00

(c) Ð.32

(d) 1.64

(e) 2.88

2006

c Carl James Schwarz 11

Solution: b

Past performance 1990 Feb - 99%

Past performance 1993 Apr - 71% (d-10%)

Past performance 1996 Dec - 96%

31. The null hypothesis will be rejected at α= 0.05 if the test statistic exceeds:

(a) 1.9600

(b) 1.6450

(c) 1.7109

(d) 2.0639

(e) 1.7081

Solution: c

Past performance 1990 Feb - 62% (A-10%, B-18%)

(b) between .020 and .025

(c) between .05 and .10

(d) 7.25

(e) between .005 and .0025

Solution: a

Past performance 1993 Apr - 74% (c-10%)

Past performance 1996 Dec - 92%

(a) Conclude that the students are less fit (on average) than the general

population when in fact they have equal fitness on average, .

(b) Conclude that the students have the same fitness (on average) as the

general population when in fact they are less fit on average.

(c) Conclude that the students have the same fitness (on average) as the

general population when in fact they are the same fitness level on

average.

(d) Conclude that the students are less fit (on average) than the general

population, when, in fact, they are less fit on average.

(e) Conclude that the students have the same fitness (on average) when

in fact they are more fit on average.

2006

c Carl James Schwarz 12

Solution: b

Past performance 1990 Feb - 79% (A-15%)

Past performance 1993 Apr - 80% (a-10%)

2006

c Carl James Schwarz 13

Multiple Choice Questions

Hypothesis Testing - Population proportion

from a single sample

Z=1.28 for the value of the test statistic. Thus the p-value (or observed

level of significance) of the test is approximately equal to:

(a) 0.90

(b) 0.40

(c) 0.05

(d) 0.20

(e) 0.10

Solution: d

The one-sided p-value is P (Z > 1.28) = .10. Because the alternative is a

two-sided alternative, the two-sided p-value is 2 × .1 = .2.

2. The power takeoff driveline on tractors used in agriculture is a potentially

serious hazard to operators of farm equipment. The driveline is covered

by a shield in new tractors, but for a variety of reasons, the shield is often

missing on older tractors. Two type of shields are the bolt-on and the flip-

up. It was believed that the bolt-on shield was perceived as a nuisance

by the operators and deliberately removed, but the flip-up shield is easily

lifted for inspection and maintenance and may be left in place. In a study

initiated by the National Safety Council of the U.S., a sample of older

tractors with both types of shields was taken to see what proportion were

removed. Of 183 tractors designed to have bolt-on shields, 35 had been

removed. Of the 136 tractors with flip-up shields, 15 were removed. We

wish to test the hypothesis H: pb = pf vs A: pb 6= pf where pb and pf are

the proportion of tractors with the bolt-on and flip-up shields removed,

respectively. The test-statistic is computed to be 1.97. The p-value is:

(a) .025

(b) .049

1

(c) .012

(d) .975

(e) .475

Solution: b

Past performance 1991 Feb - 65% (a-27%)

To test H : p = .25 vs A: p > .25, a random sample of size 5 is taken from

the process. If the number of defectives is 4 or more, the null hypothesis

is rejected. What is the probability of rejecting H if p = .20 ?

(a) .00192

(b) .9933

(c) .0096

(d) .0067

(e) .9936

Solution: d

favour of candidate A. The observed value of the test statistic for testing

the null hypothesis H: p =.5 versus the alternative hypothesis A: p 6= .5

is:

(a) 1.80

(b) 1.90

(c) 1.83

(d) 1.28

(e) 1.75

Solution: a

favour the free trade agreement (FTA). A recent poll indicated that out

of 400 randomly selected individuals, 250 favoured the FTA. At the 5%

level of significance, we would:

(a) Fail to reject H0 because the calculated value of the test statistic is

1.033 which is less than 1.645.

(b) Fail to reject H0 because the calculated value of the test statistic is

1.033 which is less than 1.96.

2006

c Carl James Schwarz 2

(c) Fail to reject H0 because the calculated value of the test statistic is

1.0204 which is less than 1.96.

(d) Fail to reject H0 because the calculated value of the test statistic is

1.0204 which is less than 1.645.

(e) Not need to test because everyone knows that FTA is good.

Solution: d

represents the number of successes in 15 trials and if the null hypothesis

is rejected if X ≥ 13 , what is the probability of type I error for this test

?

(a) 0.004

(b) 0.035

(c) 0.050

(d) 0.127

(e) 0.965

Solution: d

7. A seed company claims that 80% of the seeds of a certain variety of tomato

will germinate if sown under normal growing conditions. A government

inspector is interested in whether or not the proportion of seeds germi-

nating is living up to the company’s claim. He randomly selects a sample

of 200 seeds from a large shipment and tests the sample for percentage

germination. If 155 of the 200 seeds germinate, then the calculated value

of the test statistic used to test the hypothesis of interest is:

(a) −.847

(b) −.884

(c) −.897

(d) −.825

(e) −.858

Solution: b

more than 20% of its customers are purchasers of bakery products. A

random sample of 100 customers found 28% purchased bakery items. A

5% significance test is conducted to determine if the chain should increase

its bakery stock. The p-value for this situation is:

2006

c Carl James Schwarz 3

(a) 0.0500

(b) .0750

(c) .0375

(d) .0448

(e) .0228

Solution: e

had 60 males and 40 females. We wish to test if the pattern favours males.

The p-value for this test is

(a) 0.4772

(b) 0.94772

(c) 0.0456

(d) 0.0114

(e) 0.0228

Solution: e

if more than 10% of the buns are crushed. A random sample of 81 buns

finds 13 crushed buns. A 5% significance test is conducted to determine

if the shipment should be accepted. The p value for this situation is:

(a) 0.0348

(b) 0.0500

(c) .0700

(d) 0.0436

(e) 0.0218

Solution: ***

The University of Manitoba research station wishes to investigate if a new

variety of wheat is more resistant to a disease than an old variety. It

is known that this disease strikes approximately 15% of all plants of the

old variety. A field experiment was conducted, and of 120 new plants, 12

became infected.

11. The null and alternative hypothesis are:

2006

c Carl James Schwarz 4

(a) H0 : p = 0.10 H1 : p > 0.15

(b) H0 : p = 0.10 H1 : p > 0.10

(c) H0 : p = 0.15 H1 : p 6= 0.15

(d) H0 : p = 0.15 H1 : p < 0.15

(e) H0 : p = 0.15 H1 : p > 0.15

Solution: d

Past performance 1991 Feb - 90%

(a) 1.83

(b) −1.10

(c) 1.53

(d) −1.83

(e) −1.53

Solution: e

Past performance 1991 Feb - 55% (a-13%, d-18%)

13. A method currently used by doctors to screen women for possible breast

cancer fails to detect cancer in 15% of the women who actually have the

disease. A new method has been developed that researchers hope will be

able to detect cancer more accurately. A random sample of 80 women

known to have breast cancer are to be screened using the new method. At

the 0.05 level of significance, the researchers will be able to conclude that

the new method is better than the one currently in use if the appropriate

test statistic has a value:

(b) less than 1.645

(c) less than −1.645

(d) greater than −1.96

(e) greater than 1.96 in absolute value

Solution: ***

14. Refer to the previous question. After the experiment was performed it

was discovered that the new method failed to detect the breast cancer in

8 of the 80 randomly selected women. The value of the test statistic is

equal to:

2006

c Carl James Schwarz 5

(a) 0.10

(b) −1.25

(c) 1.50

(d) 0.15

(e) −0.14

Solution: ***

2006

c Carl James Schwarz 6

Multiple Choice Questions

Hypothesis Testing - Populations Means from

two independent samples

1000 subjects participated in the study, with 500 being randomly assigned

to the “treatment group” and the other 500 to the “control (or placebo)

group”. A statistically significant difference was reported between the

responses of the two groups (P < .005). Thus,

(a) there is a large difference between the effects of the treatment and

the placebo.

(b) there is strong evidence that the treatment is very effective.

(c) there is strong evidence that there is some difference in effect between

the treatment and the placebo.

(d) there is little evidence that the treatment has any effect.

(e) there is evidence of a strong treatment effect.

Solution: c

Not (a), (b), or (e) because there is nothing the question

about the size of the effect - it may statistically significant, but.

of no practical importance - refer to notes

2. Herbicide A has been used for years in order to kill a particular type of

weed, but an experiment is to be conducted in order to see whether a new

herbicide, Herbicide B, is more effective than Herbicide A. Herbicide A

will continue to be used unless there is sufficient evidence that Herbicide

B is more effective. The alternative hypothesis in this problem is that

(b) Herbicide B is more effective than Herbicide A.

(c) Herbicide A is not more effective than Herbicide B.

(d) Herbicide B is not more effective than Herbicide A.

1

(e) Herbicides A and B differ in effectiveness.

Solution: b

The Excellent Drug Company claims its aspirin tablets will relieve headaches

faster than any other aspirin on the market. To determine whether Excel-

lent’s claim is valid, random samples of size 15 are chosen from aspirins

made by Excellent and the Simple Drug Company. An aspirin is given

to each of the 30 randomly selected persons suffering from headaches and

the number of minutes required for each to recover from the headache is

recorded. The sample results are:

$\overline{X}$ $s^2$

Excellent (E) 8.4 4.2

Simple (S) 8.9 4.6

aspirin cures headaches significantly faster than Simple’s aspirin.

3. The appropriate hypothesis to be tested is:

(a) H: µE − µS = 0 A: µE − µS > 0

(b) H: µE − µS = 0 A: µE − µS 6= 0

(c) H: µE − µS = 0 A: µE − µS < 0

(d) H: µE − µS < 0 A: µE − µS = 0

(e) H: µE − µS > 0 A: µE − µS = 0

Solution: c

4. Absolute value of the calculated value of the appropriate test statistic is:

(a) 1.61

(b) 2.33

(c) 0.65

(d) 1.24

(e) 0.85

(a) 1.960

2006

c Carl James Schwarz 2

(b) 1.701

(c) 2.048

(d) 2.145

(e) 1.645

A new drug has been developed for treating stage four (near terminal)

AIDS patients. Patients were randomized to the old and new drug and

the time to death (months) was recorded:

OLD 32 <25 40 31 35 29

NEW 45 32 >48 34 37 27 35 >48

One patient died before twenty five months, but it was not known when.

Two patients were still alive after four years when the study was termi-

nated.

6. The value of the test statistic (computed on the OLD drug) for testing if

the new drug gave an increased life span is:

(a) 75

(b) 71

(c) 32

(d) 34

(e) 33

Solution: e

Past performance 1990 Apr - 84%

8. Which of the following is NOT CORRECT?

(a) Nonparametric procedures require fewer assumptions than paramet-

ric procedures.

(b) The SIGNED-RANK test should be used for paired data.

(c) Nonparametric procedures can be used with ordinal data because all

that is needed are the relative sizes of the values.

(d) Tied values are assigned a rank equal to average of the ranks associ-

ated with the tied values.

2006

c Carl James Schwarz 3

(e) The assumption of independence is not important for non-parametric

procedures.

Solution: e

Past performance 1990 Apr - 78% (C-11%)

that has been developed in the laboratory. Experience shows that the

variable being measured can reasonably be considered to be normally dis-

tributed. In order to test to determine if the new technique is more precise

than the old standard technique the researcher uses the Wilcoxon Rank

Sum Test. The researcher has used a procedure which

(b) has greater power to detect small differences than the t test in this

case.

(c) may be easier to use but is less powerful than the t test in this

circumstance.

(d) is both inappropriate and invalid.

(e) will likely lead to a wrong conclusion here.

10. We wish to test if a new feed increases the mean weight gain compared

to an old feed. At the conclusion of the experiment it was found that the

new feed gave a 10 kg bigger gain than the old feed. A two-sample t-test

with the proper one-sided alternative was done and the resulting p-value

was .082. This means:

(b) There was only a 8.2% chance of observing an increase greater than

10 kg (assuming the null hypothesis was true).

(c) There was only an 8.2% chance of observing an increase greater than

10 kg (assuming the null hypothesis was false).

(d) There is an 8.2% chance the alternate hypothesis is true.

(e) There is only an 8.2% chance of getting a 10 kg increase.

Solution: b

Past performance 1991 Feb - 50% (20%-a; 12%-d; 11%-e)

Past performance 1993 Feb - 86%

Past performance 1993 Apr - 81%

Past performance 1997 Aug - 74% (14%-d)

Past performance 2006 Dec - 77% (11%-a)

2006

c Carl James Schwarz 4

11. Following the analysis of some data on two samples drawn from popula-

tions in which the variable of interest is normally distributed, the p-value

for the comparison of the two sample means under the null hypothesis that

the two population means are equal (H0 µ1 = µ2 ) against HA : µ1 6= µ2

was found to be .0063. This p-value indicates that:

(a) there is very little evidence in the data for a conclusion to be reached.

(b) there is rather strong evidence against the null hypothesis.

(c) the evidence against the null hypothesis is not strong.

(d) the null hypothesis should be accepted.

(e) there is rather strong evidence against the alternative hypothesis.

Solution: b

Different varieties of fruits and vegetables have different amount of nu-

trients. These differences are important when these products are used to

make baby food. We wish to compare the carbohydrate content of two

varieties of peaches. The data was analyzed with SAS and the following

output was obtained:

A 5 33.6 3.781 1.691 29.000 38.000

B 7 25.0 10.392 3.927 2.000 33.000

UNEQUAL 2.0110 8.0 0.0791

EQUAL 1.7490 10.0 0.1109

FOR $H_0: \textit{VAR~ARE~EQUAL}$, F’= 7.55 WITH 6 AND 4 DF PROB > F’= 0.0707

12. We wish to test if the two varieties are significantly different in their mean

carbohydrate content . The null and alternative hypotheses are:

(a) H: µ1 = µ2 A: µ1 < µ2

(b) H: µ1 = µ2 A: µ1 > µ2

(c) H: µ1 = µ2 A: µ1 6= µ2

(d) H: X 1 = X 2 A: X 1 < X 2

(e) H: X 1 = X 2 A: X 1 6= X 2

Solution: c

Past performance 1990 Apr - 97%

Past performance 1990 Dec - 86%

2006

c Carl James Schwarz 5

13. The test statistic, absolute critical value (at α=.05), and p-value are:

(b) 1.7490 1.8125 .0554

(c) 2.0110 2.3060 .0791

(d) 2.0110 1.8595 .0396

(e) 7.5500 6.1600 .0707

Solution: c

Past performance 1990 Apr - 44% ( a=41%, e=12%)

(b) The unequal variance test is used if the ratio of the sample variances

is more than about 5:1

(c) If both sample sizes are large, the p-value for T ∗ can be approximated

using a normal distribution.

(d) If the df are fractional, we round down to the lower integer

(e) Outliers normally do not affect T ∗ very much in small samples.

Solution: e

Past performance 1990 Apr - 91%

15. These findings were submitted to a journal, and one reviewer questioned

the results because she believed that the data within each group were

not normally distributed. Consequently, a non-parametric procedure was

used, and the output follows:

SUM OF EXPECTED STD DEV MEAN

LEVEL N SCORES UNDER $H_0$ UNDER $H_0$ SCORE

A 5 45.50 32.50 6.14 9.10

B 7 32.50 45.50 6.14 4.64

S= 45.50 Z= 2.0371 PROB >|Z|=0.0416

2006

c Carl James Schwarz 6

(b) S=45.5 p-value=.0208

(c) Z=2.0371 p-value=.0208

(d) Z=2.0371 p-value=.0664

(e) S=45.5 p-value=.0664

Solution: a

Past performance 1990 Apr - 64% (b=23%)

Different varieties of fruits and vegetables have different amount of nu-

trients. These differences are important when these products are used to

make baby food. We wish to compare the carbohydrate content of two

varieties of peaches. The data was analyzed with JMP and the following

output was obtained:

16. We wish to test if the two varieties are significantly different in their mean

carbohydrate content . The null and alternative hypotheses are:

(a) H: µ1 = µ2 A: µ1 < µ2

(b) H: µ1 = µ2 A: µ1 > µ2

(c) H: µ1 = µ2 A: µ1 6= µ2

(d) H: X 1 = X 2 A: X 1 < X 2

(e) H: X 1 = X 2 A: X 1 6= X 2

Solution: c

Past performance 1996 Dec - 96%

(b) 4.264 .1020

(c) 3.137 .2039

(d) 10 .2039

2006

c Carl James Schwarz 7

(e) -2.725 .1020

Solution: a

Past performance 1996 Dec - 95%

18. The following are percentages of fat found in 5 samples of each of two

brands of ice cream:

B 6.3 5.7 5.9 6.4 5.1

equal average fat content in the two types of ice cream?

(b) Two sample t-test with 8 d.f.

(c) Paired t-test with 4 d.f.

(d) Two sample t-test with 9 d.f.

(e) Sign test

Solution: b

19. The life, in months of service, before a failure of the color television picture

tube in a random sample of 6 television sets manufactured by Company

A and 8 television sets manufactured by Company B are as follows:

A 32 25 40 31 35 29

B 45 32 47 34 37 27 35 44

The calculated value of the Rank-Sum test statistic for testing the null

hypothesis that the life, in months of service, before failure of picture

tube is the same both companies is:

(a) 75

(b) 71

(c) 32

(d) 34

(e) 33

Solution: e

In order to compare two kinds of feed, thirteen pigs are split into two

groups, and each group received one feed. The following are the gains in

weight (kilograms) after a fixed period of time:

2006

c Carl James Schwarz 8

Feed A: 8.0 7.4 5.8 6.2 8.8 9.5

Feed B: 12.0 18.2 8.0 9.6 8.2 9.9 10.3

We wish to test the hypothesis that Feed B gives rise to larger weight

gains. The output from SAS is as follows:

----------------------------------------------------

a 6 7.45000000 1.33529023 0.54512995

b 7 10.88571429 3.49400848 1.32061107

Variances T DF Prob>|T|

---------------------------------------

Unequal -2.4048 7.9 0.0431

Equal -2.2596 11.0 0.0451

(b) T ∗ = -2.4048; p-value = .0216

(c) T ∗ = -2.2596; p-value = .0451

(d) T ∗ = -2.2596; p-value = .0256

(e) F’ = 6.85; p-value = .0520

Solution: b

Past performance 1991 Apr - 56% (A-20%)

21. The results were written up in a report, but a reviewer of the report

thought that some of the assumptions necessary for a two-sample t-test

might be violated. Consequently, a non-parametric procedure was also

done. The rank-sum test statistic computed for Feed A and the corre-

sponding p-value are:

(b) W = 25.5 p-value = .018

(c) W = 23.5 p-value = .003

(d) W = 23.5 p-value = .006

(e) W = 7.45 p-value = .043

2006

c Carl James Schwarz 9

Solution: a

Past performance 1991 Apr - 82%

(a) Reject H if WA 2 29

(b) Reject H if WA 3 55

(c) Reject H if WA 2 36

(d) Reject H if WA 2 27

(e) Reject H if WA 2 34

Solution: a

Past performance 1991 Apr - 77%

In order to compare two kinds of feed, thirteen pigs are split into two

groups, and each group received one feed. The following are the gains in

weight (kilograms) after a fixed period of time:

Feed B: 12.0 18.2 8.0 9.6 8.2 9.9 10.3

We wish to test the hypothesis that Feed B gives rise to larger weight

gains. The output from JMP is as follows:

(a) H: X A = X B A: X A 6= X B

(b) H: µA = µB ; A: µA 6= µB

(c) H: X A = X B A: X A < X B

(d) H: µA = µB ; A: µA < µB

(e) H: X A = X B A: X A > X B

2006

c Carl James Schwarz 10

Solution: d

Past performance 1997 Aug - 90%

(a) -3.269

(b) 1.535

(c) .0566

(d) -2.130

(e) -6.647

Solution: d

Past performance 1997 Aug - 95%

(a) .0566

(b) .0283

(c) .1132

(d) .1087

(e) 2.130

Solution: b

Past performance 1997 Aug - 88%

Nitric oxide is one component of the pollution emitted by automobiles.

Two different control devices are to be compared by equipping 10 cars

with device I and 7 cars with device II. The data was analyzed with SAS

and the output follows:

I 10 1.0160 0.0377 0.0119 0.9600 1.0800

II 7 0.9942 0.0350 0.0132 0.9500 1.0500

UNEQUAL 1.2173 13.7 0.2441

EQUAL 1.2004 15.0 0.2486

FOR H0: VAR ARE EQUAL, F’= 1.16 WITH 9 AND 6 DF PROB > F’= 0.8868

26. We wish to test if the mean level of nitric oxide from device I is greater

than that of device II. The null and alternate hypotheses are:

2006

c Carl James Schwarz 11

(a) H: µ1 − µ2 =0 A: µ1 − µ2 6= 0

(b) H: X 1 − X 2 =0 A: X 1 − X 2 < 0

(c) H: µ1 − µ2 =0 A: µ1 − µ2 < 0

(d) H: X 1 − X 2 =0 A: X 1 − X 2 > 0

(e) H: µ1 − µ2 =0 A: µ1 − µ2 < 0.

Solution: c

27. The test statistic, rejection region (α=.05), and the p-value are:

(b) T ∗ =1.2004; reject if T ∗ 1.7530; p-value=.2486

(c) T ∗ =1.2004; reject if T ∗ 1.7530; p-value=.1243

(d) T ∗ =1.2173; reject if T ∗ 1.7709; p-value=.1220

(e) T ∗ =1.2004; reject if T ∗ 2.1314; p-value=.1243

Solution: c

(b) The unequal variance test is used if the ratio of the sample variances

is more than about 5:1

(c) If both sample sizes are large, the p-value for T ∗ can be approximated

using a normal distribution.

(d) If the df are fractional, we round down to the lower integer

(e) Outliers normally do not affect T ∗ very much in small samples.

Solution: e

29. These findings were submitted to a journal, and one reviewer questioned

the results because she believed that the data within each group were

not normally distributed. Consequently, a non-parametric procedure was

used, and the output follows:

LEVEL N SCORES UNDER H0 UNDER H0 SCORE

I 10 102.00 90.00 10.20 10.20

II 7 51.00 63.00 10.20 7.29

2006

c Carl James Schwarz 12

WILCOXON 2-SAMPLE TEST (NORMAL APPROXIMATION)

(WITH CONTINUITY CORRECTION OF .5)

S= 51.00 Z=-1.1278 PROB >|Z|=0.2594

(b) S=51.0 p-value = .1297

(c) Z=-1.1278 p-value = .2594

(d) Z=-1.1278 p-value = .2760

(e) S=90.0 p-value = .1297

Solution: b

Two different emission control devices for automobiles were being tested

to determine if Device I gives greater emissions, on average, than Device

II. Twenty cars of the same model and year are equipped with the devices;

ten were equipped with Device I and ten were equipped with Device II.

Unfortunately, three cars were involved in accidents and had to be removed

from the study. The following output was obtained from SAS.

I 10 1.032 0.0522 0.0165 0.9600 1.1500

II 7 1.004 0.0299 0.0113 0.9600 1.0500

FOR H0: VARIANCES EQUAL, F’=3.05 WITH 9 AND 6 DF PROB > F’= 0.1882

UNEQUAL 1.3844 14.6 0.1871

EQUAL 1.2590 15.0 0.2273

(a) H: µ1 − µ2 > 0 A: µ1 − µ2 = 0

(b) H: X 1 − X 2 > 0 A: X 1 − X 2 = 0

(c) H: X 1 − X 2 = 0 A: X 1 − X 2 > 0

(d) H: µ1 − µ2 = 0 A: µ1 − µ2 < 0

2006

c Carl James Schwarz 13

(e) H: µ1 − µ2 = 0 A: µ1 − µ2 > 0

Solution: e

Past performance 1990 Feb - 97%

31. The value of the proper test statistic and rejection region (α= 0.05) are:

(b) T ∗ = 1.26; reject H if T ∗ > 1.75

(c) T ∗ = 1.38; reject H if T ∗ > 2.14

(d) T ∗ = 1.26; reject H if T ∗ > 2.19

(e) T ∗ = 1.26; reject H if T ∗ < −2.14 or T ∗ > 2.14

Solution: b

Past performance 1990 Feb - 92%

(a) .1882

(b) .1871

(c) .2273

(d) .0936

(e) .1136

Solution: e

Past performance 1990 Feb - 65% (C-26%)

ing that the the true variance for each type is 4 (ppm 2 ) when testing at

α=.05. The required sample size is estimated to be:

(b) 4 cars for each device for a total of 8 cars

(c) 12 cars in total; 6 cars for each device.

(d) 4 cars in total; 2 cars for each device

(e) 24 cars in total; 12 cars for each device.

Solution: b

Past performance 1990 Feb - 56% (A-27%)

UATION:

2006

c Carl James Schwarz 14

A sheep producer wishes to investigate if the mean number of tapeworms

in the stomachs of Suffolk sheep is less if they have been treated with a

drug compared to sheep not treated. He obtains the following sample data

to conduct a 5% significance test:

Group Deviation

1 -No Drug 7 43.2 17.0 42

2 - Drug 7 28.6 14.1 37

(a) H0 : µ1 − µ2 = 0; H1 : µ1 − µ2 < 0

(b) H0 : µ1 − µ2 = 0; H1 : µ1 − µ2 > 0

(c) H0 : X 1 − X 2 = 0; H1 : X 1 − X 2 < 0

(d) H0 : X 1 − X 2 = 0; H1 : X 1 − X 2 > 0

(e) H0 : µ1 − µ2 = 0; H1 : µ1 − µ2 6= 0

Solution: b

(a) 1.54

(b) 1.28

(c) 1.75

(d) 2.1

(e) 4.41

(a) 1.8946

(b) 1.7709

(c) 1.9432

(d) 1.7823

(e) 2.1788

2006

c Carl James Schwarz 15

37. Calculate the observed value of the test statistic for the test of H0 : µ1 −

µ2 = 0 versus Ha : µ1 − µ2 < 0 on the basis of the following information.

Test the hypotheses at the 5% level of significance.

sample size 50

sample variance 100

sample mean 403

Sample statistics for group 2:

sample size 60

sample variance 150

sample mean 409

(b) zobs = +2.83 so we conclude that µ1 is greater than µ2 .

(c) zobs = +2.78 so we conclude that µ1 is greater than µ2 .

(d) zobs = -2.78 so we conclude that µ1 is less than µ2 .

(e) zobs = -2.78 so we conclude that µ1 is greater than µ2 .

the test of hypotheses in the previous question?.

(b) The two population variances are equal, i.e. s21 = s22 .

(c) Each population follows a normal distribution.

(d) Both (b) and (c) are necessary assumptions.

(e) None of the above assumptions are necessary.

drugs – a new drug and an old drug. The researcher would like to see

whether there is sufficient evidence to say that the new drug is better

than the old drug. In this problem, the researcher will commit a type I

error if:

(a) she concludes that the drugs are equal in effectiveness when in fact

the new drug is better.

(b) she concludes that the drugs are equal in effectiveness when in fact

the old drug is better.

2006

c Carl James Schwarz 16

(c) she concludes that the old drug is better when in fact the new drug

is better.

(d) she concludes that the new drug is better when in fact the drugs are

equal in effectiveness.

(e) she concludes that the old drug is better when in fact the drugs are

equal in effectiveness.

Solution: d

Past performance 1990 Dec - 83%

Past performance 1991 Feb - 83% (a-10%)

An experiment was conducted to assess the efficacy of spraying oats with

malathion (at .25 lbs/acre) to control the cereal leaf beetle. A sample of 10

farms were selected at random from southwest Manitoba. Each farm was

assigned at random to either the control group (no spray) or the treatment

group (spray). At the conclusion of the experiment, a plot on each farm

was selected and the number of larvae per stem was measured. Here are

two possible outputs from DataDesk (only one of which is correct; some

output hidden)

t-Tests

separate estimates of sigma_1, sigma_2

vs $H_A:\mu_{not spray}- \mu_{spray} > 0$

Sample mean(spray) =3.0508

------------------------------------

t-Test, paired samples

not spray - spray:

2006

c Carl James Schwarz 17

(a) 1.896, 0.033

(b) 1.896, 0.131

(c) 1.896, 0.065

(d) 1.887, 0.059

(e) 1.887, 0.118

Solution: c

Past performance 1993 Feb - 38% (a-53%)

(b) We conclude malathion is effective when in fact it is ineffective.

(c) We conclude malathion is effective when in fact it is effective.

(d) We conclude malathion is ineffective when in fact it is ineffective.

(e) We conclude malathion is neither ineffective or effective.

Solution: a

Past performance 1993 Feb - 83% (b-17%)

effect.

(b) the ability to not detect an effect of malathion when in fact there is

no effect.

(c) the ability to detect an effect of malathion when in fact there is an

effect.

(d) the ability to not detect an effect of malathion when in fact there is

an effect.

(e) the ability to make a correct decision regardless if malathion has an

effect or not.

Solution: c

Past performance 1993 Feb - 66% (a-10%; e-15%)

in controlling pests and their effects on subsequent yield. What is the best

reason for randomly assigning treatment levels (spraying or not spraying)

to the experimental units (farms)?

2006

c Carl James Schwarz 18

(a) Randomization make the experiment easier to conduct because we

can apply the insecticide in any pattern rather than in a systematic

fashion.

(b) Randomization will tend to average out all other uncontrolled fac-

tors such as soil fertility so that they are not confounded with the

treatment effects.

(c) Randomization makes the analysis easier because the data can be

collected and entered into the computer in any order.

(d) Randomization is required by statistical consultants before they will

help you analyze the experiment.

(e) Randomization implies that it is not necessary to be careful during

the experiment, during data collection, and during data analysis.

Solution: b

Past performance 1990 Feb - 97%

Past performance 1993 Feb - 98%

Past performance 1996 Dec - 100%

Past performance 2006 Dec - 99%

In order to study the harmful effects of DDT poisoning, the pesticide was

fed to 6 randomly chosen rats out of a group of 12 rats. The other 6 rats

were used as the control group. The following data gives the measure-

ments of the amount of tremor detected in the bodies of each rat after the

experiment: The more tremor, the more harmful.

Control group : 11.1 12.1 9.3 6.6 9.6 8.2

Here is some output from JMP: (the differences are computed as control-

poisoned)

(a) H: µc = µp A: µc < µp

2006

c Carl James Schwarz 19

(b) H: X c = X p A: X c < X p

(c) H: pc = pp A: pc < pp A: βc < βp

(d) H: X c = X p A: X c 6= X p

Solution: a

Past performance 1998 Dec - 95%

(a) We are about 95% confident that the rats in the poisoned group have

all between 14 and 2 more tremors than the control group.

(b) The std error measures how much the estimated difference could vary

if a new experiment was done.

(c) We are about 95% confident that the sample mean number of tremors

for the control group is between 2 and 14 more than the sample mean

number of tremors in the poisoned group.

(d) The test-statistic is a measure of how far the data is from that ex-

pected under the alternate hypothesis.

(e) The p-value measures the probability that there is no difference in

the mean number of tremors between the two groups.

Solution: b

Past performance 1998 Dec - 48% (20% a; 23% e))

Note that (a) refers to individual rats, not to the mean over all the rats

Note that (e) incorrect states that p-values measure the probability of an

hypothesis

(a) The p-value is small. There is good evidence that the two means are

equal.

(b) The p-value is large. There is good evidence that the two means are

different.

(c) The p-value is small. There is good evidence that the two sample

means differ, in fact, the control group appears to have fewer tremors,

on average.

(d) The confidence interval does not include 0. Hence, there is evidence

that the mean number of tremors for all potential rats in the poisoned

group is larger than that in the control group.

(e) The confidence interval does not include 0. Hence there is no evidence

that the means are the same for both groups.

2006

c Carl James Schwarz 20

Solution: d

Past performance 1998 Dec - 23% (20% e; 53% c)

Note: (c) refers to SAMPLE means not population means.

A researcher wants to see if birds that build larger nests lay larger eggs.

She selects two random samples of nests: one of small nests and the other

of large nests. She measures one egg from each nest. The data are sum-

marized below.

(a) H : µL = µS ; A : µL > µS

(b) H : Y L = Y S ; A:YL >YS

(c) H : µL = µS ; A : µL 6= µS

(d) H : Y L = Y S ; A : Y L 6= Y S

(e) H : µL = µS ; A : µL < µS

2006

c Carl James Schwarz 21

Solution: a

Past performance 2006 Dec - 87%

(a) We conclude that larger nests have the same size eggs (on average)

when in fact they are larger.

(b) We conclude that larger nests have larger eggs (on average) when in

fact they are larger.

(c) We conclude that larger nests have the same size eggs (on average)

when in fact there is no difference in the mean.

(d) We conclude that larger nests had larger eggs (on average) when in

fact there is no difference in the mean.

(e) I ever take a statistics course again in my life! (just kidding).

Solution: d

Past performance 2006 Dec - 77% (20%-a)

2006

c Carl James Schwarz 22

Multiple Choice Questions

Testing - Two independent samples on

proportions

serious hazard to operators of farm equipment. The driveline is covered

by a shield in new tractors, but for a variety of reasons, the shield is often

missing on older tractors. Two type of shields are the bolt-on and the flip-

up. It was believed that the bolt-on shield was perceived as a nuisance

by the operators and deliberately removed, but the flip-up shield is easily

lifted for inspection and maintenance and may be left in place. In a study

initiated by the National Safety Council of the U.S., a sample of older

tractors with both types of shields was taken to see what proportion were

removed. Of 183 tractors designed to have bolt-on shields, 35 had been

removed. Of the 136 tractors with flip-up shields, 15 were removed. We

wish to test the hypothesis H: pb = pf vs A: pb 6= pf where pb and pf are

the proportion of tractors with the bolt-on and flip-up shields removed,

respectively. The test-statistic is computed to be 1.97. The p-value is:

(a) .025

(b) .049

(c) .012

(d) .975

(e) .475

Solution: b

Past performance 1991 Feb - 65% (a-27%)

manufactured by machine B showed 52 and 23 defective bolts respectively.

The observed value of the test statistic for testing the null hypothesis that

there is no difference in the performance of the machines is:

(a) 3.29

1

(b) 2.47

(c) 8.56

(d) 12.32

(e) 3.41

Solution: e

3. Two different medical procedures are widely used to treat a disease. One

hundred patients were randomly selected for each procedure in a recent

clinical trial, with the following results:

procedure 1 100 78

procedure 2 100 87

What is the absolute value of the test statistic calculated from the data for

testing the null hypothesis that there is no difference between the success

rates between procedure 1 and procedure 2?

(a) +0.658

(b) +1.675

(c) +2.385

(d) +2.575

(e) +31.610

Solution: b

tiveness of teaching English by the traditional classroom lecture system

(T) and by the extensive use of audio visual aids To do so a class of 250

is randomly divided into two groups

150 are taught by method A; of these 105 pass a test.

The appropriate test statistic for testing whether the traditional method

has a lower passing rate than the audio visual methods:

.63−.70

(a) √ .672×.328 .672×.328

100 + 150

.63−.70

√ .630×.370

(b)

100 + .700×.300

150

.63−.70

(c) √ .667×.333 .667×.333

100 + 150

2006

c Carl James Schwarz 2

(63−67.2)2 (37−32.8)2 (105−100.8)2 (45−49.2)2

(d) 67.2 + 32.8 + 100.8 + 49.2

(e) none of the above

Solution: a

Manitoba. They are asked their reaction to increased tuition fees. The

results are as follows:

Based on the data (with α = .05):

(b) Our suspicions are confirmed as the p-value is .2939.

(c) Our suspicions are confirmed as the p-value is 0.82.

(d) We cannot conclude that a larger proportion of women are in support

of the increase as the p-value is .2061.

(e) We cannot conclude that a larger proportion of women are in support

of the increase as the p-value is .2939.

In the past decade there have been extensive antismoking campaigns to

try and reduce the proportion of smokers in the population. In 1982, a

survey of 350 adult females revealed that 148 smoked. In 1989, 488 adult

females were surveyed and 163 smoked. Let p represent the proportion of

adult female smokers.

6. The null and alternate hypotheses are:

(b) H: p1982 6= p1989 A: p1982 = p1989

(c) H: p1989 = .423 A: p1989 < .423

(d) H: p1982 = .334 A: p1982 > .334

(e) H: p1982 = p1989 A: p1982 6= p1989

Solution: a

Past performance 1990 Feb - 83%

Past performance 1990 Apr - 92%

Past performance 1990 Dec - 68% (22% - e)

2006

c Carl James Schwarz 3

7. The test statistic would be computed as:

q

(a) .09/ .423(1−.423)

350 + .334(1−.334)

488

q

.423(1−.423) .334(1−.334)

(b) .09/ 838 + 838

q

(c) .09/ .371(1−.371)

838

q

(d) .09/ .371(1−.371)

350 + .371(1−.371)

488

q

(e) .09/ .423(1−.423)

350 + .370(1−.370)

488

Solution: d

Past performance 1990 Feb - 65% (A-32%)

Past performance 1990 Apr - 83%

(a) 2.63

(b) .004

(c) .009

(d) .496

(e) .089

Solution: b

Past performance 1990 Feb - 80%

Past performance 1990 Apr - 83%

(a) The probability that the proportion of smokers has not changed is

.053.

(b) The proportion of smokers has definitely decreased.

(c) There is some, but not overwhelming evidence, that the proportion

of smokers has decreased.

(d) There is no evidence that the proportion of smokers is the same in

both years.

(e) There is overwhelming evidence that the proportion of smokers has

stayed the same.

Solution: c

Past performance 1990 Dec 61%

2006

c Carl James Schwarz 4

10. In a similar study of adult males, the p-value was found to be .053. This

means:

(a) The probability that the proportion of male smokes has not changed

is .053.

(b) The proportion of male smokers has definitely decreased.

(c) If the proportion of male smokers has not changed, then there is only

a .053 chance of seeing the observed drop in the smoking rate in the

survey.

(d) If the proportion of male smokers has changed, then there is only a

.053 chance of detecting a difference.

(e) If the proportion of smokers has changed, then there is only a .053

chance of seeing the observed drop in the smoking rate in the survey.

Solution: c

Past performance 1990 Feb - 38% (A-14%, B-38%, C-38%, D-29%, E-17%)

Past performance 1990 Apr - 64%(C-64%, D-11%, E-21%)

was found that 66% of them had previously attended some other college

or university. In a random sample of 100 University of Waterloo graduate

students, it was found that 35% of them had previously attended some

other college or university. A 95% confidence interval for estimating the

difference in proportions of graduate students who had previously attended

some other college or university between the University of Manitoba and

the University of Waterloo is:

q

1 1

(a) (0.66 − 0.35) ± 1.96 (0.3366)(0.6633)(( 200 + 100 )

q

(b) (0.66 − 0.35) ± 1.96 (0.66)(0.34)

200 + (0.35)(0.65)

100

q

1 1

(c) (0.66 − 0.35) ± 1.96 ((0.5566)(0.4433)(( 100 + 200 )

q

1 1

(d) (0.33 − 0.35) ± 1.96 (0.5566)(0.4433)(( 100 + 200 )

q

1 1

(e) (0.33 − 0.35) ± 1.645 (0.5566)(0.4433)(( 100 + 200 )

One criticism of reforestation efforts after timber harvesting is that too

few of the seedling survive. An experiment was conducted to assess if

mulching the slash (limbs, roots, small branches, etc.) and leaving the

mulch on the ground improves the survival rate compared to just leaving

2006

c Carl James Schwarz 5

the slash on the ground. It is believed that mulching will cause the ma-

terial to break down sooner and release the nutrients to the seedlings. A

total of 500 seedlings were randomly assigned to the two treatments and

the two year survival rate was measured. Of the 250 seedling receiving

the “mulching” treatment, 75 survived; of the 250 seedlings receiving the

“control” treatment, 55 survived.

12. The null and alternate hypotheses are: (m=mulch, c=control)

(b) H: µm =.22 A: µm > .22

(c) H: pm -pc =0 A: pm − pc > 0

(d) H: µm -µc =0 A: µm − µc > 0

(e) H: pm -pc =0 A: pm − pc 6= 0

Solution: c

Past performance 1993 Feb - 82% (d=19%)

13. The value of the test statistic and the p-value are:

(a) 2.76, .003

(b) 2.05, .042

(c) 2.76, .006

(d) 2.05, .021

(e) 2.05, .011

Solution: d

Past performance 1993 Feb - 84%

2006

c Carl James Schwarz 6

- Apparel Production Management_newЗагружено:email2suman
- STATS - DOANE - Chapter 15 Chi-Square TestsЗагружено:BG Monty 1
- Purple Topshells[1]Загружено:06odouglas
- examples-sheet-3.pdfЗагружено:Abraham Sauvingnon
- Exercises in Engineering StatisticsЗагружено:Cristina Cojocea
- Allana Management Journal of Research January-june-2011Загружено:Sobia Murtaza
- ANOVA: A Paradigm for Low Power and Misleading Measures of Effect Size?stЗагружено:Gilmer Solis Sánchez
- Indian Hockey Team NewsЗагружено:Sanjay Soni
- Chi Square NotesЗагружено:hazursaran
- Calculsu 7_8 PracticeЗагружено:speterlee6253
- Assignment Statistics Behaviour of Sikkim Manipal University Semester 1Загружено:LavaKumarK
- evert butkowski final paperЗагружено:api-350309124
- Effect of Credit Information Influence Loan Volume Granted By Selected Deposit Taking Saccos in Nyeri County, KenyaЗагружено:International Organization of Scientific Research (IOSR)
- Statistics Formula LogЗагружено:vhopmom
- Statistical Analysis in ResesarchmoduleЗагружено:Eric Cabrera
- Ejbss 1293 13 CustomerretentionintheghanaianЗагружено:ahmedbalo
- M P-ENT. 108 (50 Bit)(1)Загружено:Saurabh Bhise
- Stats Lecture ContinuedЗагружено:odie99
- Analysis and Sample VariancesЗагружено:cinvehbi711
- Als Cluster FinlandЗагружено:Elman Askerov
- AbstractЗагружено:bertin
- ContentsЗагружено:qwerty3146831
- Chapter 13 - Hypothesis Testing for Two Population ParametersЗагружено:ruel pablo
- PSEUDOREPLICACIONЗагружено:David Mero del Valle
- Tutorials - StatisticsЗагружено:Tousif Ahmed Khan
- Assignment 1 2016 (1)Загружено:badabing123
- A Confidence Interval Provides Additional Information About VariabilityЗагружено:Shrey Budhiraja
- Efficient Market Hypothesis in European Stock MarketsЗагружено:Angus Sadpet
- ELG 3126Загружено:Serigne Saliou Mbacke Sourang
- Research MethodologyЗагружено:Sreenath Reddy

- Engleza AvansatiЗагружено:MelittaKrestel
- Tolerance ChartЗагружено:Chin Yuan Goh
- f02lec17.pdfЗагружено:Alina Harnagea
- girlshare.ro_pr11ca.pdfЗагружено:Alina Harnagea
- TR0173 C Language ReferenceЗагружено:Alina Harnagea
- Edu Cat en Smd Fi v5r19 ToprintЗагружено:totenkopf0424
- 1e-handbook-catia-v-5sheetmetaldesign.pdfЗагружено:Alina Harnagea
- catia sheet metal design.pdfЗагружено:simson
- VSem BA Economics CoreCourse Computer Application in EconomicsЗагружено:Alina Harnagea
- ProgrammingPrinciplesAndPracticeUsingC++.pdfЗагружено:Lisa Thorne
- TR0173 C Language ReferenceЗагружено:vaskomane
- Curs_C(Cork)Загружено:catalinjkd
- b Com calicut univercityЗагружено:farhan ok
- BA Economics -I Sem - Course - Micro Economics - I_2015Загружено:harshdave123456789
- informatics english baЗагружено:Alina Harnagea
- Abstract_algebra.pdfЗагружено:alin444444
- Paris Motorshow 2016 ReviewЗагружено:Alina Harnagea
- 20.Cmgp - Inst Sanit - Casa Mare-modelЗагружено:Alina Harnagea
- 3.Casa Mica - Var.2-ModelЗагружено:Alina Harnagea
- 13.Cmgp - Inst Sanit - Casa Mare-modelЗагружено:Alina Harnagea
- Plan de Operatii 2Загружено:Alina Harnagea
- SculeЗагружено:Alina Harnagea
- AaaarЗагружено:Alina Harnagea

- Shaw - Mrs. Warren's professionЗагружено:Jewelzz
- ABB Ipdu BrochureЗагружено:lucio_jolly_roger
- Chemical EarthingЗагружено:praveshkafle
- Relationship between gold price and stock marketЗагружено:Sandeep Madival
- Beginning of the Year Clinic Reminders to Parents (1)Загружено:TuTit
- Ch15SЗагружено:wsvivi
- fsci ch 17Загружено:api-266860509
- Adjectives & Adverbs for WrЗагружено:alexoundpc
- Chapter 01Загружено:Teehee Jones
- Management of Elementary Education in IndiaЗагружено:Ranganath Benakatti
- Vega InstrumentsЗагружено:amarnetha
- QuestionnaireЗагружено:Alyssa Marie P. Ramelb
- models.bfc.impedance_spectroscopy.pdfЗагружено:alerozco
- Grammarians Dharma Aklujkar.pdfЗагружено:Suhas Mahesh
- Cows in the Maze - Ian Stewart.pdfЗагружено:Jireh Espinosa
- 07Загружено:Samik Mukherjee
- Assignment 1 - Julie (Scribd)Загружено:Julie Pham
- maths notesЗагружено:Sekar Dinesh
- Questionaire 2Загружено:Dain Alice
- WEKA Lab RecordЗагружено:Srinivasan Alladi
- Savran y Tonak - Productive and Unproductive Labour.pdfЗагружено:raúl
- Circle GeometryЗагружено:Julius Fernan Vega
- The Spiritual LawsЗагружено:Mary Eden
- Ahold Case StudyЗагружено:rajdeeplaha
- Drama PlanЗагружено:Ryan Gornall
- Dq TransformationЗагружено:Aili Luggymix
- Airtel InfoЗагружено:venkatesh.apparaju
- Ebm2.0 Client Manual Eng Ver1.1 JuneЗагружено:Gilbert Kamanzi
- 37434012 Mary Balogh AnniversaryЗагружено:Ranjani
- architecЗагружено:Sumit Sonawane

## Гораздо больше, чем просто документы.

Откройте для себя все, что может предложить Scribd, включая книги и аудиокниги от крупных издательств.

Отменить можно в любой момент.