You are on page 1of 11

Circulation

As publication manager of a weekly magazine, you wish to develop guaranteed


circulation figures to use in soliciting advertising. A study of weekly sales over the past
three years fails to reveal any marked trend or seasonal variation; the circulation
figures seem to be roughly normally distributed. During the period studied, the mean
weekly circulation was 650,000 copies, with a sample standard deviation of 40,000
copies. For parts (a)-(d), which review basic concepts involving random variables and
the normal distribution, assume these figures to be accurate estimates.

(a) If you guarantee the sale of at least 600,000 copies next week, what chance
will you face of failing to meet the guarantee?
(b) How many copies of your next issue must you print, in order to have only
one chance in 100 of being caught short?
(c) What mean weekly circulation figure would you be 95% safe in
guaranteeing over the next (52-week) year?
(d) What total circulation figure would you be 95% safe in
guaranteeing over the next (52-week) year?
(e) The estimate of mean weekly circulation is actually based on a
sample of 156 weeks (3 years). Give a 95%-confidence interval
for the true mean weekly circulation.

Advertising Sales

A magazine publishing house wishes to estimate (for purposes of advertising


sales) the average annual expenditure on furniture among its subscribers. A
sample of 100 subscribers is chosen at random from the 100,000-person
subscription list, and each sampled subscriber is questioned about his
furniture purchases in the last year. The sample mean response is $530,
with a sample standard deviation of $180.

(a) What estimate (of the population average) should be reported?


(b) Give a 95% confidence interval for the estimate.
(c) How would your answer in (b) change, if the subscription list included 400,000 people?
(d) How would your answer in (b) change, if the subscription list included 1,000 people?
(e) How would your answer in (b) change, if the sample size were only 10?
(f) Did any of your answers above require special assumptions?
(g) If (b) had asked for a 99% confidence interval, would you have reported a
wider or narrower interval?

TV Ratings

The A.C. Nielsen Co. wishes to estimate, for each of several cities, the
amount of time a typical adult living in the city spends watching television
each week. In San Diego (which has an adult population of 400,000), they
sample 100 people. The sample mean is 25 hours, with a sample standard
deviation of 10 hours.

(a) Give a 95%-confidence interval for their estimate.


(b) How many more people should they interview, in order to be 95%-confident
that their final estimate is wrong by no more than 1 hour?
(c) Assuming that viewing time varies from person to person in Los Angeles by
about as much as it does in San Diego, and that Nielsen wishes to make as
precise an estimate for L.A. as they did in part (a) for San Diego, how large
a sample should they take in L.A. (which has an adult population of
1,600,000)?

Miscellany

Explain, briefly, the difference between sampling bias and sampling error.

A 95%-confidence interval for an estimate indicates how much you can trust
the (fill in the blank) __________________ .

A firm has conducted a study, and has found that a 95%-confidence interval
for the mean time required to process new insurance policies is (10 days,
12 days). This means that

(a) only 5 percent of all policies take less than 10 or more than 12 days to process.
(b) only 5 percent of all policies take between 10 and 12 days to process.
(c) about 95 out of every 100 intervals similarly constructed will contain the true
mean processing time.
(d) the probability is 0.95 that the true mean lies between 10 and 12 days.
Excel Functions

Excel's normal distribution and t-distribution functions are not consistently


defined. While NORMDIST and NORMINV are both left-tail oriented, TDIST is
right-tail oriented and TINV is two-tail oriented.

1000 mean
200 standard deviation
900 target
30.8538% #VALUE!
Pr(normally-distributed random variable with given mean and
standard deviation is less than or equal to target value)

1000 mean
200 standard deviation
95.0000% target left-tail probability
1328.97 #VALUE!
value that a normally-distributed random variable with given mean
and standard deviation has "target probability" of being less than
or equal to

9 degrees of freedom
1.5 target value (must be non-negative)
8.3925% #VALUE!
Pr(t-distributed random variable with given number of degrees of
freedom [and with mean 0 and standard deviation 1] is greater
than or equal to target value)

9
5.0000% target two-tail probability
2.2622 #VALUE!
value that the absolute value of a t-distributed random variable with
given number of degrees of freedom has "target probability" of
exceeding

Excel offers functions for computing the variance and standard deviation of a
"population" of values, and a matching set of functions for using sample
values to estimate a population's variance and standard deviation.

1.0
2.0 a range of sample data
3.0
4.0

2.5000 #VALUE! population and sample mean sum of observations, divided by "n"
1.2500 #VALUE! population variance sum of squared differences from the mean, divided by "n"
1.1180 #VALUE! population standard deviation square root of population variance
1.6667 #VALUE! sample variance sum of squared differences from the mean, divided by "n-1"
1.2910 #VALUE! sample standard deviation square root of sample variance
Circulation

Circulation

As publication manager of a weekly magazine, you wish to develop guaranteed


circulation figures to use in soliciting advertising. A study of weekly sales over the past
three years fails to reveal any marked trend or seasonal variation; the circulation
figures seem to be roughly normally distributed. During the period studied, the mean
weekly circulation was 650,000 copies, with a sample standard deviation of 40,000
copies. For parts (a)-(d), which review basic concepts involving random variables and
the normal distribution, assume these figures to be accurate estimates.

(a) If you guarantee the sale of at least 600,000 copies next week, what chance
will you face of failing to meet the guarantee?

You will only fail to meet the guarantee if demand runs below 600,000.

650,000 weekly mean


40,000 standard deviation
600,000 guarantee 10.56% Pr(demand < guaranteed amount)

(b) How many copies of your next issue must you print, in order to have only
one chance in 100 of being caught short?

You will only run short if demand exceeds the number of copies printed.

1% chance of running out


743,054 number to print

(c) What mean weekly circulation figure would you be 95% safe in
guaranteeing over the next (52-week) year?

In a typical advertising contract, you promise to rebate some of the fees if your
circulation doesn't reach a specified target level. In this case, you want a 95%
chance that circulation will be above the guaranteed level, i.e., only a 5% chance
of circulation being below that level.

The key here is to remember that the variance of a sum of independent random
variables is the sum of their variances. Therefore the standard deviation of a
sum of n independent, identically-distributed random variables is sqrt(n) times
the standard deviation of a single one, and the standard deviation of the mean is
the original standard deviation divided by sqrt(n).

52 time period
650,000 mean per week
5,547 standard deviation of mean over time period
5% chance of failing to meet guarantee

640,876 guarantee to make

(d) What total circulation figure would you be 95% safe in


guaranteeing over the next (52-week) year?

Page 5
Circulation

The answer obviously must be 52 times the answer to the previous question.

33,325,552 guarantee to make

This could also be obtained directly.

52 time period
33,800,000 expected total over time period
288,444 standard deviation of total over time period
5% chance of failing to meet guarantee

33,325,552 guarantee to make

(e) The estimate of mean weekly circulation is actually based on a


sample of 156 weeks (3 years). Give a 95%-confidence interval
for the true mean weekly circulation.

650,000 6,277 fine for all practical purposes


650,000 6,326 a bit more precise

Page 6
Advertising sales

Advertising Sales

A magazine publishing house wishes to estimate (for purposes of advertising


sales) the average annual expenditure on furniture among its subscribers. A
sample of 100 subscribers is chosen at random from the 100,000-person
subscription list, and each sampled subscriber is questioned about his
furniture purchases in the last year. The sample mean response is $530,
with a sample standard deviation of $180.

(a) What estimate (of the population average) should be reported?

$530 sample mean, and estimate of (true) population mean


$180 sample standard deviation, and estimate of population standard deviation
100 sample size

(b) Give a 95% confidence interval for the estimate.

530 35.2800

(c) How would your answer in (b) change, if the subscription list included 400,000 people?

It wouldn't change. But see (d).

(d) How would your answer in (b) change, if the subscription list included 1,000 people?

Again, it wouldn't change. But this is assuming that the sample procedure
used simple random sampling with replacement. Had it used simple random
sampling without replacement, the answers to (b)-(d) would be:

population size margin of error for sticklers


100,000 35.26253 35.6982
400,000 35.27563 35.7115
1,000 33.48629 33.9000

(e) How would your answer in (b) change, if the sample size were only 10?
(f) Did any of your answers above require special assumptions?

Assuming the the population distribution is roughly normal - probably a bad


assumption in reality, since it's likely that the population contains a number
of people who bought no furniture last year - the answer to (e) would be

530 128.7642

2.262157 how many standard deviations out from the mean you must go in
the t-distribution with 10-1 = 9 degrees of freedom, in order to cut
off 5% of the distribution in the two tails

(g) If (b) had asked for a 99% confidence interval, would you have reported a
wider or narrower interval?

Page 7
Advertising sales

Wider. The 99%-confidence interval is

530 46.3649 fine


530 47.2753 a bit more precise

Page 8
TV ratings

TV Ratings

The A.C. Nielsen Co. wishes to estimate, for each of several cities, the
amount of time a typical adult living in the city spends watching television
each week. In San Diego (which has an adult population of 400,000), they
sample 100 people. The sample mean is 25 hours, with a sample standard
deviation of 10 hours.

(a) Give a 95%-confidence interval for their estimate.

25 1.96

(b) How many more people should they interview, in order to be 95%-confident
that their final estimate is wrong by no more than 1 hour?

It takes roughly four times as large a sample, in order to cut the margin
of error in the estimate in half (this is the "square-root" effect). Hence, they
need a total sample of roughly 400 viewers.

Working it out precisely, the total should be

384 viewers in the full sample

(c) Assuming that viewing time varies from person to person in Los Angeles by
about as much as it does in San Diego, and that Nielsen wishes to make as
precise an estimate for L.A. as they did in part (a) for San Diego, how large
a sample should they take in L.A. (which has an adult population of
1,600,000)?

The same size. For practical purposes (i.e., ignoring the distinction between
sampling with and without replacement), the margin of error in an estimate
depends on the sample size, but not on the population size.

Page 9
Miscellany

Miscellany

Explain, briefly, the difference between sampling bias and sampling error.

Sampling bias is the result of being dumb; its presence is undetectable by


statistical analysis; it is completely avoidable through the use of good
judgment. Sampling error is the result of being unlucky; our exposure to it
can be measured and controlled; it can never be eliminated (except through
100% sampling).

A 95%-confidence interval for an estimate indicates how much you can trust
the (fill in the blank) __________________ .

... estimation procedure. The point here is that we don't have any idea how
accurate our actual estimate is, but we do know how much accuracy is
inherent in the procedure we used to make the estimate.

A firm has conducted a study, and has found that a 95%-confidence interval
for the mean time required to process new insurance policies is (10 days,
12 days). This means that

(a) only 5 percent of all policies take less than 10 or more than 12 days to process.
(b) only 5 percent of all policies take between 10 and 12 days to process.
(c) about 95 out of every 100 intervals similarly constructed will contain the true
mean processing time.
(d) the probability is 0.95 that the true mean lies between 10 and 12 days.

(a) and (b) are totally wrong, since they refer to the distribution of individuals
within the population, and a 95%-confidence interval tells us nothing about
that. It could well be that half of all the policies are approved instantly, and
the other half take about three weeks.

(c) is objectively (no one could argue) correct. It restates the definition of a
95%-confidence interval.

Page 10
Miscellany

(d) is subjectively (from the viewpoint of the person making the estimate,
who sees only this sample data) correct, if we accept the principle that
most people equate lack-of-personal-knowledge with randomness. (d) is the
interpretation most commonly used in practice.

Page 11