Вы находитесь на странице: 1из 9

Name: __________________________

Semester 1 Review 2019


Multiple Choice
Identify the choice that best completes the statement or answers the question.

1. You measure the age, marital status and political affiliation of an SRS of 1463 women. What are the number
and type of variables measured? (Quantitative, Categorical, Neither)
Quantitative – age
Categorical – Marital status and political affiliation

Scenario 1-1
A review of voter registration records in a small town yielded the following table of the number of males and
females registered as Democrat, Republican, or some other affiliation.

Male Female
Democrat 400 600
Republican 500 400
Other 200 100

2. Use Scenario 1-1. What is the proportion of males that are registered as Democrats?

3. Use Scenario 1-1. What is the conditional distribution of political party among females?

4. Define skew (left and right) in terms of the of the median and the mean
In a distribution with a right skew, the mean is greater than the median.
In a distribution with a left skew, the mean is less than the median.

Scenario 1-6
A sample was taken of the salaries of 20 employees of a large company. The following boxplot shows the
salaries (in thousands of dollars) for this year.
5. Describe the box plot of scenario 1-6. Include the five number summary, IQR, and calculate the minimum
and maximum cut-off value of an outlier. (Use 𝑄𝑄1 − 1.5 ∙ 𝐼𝐼𝐼𝐼𝐼𝐼) 𝑎𝑎𝑎𝑎𝑎𝑎 𝑄𝑄3 + 1.5 ∙ 𝐼𝐼𝐼𝐼𝐼𝐼)
The distribution is somewhat symmetric with a slight right skew. The median is approximately 48. The
values in the upper 50% have slightly more variation than the numbers in the lower 50%.

6. Define the standard deviation in terms of the mean. When is the standard deviation of a data set equal to
zero?
Standard deviation is the average distance a point is from the mean. The standard deviation is zero
when all points are the same.

7. Explain what it means to be in the 90th percentile.


The 90th percentile is the score that separates the lower 90% of the data from the upper 10% of the
data.

8. Ramon is planning on buying a new car. He is looking at the Ford Escape, a sport-utility vehicle, which
gets 28 highway miles per gallon, and the Ford Fusion, a mid-sized sedan, which gets 31 highway miles per
gallon. The mean fuel efficiency for all sport utility vehicles is 22, with a standard deviation of 7.6. The mean
of all mid-sized sedans is 27, with a standard deviation of 5.2. Use the z-scores to determine which vehicle
has a better standing (in terms of gas mileage), relative to others of the same style?

Scenario 2-1
A sample was taken of the salaries of 20 employees of a large company. The following are the salaries (in
thousands of dollars) for this year. For convenience, the data are ordered.

28 31 34 35 37 41 42 42 42 47
49 51 52 52 60 61 67 72 75 77

9. Suppose each employee in the company receives a $3,000 raise for next year (each employee's salary is
increased by $3,000). How does this impact the mean, median, and standard deviation? You do not need to
calculate each just explain in terms of the $3,000.
The median and mean increase by $3,000. The standard deviation is unchanged.
10. A company produces ceramic floor tiles that are supposed to have a surface area of 36.0 square inches.
Due to variability in the manufacturing process, the actual surface area has a Normal distribution with a mean
of 36.2 square inches and a standard deviation of 0.15 square inches. The proportion of tiles produced by the
process with surface area less than 36.0 square inches is

11. The time to complete a standardized exam is approximately Normal with a mean of 70 minutes and a
standard deviation of 10 minutes. How much time should be given to complete the exam so that 80% of the
students will complete the exam in the time given?

Scenario 3-1
The height (in feet) and volume (in cubic feet) of usable lumber of 32 cherry trees are measured by a
researcher. The goal is to determine if volume of usable lumber can be estimated from the height of a tree.

12. Use Scenario 3-1. In this study, what are the response variable and the explanatory variable?
The explanatory variable is the height and the response variable is the volume.

13. Use Scenario 3-1. Describe the form, strength, direction of the scatterplot.
The scatterplot is a moderately strong, positive, linear relationship with an outlier/possibly influential
point at (65, 70). As height of the tree increases, the usable volume of lumber increases.

14. Use Scenario 3-1. If the data point (65,70) were removed from this study, how would the value of the
correlation r change? Be specific.
The point does not follow the linear relationship of the scatterplot. Removing it will increase the
strength and consequently the correlation of the scatterplot.
Scenario 3-8

A fisheries biologist studying whitefish in a Canadian Lake collected data on the length (in centimeters) and
egg production for 25 female fish. A scatter plot of her results and computer regression analysis of egg
production versus fish length are given below.
Note that Number of eggs is given in thousands (i.e., “40” means 40,000 eggs).

Predictor Coef SE Coef T P


Constant 19.2 15.93 0.13 0.899
Length 45.3 0.02409 2.61 0.028

S = 3.37648 R-Sq = 79.7% R-Sq(adj) = 36.9%

15. Use Scenario 3-8. Write the least squares regression line (LSRL).

16. Use Scenario 3-8. Interpret S in context of the problem.


S is the standard deviation of the residuals. The average distance a point is from the LSRL.

17. Use Scenario 3-8. Interpret R2 in context of the problem.


Approximately 80% of the variation in eggs produced (in thousands) is explained by the LSRL of eggs
produced and whitefish length.

18. Use Scenario 3-8. Interpret the slope and the y-intercept in context of the problem.
On average, for each centimeter increase in whitefish length, 45.3 (thousand) more eggs are produced.
The y-intercept has no meaning in this context. A fish of zero length will produce 19.2 (thousand) eggs.

19. If a LSRL fits the data well, what will the residual plot look like?
The residual plot will be random with no discernable pattern. Ideally, but not necessarily true,
approximately the same number of points will be above as below the line.

20. Compare and contrast an outlier and an influential point in a scatterplot.


An outlier is a point that breaks from the overall pattern of the scatterplot. A point may be an outlier in
the x direction or the y direction. Outliers with large residuals tend to be influential. An influential
point is one that largely changes the slope and correlation when removed from the scatterplot.
21. You want to know the opinions of American schoolteachers about establishing a national test for high
school graduation. You obtain a list of the members of the National Education Association (the largest
teachers' union) and mail a questionnaire to 2500 teachers chosen at random from this list. In all 1347
teachers return the questionnaire. What are the population and the sample?
Population: All American teachers.
Sample: The 1347 teachers that responded to the questionnaire.

22. In order to assess the opinion of students at the University of Minnesota on campus snow removal, a
reporter for the student newspaper interviews the first 12 students he meets who are willing to express their
opinion. What sampling method is the reporter using?
Convenience.

23. Define statistically significant.


An event is statistically significant when the observed effect is large enough that it would rarely occur
by chance alone.

24. A public opinion poll in Ohio wants to determine whether or not registered voters in the state approve of a
measure to ban smoking in all public areas. They select a simple random sample of fifty registered voters
from each county in the state and ask whether they approve or disapprove of the measure. What sampling
method is the reporter using?
Stratified.

25. Give an example of undercoverage bias.


Intentionally or unintentionally leaving a group from the population out of a sample. An example
would conducting a poll/survey at 2:00PM. People working during this time do not have a chance to
take the survey.

26. What differentiates an experiment from an observational study?


An experiment intentionally imposes a treatment. An observational study does not.

27. What are confounding variables? Give an example.


Two variables in which an experimenter cannot distinguish their effects on the response variable. In
other words, one or both may be influencing the response.

28. What is double-blinding?


Neither the subjects nor the experimenters know which treatment a subject receives.

29. Define sample space and give an example.


A sample space includes all possible outcomes of an event(s). For example, when three coins are tossed,
the outcomes are: {𝐓𝐓𝐓𝐓𝐓𝐓, 𝐓𝐓𝐓𝐓𝐓𝐓, 𝐓𝐓𝐓𝐓𝐓𝐓, 𝐇𝐇𝐇𝐇𝐇𝐇, 𝐓𝐓𝐓𝐓𝐇𝐇, 𝐇𝐇𝐇𝐇𝐇𝐇, 𝐇𝐇𝐇𝐇𝐇𝐇, 𝐇𝐇𝐇𝐇𝐇𝐇}
30. What is true if two events are disjoint (mutually exclusive).
If two events are disjoint, they do not share any outcomes.

Scenario 5-3
Ignoring twins and other multiple births, assume that babies born at a hospital are independent random events
with the probability that a baby is a boy and the probability that a baby is a girl both equal to 0.5.

31. Use Scenario 5-3. Calculate the probability that exactly 3 of the next five babies are girls.

Scenario 5-8
A student is chosen at random from the River City High School student body, and the following events are
recorded:
M = The student is male
F = The student is female
B = The student ate breakfast that morning.
N = The student did not eat breakfast that morning.
The following tree diagram gives probabilities associated with these events.

32. Use Scenario 5-8. What is the probability that the selected student is a male and ate breakfast?

33. Use Scenario 5-8. What is the probability that the student had breakfast?

34. Use Scenario 5-8. Find P(B | F) and write in words what this expression represents.
35. Give an example of a continuous variable and a discrete variable.
Discrete variable: The combined weight of two randomly selected LTHS teachers.
Continuous variable: The number of Cub fans in 25 randomly selected LTHS teachers.

Scenario 6-3
In a certain population of students, the number of calculators a student owns is a random variable X described
by the following probability distribution:
X 0 1 2
P(X) 0.2 0.6 0.2

36. Use Scenario 6-3. Calculate the mean or expected value of X.

37. Use Scenario 6-3. Calculate the standard deviation of X.

38. The weight of written reports produced in a certain department has a Normal distribution with mean 60 g
and standard deviation 12 g. Calculate the probability that the next report will weigh less than 45g.

Scenario 6-12
There are twenty multiple-choice questions on an exam, each having responses a, b, c, or d. Each question is
worth five points and only one option per question is correct. Suppose the student guesses the answer to each
question, and the guesses from question to question are independent.

39. Verify that the above scenario is binomial. Calculate the mean and the standard deviation.
1) The number of trials is 20. 2) The probability of success is constant at 0.25. 3) The questions are
independent. 4) Each trial has two outcomes: Correct/Incorrect

40. A worn out bottling machine does not properly apply caps to 5% of the bottles it fills. If you randomly
select 20 bottles from those produced by this machine, what is the approximate probability that between 2 and
6 (inclusive) caps have been improperly applied? Verify that the scenario is binomial.
1) The number of trials is 20. 2) The probability of success is constant at 0.05. 3) The bottles are capped
independently of the other. 4) Each trial has two outcomes: Properly applied/Improperly applied
Scenario 6-17
You are stuck at the Vince Lombardi rest stop on the New Jersey Turnpike with a dead battery. To get on
the road again, you need to find someone with jumper cables that connect the batteries of two cars together so
you can start your car again. Suppose that 16% of drivers in New Jersey carry jumper cables in their trunk.
You begin to ask random people getting out of their cars if they have jumper cables.

41. Use Scenario 6-17. Verify that the scenario is geometric. On average, how many people do you expect
you will have to ask before you find someone with jumper cables?
1) The number of trials varies. (first success) 2) The probability of success is constant at 0.16. 3) The
drivers are independent. 4) Each trial has two outcomes: Cables/No cables

42. Use Scenario 6-17. You’re going to give up and call a tow truck if you don’t find jumper cables by the
time you’ve asked 10 people. What iss the probability you end up calling a tow truck?

Scenario 7-2
Below are dot plots of the values taken by three different statistics estimating the same parameter in 30
samples from the same population. The true value of the population parameter is marked with an arrow.

43. Use Scenario 7-2. Which statistic has the largest bias among these three? Why?
Statistic C has the greatest bias because the center of distribution C is right of the population
parameter indicated by the arrow.

Scenario 7-3
A 2010 study of 240 randomly-selected residents of a subtropical resort city with 82,000 residents found that
5.4% of them had been exposed to the mosquito-borne virus that causes Dengue fever. Suppose the actual
percentage of people in the city who have been exposed to the virus is 3%.

44. Let = the proportion of residents who have been exposed in a random sample of 240, what is the mean
of the sampling distribution (𝜇𝜇𝑝𝑝� )?
45. Use Scenario 7-3. The standard deviation of is approximately

46. According to a recent poll, 27% of Americans get 30 minutes of exercise at least five days each week.
Let’s assume this is the parameter value for the population. If a simple random sample of size were
taken, what is the approximate probability that , the proportion who exercise at least five days per week, is
higher than 0.30?

47. Suppose we select an SRS of size n = 100 from a large population having proportion p of successes. Let X
be the number of successes in the sample. For which value of p (probability of success) would it be safe to
assume the sampling distribution of X is approximately normal?

48. The incomes in a certain large population of college teachers have a normal distribution with mean
$60,000 and standard deviation $5000. Four teachers are selected at random from this population to serve on
a salary review committee. What is the probability that their average salary exceeds $65,000?

Вам также может понравиться