Вы находитесь на странице: 1из 4

Assignment 4 – BUS 336

Nathaniel Payne, February 25, 2019

1. (Regression Problem) (10 marks) Tolko manufacturing is currently one of the world’s most
dominant OSB (oriented strand board) manufacturers. Tolko’s innovative OSB products
embody and are the results of nearly 55 years of innovative industry engineering. They are part
of the company’s commitment to maintaining consistent product quality and are central to
their success. Today, Tolko has hired you as a statistical consultant for one of their forestry
jobs. When OSB is produced, several panels are bonded together to form a board.
Unfortunately, board strength has been a concern. Specifically, one of the factors influencing
the strength of the final board is the amount of glue used in the production process. Earlier this
week, at the FPInnovations lab @UBC, a test was conducted on behalf of Tolko to determine
the optimal braking point of a board based on the amount of glue used in the production
process. In each test, a board was manufactured using a given amount of glue. Weight was
then applied to determine the point at which the board would fail (or break). This test was
performed 67 times using various amounts of glue. Using the data provided:

a. Prepare a scatter plot of this data. (1 mark)

b. What type of regression function would you use to fit this data? (1 mark)

c. Estimate the parameters of this regression function using the ANOVA. What is the
estimated regression function? (2 marks)

d. Interpret the value of the R2 statistic (2 marks)

e. Suppose the company wants to manufacture boards that will withstand up to 120.5 lbs
of pressure per square inch, as per the Canadian standard. How much glue should they
use? (2 marks)

f. Suppose that a new adhesive manufacturer has approached Tolko with a new adhesive
that requires only 4.2 units of glue (versus the current average of 5.6). Determine the
proposed breaking point of the adhesive under the current standard (or benchmark),
along with an approximate 95% confidence interval. (2 marks)

2. (Multiple Regression & Binary Variable Problem) (10 marks) A number of years ago, the
student association at The University of British Columbia published an evaluation of several
hundred courses taught during the preceding semester. Students in each course had
completed a questionnaire in which they rated a number of different aspects of the course on
a 5-poin t scale (1= very bad to 5=excellent). Five variables obtained are:
a. Overall – overall rating of the course.
1
b. Teach – rating of teaching skills of the instructor.
c. Exams – quality of tests and exams
d. Knowledge – rating of the instructor’s knowledge of the material
e. Grade - student’s anticipated grade for the course (1=F to 5=A)
f. Enroll – enrollment for the course

a) Build a multiple regression model using the variables teach, exams, knowledge, grade, and
enroll as the independent variables, and overall as the dependent variable (the Y). (1 mark)

b) Estimate the parameters of this regression function using the ANOVA. What is the
estimated regression function? (1 mark)

c) Interpret the value of the R2 statistic & the adjusted R2 for the model. What proportion of
variability is accounted for by the variables in question? (1 marks)

d) At a recent all campus meeting, a number of researchers at the school raised the concern
that differences between faculties might be explaining some of the variation between
teaching scores. Specifically, the hypothesis was that Science and Business courses, with
their superior instructors, were getting higher ratings then Arts courses. Build a multiple
regression model using Faculty as the basis for a binary variable. (2 marks)

e) Estimate the parameters of this regression function using the ANOVA. What is the
estimated regression function? (1 mark)

f) Interpret the value of the R2 statistic & the adjusted R2 for the model. What proportion of
variability is accounted for by the variables in question? Did you see an improvement in the
fit of the model? (2 marks)

g) A new instructor was recently evaluated by the Faculty. During this evaluation, the
instructor achieved the following score:
a. Teach: 2.7
b. Exams: 2.1
c. Knowledge: 2
d. Grade: 4.0
e. Enroll: 530
f. Faculty: Arts

h) Determine the instructors predicted score and make a 95% confidence interval around that
score. The Faculty has decided to send all faculty members who have a predicted overall
score of 3 to a special instructional workshop. Should this Faculty member be sent to the
workshop? (1 marks)

2
i) Produce a plot which compares the predicted values for the model you built in question d
with the actual values. Also, plot the upper and lower 95% confidence intervals on this plot.
How many actual values appear outside of the upper or lower 95% confidence interval? (1
marks)

3. (Multiple Regression with Binary / Dummy Variables) (10 marks) The manager of a small sales
force wants to know whether average monthly salary is different for males and females in the
sales force. He obtains data on monthly salary and experience (in months) for each of the 9
employees:

Employee Months Gender Salary


Employed ($000)
1 6 0 7.5
2 10 0 8.6
3 12 0 9.1
4 18 0 10.3
5 30 0 13
6 5 1 6.2
7 13 1 8.7
8 15 1 9.4
9 21 1 9.8

a. Enter the data into excel. Then, prepare a scatter plot that compares salary with gender.
Make sure that the plot is appropriately labeled. (1 mark)

b. Create a simple linear regression model that compares salary with gender. (1 mark)

c. Calculate the predicted salary for a male, and then for a female. Are they different? (1
mark)

d. Interpret the R2 for the model. Are you confident that the model is accurate? (2 marks)

e. The analyst decides to use additional information to explain employee salary employee’s
experience at this company (months employed). Create a multiple regression (binary
variable) model which includes both the years of experience and the gender. (2 marks)

f. Find the predicted salary for a female who has worked at the organization for 14 months.
Include and upper and lower 95% confidence Interval? (1 mark)

g. Find the predicted salary for a male who has worked at the organization for 14 months.
Include and upper and lower 95% confidence Interval? (1 mark)
3
h. Compare both c and d. Does there appear to be a statistically significant difference
between the 2 predicted values? (1 mark)

4. (Multiple Regression / Binary Variable Problem) (10 marks) A pharmaceutical company wants
to study the effectiveness of three different versions of a drug. The company refers to these
versions as A, B and C. A clinical study whereby patients are treated using one of the three
versions is administered. Data on drug effectiveness, age, and the version of the drug taken
are provided for 36 patients in the spreadsheet. The company wants to know whether the
three versions of the drug equal in their effectiveness. It also wants to know whether age
influences the effectiveness of these three versions.

a. Create a scatter plot that compares age with drug effectiveness. (1 mark)

b. Create a simple linear regression model that compares age with drug effectiveness. (1
mark)

c. Interpret the R2 for the model. How much variation in the effectiveness does age explain? (2
marks)

d. Create a multiple regression model that compares both the age of the patient and the
version of the drug taken (using a binary variable) to the effectiveness of the drug itself. (1
mark)

e. Interpret the adjusted R2 for the model. (2 marks)

f. For a 50 year old patient, predict the effectiveness of the each of the different drug types.
Do any of the drugs seem to work more effectively than the others? (3 marks)

Вам также может понравиться