Академический Документы
Профессиональный Документы
Культура Документы
Exam A
Empirical Research in Economics and Management
WT 2014
General information:
Please check the number of pages. The exam has 27 pages (including the cover sheet).
Leave all pages stapled together.
The editing time is 120 minutes. You can achieve a total of 120 points. For each point, you should
approximately use 1 minute of editing time.
Answer the questions exclusively on the distributed pages. If needed please use the rear page.
Always answer short and precise. Important: Show your comprehension.
You may use a non-electronic dictionary and a non-programmable calculator.
Auxiliary resources such as scripts, books or personal notes are not allowed.
You may answer in German as well. You are not allowed to mix German and English in sub-questions.
That also applies for technical terms. However, you could e.g. answer Question 1.a) in German and
1.b) in English.
Name: _____________________________________________
Good luck!
Note:
For each question, exactly one of the four statements contains the right answer.
Only mark the correct answer with an
Please note that you obtain 0 points if an incorrect box is checked or if none, two or more boxes are
checked.
2
Question 1: Multiple choice (30 points)
Answer
1a Which of the following statements is correct?
(2 points)
b In a right-skewed distribution the arithmetic mean is typically higher than the median
c In a right-skewed distribution the arithmetic mean typically coincides with the median
Answer
1b Which of the following statements regarding reliability is correct?
(2 points)
a An empirical analysis is reliable if the same results, apart from sampling errors, are
obtained for other samples from the same population
b An empirical analysis is reliable if the same results, apart from sampling errors, are
obtained for other populations
c The higher the reliability of an empirical analysis conducted for one industry, the better
it can be generalized to other industries.
3
You want to survey start-up firms in Munich to understand how they negotiate
with Venture Capital firms. There are 1000 start-ups in Munich, 30 of them are Answer
1d
Venture Capital funded. For the survey you want to draw a sample of 50 firms. (1 point)
Which sampling method should you choose?
a Random sampling
b Stratified sampling
c Cluster sampling
d Multi-stage sampling
Answer
1e Which of the following statements regarding validity is not correct?
(2 points)
a Internal validity of an empirical study refers to the extent to which a causal conclusion
based on a study is warranted
b External validity refers to the possibility of generalizing the results of an empirical study
to other units of observation and/or other situations
Answer
1f What is the meaning of the p-value?
(1 point)
c The probability of obtaining the observed value, or even larger ones, if H0 was correct
d The probability that the level of significance (critical value) is exceeded by the test
4
Answer
1g The Bravais-Pearson correlation can be used to measure correlation between…
(2 points)
a Metric variables
b Ordinal variables
c Nominal variables
What is the advantage of Spearman's rho over the Bravais-Pearson correlation Answer
1h
coefficient? (2 points)
Which of the following assumptions is not required for the OLS regression Answer
1i
model? (2 points)
a The disturbance terms are not correlated with the explanatory variables.
5
Assume you have 10 explanatory variables, each of which potentially has a
causal effect on the dependent variable, y. Under what conditions does omitting Answer
1j
one of these variable in a regression lead to (one or more) biased coefficient (2 points)
estimates?
c If the p-value of the coefficient of the omitted variable, if it was included in the
regression, would be below 0.05
Answer
1k Which of the following points is not a potential difficulty in multiple regression?
(2 points)
Answer
1l In the regression 𝒍𝒏(𝒚) = 𝜷𝟎 + 𝜷𝟏 𝒙 + 𝒖, what does 𝜷𝟏 indicate?
(2 points)
6
Which of the following indicators would you use to measure the goodness of fit Answer
1m
of a regression model? (2 points)
b p-value of coefficients
d Explained variance
Answer
1n The Herfindahl Index is a measure of…
(1 point)
a Industry concentration
b R&D intensity
c Innovativeness of an individual
Consider a large population of units (e.g., firms), out of which a random sample
of N units is drawn. For each of these N units you determine two variables, x
and y (e.g., x = marketing expenditures and y = sales). You then perform an OLS
regression, 𝒚 = 𝜷𝟎 + 𝜷𝟏 𝒙 + 𝒖. The population coefficient 𝜷𝟏 is not known Answer
1o
precisely, but we do know that it is positive. Which of the following statements (2 points)
is correct?
If the sample size is increased from N to 2N, then one can expect that …
a R2 increases and the standard error of the estimator β̂1 remains unchanged.
c R2 remains unchanged and the standard error of the estimator β̂1 decreases.
d R2 and the standard error of the estimator β̂1 both remain unchanged.
7
Which is not a suitable method to reduce the influence of social desirability Answer
1p
bias?
(1 point)
b Guaranteeing anonymity
d Asking projective questions, i.e. questions referring not to the respondent but to other
individuals
In a survey about company culture, you are asked to tick any of the following
statements that you find correct:
a Likert scale
b Guttman scale
c Semantic differential
8
Question 2: Developing the research question and design (18 points)
a) How would you approach this topic with what you have learnt in the lecture? Develop a
meaningful approach. In doing so, use and explain the following terms (10 points):
Research question
Hypothesis
Independent variable
Dependent variable
Operationalization
Research question:
Should be specific, new and, relevant. [1] State RQ [1]
Hypothesis:
Should be falsifiable [0.5] and generalizable [0.5] State hypothesis [1]
Independent variable:
The “input variable” [1] Name IV [1]
Dependent variable:
The “output variable” that is used to measure the effect [1] Name DV [1]
Operationalization:
Measuring of the concept that is not directly measurable [1] Give example for
operationalization of variable
[1]
9
b) After finishing your thesis your supervisor challenges your research. She uses the following
terms. Explain each term and how you could improve the quality of your work (6 points):
Low significance
Endogeneity
Omitted variable bias
Significance
Improve: Increase N
Endogeneity
In statistics, omitted-variable bias occurs when a model is created which incorrectly leaves
out one or more important causal factors. The "bias" is created when the model compensates
for the missing factor by over- or underestimating the effect of one of the other factors.
10
c) Assume that you have two samples of business data: the first sample covers complex technology
industries, the second one, discrete technology industries. You are interested in a number of
variables that relate to innovation. How can you evaluate if the means of these variables differ
between the two types of industries? Please explain. (2 points)
For each variable, conduct a two-sample, two-tailed t-test for a difference in means. [1]
H0 = mean is the same for discrete and complex technology industry [0.5]
11
Question 3: Survey design (7 points)
a) What type of data can be measured with an ordinal scale? Provide an example. (2 points)
The ordinal scale allows for rank order by which data can be sorted, but there is no meaningful
distance between any two points on the scale. [1]
“strongly agree“, “agree“, “neither agree nor disagree“ (or “uncertain“), “disagree“, “strongly
disagree“
c) Under what conditions is unit-nonresponse (i.e., missing responses from some units in the
population) problematic? (3 points)
Systematic means that the characteristics of the object to be examined are related to the causes
of non-response. [1]
As a consequence, results are biased (e.g., (self) selection bias, survivor bias). [1]
12
Question 4: Cluster and factor analysis (18 Points)
You work for a sports car manufacturer that produces fast, appealing, and very expensive cars
with a high gas consumption. You want to conduct a marketing campaign for your new model.
You need to know more about your potential customers before you decide which customer group
you want to invite to the presentation of the new model. You conduct a cluster analysis. For the
clustering, you use three variables: age, income, and environmental consciousness.
a) Describe cluster analysis as well as factor analysis briefly and precisely. What is the
fundamental difference between the two methods? (3 points)
Cluster analysis:
Factor analysis:
Fundamental difference:
Furthermore, the method differs. For the cluster analysis, a distance matrix is used. For the
factor analysis, a correlation matrix is used. [1]
13
The result of the conducted cluster analysis is shown in the following table:
b) Describe the result of the cluster analysis. Give each cluster a meaningful name. Chose the
customer group that you want to invite to the presentation. (5 points)
Each:
Name [0.5]
Description [0.5]
Cluster 1 - Students: 30% of the observation units, young, low income, medium environmental
consciousness. Students
Cluster 2 – Retired persons: 20% of the observation units, old, high income, low environmental
consciousness.
Cluster 3 – Young professionals: 40% of the observation units, young, medium income,
medium environmental consciousness.
Cluster 4 – Employed persons: 10% of the observation units, medium age, high income, high
environmental consciousness.
14
In addition to the analysis above, you have two customer data sets available. You want to do a
factor analysis for each of the data sets. You determined the following correlation tables:
Dataset 1:
Dataset 2:
c) Describe each dataset. State and justify if the datasets are suited for a factor analysis.
(4 points)
Dataset 1:
Dataset 2:
Unsuited. With the exception of variables 2/4 and 1/5 very low correlation between the variables.
Heterogeneous data structure.
15
What do you expect regarding the factors for dataset 1 if you choose a 3-factor Answer
4d
solution?
(2 points)
Answer
4e Which statement is correct?
(2 points)
a A high value in the distance matrix indicates a high correlation between two
variables
b The factor loading gives a hint if the independent variable and the dependent
variable are correlated
c The factor loading of a variable is the correlation coefficient between this variable
and the factor
d A high value in the correlation matrix indicates a high correlation between two
factors
Answer
4f Which of the following three statements is not correct?
(2 points)
c A value of Cronbach’s alpha above 0.7 indicates a sufficiently high level of validity of
the respective factor
16
Question 5: Logistic regression (11 points)
You have completed the marketing campaign that was described in Question 5. The new model
has been on the market for six months and you would like to identify the determinants of the
decision to buy the sports car. To this end you conduct a survey among the guest of your market
presentation. For the evaluation of the survey, you use logistic regression analysis.
Logistic regression tries to determine the probability of a certain result and analysis which factors
influence this probability.
Alternative: The logistic regression uses a non-linear relation between the probability of the
event and the independent variables.
Answer
5b Which variables are significant at α=10%?
(1 point)
17
c) Briefly describe the term “significant influence”. (2 points)
Significance is attained when a p-value is less than the significance level. The p-value is the
probability of observing an effect given that the null hypothesis is true whereas the significance
or alpha (α) level is the probability of rejecting the null hypothesis given that it is true. [1]
Influence implicates a causal relation. Often one can assume a distinct direction of causality.
[1]
Answer
5d Which statement is wrong?
(2 points)
a The Number of previously bought sports cars has a significant, positive influence on
the probability that the new sports car is bought
b Being Single has a significant, positive influence on the probability that the new sports
car is bought
c Education has a non-significant, positive influence on the probability that the new
sports car is bought
Answer
5e Which statement is correct?
(2 points)
b The probability of buying the car is 48.5% higher for a single than for a non-single
c A Single is 48.5% more likely to buy the new sports car if all other variables are set to
0
18
For your next market analysis about sports cars, you plan to include personal
characteristics of your potential customers. You are pondering how to Answer
5f
operationalize the respective constructs. Which of the following statements is
(2 points)
not correct?
19
Question 6: Conjoint analysis (11 points)
A colleague approaches you and suggests to use a conjoint analysis to determine which
preferences the customers have regarding sports cars.
Answer
6a What is not an advantage of a conjoint analysis?
(1 point)
Answer
6b Which of the following statements is not correct?
(2 points)
a Choice-based conjoint – i.e., to ask respondents to pick their most preferred conjoint
card out of each set – is preferable to full ranking because it is more realistic
b Asking respondents to assess each conjoint card on a rating scale from 1 to 100 is
more efficient than full ranking since it provides more information
Pair comparison – i.e., to ask respondents to pick their preferred alternative from
c several pairs of conjoint cards – has the disadvantages, compared to full ranking, of
allowing fewer attributes
d Hybrid conjoint techniques allow using more attributes than traditional full ranking
methods
20
c) State the two steps that are necessary before conducting a conjoint analysis. Explain these steps
in detail. (6 points)
Answer
6d Which statement is wrong?
(2 points)
21
Question 7: Time series and panel data analysis (16 points)
Three years passed since the new sports car model was introduced. You would like to analyze
the number of cars sold between February 2012 and February 2015 (37 months) and to forecast
future sales on a monthly basis.
a) In general, how do formulate a time series model? What is important to consider? (2 points)
You use a linear model with seasonal dummies to estimate the model. This is the output from
SPSS:
Unstandardized Standardized
Coefficients
Coefficients
Regression
Model Coefficient B
Std.Error
(Constant)
Time (months)
Std. Error of
Adjusted the estimate
Model R-Square R-Square
22
Answer
7b Which quarter is the reference category?
(1 point)
a Quarter 1
b Quarter 2
Please calculate the point forecast for May 2015. How many cars do you expect Answer
7c
to sell in May 2015?
(2 points)
Answer
7d Which statement is wrong?
(2 points)
a The projection interval is the range where the future value of a variable is predicted to
be with a certain probability
b The projection interval depends on the forecast error at the point in time for which the
forecast is made
23
e) How would you assess the forecast quality of the model? Can you use R² for the assessment? Please
explain. (3 points)
Forecast quality:
The time period for the estimation is shortened compared to the observation time period. This means
that the last values are not used for the estimation of the model. Subsequently these values are
compared to the ex-post forecasted values. [1]
Another method would be to not to shorten the time period for the model estimation but wait for the
real values. [1]
R^2:
Not suitable for the assessment of the forecast quality. R^2 can only be used to analyze how well the
model fits the observed values in the observation period. [1]
b Y fluctuates around the long term trend and this fluctuation recurs periodically
c The number of degrees of freedom is smaller if monthly dummy variables are used
instead of quarterly dummy variables
d It is useful to use monthly dummy variables instead of quarterly dummy variables if the
observation period is short
24
g) Assume you have a panel dataset at hand. State four advantages of a panel dataset compared to a
time series. (2 points)
Answer
7h Which statement, referring to panel regression, is wrong?
(2 points)
a Panel regression can be conducted also with discrete dependent variables and logit,
probit, or similar models
c A fixed effects regression has fewer degrees of freedom than a random effects
regression
25
Question 8: Guest lectures (9 points)
a …unbounded rationality
b …pure self-interest
c …complete self-control
d …variable preferences
a) David Schindler conducted an experiment with you during his guest lecture. Describe the topic
of the experiment and describe how to interpret it. (4 points)
• The person that is closest top m, whereas p>0 and m is the mean of the chosen numbers.
[1]
26
b) During his guest lecture Dr. Florian Bauer presented a new model of price research. State two
points of criticism against conventional price management. (2 points)
27