Вы находитесь на странице: 1из 3

AIA-DAGA 2013 Merano

Statistical analysis of a combination of objective and subjective data


concerning environmental noise using Factor Analysis and Multinomial Logistic
Regression
Margret Engel1, Emerson Hochsteiner De Vasconcelos Segundo2, Paulo Henrique Trombetta Zannin3
1 3
, Laboratory of Environmental and Industrial Acoustics and Acoustic Comfort, Federal University of Paraná,
Curitiba, Brazil, Email: margretengel@yahoo.com ; zannin@ufpr.br
2
Federal University of Paraná, Curitiba, Brazil, Email:e.hochsteiner@gmail.com

living in the vicinity of the streets monitored under the sound


Introduction aspect were given priority over others. The interviews were
The city of Curitiba, in the South of Brazil, has a population conducted during the day along the seven days of the week.
of approximately 2 million inhabitants and a fleet of 1.3 The noise perception questionnaires consisted of 21
million vehicles [1]; recent data show that there is a growth questions which comprised demographic information, data
of 5% per year in the number of vehicles. about noise perception, noise related behaviour and
symptoms, as well as the socioeconomic characteristics of
This study has been carried out in the Green Line, a former
the population involved. The reliability of the questionnaire
highway that crossed the city from north to south, dividing it
was tested by applying the Cronbach’s Alpha reliability test;
into two parts which, to some extent, were not integrated
this indicated a reliability of 87%. The distribution of the
with each other. In 2007 this highway was restructured into monitoring sites and of the dwellings in which interviews
an urban avenue in order to meet the growing necessities of
were conducted can be seen in Fig. 1.
the urban traffic; the traffic of heavy vehicles was deviated
to the beltway surrounding the city and ten lanes were
introduced for vehicle traffic; two of them were reserved
exclusively for buses. The constructive potential surrounding
the avenue was put to good use, since there were several
empty areas and sheds. In the same way the urban zoning
was restructured, stimulating new constructions which were
intended for commerce and dwellings. In addition, a growth
in the flow of light vehicles (cars and motorcycles) could be
noticed, as a result of the integration of the two parts of the
city, as well as a concomitant increase in environmental
noise in the area surrounding the Green Line and, with it, a
greater noise related nuisance to the residents.
The aim of this study was to examine the efficiency of the
combined use of Factor Analysis and Multinomial Logistic
Regression to explain the meaning of a combination of Figure 1: Location of the monitoring sites and of the
objective data (noise measurement results) and subjective dwellings in which interviews were conducted
data (results of interviews about noise perception) in a study As the number of variables related to the subjective data to
on noise pollution in an urban environment, i.e, the Green be analyzed was relatively large, it was necessary to reduce
Line, in the city of Curitiba. them through a Factor Analysis, which identified the most
important factors that reveal, without any great losses, the
Methodology information collected with the noise perception
The noise measurements (objective data) were taken at 23 questionnaire.
monitoring sites, always in the afternoon. Seven sites were
Factor Analysis was carried out with the aid of an algorithm
located at the Green Line and the remainder at two parallel
developed in the Matlab software.
streets, with eight sites each. The duration of the
measurement at each site was ten minutes; the measurement To compare the objective data with the subjective ones, it
range was from 30 to 120 dB (A) in fast mode, with was necessary to use the Multinomial Logistic Regression;
frequencies measured in 1/3 octave. The equipment used by means of it a simultaneous analysis was done of the
was from Brüel & Kjaer (BK 2238, BK BK 2250 and 2260). subjective variables (survey questions), extracted from
Factor Analysis, as well as of the noise measurement results
The noise perception interviews for collecting the subjective
(objective data). For this purpose, it was necessary to trace
data were conducted with the aid of a questionnaire, which
rays of approximately 150 meters around the monitoring
was applied to a sample of 397 informants. To calculate the
points. The dwellings in which interviews were conducted
required sample size, the demographic data of the districts
and which were situated within these rays were part of the
involved in the study area were taken into account. Residents
data set of the corresponding monitoring site; these data

969
AIA-DAGA 2013 Merano

referred to the noise generated in this place. This procedure After that, the result of the new extraction led to 12
allowed the tabulation of data from the interviews together variables, with 7 common factors and 3 degrees of freedom
with the noise measurement results. For this analysis the (see Table 2).
statistical software SPSS Statistics was used.
%
The Multinomial Logistic Regression is suitable for the Original %
Communalities Acumulated
analysis of studies that deal with variables of categorical Variabl. Particip.
response, whose main objective is the description of the
Particip.
relationship between a dependent variable and a set of V2 2.7911 5.11 5.11
predictor variables. This type of regression takes a more V3 1.0751 1.97 7.08
general form, as the dependent variable does not limit itself V5 7.2170 13.22 20.30
to two categories only, as is the case of Binomial Logistic
V9 2.5240 4.62 24.93
[2]. In this case, the dependent variable can be nominal or
ordinal [3]. V10 3.5867 6.57 31.50
V12 12.5949 23.07 54.57
Results V14 8.8743 16.26 70.83
a) Factor Analysis V15 4.1510 7.60 78.43
V16 3.4435 6.31 84.74
The results generated by the insertion of the data matrix and
after running the calculation through the algorithm created in V17 3.7356 6.84 91.59
the Matlab software showed initially 15 factor loadings V18 3.8695 7.09 98.67
(factors), from the original 21 variables, as shown in Table V20 0.7234 1.33 100.00
1.
Table 2: Ajusted Communalities and extraction of the 7
common factors
%
Original % Legend: The seven common factors can be seen in the
Communalities Acumulated
Variabl. Particip. communalities column in red.
particip.
V1 0.0173 0.02 0.02 Common factors should be observed in the results of the
variances of eigenvalues [5]. The participation of 80% of the
V2 3.7571 5.26 5.28
accumulated results of the eigenvalues indicates how many
V3 3.1821 4.45 9.73 will be the common factors of the original variables.
V4 0.0788 0.11 9.84 Through the analysis of the variance of the eigenvalues, 12
V5 7.3347 10.26 20.11 variables with 7 common factors could be generated. This
V6 0.1073 0.15 20.26 represents 83.2812% of the total variance of the
V7 2.4562 3.44 23.69 questionnaire, as indicated in Table 3.
V8 1.5771 2.21 25.90 Common % %
V9 3.7980 5.31 31.21 Eigenvalues
Factors Explained Acumulated
V10 4.5217 6.33 37.54 Y1 17.0352 25.9032975 25.90329752
V11 0.0647 0.09 37.63 Y2 10.6921 16.2581389 42.1614364
V12 12.6148 17.65 55.28 Y3 7.60650 11.566253 53.72768936
V13 0.0409 0.06 55.34 Y4 6.15310 9.35624941 63.08393878
V14 8.9106 12.47 67.81 Y5 5.69000 8.65207118 71.73600995
V15 4.3295 6.06 73.86 Y6 4.02500 6.12031397 77.85632392
V16 5.1818 7.25 81.11 Y7 3.5677 5.42495507 83.28127899
V17 4.6344 6.48 87.60 Y8 2.9855 4.53967636 87.82095535
V18 4.7632 6.66 94.26 Y9 2.6971 4.10114256 91.92209791
V19 1.3593 1.90 96.16 Y10 2.2561 3.43056903 95.35266694
V20 2.7301 3.82 99.98 Y11 2.0298 3.08646293 98.43912987
V21 0.0113 0.02 100.00 Y12 1.0265 1.56087013 100
Table 1: Ajusted Communalities and extraction of the 7 Table 3: Proportion of total variance explained by the
common factors major components, measured individualy and cumulatively
Legend: The 15 factor loadings are indicated in red. The variables of the noise perception questionnaire that have
common factors are arranged according to their level of
Communalities represent the proportion of the variance for
importance (communalities or % of participation in Table 3):
each variable included in the analysis which is explained by
x Variable 12: What does noise cause for you?
the extracted components [4]. However, a variable with a
(Dependent variable on the Multinomial Logistic
commonality below 0.50 must be deleted; so common
Regression)
factors must be extracted again with the aid of the algorithm.

970
AIA-DAGA 2013 Merano

x Variable 14: What is the activity that you do inside which was an ideal tool for data analysis with more than two
your residence that is directly affected by the noise response categories. The explanation of 81.5 % of the
from outside? symptoms and reactions related to noise in conjunction with
x Variable 5: Which are the noises that annoy most in other subjective and objective variables can be considered
your daily life? quite satisfactory, since the subjective variables, by their
x Variable15: Do you think that the noise in the area nature, motivate more complex interpretations then objective
where you live can devaluate your property? variables.
x Variable 18: When you or your family moved to
your home, the value of the property was an The efficiency of multinomial regression may be attributed
important factor in the choice of the property? to the questionnaire that was subjected to various pre-tests
x Variable17: When you or your family moved to and, therefore, has revealed 87% reliability (Cronbach's
your home, the size of the property was an Alpha). However, in future studies this efficiency may be
important factor in the choice of the same? further improved with the implementation of a larger number
x Variable 10: In which period of the week do you of pre-tests, perhaps with different sample sizes with the aim
notice more noise interference? of eliminating the subjectivity of possible questions and of
The results presented above indicate a high efficiency of encouraging the selection of univocal terms in their
Factor Analysis in the reduction of subjective variables, in formulation.
the endeavour of extracting common factors from the
collected information with the noise perception References
questionnaire, in view of the next stage of the statistical [1] Departamento Nacional de Trânsito – DENATRAN.
analysis. Frota de veículos no Brasil em 2011. URL:
b) Multinomial Logistic Regression http://www.denatran.gov.br/frota.htm
For this type of analysis variables extracted by Factor [2] Maroco, J.: Análise estatística. Com utilização do SPSS.
Analysis were used, along with the variable "noise Edições Sílabo Ltda. 3trd ed., Lisbon, 2007.
measurement". The analysis of these variables led to the
realization that the variable "What does noise cause for [3] Fávero, L. P., Belfiore, P., Silva, F. L., Chan, B.: Análise
you?" (Symptoms and reactions) is the most appropriate de dados. Modelagem multivariada para tomada de decisões.
variable to assume the role of the dependent variable, since it Elsevier, Rio de Janeiro, 2009.
points to the effects produced by noise perception. [4] Schwab, A. J. Eletronic Classroom. URL:
As the Multinomial Logistic Regression does not calculate http://www.utexas.edu/ssw/eclassroom/schwab.
R2 as in linear regression, Pseudo-R2s, which are based on
the maximum likelihood coefficients between variables, are [5] Ferreira, D. F.: Estatística multivariada. Ed. UFLA,
calculated. Pseudo-R2s are calculated through the coefficient Lavras, 2008.
of Cox & Snell; which never reaches the maximum value of
[6] Long, J. S.: Regression Models for Categorical and
1 (100%); the Nagelkerke coefficient normalizes the Cox &
Limited Dependent Variables. Sage Publications, Thousand
Snell coefficient; so that the maximum value can reach 1
(100%); the McFadden coefficient is the most reliable, since Daks, 1997.
it indicates if the model is adjusted: values of 0.2 (20%) and [7] Freese, J., Long, S.: Regression Models for Categorical
0.4 (40%) are suitable for the adjustment [6], [7], [2]. The fit Dependent Variables Using Stata. Stata Press, College
by maximum likelihood aims to obtain, from a sample, Station, 2006.
estimates of statistical parameters, ensuring consistency,
efficiency and adjustment of model parameters.
In this study the resulting Pseudo-R2 were the following:
x R2 of Cox e Snell = 0.815
x R2 of Nagelkerke = 0.829
x R2 of McFadden = 0.417
Thus, the model indicates that 81.5 % (R2 of Cox & Snell) of
the interdependencies between the independent variables and
the dependent variable can be explained, that is, 81.5 % of
the answers of the subjective data and the noise levels in
this study had a relationship with the variable "symptoms
and reactions." The result also indicates that the variables
were well adjusted to this type of model, as the result of
McFadden's R2 was around 0.4.

Discussion
In the context of the above mentioned aim of this study, a
high efficiency in the condensation of the information
collected with the noise perception questionnaire could be
achieved with the Factor Analysis; the results of this analysis
were then used in the Multinomial Logistic Regression

971

Вам также может понравиться