Вы находитесь на странице: 1из 34

See

discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/304529106

A hybrid methodology to estimate construction


material quantities at an early project phase
Article in International Journal of Construction Management June 2016
DOI: 10.1080/15623599.2016.1176727

CITATIONS

READS

81

3 authors, including:
Bryan Adey

Dilum Fernando

ETH Zurich

University of Queensland

95 PUBLICATIONS 273 CITATIONS

37 PUBLICATIONS 356 CITATIONS

SEE PROFILE

SEE PROFILE

Some of the authors of this publication are also working on these related projects:
Methodology to Accurately Estimate the Quantities of Construction Material Quantities in Cement
Plant Construction Projects (MIPP) View project
Effizienter Betrieb und Unterhalt der technischen Infrastruktur (EFFIN) View project

All content following this page was uploaded by Dilum Fernando on 03 August 2016.
The user has requested enhancement of the downloaded file. All in-text references underlined in blue are added to the original document
and are linked to publications on ResearchGate, letting you access and read them immediately.

International Journal of Construction Management

ISSN: 1562-3599 (Print) 2331-2327 (Online) Journal homepage: http://www.tandfonline.com/loi/tjcm20

A hybrid methodology to estimate construction


material quantities at an early project phase
Borja Garca de Soto, Bryan T. Adey & Dilum Fernando
To cite this article: Borja Garca de Soto, Bryan T. Adey & Dilum Fernando (2016): A hybrid
methodology to estimate construction material quantities at an early project phase,
International Journal of Construction Management, DOI: 10.1080/15623599.2016.1176727
To link to this article: http://dx.doi.org/10.1080/15623599.2016.1176727

Published online: 27 Jun 2016.

Submit your article to this journal

Article views: 7

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


http://www.tandfonline.com/action/journalInformation?journalCode=tjcm20
Download by: [UQ Library]

Date: 30 June 2016, At: 20:54

International Journal of Construction Management, 2016


http://dx.doi.org/10.1080/15623599.2016.1176727

A hybrid methodology to estimate construction material quantities at an early project phase


Borja Garca de Soto

*, Bryan T. Adeya and Dilum Fernando

Institute of Construction and Infrastructure Management, ETH Z


urich, Switzerland; bSchool of Civil Engineering, The University of
Queensland, Australia

Downloaded by [UQ Library] at 20:54 30 June 2016

Preliminary project cost estimates are the first serious estimates made on a project. They play an important role during the
decision-making process, and are the benchmark with which future estimates are expected to agree. This paper
concentrates on the estimation of construction material quantities (CMQs) and presents a methodology to accurately
estimate them during an early project phase. We make use of existing data and utilize regression analysis, neural networks
and case-based reasoning to provide accurate results. It encompasses data collection, model development and evaluation,
and the integration of different techniques. The use of the methodology is demonstrated by estimating CMQs of relevant
structures. The accuracy of the methodology is investigated and compared with three state-of-practice approaches. The
results obtained show a significant improvement over the state of the practice, and would improve the accuracy of
preliminary project costs estimates. Through partial automation, it would likely reduce the time required to make estimates.
Keywords: resource estimates; preliminary estimates; hybrid estimation models; neural networks; backward elimination;
case-based reasoning; construction material quantities

Introduction
Preliminary project cost estimates are the first serious estimates of the cost of a project. They are prepared during the conceptual phase of a project. They play an important role in the decision of whether or not to proceed with the project (i.e.
concept or feasibility study), and are the benchmark estimates with which future estimates are often, perhaps unreasonably,
expected to agree (Oberlender & Trost 2001). These preliminary estimates have two major limitations: (1) they are often
not accurate, i.e. there are large variations between these estimates and the actual construction cost; and (2) they require a
significant amount of effort to make. The first of these limitations derives from the fact that at early stages of a project, estimators have little information about the project, have limited scope definition and have little time to prepare estimates. The
second limitation complicates matters because estimators are often expected to deliver more detail, and therefore higher
accuracy, in shorter periods of time than is realistically possible. Due to the importance of these estimates it is of interest
to both increase their accuracy and decrease the time required to make them.
One way to improve the accuracy of construction costs estimates, which is used extensively, is to make preliminary
estimates based on the construction material quantities (CMQs) to be used in the project (Singh 1990, 1991; Bakhoum
et al. 1998; Yeh 1998; Chou 2005; Peng 2006; Chou et al. 2006; K. Kim et al. 2009; Fragkakis et al. 2011; Oh et al. 2013;
Son et al. 2013; Du & Bormann 2014; Garca de Soto et al. 2014), and then to multiply these estimates by the unit costs
which reflect market prices. These unit costs typically include fabrication and delivery, erection, installation, insurance,
site indirect costs, supervision, and overhead and profit. An advantage of making CMQ-based cost estimates, as compared
to estimating cost directly, is that they provide more insight into the cost estimates as quantities and unit costs are separated. This also allows managers to make better decisions and keep better track of progress throughout the project (Chou
et al. 2006; Yu 2006) by controlling the changes in quantities and unit costs separately. They are also beneficial to managers who want to exercise more control over their projects during the conception, bidding and construction phases, and to
contractors who want to prepare both reliable and competitive bids. A disadvantage, however, is that substantial increases
in accuracy require substantial increases in effort and are, therefore, time intensive. For example, detailed estimates might
require models to be developed and tested for every structure, and its corresponding CMQs, in a facility. Such models
would require substantial, and not always available, information, regarding the design of the structure for which the CMQs
are to be estimated, the designs of similar structures and the CMQs used in these similar structures.
Of the models used to make CMQ estimates, regression analysis (RA), neural networks (NNs) and case-based reasoning (CBR) are all common (Bowen & Edwards 1985; Khosrowshahi & Kaka 1996; Hegazy & Ayed 1998; Mohamed &
Celik 1998; Yau & Yang 1998; Yeh 1998; G.H. Kim et al. 2004; Sonmez 2004; Wilmot & Mei 2005; Lowe et al. 2006;
Cho et al. 2013; S. Kim & Shim 2013) and a small but growing number of researchers are proposing combining different
techniques to create hybrid models (G.H. Kim et al. 2005; Yu 2006; An et al. 2007; Koo et al. 2010; Jin et al. 2012;
*Corresponding author. Email: garcia.de.soto@ibi.baug.ethz.ch
2016 Informa UK Limited, trading as Taylor & Francis Group

Downloaded by [UQ Library] at 20:54 30 June 2016

B. Garca de Soto et al.

S. Choi et al. 2013; S. Kim & Shim 2013). In many cases, the idea behind the combination of techniques to improve estimates stems from the work of Crane and Crotty (1967) who suggested combining estimates using RA, and Reid (1968)
and Bates and Granger (1969) who provided an initial theory for the combination of estimates. Combining estimates is
especially useful when it is not clear which approach provides the most accurate estimates (Armstrong 2001), enabling the
use of the unique features of each model to best capture different patterns or features in the data set (Al-Alawi et al. 2008),
and inevitably to improve the accuracy of estimates over those made using a single model (Clemen 1989).
One way to decrease the amount of effort to make construction cost estimates is to partially automate the process. This
has been attempted by numerous researchers, such as Chou (2005), Peng (2006), Yu (2006) and Garca de Soto et al.
(2013). As the estimation of the construction cost of a project is a tedious quantification process, which is subject to human
error, partial automation enables project estimators to concentrate more on project-specific features, such as project type,
location and the constitution of the design/construction team, and on making preliminary designs of structures, which in
turn is likely to improve the accuracy of the cost estimates. Partial automation, through the standardization of the type and
level of information to be collected throughout projects, would also enable more meaningful comparisons between preliminary estimates, subsequent estimates and final costs, which would in turn improve future cost estimates.
When thinking of automation, one needs to consider the use of building information modeling (BIM) for the development of cost estimates, or 5D BIM, as it is typically referred to in the construction industry. Some researchers have looked
into the application of BIM for quantity take-offs or cost estimation during different project phases (e.g. J. Choi et al.
2015; Cheung et al. 2012), and there are different software tools to facilitate this process; however, the applications for 5D
BIM are past the conceptual phase. This makes sense because during the conceptual phase of a project the levels of definition (for both detail and information) that could be used by a BIM are not sufficient to develop a meaningful 5D BIM. In
addition to the insufficient level of definitions, the use of BIM is not widespread in all countries and industry sectors especially when used to develop estimates (Sattineni and Bradford 2011; Franco et al. 2015). However, there is no doubt that
BIM for cost estimation will be more prevalent, and it will redefine the way cost estimators prepare project estimates
(Forgues et al. 2012; Wu et al. 2014; McCuen 2015).
In this paper, a methodology is presented to improve the accuracy of preliminary cost estimates through improved estimates (Class 4 estimates as defined by AACEI, formerly the Association for the Advancement of Cost Engineering) of the
CMQs used in construction projects, and to reduce the amount of time required to make these estimates through partial
automation. The methodology makes use of existing data and uses RA, NNs and CBR. It fills a gap in current research as
it includes a process that can be systematically used to both develop and evaluate estimation models using different techniques. The methodology encompasses data collection, model development and evaluation, and the integration of different
techniques to create accurate estimates of CMQs that can then be coupled with unit costs.
The next two sections contain a description of the proposed methodology and an example of how it is used to estimate
the CMQs of structures in cement plant projects. The subsequent sections contain an evaluation of the accuracy of the
methodology, a comparison of this accuracy with three state-of-practice approaches, and a conclusion.
Methodology
The proposed methodology consists of two phases, namely the pre-estimation phase and the estimation phase. Each phase
has steps1 that can be further divided into sub-steps (Figure 1). This section includes the framework and fundamentals of
the proposed methodology that could be applied to any project type, in any industry.
Pre-estimation phase
The pre-estimation phase involves the steps to be done prior to making CMQ estimates. This phase includes data collection, identification of relevant CMQs and structures, and the development and evaluation of CMQ estimation models.
Before starting these steps, it is necessary to set the boundaries for the type of project2 for which CMQ estimates will
be made. One must identify, for the project type to be investigated, the area(s) (e.g. civil/structural work), the corresponding section(s) of the facility (e.g. process buildings and structures), and the type of structures (e.g. tall-frame structures,
storage structures) to be addressed. Then, one must identify the CMQs of interest (e.g. concrete, reinforcement, structural
steel) for the different structure types, and classify the structure types based on their contribution to those CMQs. This
allows the estimator to define the scope for the models to be developed, to gain a better understanding of the different
structures to be considered, to identify the CMQ-relevant structures, and to determine the data required to estimate the
CMQs (i.e. the CMQ-relevant parameters).
Therefore, one must first set the boundaries of the estimation by making a decision as to what models will be developed
(i.e. breakdown structure for CMQs to be considered and for which structures). This is obvious for projects, such as

International Journal of Construction Management

Downloaded by [UQ Library] at 20:54 30 June 2016

Figure 1. Main steps of the proposed methodology.


CMQs: construction material quantities.

industrial projects (e.g. greenfield cement plant projects), which are composed of different areas (e.g. mechanical, civil/
structural, electrical) with several elements (e.g. for civil/structural: temporary facilities, process buildings and structures,
internal infrastructure) each with many components or structures of different types (e.g. for process buildings: tall-frame
structures, storage structures). It is therefore important to conduct a careful definition and classification of the different
types of components or structures for the selected element in a given area of the typical project type being considered.
This relationship is shown in Figure 2.
Once a given area and element for a typical project type has been chosen (e.g. the civil and structural work related to
the process buildings and structures used in complete-process3 greenfield cement plants), the typical structures for that element can be identified and grouped by structure type (e.g. structure type Y or storage structure; upper level) and structure
subtype (e.g. structure subtype Y1 or storage structure type A, structure subtype Y2 or storage structure type B, structure
subtype Y3 or storage structure type C; lower level). For simplification purposes, no further levels are considered.
The CMQs are assigned to the structures in the lower level (i.e. structure subtypes); therefore, the collection of data should
be made at that level. This relationship is shown in Figure 3.
Once the typical structures have been classified, they can be further analysed to determine their importance, or contribution, in terms of the selected CMQs. Models should be focused on the important structures only (i.e. CMQ-relevant
structures).

Typical project
(e.g., Greenfield cement plant)

Area A

Area B

Area C

(e.g., Mechanical)

(e.g., Civil/Structural)

(e.g., Electrical)

Element A

Element B

Element C

(e.g., Temporary facilities)

(e.g., Process
buildings/structures)

(e.g., Internal
infrastructure)

Etc.

Etc.

Typical Structures
(e.g., tall-frame structures, storage
structures)

Figure 2. Example of the breakdown structure with the different parts that make up a typical project for which construction material
quantity estimates will be developed.

B. Garca de Soto et al.


Typical Structures
(e.g., Process buildings)

Structure type X

Structure type Y

(e.g., tall-frame)

(e.g., storage)

Structure subtype Y1

Structure subtype Y2

Structure subtype Y3

(e.g., storage A)

(e.g., storage B)

(e.g., storage C)

CMQ1

CMQ2

CMQ3

Etc.

Upper level

Etc.

Lower level

Etc.

Additional classifications within a structure subtype are addressed when considering the parameters to be used in the
estimation models to be developed (e.g. construction material, capacity, type of discharge or equipment required, number
of chambers, number of stages). When the amount of data for model development is limited, this consideration could also
be extended to the structure type (i.e. upper level) by rolling the structure subtype up to avoid developing simplified design
models to generate data. For example, for a given structure type, the associated structure subtypes can be used as parameters (i.e. categorical variables) when developing the estimation models for that structure type, e.g. combining all the data
for storage structure types and using the storage structure subtype (i.e. A, B, C) as a categorical variable in the models.
Step 1: collect and prepare data
In this step, all necessary data are collected and prepared (Figure 4). The data are normally available from different sources, such as bill of quantities (BoQs), as-built drawings, general arrangement drawings, design calculations and technical
reports. When collecting data one should keep in mind the level of detail required. All collected data should be stored in a
standard format (e.g. same units, with clear identification of applicable building code or standard) in a database, to facilitate the extraction of information and the identification of similar structures during the analysis phase of the methodology.

Collect

No more data
available

Collect raw data


from similar
projects

Collect
additional/
missing
information

More data
available

Process/Analyze

Data collection

More data required

Assign structure
type and subtype
information

Store

Data collection, processing, and storage for further analysis

Downloaded by [UQ Library] at 20:54 30 June 2016

Figure 3. Breakdown structure for typical structures into structure type (upper level) and structure subtype (lower level).
CMQ: construction material quantity.

Figure 4. Process to collect and prepare data.

Check and adjust


units for different
fields

Analyze data (e.g.,


identification and
removal of outliers,
handling of missi ng

No more
data required
Data ready for further use

values)

Store data
(database)

Update database

International Journal of Construction Management

Special attention should be made to ensure that the CMQ data are coupled together with the information about the structure
type and subtype, dimensions and structure-specific characteristics, as well as site-specific information.
The data are to be prepared to make them homogeneous. This is done by checking them for consistency, eliminating
biases, identifying and removing outliers, handling missing values, transforming the data to a logarithmic scale (i.e. by taking the natural logarithm) to accommodate the functional form to be used in the RA, and scaling values between 0 and 1 to
facilitate the estimation process in the next phase (i.e. determination of similar structures).

Downloaded by [UQ Library] at 20:54 30 June 2016

Step 2: identify CMQ-relevant structures


In this step, the structure subtypes defined in step 1 are analysed to determine their contribution to the CMQs required for
the project. The contribution of a structure subtype to the CMQs of the project consists of the CMQs of all of the structures
of the defined structure subtype. The CMQ-relevant structures are those belonging to the subtypes that together comprise
the percentage of CMQs to be associated with structures, when the structure subtypes are ordered in descending order of
their contribution to the CMQs. For example, if 80% of the CMQs of a project are to be associated with structures, and
structure subtype 1 makes up 60% and structure subtype 2 makes up 25% and structure subtype 3 makes up 15%, then all
structures belonging to structure subtypes 1 and 2 are considered to be CMQ-relevant. The development of CMQ models
are done only for the CMQ-relevant structures.
In setting the threshold for the percentage of CMQs to be associated with structures, thought must be given to the
increasing amount of effort and time required to collect data and develop the models with the increasing of the
threshold  especially considering that often only a relatively small number of structure subtypes are the source of most
of the CMQs (similar to the Pareto principle). For example, in the construction of a typical Greenfield cement plant (i.e.
typical project), about 20% of the structure types contribute about 80% of the CMQs used (Garca de Soto et al., 2013).
The increase in effort may not be worth the increased accuracy of the CQM estimate.
The contribution of a structure subtype for each CMQ is determined by calculating the ratio between the CMQ used in
the structures of that structure subtype for sample projects of that project type (i.e. typical project) and the total CMQ
(Equation 1).
Pi;a D

CMQi;a
CMQi;T

(1)

where:
Pi,a: percentage of CMQi for structure subtype a (from a D 1 to b and i D 1 to J);
CMQi,a: total amount of CMQi for structure subtype a in all typical projects from which data have been collected;
CMQi,T: total amount for CMQi for all structure subtypes in all typical projects from which data have been collected.
Instead of ranking structure subtypes separately for each CMQ, the individual percentages (Pi,a) can be combined to
facilitate the identification of CMQ-relevant structures, i.e. the structure subtypes are ranked only once, taking into consideration all the CMQs to be estimated, instead of once for each CMQ to be estimated. Using Equation (2), a ranking for all
of the structure subtypes for one type of project can be made by taking into account the number of structure subtypes contributing to each CMQ and the amount that each structure subtype contributes to each CMQ.
Wa D

b X
J 
X
Sa
aD1 iD1

ST


Pi;a 100

(2)

where:
Wa: weighted percentage for structure subtype a (from a D 1 to b);
Sa: total number of structure subtypes with CMQi in all typical projects from which data have been collected;
ST: total number of structure subtypes for all CMQs in all typical projects from which data have been collected.
Once the weighted percentage is calculated, the structure subtypes are ranked from high to low contribution, and the
structure subtypes which, when added together, make up for at least the selected threshold are classified as CMQ-relevant
structures. The rest are CMQ-irrelevant and their cumulative CMQs are grouped and used later to estimate the CMQs for
the entire project.

B. Garca de Soto et al.

At the end of this step there is a total of P CMQ-relevant structures that contribute in different amounts to J CMQs (e.g.
CMQi, CMQiC1, CMQJ). It is not likely that there is much variation in the CMQ-relevant structures of projects of one project type. This should, however, be checked if there is reason to believe that there have been fundamental changes to how
structures of certain subtypes are built, such as the use of new technology, construction methods, materials or general project configuration.

Downloaded by [UQ Library] at 20:54 30 June 2016

Step 3: Develop and evaluate CMQ estimation models


The 11 sub-steps for step 3 can be found in Garca de Soto et al. (2014). This process uses RA and NNs. The functional
form used to develop models using RA is the back-transformed equation (Equation 3), also known as the constant elasticity
or multiplicative relationship (Albright et al. 2003). It has been used by other researchers, such as Chou (2005), Peng
(2006) and Chou et al. (2006), to develop quantity-based preliminary cost estimation models. The RA models were developed using the backward elimination technique (BET), a stepwise regression technique that uses statistical constraints to
determine whether a variable is kept or removed from the regression equation. The different developed models were evaluated using a second-order correction Akaike information criterion (AICc) in order to select the most accurate models from
a set. The model with the lowest AIC value is the one with the lower information loss, hence the one more likely to be the
most accurate model from a set of models (Motulsky & Christopoulos 2003). The development and evaluation of the models also tells the estimator the key parameters to be used in the estimation of the CMQs (i.e. identification of CMQ-relevant
parameters). Additional information about this step can be found in Garca de Soto et al. (2014).


b
b
D X1 1 X2 2

   Xnbn e

bo C

m
X
jD1

bn C j Xn C j C SEE
2

(3)

where:
Y: output from the regression equation;
b0 : constant term (y-intercept);
b1 ! bn C m : unstandardized regression coefficients;
X1 ! Xn : continuous independent variables (IVs);
Xn C 1 ! Xn C m : categorical IVs;
e: the inverse of natural logarithm (LN);
SEE: sum of squared errors.
Estimation phase
The estimation phase consists of one step, which in turn consists of the eight sub-steps that are required to make CMQ estimates, once the data have been prepared and the necessary models have been developed, i.e. steps 13. In this phase, the
CMQs for the structures in the target project are estimated. This is done by using CBR to select similar structures, which
enables the methodology to learn from historical data (i.e. transfer of knowledge). The values obtained from CBR are completed using either RA models or NN models. Throughout this phase, the bases for the estimated CMQs (e.g. the parameters used and the assumptions relied upon in the estimating process) should be tracked and documented (Garca de Soto
et al. 2013). The eight sub-steps (Figure 5) are described in the following sections.
Step 4.1: obtain values of input parameters
In this step, the estimator obtains the values of the required input parameters and ensures that they are in a form compatible
with the model requirements (i.e. appropriate units, relevant code) for the structures to be estimated. Obtaining these values during the conceptual or feasibility phase of the project is normally not difficult as most of the information can be
obtained, for example, from mass flow diagrams, flow sheets, feasibility study reports, project layout and, in some cases
general arrangement drawings) since one of the requirements for the selection of potential key parameters (i.e. CMQrelevant parameters) is that they would be readily available during the early stages of a project (Garca de Soto et al.
2014). If they are not, the models developed in step 3 should be reviewed accordingly to include only variables that are
readily available.

Downloaded by [UQ Library] at 20:54 30 June 2016

International Journal of Construction Management

Figure 5. Process for the estimation of construction material quantities (CMQs) for structures in target projects (Step 4).

B. Garca de Soto et al.

Step 4.2: Normalize parameters


In this step, the parameters are normalized by scaling them between 0 and 1 (Equation 4). This ensures that the different
scales and units used for the parameters do not inadvertently skew the search for similar structures.
8
min
>
< Xi Xi
; 8Ximax > Ximin
Xi;norm D X i D Ximax Ximin
>
: 0:5
; 8Ximax D Ximin

(4)

where:
Xi,norm: normalized value between 0 and 1;
Xi: raw parameter to be normalized;
Ximin: minimum value for parameter Xi (minimum of input4 or existing);
Ximax : maximum value for parameter Xi (maximum of input or existing)

Downloaded by [UQ Library] at 20:54 30 June 2016

The natural log values of the variables are used for the normalization or scaling process, i.e. Equation (4) is modified to
use ln(X) instead of X yielding to Equation (5).
8
min
>
< lnXi lnXi ; 8X max > X min
i
i
lnXi;norm D lnX i D lnXimax lnXimin
>
:
0:5
; 8Ximax D Ximin

(5)

Step 4.3: select the similarity threshold


In this step the similarity threshold is selected. It is to be selected taking into consideration the availability of data and the
actual similarity between the situation where the project is to be conducted and the situations where other projects have
been conducted. The adaptation process used (Garca de Soto & Adey 2015b), reduces the impact of the value of the similarity threshold by controlling the contribution of the existing cases as the basis of the estimate for the new case depending
on their similarity. Therefore, the exact value of the similarity threshold does not have a large effect on the CMQ estimates,
especially when it is below 60%. A default value for the similarity threshold of 70%5 has been used (Garca de Soto &
Adey 2015b).
Step 4.4: calculate the distance between the target structure and existing structures
In this step, the distance between the target structure and all existing structures of the same subtype are calculated. The distance is the difference between the normalized (scaled) parameters for the target and existing structures (Garca de Soto &
Adey 2015a). The distance between the target and existing structures used in the proposed methodology is determined
using Equation (6) (Garca de Soto & Adey 2015a).
DistXo ; Xj D

n
X
j xoi xji j bi D Y o Yj

(6)

iD1

where:
Xo: existing structure;
Xj: target structure;
xoi: scaled value of the ith CMQ-relevant parameter for the existing structure (Xo);
xji: scaled value of the ith CMQ-relevant parameter for the target structure (Xj);
n: number of CMQ-relevant parameters, from i D 1 to n;
bi : weight of each parameter using the unstandardized regression coefficients adjusted (Equation 7) to account for
parameter normalization;
Y o : normalized and adjusted value for the existing structure corresponding to CMQ-relevant parameters xoi;
Y j : normalized and adjusted value for the target structure corresponding to CMQ-relevant parameters xji.

International Journal of Construction Management

In the estimation of distance, not all parameters have the same weight. For example, when estimating the CMQs for a
given structure subtype (e.g. storage structure A) it may be more beneficial to have another storage structure A with a similar wind load than with a similar extreme ground motion, or vice versa. The contribution of each parameter to the distance
between structures is weighted using the unstandardized regression coefficients from the selected regression model. Since
the parameters are normalized, the unstandardized regression coefficients are adjusted to account for this normalization
(Equation 7).
8
j bi j Ximax Ximin
>
<X
; 8Ximax > Ximin
j bi j Ximax Ximin
badjusted D bi D
>
:
0
; 8Ximax D Ximin

(7)

Equation (7) can be modified to use ln(X) instead of X yielding to Equation (8), if this is considered to be beneficial,
e.g. when the regression models have been developed using the transformed data set (i.e. using the natural logarithm) for
Equation (3).

Downloaded by [UQ Library] at 20:54 30 June 2016

8
>
>
>
<



j bi j lnXimax lnXimin

 ; 8Ximax > Ximin
X
max
min
badjusted D bi D
j bi j lnXi lnXi
>
>
>
:
0
; 8Ximax D Ximin

(8)

Step 4.5: calculate the similarity between the target and each existing structure
In this step, the similarity of the target structure to each other structure is calculated using Equation (9). When the distance
among the different values of the parameters between the target and a given existing structure is zero the similarity is
100%, i.e. there is a perfect match between the two structures.
SimXo ; Xj D 1 j DistXo ; Xj j 100

(9)

where:
Sim(Xo,Xj): similarity between the target (Xj) and existing (Xo) structures as a percentage (100% D perfect match);
Dist(Xo,Xj): value of the distance between the target (Xj) and existing (Xo) structures (scaled between 0 and 1).
Step 4.6: select similar structures
In this step the similarity of the target structure with each existing structure is compared to the similarity threshold. If the
similarity is greater than or equal to the similarity threshold, the existing structure is classified as a similar structure (Equation 10) and will be later used for the estimation of CMQs for the target structure. If the similarity threshold is not met, the
existing structure is classified as not similar and will not be used.
Simadj X0 ; Xj  similarity threshold ! similar structure

(10)

Step 4.7: estimate CMQs for CMQ-relevant structures


In this step, the CMQs for the CMQ-relevant structures are estimated. The estimation falls into the following categories,
depending on the similarity between the target and the existing structure(s) (Table 1).
When situation 2 occurs (Table 1), the initial estimate is adjusted to account for the differences in the values of the
parameters between the target and the existing similar structure using the selected regression model (Patterson et al. 2002;
Jin et al. 2012; Garca de Soto et al. 2013; Jin et al. 2014), modified to account for the percentage error of the model.
When there is more than one similar existing structure, their contribution is weighted using their similarity (i.e. more similar structures have a higher weight than less similar structures do). The sub-process used to make the adjustment is
described in Figure 6.

10

B. Garca de Soto et al.

Table 1. Categories of similarity.


No. Definition
1
2

Description

Estimation of CMQs of the target structure

Sim D 100%

There is at least one other structure with exactly the same values
Is the actual CMQs for the existing
of the CMQ-relevant parameters as the target structure
structure(s) (i.e. CMQj D CMQoi)
100% > Sim  There is at least one structure that is considered to be similar to the Is the adapted CMQs for the existing
structure(s), which is explained in the
SimThres
target structure but which does not have exactly the same values
remainder of this section
of the CMQ-relevant parameters as the target structure
Sim < SimThres There are no structures that are considered to be similar to the
Is estimated directly using the models
target structure
determined in step 3

Downloaded by [UQ Library] at 20:54 30 June 2016

CMQs: construction material quantities.

Step 4.8: estimate total CMQs based on estimated CMQs for CMQ-relevant structures
The total CMQ for each CMQ of interest in the project is derived using the estimated CMQs for the CMQ-relevant structures (Equation 11).
X
CMQi;m
CMQi;p D X
(11)
Pi;m
where:
CMQi;p : total amount for CMQi for the project p being estimated;
CMQi;m : estimated amount of CMQi for structure subtype m;
Pi;m : corresponding percentage of CMQi (in decimal form) for structure subtype m (determined from existing project
data, see Equation 1).

Step 1. Use actual of exsiting


similar structure (CMQoi )

Step 2. Use selected regression


model to calculate estimated CMQ
(CMQomi )
Step 3. Use selected regression
model to calculate estimated CMQ
(CMQjm)

CMQjm i = CMQjm + (CMQjm * ei)


CMQjm :estimated CMQ of the
target structure using the selected
regression model

Step 4. Modify the estimated CMQ


taking into consideration the %
error

ei :% error obtained from the


regression equation (keeping the
sign) for existing structure i
((CMQomi - CMQoi ) / CMQoi )

Step 5. Determine the difference


between the target and existing
similar

Step 6. Calculate the adjusted


CMQ (CMQjadj) for the target
structure

= CMQjm i CMQomi

i : diff erence between estimated


CMQs for existing similar structure
i and target str ucture j (keeping the
sign)

CMQjm i : estimated CMQ for the


target structure j using the selected
regression model and modified to
account for % error for existing
str ucture (step 4)
CMQomi : estimated CMQ for
existing similar structure i using
the selected regression model
n

( ( CMQ

oi

CMQ jadj =

i =i

+ i ) Simoij )

Simoij
i =1

Simoij :similar ity between target


str ucture j and existing similar
str ucture i in decimal form
(determined using Equation (9))

Figure 6. Adaptation sub-process.


CMQ: construction material quantity.

International Journal of Construction Management

11

Example
This example shows how the proposed methodology can be used to develop estimation models to determine the quantities
of concrete (CO), reinforcement (RE) and structural steel (ST) required in the construction a typical Greenfield cement
plant project. The example covers the two phases of the proposed methodology and all of the required steps, from data collection to model development and CMQ estimations. The example is explained from the point of view of an estimator following the process. For the estimation phase, three Greenfield plants containing storage, tall-frame, grinding and packing
structures were used. The structures from the three Greenfield plants were not used in the development of the estimation
models incarnated in the complete methodology. Due to space constrains, only the models related to the estimation of concrete and reinforcement for storage structures have been presented in detail to show their development. A similar process
was made for the models related to the estimation of concrete, reinforcement and structural steel for tall-frame structures.6
For these, only the selected models are shown.

Downloaded by [UQ Library] at 20:54 30 June 2016

Pre-estimation phase
Step 1: collect and prepare data
Data were collected from Greenfield cement plant construction projects around the world built between 2007 and 2010. It
contained information about the structure, their dimensions, structure-specific characteristics, as well as site specific information. The information was obtained from BoQs, as-built and/or general arrangement drawings, mass flow diagrams,
flow sheets and design calculations/technical reports made available by one of the leading suppliers of cement and aggregates worldwide. The collected data were prepared (e.g. investigating outliers, getting information about missing values7)
as shown in Figure 4, and the design parameters were standardized to the same code values8 and appropriate units. The
data were transformed (e.g. taking the natural logarithm of the data set), and the transformed values were scaled between 0
and 1.
Step 2: identify CMQ-relevant structures
Equation (1) was used to determine the contribution of the different structure subtypes, with respect to the CMQs evaluated
(Table 2).
Once the contribution of each structure type with respect to the CMQs was calculated, Equation (2) was used
to combine the different percentages and obtain a weighted value used to rank the structures.9 A total of 332
CMQs were collected from all available structures, of which 148 corresponded to CMQ entries related to concrete
and reinforcement, and 36 were for entries related to structural steel. For example, the weighted percentage
(W) for the structures shown in Table 2 was calculated as follows:

 
 

148
148
36
34% C
48% C
0% D 37%
WStorage structure subtype A D
332
332
332

Table 2. List of structure types with their corresponding percentage of construction material quantities (CMQs) sorted by percentage of
concrete.
Structure subtype
Storage structure subtype A
Storage structure subtype B
Tall-frame structures (subtypes AL)
Grinding structure subtype B
Storage structure subtype C
Grinding structure subtype A
Grinding structure subtype C
Packing structure
Total

Concrete (%)

Reinforcement (%)

Structural steel (%)

34
22
12
10
9
9
3
1
100

48
22
6
4
10
5
2
2
100

0
0
61
16
0
18
5
0
100

12

B. Garca de Soto et al.

Table 3. List of structure types ranked in terms of their weighted percentage (W) and classified as relevant or not relevant.
CMQ relevancy
Relevant

Not relevant

Structure type

W (%)

Cumulative W (%)

Storage structure subtype A


Storage structure subtype B
Tall-frame structures (subtypes AL)
Storage structure subtype C
Grinding structure subtype A
Grinding structure subtype B
Grinding structure subtype C
Packing structure

37
19
15
9
8
8
3
1

37
56
71
80
88
96
99
100

CMQ: construction material quantity.

Table 4. List of combined structure types ranked in terms of their weighted percentage (W) and classified as relevant or not relevant.

Downloaded by [UQ Library] at 20:54 30 June 2016

CMQ relevancy
Relevant
Not relevant

Structure type

W (%)

Cumulative W (%)

Storage structure (subtypes AC)


Tall-frame structures (subtypes AL)
Grinding structure subtype A
Grinding structure subtype B
Grinding structure subtype C
Packing structure

65
15
8
8
3
1

65
80
88
96
99
100

CMQ: construction material quantity.

The same calculation was made for all of the structures shown in Table 2, and the results are summarized in Table 3.
Once the weighted percentage was calculated, the structures were ranked to determine the CMQ-relevant structures, using
a threshold of 80%,10 meaning that the structure types contributing to a cumulative weighted percentage of at least 80% of
the CMQs being evaluated where classified as CMQ-relevant (Table 3).
Due to their similitude, storage structure subtypes AC were further combined to simplify the number of models to be
developed, using their subtype as a categorical variable (Table 4).
The models were developed for the relevant structures shown in Table 4 [i.e. Storage structure (subtypes AC) and
tall-frame structures (subtypes AL)], which account for 80% of weighted CMQs considered. The not relevant structures
are not directly estimated but rolled up into an aggregated amount when estimating the total CMQs for the typical project
being estimated.
Step 3: develop and evaluate CMQ estimation models
This steps consists of different sub-steps and processes to develop regression and NN models that are then evaluated and
selected based on their AIC values. The sub-steps and processes for this step are explained in detail in Garca de Soto et al.
(2014). When implementing this step for the estimation of CMQs for storage structure subtypes AC, the lowest AICc values for the regression model were 592.9 and 461.3 for concrete and reinforcement, respectively, and for the NN modes it
was 613.0 and 474.2 for concrete and reinforcement, respectively. In this case, the regression models performed better
than the NN models did. The selected models are shown in Equation (12) (concrete-CO) and Equation (13) (reinforcement-RE). The key parameters for those structures and CMQs are the capacity (Cap), diameter (Diam), height and ground
acceleration (S1). The storage subtypes (A and B) were used as categorical variables and the subtype C was used as the reference subtype.
CO D Cap0:426 Diam0:773 Height0:102 S1 0:573 e1:82 C TypeA 0:176 C TypeB 0:868

(12)

RE D Cap0:390 Diam0:647 Height0:135 S1 0:695 e0:539 C TypeB 0:667

(13)

International Journal of Construction Management

13

Table 5. Summary of ACI for regression and NN models for concrete, reinforcement and structural steel.
Model ID


Upper structure

CO-2
CO-NN

RE-2

RE-NN

ST-2

ST-NN

CO-3

CO-NN

RE-3

RE-NN


Foundation

AICc
9.69EC03
1.04EC04
1.79EC04
1.89EC04
1.09EC04
1.11EC04
4.80EC01
2.89EC02
1.50EC03
1.98EC03

TFS_A-L/US/BETTFS_A-L/US/
TFS_A-L/FND/BET
TFS_A-L/FND/


Downloaded by [UQ Library] at 20:54 30 June 2016



A similar process was done for the tall-frame structures. The models to estimate the amount of concrete, reinforcement and structural steel for the upper structure and the amount of concrete and reinforcement for the foundation of tall-frame structures were selected based on their AIC values. The AICc for the regression and NN models
are summarized in Table 5. The AICc for the NN models is lower than that for the regression models; hence, the
models selected for the estimation of concrete, reinforcement and structural steel for the upper structure (US) of
tall-frame structures (TFS) are TFS_A-L/US/BET-CO-NN, TFS_A-L/US/BET-RE-NN, TFS_A-L/US/BET-ST-NN. The
ones for the concrete and reinforcement for the foundation (FND) of tall-frame structures are TFS_A-L/FND/BETCO-NN, TFS_A-L/FND/BET-RE-NN.
The developed regression models to estimate the amount of concrete, reinforcement and structural steel for the
upper structure are shown in Equations (14), (15) and (16), respectively. The regression models to estimate the
amount of concrete and reinforcement for the foundation of tall-frame structures are shown in Equations (17) and
(18), respectively.
CO D Height1:36 Wind 0:323 e1:26 C HB 0:183 C Stage4 0:150 C Stage5 0:074 C String1 0:523

(14)

RE D Height1:46 Wind 0:302 e 0:822 C HB 0:353 C Stage4 0:143 C Stage5 0:0810 C String1 0:503

(15)

ST D Height 0:0930 Wind 0:0230 e6:88 C Stage4 0:160 C Stage5 0:0180 C String1 0:441

(16)

CO D S1 0:104 SoilBC 0:397 e9:88 C Stage4 0:303 C Stage5 0:121 C String1 0:473

(17)

RE D S1 0:112 SoilBC 0:399 e6:90 C Stage4 0:300 C Stage5 0:121 C String1 0:472

(18)

The value of the weights between the input and the hidden layer and the hidden layer and the output for the NN models
for CMQs for the upper structure (Tables 6 and 7) and foundation (Table 8) of the different tall-frame structure subtypes.
Estimation phase
Step 4.1: obtain values of input parameters
The required parameters were derived from the selected models (e.g. Equations 12 and 13). For this example, the values of
the input parameters for the target structures are summarized in Table 9.
Step 4.2: normalize parameters
The required transformation and normalization were done for all parameters of all existing structures. The maximum and
minimum values required to normalize each parameter are shown in Table 10.

14

B. Garca de Soto et al.

Table 6. Summary of estimated weights for a neural network (NN) model using five inputs and three neurons for concrete and reinforcement for the upper structure of tall-frame structures.
Concrete (upper structure)
Input/hidden layer (WA)
Input
Input layer

H2

H1

0.082 0.680
I1: Height_m
I2: Wind_sp
0.172
0.238
I3: HB
0.003
0.202
I4: Stages
0.011 0.127
I5: Strings
0.473 0.102
Bias 1
0.195
0.059

H3

Output layer

Hidden layer (WA)

Concrete (m )

0.371
0.680
0.002
0.137
0.336
0.606

H1
0.240
0.071
0.110
0.039
0.444
0.156

Hidden/output layer (WB) H1


H2
H3
Bias 2

Downloaded by [UQ Library] at 20:54 30 June 2016

Reinforcement (upper structure)

H2

H3

Output layer
Reinforce-ment (t)

0.448
0.430
0.142 0.176
0.049
0.330
0.770
0.035
0.929 0.536
0.240 0.478
1.152
0.508
0.420
0.673

0.829
0.586
0.423
0.156

Table 7. Summary of estimated weights for a neural network (NN) model using four inputs and three neurons for structural steel for the
upper structure of tall-frame structures.
Structural steel (upper structure)
Input/hidden layer (WA)
Input
Input layer

I1: Height_m
I2: Wind_sp
I3: Stages
I4: Strings
Bias 1

Hidden/output layer (WB)

H1
H2
H3
Bias 2

Output layer

H1

H2

H3

0.933
0.182
0.924
0.187
1.084

0.405
0.148
0.010
1.137
0.926

0.106
0.102
0.354
0.343
0.049

Structural steel (t)

0.943
1.153
0.524
0.009

Table 8. Summary of estimated weights for a neural network (NN) model using four inputs and three neurons for concrete and reinforcement for the foundation of tall-frame structures.
Concrete (foundation)
Input/hidden layer (WA)
Input
Input layer

H1

H2

H3

Reinforcement (foundation)

Output layer
Concrete (m3) H1

0.008
0.003 0.054
I1: S1
I2: Soil_BC 0.054 2.885 0.637
I3: Stages
0.394
0.486 0.013
I4: Strings
0.983
0.496 0.204
Bias 1
1.539 1.899
0.678

Hidden/output layer (WB) H1


H2
H3
Bias 2

Hidden layer (WA)


H2

H3

Output layer
Reinforce-ment (t)

0.004
0.015
0.035
0.622 0.385 2.177
0.478
1.828 0.234
1.306 1.483 0.478
0.511 0.752 1.469
1.174
0.811
0.430
0.594

1.639
1.083
0.939
0.072

International Journal of Construction Management

15

Table 9. Input variables for target structures.


Parameters
Struct. ID

Subtype

T1
T2
T3
T4
T5
T6
T7
T8
T9
T10
T11
T12

A
A
A
A
B
B
B
B
C
C
C
C

CMQs

Capacity (t)

Dia. (m)

Height (m)

Ground acc. (S1xg)

Concrete (m3)

Reinf. (t)

10,000
10,700
14,000
14,800
59,000
60,000
75,000
107,900
8000
13,300
15,500
21,800

18
24
20
20
33
45
48
28
18
19
22
22

28
58
60
55
30
45
48
35
40
50
55
55

0.03
0.09
0.33
0.15
0.13
0.05
0.24
0.02
0.07
0.16
0.07
0.01

3221
4842
5876
5077
6314
8240
11,235
6871
4028
5685
6494
6617

608
818
1158
998
1061
1408
1629
1120
649
893
981
1125

Downloaded by [UQ Library] at 20:54 30 June 2016

CMQs: construction material quantities.

Table 10. Maximum and minimum values of parameters for different storage structures.
Storage subtype
Parameter
Cap (t)
Diameter (m)
Height (m)
Ground acc. (S1xg)

Max
Min
Max
Min
Max
Min
Max
Min

26,000
7500
24
18
70
25
1.33
1.01

111,900
22,500
51
28
50
20
1.28
1.01

22,800
7000
22
15
75
30
1.24
1.01

Using the capacity of structure ID T1 (10,000 tons) as an example, the transformed and normalized capacity was calculated as follows (Equation 5):
ln10000 ln7500
9:21 8:92
D
D 0:23
ln26000 ln7500 10:17 8:92
The same calculation was done for all of the parameters in the remaining target storage structure subtypes. The results
of the normalization for each storage structure subtype are shown in Table 11.
Step 4.3: enter the similarity threshold
The similarity threshold (i.e. the minimum matching requirement when select existing structures) was set to 70%. This
value ensures accurate CMQ estimates (Garca de Soto 2014).
Step 4.4: calculate the distance between the target and existing structures
Equation (6) was used to calculate the distance between the target and existing structures. From step 3, the best-performing
regression models for the estimation of concrete and reinforcement for storage structures were STGS_A-C/BET-CO-411
and STGS_A-C/BET-RE-5,12 respectively. The unstandardized regression coefficients were adjusted using Equation (8) to

16

B. Garca de Soto et al.

Table 11. Transformed and normalized parameters for each storage structure.

Downloaded by [UQ Library] at 20:54 30 June 2016

Structure

Transformed (ln) and normalized (01) parameters

ID

Subtype

Capacity

Diam.

Height

Ground acc.

T1
T2
T3
T4
T5
T6
T7
T8
T9
T10
T11
T12

A
A
A
A
B
B
B
B
C
C
C
C

0.23
0.29
0.50
0.55
0.60
0.61
0.75
0.98
0.11
0.54
0.67
0.96

0.00
1.00
0.37
0.37
0.27
0.79
0.90
0.00
0.48
0.62
1.00
1.00

0.11
0.82
0.85
0.77
0.44
0.89
0.96
0.61
0.31
0.56
0.66
0.66

0.05
0.27
1.00
0.48
0.46
0.16
0.87
0.04
0.28
0.68
0.28
0.00

account for the normalization (i.e. scaling between 0 and 1) of the parameters. As an example, the unstandardized coefficient from STGS_A-C/BET-CO-4 for capacity (0.43) was adjusted for storage structure subtype A as follows:
badj x

j bi j lnXimax lnXimin D j 0:43 j ln26000 ln7500 D 0:42610:17 8:92 D 0:53

For simplicity, the denominator of Equation (8) was excluded at this point and included as a constant when calculating
the distance. The original and adjusted unstandardized coefficients for the different storage structure subtypes are shown
in Table 12.
P
The denominator of Equation (8) [ b(ln(xmax)-(xmin))] for the different storage structures and models is shown in
Table 13. The other values required in the estimation of the preliminary distance, i.e. the numerator, are given in Table 14.
For example, when ID T1 was compared with existing storage structure ID 1 with the following parameters (capacity of
7500 tons, diameter of 18 m, height of 25 m, wind speed of 42 m/s, ground acceleration of 1.07 (S1xg), soil bearing
capacity of 45 (Tons/m2), and soil factor of 1.35); the preliminary distance was 0.16 (before applying the denominator of
Equation 8; Table 13). The distances are summarized in Table 15.
Step 4.5: calculate the similarity between the target and existing structures
The similarity between the target and the existing structures was calculated using Equation (9). The similarities between
the target storage structure (ID T1) and the existing storage structures are shown in Table 16.
Table 12. Unstandardized and adjusted unstandardized coefficients for selected regression models to estimate the amount of concrete
and reinforcement in storage structures.
Model
Unstandardized
Adjusted unstandardized
Adjusted unstandardized
Adjusted unstandardized
(STGS_A-C/BET-) coefficients
coefficients (storage structure A) coefficients (storage structure B) coefficients (storage structure C)
CO-4 LN_Cap
LN_Diam
LN_Height
LN_S1
RE-5 LN_Cap
LN_Diam
LN_Height
LN_S1

0.43
0.77
0.10
0.57
0.40
0.65
0.14
0.70

0.53
0.22
0.11
0.16
0.48
0.19
0.14
0.19

0.68
0.46
0.09
0.14
0.63
0.39
0.12
0.17

0.50
0.30
0.09
0.12
0.46
0.25
0.12
0.14

International Journal of Construction Management

17

Table 13. Denominator of Equation (8) for the different storage structures.
P
Model (STGS_A-C/BET-)

b(ln(xmax) ln(xmin))

Storage structure A

Storage structure B

Storage structure C

1.01
1.00

1.38
1.31

1.01
0.97

CO-4
RE-5

Table 14. Values to be used in the calculation of the preliminary distance, between target structure (ID T1) and existing structure
(ID 1).

Downloaded by [UQ Library] at 20:54 30 June 2016

Transformed (ln) and normalized (0-1) parameters


Item

Cap Diam. Height Wind sp. Ground acc. Soil BC Soil factor

Target structure (ID T1) (xo)


Existing structure (ID 1) (xj)
Difference (xoi-xji)
P
Adjusted unstandardized coefficients (badjx b(xmaxxmin))
P max min
Preliminary distance (xoi-xjix (badjx b(x x ))

0.23
0.00
0.23
0.53
0.12

0.00
0.00
0.00
0.22
0.00

0.11
0.00
0.11
0.11
0.01

0.00
0.88
0.88
0.00
0.00

0.05
0.21
0.16
0.16
0.02

1.00
0.71
0.29
0.00
0.00

0.00
0.64
0.64
0.00
0.00

P
N/A
N/A
N/A
1.02
0.16

Table 15: Distances between target structure (ID T1) and existing storage structures (subtype A).
P max min
Structure
Ground Soil Soil
PreliminaryP
distance
b(x -x )
Distance
max min
ID
Cap Diam. Height Wind sp acc. BC factor (xoi-xjix (badjx b(x -x )) (storage structure A) (xoi-xjix badj)
1
2
3
4
5
6
7
8
9
10
11
12
13
14

0.12
0.00
0.00
0.08
0.06
0.00
0.27
0.13
0.11
0.40
0.27
0.16
0.20
0.41

0.00
0.00
0.00
0.00
0.08
0.16
0.16
0.04
0.04
0.00
0.16
0.22
0.08
0.22

0.01
0.01
0.01
0.04
0.01
0.08
0.09
0.05
0.06
0.06
0.08
0.07
0.09
0.08

0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00

0.02
0.01
0.01
0.01
0.13
0.10
0.01
0.00
0.09
0.11
0.05
0.11
0.11
0.10

0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00

0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00

0.16
0.01
0.02
0.12
0.28
0.33
0.52
0.22
0.30
0.57
0.55
0.57
0.48
0.81

1.01

0.16
0.01
0.02
0.12
0.27
0.32
0.52
0.22
0.30
0.56
0.54
0.56
0.47
0.80

Step 4.6: select similar structures


The similar structures were selected using Equation (10) and the recommended similarity threshold of 70%. The existing
structures with a similarity greater than or equal to the similarity threshold of the target structure (ID T1) are shown in
Table 17.
Step 4.7: estimate CMQs for CMQ-relevant structures
The CMQs for the CMQ-relevant structures were estimated as described in step 6 of Figure 6. The results for the storage
structures are summarized in Table 18.

18

B. Garca de Soto et al.


Table 16. Similarity between target structure (ID T1) and existing storage structures (subtype A).
Structure ID

Dist (xoi-xjix badj)

Sim (1-(Dist(Xo-Xj))100)

0.16
0.01
0.02
0.12
0.27
0.32
0.52
0.22
0.30
0.56
0.54
0.56
0.47
0.80

84.36
98.75
98.47
88.21
72.86
67.61
48.48
78.46
70.49
43.86
45.66
43.82
53.03
20.47

Downloaded by [UQ Library] at 20:54 30 June 2016

1
2
3
4
5
6
7
8
9
10
11
12
13
14

Table 17. Existing storage structures (subtype A) with Sim  SimThres for target structure (ID T1).
Structure ID

Distance (xoi-xjix badj)

Similarity (1-(Dist(Xo-Xj))100)

0.01
0.02
0.12
0.16
0.22
0.27
0.30

98.75
98.47
88.21
84.36
78.46
72.86
70.49

2
3
4
1
8
5
9

Table 18. Estimated construction material quantities (CMQs) for target storage structures (upper structure
and foundation).
Structure ID
T1
T2
T3
T4
T5
T6
T7
T8
T9
T10
T11
T12

Storage subtype

Concrete (m3)

Reinforcement (t)

A
A
A
A
B
B
B
B
C
C
C
C

3519
5020
5562
5175
6383
8213
10,453
6931
4027
5612
6459
7244

649
929
1065
978
1065
1333
1706
1149
657
897
1006
1115

The estimation of CMQs for storage structure ID T1 fell into the adaptation category because there were existing
structures with a similarity to the target structure exceeding the minimum requirement (i.e. 70% similarity threshold).
The results of the first five steps in the adaptation sub-process applied to the structures shown in Table 17, using the estimation of the amount of concrete as an example, are shown in Table 19. The results of the sixth step are shown in
Table 20.

International Journal of Construction Management

19

Table 19. Results from steps 1 through 5 of the proposed adaptation process applied to storage structure ID T1.
Concrete (m3)
Structure ID
2
3
4
1
8
5
9

% Error from RA model

Step 1

Step 2

Step 3

Step 4

Step 5

4.51
1.99
14.70
11.64
1.18
19.39
0.76

3358
3431
4579
3555
4310
3866
4692

3509
3500
3906
3141
4360
4616
4728

3505
3505
3505
3505
3505
3505
3505

3663
3574
2989
3097
3546
4184
3531

153
75
916
45
815
432
1197

RA: regression analysis.

Table 20. Intermediary results for step 6 of the adaptation sub-process applied to storage structure ID T1.

Downloaded by [UQ Library] at 20:54 30 June 2016

Structure ID
2
3
4
1
8
5
9

% Error from model

Step 1

Step 5

Step 1 C Step 5

Sim (%)

(Step 1 C Step 5)Sim

4.51
1.99
14.70
11.64
1.18
19.39
0.76

3358
3431
4579
3555
4310
3866
4692

153
75
916
45
815
432
1197

3511
3506
3662
3510
3495
3434
3495

98.75
98.47
88.21
84.36
78.46
72.86
70.49
591.61

346,737
345,250
323,062
296,126
274,228
250,231
246,397
2,082,031

After step 6, the estimated amount of concrete for the target structure (ID T1) was calculated as shown below, resulting
in 3519 m3 of concrete (for the upper structure and foundation).

Xn 
C
D
Sim
CMQ
oi
i
oij
2; 082; 031
iDi
Xn
D 3; 519
D
CMQjadj D
591:61
Sim
oij
iD1
A similar approach was used for the tall-frame structures. The results are summarized in Table 21 (upper structure) and
Table 22 (foundation).
Table 21. Estimated construction material quantities (CMQs) for target tall-frame structures (upper structure).
Structure ID
T13
T14
T15
T16
T17
T18
T19
T20
T21
T22
T23

Tall-frame subtype

Concrete (m3)

Reinforcement (t)

Struct. steel (t)

A
A
A
A
A
B
B
B
B
B
C

3810
3886
4066
3404
3063
6389
7271
6556
6394
6108
4344

706
717
757
641
573
1181
1327
1181
1165
1098
788

N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
(continued)

20

B. Garca de Soto et al.

Table 21. (Continued )

Downloaded by [UQ Library] at 20:54 30 June 2016

Structure ID
T24
T25
T26
T27
T28
T29
T30
T31
T32
T33
T34
T35
T36
T37
T38
T39
T40
T41
T42
T43
T44
T45
T46
T47
T48
T49
T50
T51
T52
T53
T54
T55
T56
T57
T58
T59
T60
T61
T62
T63
T64
T65
T66
T67
T68
T69
T70
T71
T72

Tall-frame subtype

Concrete (m3)

Reinforcement (t)

Struct. steel (t)

C
C
C
C
D
D
D
D
D
E
E
E
E
E
F
F
F
F
F
G
G
G
G
G
H
H
H
H
H
I
I
I
I
I
J
J
J
J
J
K
K
K
K
K
L
L
L
L
L

3929
4073
4753
4436
6159
7431
6701
7504
6844
4734
4862
4,723
5265
5383
13,208
10,468
12,734
11,410
10,465
3314
3032
2820
2768
2857
4940
4557
5098
5134
5268
3202
4219
4087
3211
3997
5645
6123
6042
5404
6030
3760
4487
3892
4206
4821
11,082
9464
11,358
9393
8363

733
750
870
818
1121
1333
1228
1348
1231
887
899
897
987
1008
2461
1967
2373
2114
1979
520
471
447
437
456
764
696
776
782
807
500
659
641
502
618
864
938
933
835
917
589
714
614
658
764
1742
1516
1793
1551
1325

N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
381
385
376
378
374
558
596
598
597
593
450
451
449
462
455
700
698
694
694
704
444
439
441
445
440
678
676
675
681
667

International Journal of Construction Management

21

Table 22. Estimated construction material quantities (CMQs) for target tall-frame structures (foundation).

Downloaded by [UQ Library] at 20:54 30 June 2016

Structure ID

Tall-frame subtype

Concrete (m3)

Reinforcement (t)

A/G
A/G
A/G
A/G
A/G
B/H
B/H
B/H
B/H
B/H
C/I
C/I
C/I
C/I
C/I
D/J
D/J
D/J
D/J
D/J
E/K
E/K
E/K
E/K
E/K
F/L
F/L
F/L
F/L
F/L

3105
2952
2114
1658
1692
2806
3474
2693
4520
4523
2500
2502
2085
2031
2549
4227
4345
3281
5178
5191
2255
2986
2253
2267
3089
3509
3567
4697
4642
4503

158
150
108
84
86
141
177
134
232
232
126
126
106
103
129
214
221
167
246
250
114
153
114
115
159
177
180
239
236
229

T73
T74
T75
T76
T77
T78
T79
T80
T81
T82
T83
T84
T85
T86
T87
T88
T89
T90
T91
T92
T93
T94
T95
T96
T97
T98
T99
T100
T101
T102

Step 4.8: estimate total CMQs based on estimated CMQs for CMQ-relevant structures
The structures and related CMQs for the Greenfield plants are shown in Table 23.
The estimated CMQs for the CMQ-relevant structures for the different Greenfield plant projects are summarized in
Table 24.
Table 23. Structures in Greenfield plants and actual construction material quantities (CMQs).
Plant ID
Alpha

Structure type/ID

Concrete (m3)

STGS A/T3
STGS A/T4
STGS B/T7
STGS B/T8
STGS C/T11
STGS C/T12
TFS J/T58
TFS J/T58

5876
5077
11,235
6871
6494
6617
9191
9191

Reinforcement (t)
1158
998
1629
1120
981
1125
1053
1053

Structural steel (t)


NA
NA
NA
NA
NA
NA
691
691
(continued)

22

B. Garca de Soto et al.

Table 23. (Continued )


Structure type/ID

Concrete (m3)

Reinforcement (t)

Structural steel (t)

GRIND A
GRIND B
GRIND C
PACK
TOTAL

6783
8227
2399
1112
79,074

538
468
188
169
10,480

400
359
119
NA
2260

Beta

STGS A/T1
STGS B/T5
STGS C/T9
TFS I/T53
GRIND A
GRIND B
GRIND C
PACK
TOTAL

3221
6314
4028
5613
2148
2605
760
352
25,042

608
1061
649
617
173
151
60
54
3374

NA
NA
NA
465
134
121
40
NA
760

Gamma

STGS A/T2
STGS B/T6
STGS C/T10
TFS G/T43
GRIND A
GRIND B
GRIND C
PACK
TOTAL

4842
8240
5685
4909
2652
3217
938
435
30,918

818
1408
893
575
218
190
76
69
4246

NA
NA
NA
373
108
97
32
NA
610

Downloaded by [UQ Library] at 20:54 30 June 2016

Plant ID

STGS: storage structure; TFS: tall-frame structure; GRIND: grinding structure; PACK: packing structure.

Table 24. Relevant structures in Greenfield plants and estimated construction material quantities (CMQs).
Plant ID

Structure type/ID

Alpha

STGS A/T3
STGS A/T4
STGS B/T7
STGS B/T8
STGS C/T11
STGS C/T12
TFS J/T58
TFS J/T58
TOTAL RELEVANT

Beta

Gamma

Concrete (m3)

Reinforcement (t)

Structural steel (t)

5562
5175
10,453
6931
6459
7244
9872
9872
61,568

1065
978
1706
1149
1006
1115
1078
1088
9185

NA
NA
NA
NA
NA
NA
700
700
1400

STGS A/T1
STGS B/T5
STGS C/T9
TFS I/T53
TOTAL RELEVANT

3519
6383
4027
5751
19,680

649
1065
657
629
3000

NA
NA
NA
450
450

STGS A/T2
STGS B/T6
STGS C/T10
TFS G/T43
TOTAL RELEVANT

5020.00
8213.00
5612.00
5006
23,851

929
1333
897
606
3765

NA
NA
NA
381
381

STGS: storage structure; TFS: tall-frame structure; GRIND: grinding structure; PACK: packing structure.

International Journal of Construction Management

23

Table 25. Relevant structures in Greenfield plants and estimated construction material quantities (CMQs).
Estimated (relevant structures)
3

Estimated (Equation 14)


3

Concrete (m )

Reinf. (t)

Struct. steel (t)

Concrete (m )

Reinf. (t)

Struct. steel (t)

61,568
19,680
23,851

9185
3000
3765

1400
450
381

80,400
25,700
31,146

10,558
3448
4328

2289
736
623

Alpha
Beta
Gamma

These CMQs were used to estimate the total amounts of concrete, reinforcement and structural steel for each Greenfield plant project using Equation (11). Using the amount of concrete from the Beta plant as an example, the estimated total
amount of concrete for the entire project was determined as follows:
X

Downloaded by [UQ Library] at 20:54 30 June 2016

CMQi;p D

CMQi;m
3; 519 C 6; 383 C 4; 027 C 5; 751 19; 680
X
D
D 25; 700
D
0:34 C 0:22 C 0:12 C 0:09
0:77
Pi;m

The same process was done for all the plants and CMQs. The results are summarized in Table 25.
Accuracy
The accuracy of the proposed methodology was investigated at the structure and project levels. At the structure level, this
was done by comparing the estimated quantities of the target structures (Tables 18, 21 and 22) with the actual quantities.
At the plant level, this was done by comparing the estimated total CMQs based on estimated CMQs for CMQ-relevant
structures with actual Greenfield plants (as shown in Table 23).
Structure level
The resulting errors (Equation 19) were checked against the accuracy ranges proposed by Christensen and Dysert (2011)
for Estimate Class 4. The methodology was evaluated in terms of the percentage errors from the estimated CMQs for individual structures with respect to the proposed accuracy ranges.13 The actual and estimated CMQs for the target storage
structure subtypes AC and tall-frame structures AL (upper structure and foundation), as well as the percentage errors
(Equation 19), are shown in Table 26 and Figure 7. It can be seen that for all the structures, the errors ranged between
7% and 9% for concrete, between 8% and 14% for reinforcement, and between 4% and 5% for structural steel. The
percentage of estimated CMQs for a given percentage error are shown in Table 27 and Figure 8.
%Error D

CMQest CMQact
CMQact

(19)

Where:
CMQest: estimated quantity;
CMQact: actual quantity.
Table 26. Actual and estimated construction material quantities (CMQ) for target storages using the proposed methodology.
Structure
Structure ID
T1
T2
T3

Type
Storage
Storage
Storage

Proposed methodology
Subtype

% Error (Concrete)

% Error (Reinf.)

% Error (Struct. Steel)

A
A
A

9
4
5

7
14
8

N/A
N/A
N/A
(continued)

24

B. Garca de Soto et al.

Table 26. (Continued )

Downloaded by [UQ Library] at 20:54 30 June 2016

Structure
Structure ID

Type

T4
T5
T6
T7
T8
T9
T10
T11
T12
T13
T14
T15
T16
T17
T18
T19
T20
T21
T22
T23
T24
T25
T26
T27
T28
T29
T30
T31
T32
T33
T34
T35
T36
T37
T38
T39
T40
T41
T42
T43
T44
T45
T46
T47
T48
T49
T50
T51

Storage
Storage
Storage
Storage
Storage
Storage
Storage
Storage
Storage
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame

Proposed methodology
Subtype

% Error (Concrete)

% Error (Reinf.)

% Error (Struct. Steel)

A
B
B
B
B
C
C
C
C
A
A
A
A
A
B
B
B
B
B
C
C
C
C
C
D
D
D
D
D
E
E
E
E
E
F
F
F
F
F
G
G
G
G
G
H
H
H
H

2
1
0
7
1
0
1
1
9
5
2
1
3
1
3
2
1
1
1
3
2
1
5
4
4
4
0
2
0
0
3
5
1
5
1
2
1
2
1
3
5
2
2
4
2
4
5
4

2
0
5
5
3
1
0
3
1
0
4
3
0
3
3
1
2
3
3
2
0
2
2
1
0
1
4
3
4
2
1
5
2
3
1
1
5
2
4
1
3
0
1
1
4
3
4
1

N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
3
5
2
4
0
1
0
1
2
(continued)

International Journal of Construction Management

25

Table 26. (Continued )

Downloaded by [UQ Library] at 20:54 30 June 2016

Structure
Structure ID

Type

T52
T53
T54
T55
T56
T57
T58
T59
T60
T61
T62
T63
T64
T65
T66
T67
T68
T69
T70
T71
T72
T73
T74
T75
T76
T77
T78
T79
T80
T81
T82
T83
T84
T85
T86
T87
T88
T89
T90
T91
T92
T93
T94
T95
T96
T97
T98
T99
T100
T101
T102

Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame
Tall-frame

Proposed methodology
Subtype

% Error (Concrete)

% Error (Reinf.)

% Error (Struct. Steel)

H
I
I
I
I
I
J
J
J
J
J
K
K
K
K
K
L
L
L
L
L
A/G
A/G
A/G
A/G
A/G
B/H
B/H
B/H
B/H
B/H
C/I
C/I
C/I
C/I
C/I
D/J
D/J
D/J
D/J
D/J
E/K
E/K
E/K
E/K
E/K
F/L
F/L
F/L
F/L
F/L

0
3
3
3
4
4
4
1
1
2
1
2
5
0
5
1
2
4
2
2
1
12
3
3
4
3
17
16
15
3
10
11
10
6
8
7
13
3
11
9
1
15
1
8
1
0
2
17
12
5
0

5
2
2
1
5
4
3
5
3
2
2
1
3
4
3
1
0
4
5
2
4
17
17
3
4
5
3
2
12
10
3
7
7
5
9
15
5
13
12
13
4
7
8
13
4
10
13
14
7
12
3

3
1
0
1
3
2
1
4
4
5
4
1
3
4
0
5
1
2
1
4
3
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A

26

B. Garca de Soto et al.

Downloaded by [UQ Library] at 20:54 30 June 2016

Figure 7. Distribution of errors for estimated construction material quantities (CMQs) during validation.

Table 27. Percentage of estimated construction material quantities (CMQs) for a given percentage error.
% Error

Percentage error ()

Percentage error (C)

Percentage error (combined C and )

Cumulative (%)

<5
5 < 10
10 < 20
20 < 30
30 < 40
40 < 50
> 50

48
11
6
0
0
0
0

52
13
10
0
0
0
0

72
17
11
0
0
0
0

72
89
100





All of the estimations for the CMQs of the structures evaluated were within the specified average ranges of 23% to
C35% (Christensen & Dysert 2011) for Estimate Class 4. The proposed methodology provides a systematic way to
develop accurate CMQ estimates during an early project phase.

Figure 8. Percentage errors (positive, C and negative, ) for estimated construction material quantities (CMQs) during validation.

International Journal of Construction Management

27

Table 28. Relevant structures in Greenfield plants and estimated construction material quantities (CMQs).
Estimated (Equation 14)

Actual (entire Greenfield plant)

Percentage error

Concrete (m ) Reinf. (t) Struct. steel (t) Concrete (m ) Reinf. (t) Struct. steel (t) Concrete (%) Reinf. (%) Struct. steel (%)
Alpha
Beta
Gamma

80,400
25,700
31,146

10,558
3448
4328

2289
736
623

79,074
25,042
30,918

10,480
3374
4246

2260
760
610
MAPE

1.68
2.63
0.74
1.68

0.75
2.21
1.92
1.63

1.30
3.23
2.14
2.22

Downloaded by [UQ Library] at 20:54 30 June 2016

Plant level
A similar approach was used for the accuracy at the plant level using the information from Table 23. The estimated total
CMQs for the entire plant were compared with the actual amounts. The results are summarized in Table 28.
It can be seen that for the three Greenfield plans evaluated, the errors ranged between 0.74% and 1.68% for concrete,
between 0.75% and 2.21% for reinforcement, and between 3.23% and 2.14% for structural steel.
Comparison
In this section, the estimates using the proposed methodology at the structure level are compared with those from
approaches that used only RA models, NN models and CBR. The same data from the earlier example were used. The
results were compared using the percentage errors (Equation 19) and the resulting mean absolute percentage errors
(MAPEs) (Equation 20).
MAPE% D



100 X j CMQest CMQact j

n
CMQact

(20)

where n is the number of structures. In addition to this, the statistical significance of the performance of the proposed methodology was tested using the MAPE and the following hypotheses:
Null hypothesis: There is no difference between the absolute errors from the proposed methodology and the other technique (H0:
m1 D m2).
Alternative hypothesis: There is a difference between the absolute errors from the proposed methodology and the other technique
(H1: m1 6 m2).

where m1 and m2 correspond to the MAPEs for the proposed methodology and the compared approaches, respectively, for
each CMQ. Since the same data set was used, a dependent t-test (also called a paired-sample t-test) was used. Due to the
non-directionality of the alternative hypothesis (H1: m1 6 m2) both sides of the distribution were tested (two-tailed test).
The null hypothesis was tested at a D 0.05, and it was rejected when the observed t value (tobs) was greater than
the critical t value (tcrit) (i.e. the corresponding value for a given a and degree of freedom in the Students t-distribution;
Equation 21).
When tobs  tcrit ! reject H0

(21)

Other approaches
The RA models were developed using linear regression and the BET without transforming the data set (Equation 22). The
coefficients for all the IVs used in the models were statistically significant from zero for a D 0.05. The results are given in
Table 29.
Y D bo C

n
X
iD1

b i Xi C

m
X
jD1

bn C j X n C j C e

(22)

28

B. Garca de Soto et al.

Table 29. Regression analysis (RA) models.


Variable/unstandardized coefficients

Concrete

Reinf.

(Constant)
Cap_t
Height_m
Diam_m
S1
Type_B

1.05EC03
5.06E-02
4.25EC01
1.48EC02
3.48EC03
2.29EC03

1.19EC02
8.21E-03
6.54EC00
2.34EC01
7.97EC02
4.27EC02

Table 30. Weights for the neural network (NN) models using five inputs and three neurons for concrete and reinforcement.
Concrete

Input

Downloaded by [UQ Library] at 20:54 30 June 2016

Input/hidden layer (WA)

Output layer

H2

Total concrete (m3)

H1

H3

0.377 0.535 0.385


I1: Cap_t
I2: Diam_m
0.654 0.526 0.149
I3: Height_m 0.192 0.281 0.675
I4: S1
0.297 0.376 0.417
I5: Type_ID
0.402 0.098 0.318
Bias 1
0.138 0.733 0.032
Hidden/output layer (WB) H1



H2



H3



Bias 2




Input layer

Reinforcement







0.721
0.533
0.677
0.246

Hidden layer (WA)

Output layer

H1

H3

Total reinforce-ment (t)

0.146
0.476
0.351
0.409
0.034
0.093











0.543
0.447
0.492
0.007

H2

0.538 0.371
0.119 0.337
0.412 0.142
0.001 0.051
0.093 0.140
0.073 0.144









The NN models used were the ones developed during step 3, and found in Garca de Soto et al. (2014). The weights for
the different NN models are given in Table 30.
The estimations using CBR were based on the general CBR approach described in Garca de Soto (2014). These models used the same similarity function (i.e. the similarity using the city-block distance with the adjusted unstandardized
coefficients from the selected regression model) and similarity threshold of 70%. No adaptation was made.
Results
The results for the estimation of concrete and reinforcement for the storage structure subtypes are given in Table 31. The
percentage errors and corresponding MAPEs for the target storage structures are given in Table 32. The results from the
statistical tests are given in Table 33.
Discussion
The MAPEs for the different structures and CMQs (summarized in the last row of Table 33) show that the proposed methodology performs better than the other approaches or techniques do. In some cases the difference was bigger than others
(e.g. for the estimates of concrete in storage structures, the proposed methodology had a MAPE of 3% vs. 29% from the
RA model). Other times, the difference was not that obvious (e.g. for the estimates of structural steel in the upper structure
of tall-frame structures, the proposed methodology had a MAPE of 2% vs. 3% from the regression and NN models), but
still better.
The results from the six statistical tests (Table 34) show that in most cases (4/6 or 67% of the cases) the null hypothesis
was rejected, meaning that when estimating the CMQs there was a statistically significant difference between the MAPE
from the proposed methodology and the MAPE from the other approaches. The only cases where the null hypothesis was
accepted were for the estimation of concrete and reinforcement in storage structures using the NN model (Table 33). That

International Journal of Construction Management

29

Table 31. Estimated construction material quantities (CMQs) for target storage structure subtypes using regression analysis (RA), neural networks (NN) and case-based reasoning (CBR).
RA
Structure ID

Downloaded by [UQ Library] at 20:54 30 June 2016

T1
T2
T3
T4
T5
T6
T7
T8
T9
T10
T11
T12

NN

CBR

Subtype

Concrete (m3)

Reinf. (t)

Concrete (m3)

Reinf. (t)

Concrete (m3)

Reinf. (t)

A
A
A
A
B
B
B
B
C
C
C
C

4445
6857
7363
6579
9576
11,776
13,767
11,154
5010
6165
6618
6728

707
1098
1239
1072
1553
1881
2245
1787
805
1009
1058
1062

3395
4705
5856
4815
5938
8011
10,784
7797
3920
4437
4908
5115

647
860
1100
866
1024
1367
1754
1244
687
803
810
796

3970
5437
5107
5101
7715
8630
9353
7096
4080
5803
6339
6721

695
988
951
918
1242
1375
1521
1132
657
939
995
1097

Table 32. Percentage error between actual and estimated construction material quantities (CMQs; and MAPEs) for target storage structure subtypes using the proposed methodology, regression analysis (RA), neural networks (NN), and case-based reasoning (CBR).
Proposed methodology
Structure
ID
T1
T2
T3
T4
T5
T6
T7
T8
T9
T10
T11
T12
MAPE

RA

NN

CBR

% Error
(Concrete)

% Error
(Reinf.)

% Error
(Concrete)

% Error
(Reinf.)

% Error
(Concrete)

% Error
(Reinf.)

% Error
(Concrete)

% Error
(Reinf.)

9
4
5
2
1
0
7
1
0
1
1
9
3

7
14
8
2
0
5
5
3
1
0
3
1
4

38
42
25
30
52
43
23
62
24
8
2
2
29

16
34
7
7
46
34
38
60
24
13
8
6
24

5
3
0
5
6
3
4
13
3
22
24
23
9

6
5
5
13
3
3
8
11
6
10
17
29
10

23
12
13
0
22
5
17
3
1
2
2
2
9

14
21
18
8
17
2
7
1
1
5
1
2
8

Table 33. Results of the statistical tests.


CMQ
Concrete
Concrete
Concrete
Reinforcement
Reinforcement
Reinforcement

Proposed methodology vs.

tobs

tcrit (a/2)

Reject Ho?

RA
NN
CBR
RA
NN
CBR

4.399
2.139
2.369
3.920
2.045
2.540

2.201
2.201
2.201
2.201
2.201
2.201

Yes
No
Yes
Yes
No
Yes

30

B. Garca de Soto et al.

is, in those two cases there was no statistically significant difference between the proposed methodology and the NN
model; however, the MAPEs from the proposed methodology were still lower.

Downloaded by [UQ Library] at 20:54 30 June 2016

Conclusion
A methodology was presented to improve the accuracy of preliminary cost estimates through improved estimates of the
CMQs used in construction projects and to reduce the amount of time required to make these estimates through partial
automation. The estimation of CMQs is beneficial, over direct estimation of cost, because it allows for a clear separation
between the technical elements (e.g. construction materials) and the financial elements (e.g. unit cost of labor and material)
at an early project phase. This in turns allows for a better understanding of the variations between planned and actual and
provided a better tracking and control tool during subsequent phases of the project. The methodology makes use of existing
data and uses RA, NNs and CBR, and encompasses data collection, model development and evaluation, and the integration
of different techniques.
The proposed methodology was used to estimate CMQs at the structure and plant level. In total, 234 CMQs from relevant structures were estimated. The range of errors between the actual and estimated amounts was between 13% and
C17%, with over 70% of those estimations within a 5% error. It was found that the use of the proposed methodology
not only provides a systematic way to develop accurate CMQ estimates during an early project phase, but also performs
better (lower MAPEs) that other techniques typically used to develop models to make such estimates (e.g. regression, NN,
and CBR). This improved performance of the proposed methodology did not occur by chance, as proved by the statistical
hypothesis tests conducted on the estimation of CMQs for the structures investigated using RA, NN and CBR.
The results from the methodology for the estimation of the total amount of CMQs for Greenfield plant projects were
also accurate. For the three Greenfield projects evaluated, the errors ranged from 3.2% to 2.6%, with an overall MAPE
of 1.84%.
Although not rigorously tested, it was also found that the partial automation of the process of making CMQ estimates
was able to reduce the time of calculation substantially. Finally, it is also expected that the observed improvements in the
estimation of CMQs would be transferred to an improvement in the preliminary cost estimates of construction projects.
Future work includes investigation of these elements.

Acknowledgments
The authors would like to thank the capital expenditure (CAPEX) department at Holcim for financing and making their data available for
this research. Special thanks are given to Mr. Rudy Blum and Mr. Roberto Nores.

Notes
1.
2.
3.
4.
5.
6.

7.
8.
9.
10.

With the intent to keep it at a high level so that it can easily be adapted to different project types and different industries.
In this paper, a project is related to the construction of a facility with multiple structures.
Including clinker production, cement mill and packaging.
This is done to avoid computational problems in the case that the input from the new structure being estimated is outside the range
of the existing data by adjusting the range to accommodate the new value (as either a maximum or a minimum, whatever the case
might be) and ensure that the scale between 0 and 1 is done properly.
Nevertheless, the proposed methodology allows for the adjustment of the similarity threshold by the estimator in case it is
necessary.
Tall-frame structures were grouped according to their number of stages (levels) and their construction type, i.e. pure reinforced
concrete (RC) and hybrid (HB) structures (a combination of reinforced concrete and structural steel, e.g. columns and slabs in reinforced concrete, floor beams and main beams in standard structural steel sections), yielding a total of 12 tall-frame structure
subtypes.
This process was done in collaboration with the owners of the data and, when applicable, adjustments were made based on their
knowledge about the data and expertise about the different projects where the data were originally collected.
For example, the design wind speed used was in accordance with the Eurocode 1, EN 1991 1-4 (2010), the spectral response acceleration used was in accordance with the 2009 IBC (based on the 2002 USGS National Seismic Hazard Maps), and the soil factor
was used in accordance with the Eurocode 8, EN 1998 1-6 (2006).
Instead of ranking structures separately for each CMQ the individual percentages (Pi,a) can be combined to facilitate the identification of CMQ-relevant structures. This way the contribution of each structure subtype for each CMQ can be weighted using Equation (2).
The selected percentage (in this case 80%) is used as a threshold when determining the CMQ-relevant structures. Therefore, the
higher the threshold value the more structures will be included and the lower the uncertainty built into the preliminary estimates
for the entire project. However, this also means more work and effort required by the estimator since more data have to be collected
and more models need to be developed.

International Journal of Construction Management

31

11. Storage structure (STGS), types AC, backward elimination technique (BET), concrete (CO), model number (4).
12. Storage structure (STGS), types AC, backward elimination technique (BET), reinforcement (RE), model number (5).
13. AACEI (Christensen & Dysert 2011) indicated an accuracy outer range of -30% to C50% for Estimate Class 4; with average values
of 23 to C35%.

ORCID
Borja Garca de Soto
Dilum Fernando

http://orcid.org/0000-0002-9613-8105

http://orcid.org/0000-0001-7481-7935

Downloaded by [UQ Library] at 20:54 30 June 2016

References
Al-Alawi SM, Abdul-Wahab SA, Bakheit CS. 2008. Combining principal component regression and artificial neural networks for more
accurate predictions of ground-level ozone. Environ. Model. Softw. 23(4):396403.
Albright SC, Winston WL, Zappe CJ. 2003. Data analysis & decision making. Pacific Grove: Brooks/Cole, Thomson Learning, Inc.
An SH, Kim GH, Kang KI. 2007. A case-based reasoning cost estimating model using experience by analytic hierarchy process. Build.
Environ. 42(7): 25732579.
Armstrong JS. 2001. Combining forecasts. In: Principles of forecasting. New York: Springer; p. 417439.
Bakhoum M, Morcous G, Taha M, El-Said M. 1998. Estimation of quantities and cost of prestressed concrete bridges over the Nile in
Egypt. J. Egypt. Soc. Eng./Civ. 37(4):1732.
Bates JM, Granger CW. 1969. The combination of forecasts. Oper. Res. Soc. 20(4):451468.
Bowen PA, Edwards PJ. 1985. Cost modelling and price forecasting: practice and theory in perspective. Constr. Manag. Econ.
3(3):199215.
Cheung FK, Rihan J, Tah J, Duce D, Kurul E. 2012. Early stage multi-level cost estimation for schematic BIM models. Autom. Constr.
27:6777.
Cho H-G, Kim K-G, Kim J-Y, Kim G-H. 2013. A comparison of construction cost estimation using multiple regression analysis and neural network in elementary school projects. J. Korea Inst. Build. Constr. 13(1):6674.
Choi J, Kim H, Kim I. 2015. Open BIM-based quantity take-off system for schematic estimation of building frame in early design stage.
J. Comput. Des. Eng. 2(1):1625.
Choi S, Kim D, Han S, Kwak Y. 2013. Conceptual cost-prediction model for public road planning via rough set theory and case-based
reasoning. J. Constr. Eng. Manage. Available from: http://dx.doi.org/10.1061/(ASCE)CO.1943-7862.0000743.
Chou JS. 2005. Item-level quantity-based preliminary cost estimating system for highway earthwork, landscape, subgrade treatments,
base, surface courses, pavement and traffic control [dissertation]. Retrieved from University of Texas Libraries, Digital Repository
(2008-08-28T22:20:03Z)
Chou JS, Peng M, Persad KR, OConnor JT. 2006. Quantity-based approach to preliminary cost estimates for highway projects. Transp.
Res. Rec. J. Transp. Res. Board. 1946(1):2230.
Christensen P, Dysert LR. 2011. Cost estimate classification system  as applied in engineering, procurement, and construction for the
process industries (AACE International Recommended Practice No. 18R-97). Morgantown, WV: AACE International, Rev. November 29, 2011. Formerly: Association for the Advancement of Cost Engineering. Recommended Practice.
Clemen RT. 1989. Combining forecasts: a review and annotated bibliography. Int. J. Forecast. 5(4):559583.
Crane DB, Crotty JR. 1967. A two-stage forecasting model: exponential smoothing and multiple regression. Manag. Sci. 13(8):B501.
Du J, Bormann J. 2014. Improved similarity measure in case-based reasoning with global sensitivity analysis: an example of construction quantity estimating. J. Comput. Civ. Eng. 28(6):04014020.
Forgues D, Iordanova I, Valdivesio F, Staub-French S. 2012. Rethinking the cost estimating process through 5D BIM: a case study. In:
Construction Research Congress 2012, ASCE. Construction Challenges in a Flat World; p. 778786.
Fragkakis N, Lambropoulos S, Tsiambaos G. 2011. Parametric model for conceptual cost estimation of concrete bridge foundations.
J. Infrastruct. Syst. 17(2):6674.
Franco J, Mahdi F, Abaza H. 2015. Using building information modeling (BIM) for estimating and scheduling, adoption barriers.
Univers. J. Manag. 3(9):376384.
Garca de Soto B. 2014. A methodology to make accurate preliminary estimates of construction material quantities for construction projects [dissertation]. Swiss Federal Institute of Technology in Zurich (ETH Z
urich). Diss. ETH No. 22313, Zurich, Switzerland. doi:
10.3929/ethz-a-010361720.
Garca de Soto B, Adey BT. 2015a. Investigation of the case-based reasoning retrieval process to estimate resources in construction projects. Paper presented at: 2015 Creative Construction Conference; Krakow, Poland; 2124 June 2015.
Garca de Soto B, Adey BT. 2015b. Regression-based adaptation to estimate quantities using case-based reasoning. Paper presented at
the 2015 AACE International Annual Meeting; Las Vegas, Nevada, USA; 28June 1 July 2015.
Garca de Soto B, Adey BT, Fernando D. 2014. A process for the development and evaluation of preliminary construction material quantity estimation models using backward-elimination-regression and neural networks. J. Cost Anal. Parametr. 7:139, doi: 10.1080/
1941658X.2014.984880.
Garca de Soto B, Fernando D, Adey BT. 2013. Methodology to accurately estimate the quantities of construction materials used in
cement plant construction projects. MIP Project. Holderbank, Switzerland: Internal HGRS Report.
Hegazy T, Ayed A. 1998. Neural network model for parametric cost estimation of highway projects. J. Constr. Eng. Manag.
124(3):210218.

Downloaded by [UQ Library] at 20:54 30 June 2016

32

B. Garca de Soto et al.

Jin R, Cho K, Hyun C, Son M. 2012. MRA-based revised CBR model for cost prediction in the early stage of construction projects.
Expert Syst. Appl. 39(5):52145222.
Jin R, Han S, Hyun C, Kim J. 2014. Improving accuracy of early stage cost estimation by revising categorical variables in a case-based
reasoning model. J. Constr. Eng. Manag. 140(7):04014025.
Khosrowshahi F, Kaka AP. 1996. Estimation of project total cost and duration for housing projects in the UK. Build. Environ.
31(4):375383.
Kim GH, An SH, Kang KI. 2004. Comparison of construction cost estimating models based on regression analysis, neural networks, and
case-based reasoning. Build. Environ. 39(10):12351242.
Kim GH, Seo DS, Kang KI. 2005. Hybrid models of neural networks and genetic algorithms for predicting preliminary cost estimates.
J. Comput. Civ. Eng. 19(2):208211.
Kim KJ, Kim K, Kang CS. 2009. Approximate cost estimating model for PSC beam bridge based on quantity of standard work. Korean
Soc. Civ. Eng. J. Civ. Eng. 13(6):377388.
Kim S, Shim JH. 2013. Combining case-based reasoning with genetic algorithm optimization for preliminary cost estimation in construction industry. Can. J. Civ. Eng. 41(1):6573.
Koo C, Hong T, Hyun C, Koo K. 2010. A CBR-based hybrid model for predicting a construction duration and cost based on project characteristics in multi-family housing projects. Can. J. Civ. Eng. 37(5):739752.
Lowe DJ, Emsley MW, Harding A. 2006. Predicting construction cost using multiple regression techniques. J. Constr. Eng. Manag.
132(7):750758.
McCuen TL. 2015. Chapter 3. BIM and cost estimating: a change in the process for determining project costs. In: Issa RRA, Olbina S,
editors. Building information modeling: applications and practices. Technical Council on Computing and Information Technology
of the American Society of Civil Engineers. Reston (VA): American Society of Civil Engineers; p. 6381.
Mohamed A, Celik T. 1998. An integrated knowledge-based system for alternative design and materials selection and cost estimating.
Expert Syst. Appl. 14(3):329339.
Motulsky HJ, Christopoulos A. 2003. Fitting models to biological data using linear and nonlinear regression: A practical guide to curve
fitting. San Diego, CA: GraphPad Software Inc.
Oberlender GD, Trost SM. 2001. Predicting accuracy of early cost estimates based on estimate quality. J. Constr. Eng. Manag.
127(3):173182.
Oh CD, Park C, Kim KJ. 2013. An approximate cost estimation model based on standard quantities of steel box girder bridge substructure. Korean Soc. Civ. Eng. J. Civ. Eng. 17(5):877885.
Patterson D, Rooney N, Galushka M. 2002. A regression based adaptation strategy for case-based reasoning. In: Eighteenth national conference on artificial intelligence, p. 8792; July; Edmonton, Alberta, Canada: American Association for Artificial Intelligence.
Peng M. 2006. Item-level quantity-based preliminary cost estimating system for highway structures and miscellaneous construction.
[dissertation]. Retrieved from University of Texas Libraries, Digital Repository (2008-08-28T22:58:08Z).
Reid DJ. 1968. Combining three estimates of gross domestic product. Economica. 35(140):431444.
Sattineni A, Bradford RH. 2011. Estimating with BIM: a survey of US construction companies. In: Proceedings of the 28th ISARC;
Seoul, Korea; p. 564569.
Singh S. 1990. Cost model for reinforced concrete beam and slab structures in buildings. J. Constr. Eng. Manag. 116(1):5467.
Singh S. 1991. Cost estimation of prestressed concrete beam and reinforced concrete slab construction. Constr. Manag. Econ.
9(2):205215.
Son BS, Lee HS, Park M, Han DY, Ahn J. 2013. Quantity based active schematic estimating (Q-BASE) model. Korean Soc. Civ. Eng. J.
Civ. Eng. 17(1):921.
Sonmez R. 2004. Conceptual cost estimation of building projects with regression analysis and neural networks. Can. J. Civ. Eng.
31(4):677683.
Wilmot CG, Mei B. 2005. Neural network modeling of highway construction costs. J. Constr. Eng. Manag. 131(7):765771.
Wu S, Wood G, Ginige K, Jong SW. 2014. A technical review of BIM based cost estimating in UK quantity surveying practice, standards and tools. J. Inf. Technol. Constr. (ITCon), 19:534562.
Yau NJ, Yang JB. 1998. Casebased reasoning in construction management. Comput. Aided Civ. Infrastruct. Eng. 13(2):143150.
Yeh IC. 1998. Quantity estimating of building with logarithm-neuron networks. J. Constr. Eng. Manag. 124(5):374380.
Yu WD. 2006. PIREM: a new model for conceptual cost estimation. Constr. Manag. Econ. 24(3):259270.

View publication stats

Вам также может понравиться