Академический Документы
Профессиональный Документы
Культура Документы
Keywords
GAINS, Business Intelligent, Data warehouse, ETL
1. INTRODUCTION
The Regional Air Pollution Information and Simulation (RAINS)
model developed by the International Institute for Applied
Systems Analysis (IIASA) combines information on economic
and energy development, emission control potentials and costs,
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are not
made or distributed for profit or commercial advantage and that copies bear
this notice and the full citation on the first page. To copy otherwise, or Figure 1. Flow of information in the RAINS/GAINS model
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
As a next step of the model, the GAINS-BI approach is
iiWAS2008, November 24–26, 2008, Linz, Austria. introduced in this paper. The GAINS-BI is studied and developed
(c) 2008 ACM 978-1-60558-349-5/08/0011 $5.00. based on the Business intelligence (BI) concepts [5,7,11,12,13],
332
iiWAS 2008 Proceedings of iiWAS2008
which are basically comprised of data warehousing infrastructure, decisions, create more effective plans and respond more quickly
analysis, and reporting environment. Furthermore, Business to problems and opportunities [10]. Thus, this approach
intelligence (BI) is the process of gathering enough of the right effectively and efficiently leverages the data resources to satisfy
information in the right manner at the right time, and delivering their requirements for analysis, reporting and decision making
the right results to the right people for decision-making purposes process.
so that it can continue to yield real business benefits, or have a In the context of the GAINS model, there are several specific
positive impact on business strategy, tactics, and operations .With questions like “How much would a migration from one
its set of methodologies and technologies, BI has been described technology to another, more effective one, cost and how much
as a promising technology that tends to help enterprises in emissions would it save?”, or “What is the most effective way in
transforming their legacy systems towards integrated, user-centric terms of use of technologies to save emissions within a given
information systems required for the support of improvement budget?”. Questions like this are answered with the help of the
business operation effectiveness and of management/decision GAINS optimization module [14].
making process [12].
This paper focuses on applying mathematical models and BI
In this paper, first we introduce mathematical models used to concepts to improve the data integration (ETL) process with some
calculate emission [8], and cost [14] for a given pollutant, GAINS specific calculation for emission and costs. Hereafter, GAINS-BI
region, and year within a given GAINS scenario. These system is implemented with various data analysis and decision
mathematic sound concepts enable to specify the GAINS-BI support components, and is to provide efficient ways of obtaining
conceptual data model as well as used for calculating emission valuable information and knowledge.
and costs in the ETL process and data cube generation. In this
context, the GAINS-BI architecture and its concepts have been
introduced as an application framework. Furthermore, a 3. GAINS-BI CONCEPTUAL MODEL
multidimensional data model, including three main fact tables In this section, first we introduce the GAINS concepts and the
namely activity_f, emission_f, cost_f and six dimensions, namely scenario-based emission and cost calculations. Furthermore, the
scenario_d, pollutant_d, region_d, activity_d, sector_d, and GAINS-BI architecture is introduced as a framework for
time_d to be specified three data cubes, namely activity, emission specifying based components of the system. More details, the
and cost data cubes. To fulfill regional specific requirements, we ETL process is presented to show the data follow how to collect
have developed a class of regional data marts, , i.e. GAINS and calculate data in GAINS-BI.
Europe, GAINS Asia, GAINS World etc, used to collect relevant 3.1 GAINS concepts
data from multiple GAINS data sources and used for regional data According to [8], the RAINS model has been extended to capture
analysis and cost optimization. Afterwards, a global GAINS data (economic) interactions between the control of conventional air
warehouse has been developed to integrate multidimensional data pollutants and greenhouse gases. This GAINS model includes, in
from regional data marts for global data analysis and cost addition to the air pollutants covered in RAINS, carbon dioxide
optimization purposes. Some typical examples have been (CO2), methane (CH4), nitrous oxide (N2O) and the F-gases [8].
presented. The utilization of BI in GAINS provides a feasible and Thereby, the traditional RAINS model constitutes the air
effective method to improve the speed of reporting, analysis, and pollution-related part of the GAINS model, while the GAINS
information delivery for faster operational decision-making and extensions address the interactions between air pollutants and
action-taking, thus enabling to react rapidly to business problems greenhouse gases.
and satisfy new requirements.
The rest of this paper is organized as follows: section 2 introduces
some approaches related to our work; after a brief introduction of
GAINS concepts, in section 3, a GAINS architecture and ETL
framework are presented, section 4 will present our
implementation results. At last, section 5 gives a summary of
what have been achieved and future works.
2. RELATED WORKS
The characters of the proposed approach can be rooted in several
research areas of BI, including the trends and concepts of BI
solutions, the combined use of mathematical models and data
warehousing technologies in supporting BI, as well as the
utilization of BI in GAINS. With the amount of data generated in
an enterprise increasing continuously, delivering the right and
sufficient amount of information at the right time to the right
business users has become more complicated and critical [7]. Figure 2. Environmental effects of air pollutants and greenhouse
More and more enterprise solutions and platforms for Business gasesScenario-based Emission and Cost Calculation
Intelligence have been developed such as IBM DB2 with Emission Calculation
Business Intelligence Tools, Microsoft SQL Server, Teradata
Warehouse, SAS, iData Analyzer, Oracle, Cognos, Business According to [8].The emissions for a given pollutant, GAINS
Objects, etc. [11], have been developed aim to empower region, and year within a given GAINS scenario are calculated
businesses by providing direct access to information used to make according to the following equation
333
Proceedings of iiWAS2008 iiWAS 2008
E p , r ,t , y ¦E
a , s ,t
p , a , s ,t , r , y ¦A
a , s ,t
a,s,r , y
. X a , s , t , r , y .ef a , s , p , r .(1 K a , s ,t , p , r ) 3.2 GAINS-BI architecture
GAINS-BI is conceived as a data warehousing system whereby
where member organizations remain responsible for the generation and
p, r , y : Pollutant, GAINS region, year, providing of their data. The GAINS BI components could be
described as follows:
a, s, t : GAINS activity, sector, abatement technology
(option), x ETL Tools: GAINS-BI enables users to provide that content
(upload) and, to access it in terms of GAINS data download.
E p , r ,t , y :missions of the specific pollutant p, GAINS region r,
Afterwards, the ETL tool will extract, transform (emission
and year y, and cost calculations) (Figure 4), and load the data to a
Aa , s ,r , y :Activity for a given GAINS activity/sector GAINS data mart.
combination (a, s), x Emission and Cost Aggregation Pre-Calculation. In this
X a , s ,t ,r , y :Actual implementation rate of the considered step, three main data cubes are generated in term of pre-
calculate activity, emission and cost data with multiple levels
abatement option , of data granularities based on dimension hierarchies and
ef a , s ,t , p , r “Uncontrolled” (“unabated”) emission factor, and aggregated data values.
K a ,s ,t , p ,r Reduction efficiency. x Intranet Report Systems. Jasper Report is used in this case
Cost Calculation to generate reports, including data table and chart
generations.
Similar to the emissions, also the costs of reducing emissions for a
x Web Publisher. In the context of Jasper Report, the output of
given pollutant, GAINS region, and year can be calculated by the
GAINS-BI could be generated in different formats, i.e.
GAINS model according to [14]:
HTML Ajax, PDF, Excel, etc.
C p , r ,t , y ¦C
a , s ,t
p , a , s ,t , r , y ¦G
a , s ,t
( a , s , t ), p
Aa , s , r , y . X a , s ,t , r , y .cf a , s , t , r x Metadata. GAINS-BI metadata is used to describe all built-
steps. First, the metadata contains the Calculation Rules
where based on the mathematical formulations described in the
C p , r , t , y ;Reduction costs of the specific pollutant p, GAINS previous section. The second metadata item is information
about Pre-calculation Aggregation (data granularity). The
region r, and year y,
Multi-Dimensional Report Management is used to manage a
cf a , s ,t ,r Unit cost factor of the considered abatement option, set of generated reports as well as the configuration of each
and report. For multiple purposes, e.g. many kinds of end users,
G ( a , s ,t ), p Kronecker delta function that returns 1 if p is the different levels of decision makings, Data Selection is to
define different sub-cubes. These sub-cubes could be
primary cost pollutant for abatement option a, s, t and 0
accessed via several methods or protocols, i.e. web services,
otherwise. API, Servlet, etc. The report and result presentation could be
334
iiWAS 2008 Proceedings of iiWAS2008
335
Proceedings of iiWAS2008 iiWAS 2008
Figure 7 shows an example of Activity report generated from Cost Fact Table (emission_f)
Activity Data Cube GAINS does not produce nor use single pollutant cost curves in
the optimization. However, single pollutant cost curves can be
Emission Fact Table (emission_f) constructed by GAINS, if so desired [14].The Cost Fact Table is
Emissions of each pollutant are calculated as the product of the defined based on six dimensions, i.e. Scenario Dimension
activity levels, the “uncontrolled” emission factor in absence of (scenario_d), Region Dimension (region_d), Sector Dimension
any emission control measures, the efficiency of emission control (sector_d), Activity Dimension (activity_d), Time
measures and the application rate of such measures. The Emission Dimension(time_d), Pollutant Dimension (pollutant_d),
Fact Table is defined based on five dimensions, i.e. Scenario Technology Dimension (technology_d) and denoted as follows:
Dimension (scenario_d), Region Dimension (region_d), Sector cost_f(idscenario,idregion,idactivitytivity,idsectortor,idtechnology
Dimension (sector_d), Activity Dimension (activity_d), Time ,year,idpollutant_fraction,factor,activityvalue,perc,cost,idcostsets
Dimension(time_d), Pollutant Dimension (pollutant_d) and ,unit)
denoted as follows:
emission_f(idscenario,idregion,idactivitytivity,idsectortor,year,
idpollutant_fraction,activityvalue,impl_ef,factor_noc_abtd,rem_ef,emiss_c
alc,emiss,emiss_co2eq)
Figure 7. An example of using Activity Data Cube to generate Energy Data aggregated by Activity
336
iiWAS 2008 Proceedings of iiWAS2008
337
Proceedings of iiWAS2008 iiWAS 2008
[6] Grant A. J., Luqi, "Intranet Portal Model and Metrics: A Conference on Computer information systems and Industrial
Strategic Management Perspective," IT Professional, vol. 7, management applications, 2007, pp. 364-368.
pp. 37-44, 2005. [12] Wei X., Xiaofe X., Lei S., Quanlong L., Hao L, “Business
[7] Hugh 7. J. W., Barbara H. W., "The Current State of intelligence based group decision support system”, in Proc of
Business Intelligence," Computer, vol. 40, pp. 96-99, 2007. the International Conferences on Info-tech and Info-net ICII
[8] Klaassen, G., Amann, M., Berglund, C., Cofala, J., Höglund- 2001, Beijing, China, 2001, pp. 295 – 300.
Isaksson, L., Heyes, C., Mechler, R., Tohka, A., Schöpp, W., [13] Zeng L., Z. Shi, M. Wang, W. Wu, "Techniques, Process,
Winiwarter, W. (2004) The Extension of the RAINS Model and Enterprise Solutions of Business Intelligence," in Proc of
to Greenhouse Gases. An interim report describing the state the IEEE Conference on Systems, Man, and Cybernetics,
of work as of April 2004. IIASA IR-04-015. Taipei, Taiwan, 2006, pp. 4722-4726.
[9] Makowski M.P. Data Cleaning and Performance Tuning in [14] Wagner, F., W. Schoepp and C. Heyes. The RAINS
the GAINS Model. Thesis at the Database and Artificial optimization module for the Clean Air For Europe (CAFE)
Intelligence Group (DBAI) of the Technical University of Programme, Interim Report IR-06-029, International
Vienna, 2008. Institute for Applied Systems Analysis (IIASA), September
[10] Ta’a A., Bakar M. S. A., Saleh A. R., “Academic business 2006.
intelligence system development using SAS® tools”, in [15] www: http://www.iiasa.ac.at/web-apps/apd/gains/
Online Proc of the SAS Global Forum, 2008.
[11] Tvrdikova M., "Support of Decision Making by Business
Intelligence Tools," in Proc of the 6th International
338