Вы находитесь на странице: 1из 7

Proceedings of iiWAS2008 iiWAS 2008

GAINS-BI: Business Intelligent Approach for


Greenhouse Gas and Air Pollution Interactions and
Synergies Information System

Thanh Binh NGUYEN Wolfgang SCHOEPP Fabian WAGNER


International Institute for Applied International Institute for Applied International Institute for Applied
Systems Analysis (IIASA) Systems Analysis (IIASA) Systems Analysis (IIASA)
Schlossplatz 1 Schlossplatz 1 Schlossplatz 1
A-2361 Laxenburg, Austria A-2361 Laxenburg, Austria A-2361 Laxenburg, Austria
Tel: (+43 2236) 71 327 Tel: (+43 2236) 71 309 Tel: (+43 2236) 71 565
nguyenb@iiasa.ac.at schoepp@iiasa.ac.at wagnerf@iiasa.ac.at

ABSTRACT atmospheric dispersion characteristics and environmental


The Greenhouse Gas and Air Pollution Interactions and Synergies sensitivities towards air pollution [1,2,3,5,8,9,14]. In 2005 the
(GAINS)-Model is studied and developed to provide a consistent model was extended to meet the new needs of “pollution science”
framework for the analysis of co-benefits reduction strategies as well as modeling pollution through greenhouse gases. The
from air pollution and greenhouse gas sources. In this paper we extension of the scientific approach was also reflected in the new
introduced a BI approach, namely GAINS-BI, applied as a further name of the model, namely Greenhouse Gas and Air Pollution
development of the GAINS model. In this context, the GAINS-BI Interactions and Synergies (GAINS)[8]. These air pollution
conceptual model, including GAINS-BI architecture and related problems are considered in a multi-pollutant context
concepts, is specified based on a sound mathematical models used (Figure 1) quantifying the contributions of sulphur dioxide (SO2),
for calculate emission and costs. Hereafter, a multidimensional nitrogen oxides (NOx), ammonia (NH3), nonmethane volatile
data model, e.g. activity, emission and cost data cubes, has been organic compounds (VOC), and primary emissions of fine
studied and introduced to represent specific multidimensional (PM2.5) and coarse (PM10-PM2.5) particles. The main goal of
analysis requirements of greenhouse gas and air pollution the model is to estimate, for a given energy- and agricultural
application domains. To proof of concepts, some implementation scenario, the costs and environmental effects of user-specified
results have been presented. emission control policies (the “scenario analysis” mode), see
Figure 2. Furthermore, a linear optimization mode can be used to
identify the cost-minimal combination of emission controls
Categories and Subject Descriptors meeting user-supplied air quality targets, taking into account
H.2.8 [Database Applications] regional differences in emission control costs and atmospheric
dispersion characteristics.
General Terms
Design

Keywords
GAINS, Business Intelligent, Data warehouse, ETL

1. INTRODUCTION
The Regional Air Pollution Information and Simulation (RAINS)
model developed by the International Institute for Applied
Systems Analysis (IIASA) combines information on economic
and energy development, emission control potentials and costs,

Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are not
made or distributed for profit or commercial advantage and that copies bear
this notice and the full citation on the first page. To copy otherwise, or Figure 1. Flow of information in the RAINS/GAINS model
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
As a next step of the model, the GAINS-BI approach is
iiWAS2008, November 24–26, 2008, Linz, Austria. introduced in this paper. The GAINS-BI is studied and developed
(c) 2008 ACM 978-1-60558-349-5/08/0011 $5.00. based on the Business intelligence (BI) concepts [5,7,11,12,13],

332
iiWAS 2008 Proceedings of iiWAS2008

which are basically comprised of data warehousing infrastructure, decisions, create more effective plans and respond more quickly
analysis, and reporting environment. Furthermore, Business to problems and opportunities [10]. Thus, this approach
intelligence (BI) is the process of gathering enough of the right effectively and efficiently leverages the data resources to satisfy
information in the right manner at the right time, and delivering their requirements for analysis, reporting and decision making
the right results to the right people for decision-making purposes process.
so that it can continue to yield real business benefits, or have a In the context of the GAINS model, there are several specific
positive impact on business strategy, tactics, and operations .With questions like “How much would a migration from one
its set of methodologies and technologies, BI has been described technology to another, more effective one, cost and how much
as a promising technology that tends to help enterprises in emissions would it save?”, or “What is the most effective way in
transforming their legacy systems towards integrated, user-centric terms of use of technologies to save emissions within a given
information systems required for the support of improvement budget?”. Questions like this are answered with the help of the
business operation effectiveness and of management/decision GAINS optimization module [14].
making process [12].
This paper focuses on applying mathematical models and BI
In this paper, first we introduce mathematical models used to concepts to improve the data integration (ETL) process with some
calculate emission [8], and cost [14] for a given pollutant, GAINS specific calculation for emission and costs. Hereafter, GAINS-BI
region, and year within a given GAINS scenario. These system is implemented with various data analysis and decision
mathematic sound concepts enable to specify the GAINS-BI support components, and is to provide efficient ways of obtaining
conceptual data model as well as used for calculating emission valuable information and knowledge.
and costs in the ETL process and data cube generation. In this
context, the GAINS-BI architecture and its concepts have been
introduced as an application framework. Furthermore, a 3. GAINS-BI CONCEPTUAL MODEL
multidimensional data model, including three main fact tables In this section, first we introduce the GAINS concepts and the
namely activity_f, emission_f, cost_f and six dimensions, namely scenario-based emission and cost calculations. Furthermore, the
scenario_d, pollutant_d, region_d, activity_d, sector_d, and GAINS-BI architecture is introduced as a framework for
time_d to be specified three data cubes, namely activity, emission specifying based components of the system. More details, the
and cost data cubes. To fulfill regional specific requirements, we ETL process is presented to show the data follow how to collect
have developed a class of regional data marts, , i.e. GAINS and calculate data in GAINS-BI.
Europe, GAINS Asia, GAINS World etc, used to collect relevant 3.1 GAINS concepts
data from multiple GAINS data sources and used for regional data According to [8], the RAINS model has been extended to capture
analysis and cost optimization. Afterwards, a global GAINS data (economic) interactions between the control of conventional air
warehouse has been developed to integrate multidimensional data pollutants and greenhouse gases. This GAINS model includes, in
from regional data marts for global data analysis and cost addition to the air pollutants covered in RAINS, carbon dioxide
optimization purposes. Some typical examples have been (CO2), methane (CH4), nitrous oxide (N2O) and the F-gases [8].
presented. The utilization of BI in GAINS provides a feasible and Thereby, the traditional RAINS model constitutes the air
effective method to improve the speed of reporting, analysis, and pollution-related part of the GAINS model, while the GAINS
information delivery for faster operational decision-making and extensions address the interactions between air pollutants and
action-taking, thus enabling to react rapidly to business problems greenhouse gases.
and satisfy new requirements.
The rest of this paper is organized as follows: section 2 introduces
some approaches related to our work; after a brief introduction of
GAINS concepts, in section 3, a GAINS architecture and ETL
framework are presented, section 4 will present our
implementation results. At last, section 5 gives a summary of
what have been achieved and future works.

2. RELATED WORKS
The characters of the proposed approach can be rooted in several
research areas of BI, including the trends and concepts of BI
solutions, the combined use of mathematical models and data
warehousing technologies in supporting BI, as well as the
utilization of BI in GAINS. With the amount of data generated in
an enterprise increasing continuously, delivering the right and
sufficient amount of information at the right time to the right
business users has become more complicated and critical [7]. Figure 2. Environmental effects of air pollutants and greenhouse
More and more enterprise solutions and platforms for Business gasesScenario-based Emission and Cost Calculation
Intelligence have been developed such as IBM DB2 with Emission Calculation
Business Intelligence Tools, Microsoft SQL Server, Teradata
Warehouse, SAS, iData Analyzer, Oracle, Cognos, Business According to [8].The emissions for a given pollutant, GAINS
Objects, etc. [11], have been developed aim to empower region, and year within a given GAINS scenario are calculated
businesses by providing direct access to information used to make according to the following equation

333
Proceedings of iiWAS2008 iiWAS 2008

E p , r ,t , y ¦E
a , s ,t
p , a , s ,t , r , y ¦A
a , s ,t
a,s,r , y
. X a , s , t , r , y .ef a , s , p , r .(1  K a , s ,t , p , r ) 3.2 GAINS-BI architecture
GAINS-BI is conceived as a data warehousing system whereby
where member organizations remain responsible for the generation and
p, r , y : Pollutant, GAINS region, year, providing of their data. The GAINS BI components could be
described as follows:
a, s, t : GAINS activity, sector, abatement technology
(option), x ETL Tools: GAINS-BI enables users to provide that content
(upload) and, to access it in terms of GAINS data download.
E p , r ,t , y :missions of the specific pollutant p, GAINS region r,
Afterwards, the ETL tool will extract, transform (emission
and year y, and cost calculations) (Figure 4), and load the data to a
Aa , s ,r , y :Activity for a given GAINS activity/sector GAINS data mart.
combination (a, s), x Emission and Cost Aggregation Pre-Calculation. In this
X a , s ,t ,r , y :Actual implementation rate of the considered step, three main data cubes are generated in term of pre-
calculate activity, emission and cost data with multiple levels
abatement option , of data granularities based on dimension hierarchies and
ef a , s ,t , p , r “Uncontrolled” (“unabated”) emission factor, and aggregated data values.
K a ,s ,t , p ,r Reduction efficiency. x Intranet Report Systems. Jasper Report is used in this case
Cost Calculation to generate reports, including data table and chart
generations.
Similar to the emissions, also the costs of reducing emissions for a
x Web Publisher. In the context of Jasper Report, the output of
given pollutant, GAINS region, and year can be calculated by the
GAINS-BI could be generated in different formats, i.e.
GAINS model according to [14]:
HTML Ajax, PDF, Excel, etc.
C p , r ,t , y ¦C
a , s ,t
p , a , s ,t , r , y ¦G
a , s ,t
( a , s , t ), p
Aa , s , r , y . X a , s ,t , r , y .cf a , s , t , r x Metadata. GAINS-BI metadata is used to describe all built-
steps. First, the metadata contains the Calculation Rules
where based on the mathematical formulations described in the
C p , r , t , y ;Reduction costs of the specific pollutant p, GAINS previous section. The second metadata item is information
about Pre-calculation Aggregation (data granularity). The
region r, and year y,
Multi-Dimensional Report Management is used to manage a
cf a , s ,t ,r Unit cost factor of the considered abatement option, set of generated reports as well as the configuration of each
and report. For multiple purposes, e.g. many kinds of end users,
G ( a , s ,t ), p Kronecker delta function that returns 1 if p is the different levels of decision makings, Data Selection is to
define different sub-cubes. These sub-cubes could be
primary cost pollutant for abatement option a, s, t and 0
accessed via several methods or protocols, i.e. web services,
otherwise. API, Servlet, etc. The report and result presentation could be

Figure 3. GAINS BI architecture

334
iiWAS 2008 Proceedings of iiWAS2008

Figure 4. GAINS-BI ETL process

configured based on the Presentation Configuration AllTechnologies->TechnologyType->IDTechnology


metadata. Time Dimension (technology_d)
Time Dimension is denoted as AllTime->Year
4. MODELLING MULTIDIMENSIONAL Scenario Dimension (scenario_d)
DATA MODELS AND IMPLEMENTATION Emission Scenarios define, for each country, the combination of
RESULTS activity projections and control strategies [14]. This combination
determines the level of actual emissions. The scenario dimension
4.1 Dimension Definitions is organized as follows:
Region Dimension (region_d)
AllScenarios->ScenarioGroup->IDScenario
The GAINS Europe model covers 42 land-based regions in
Europe, most of them individual countries and four subnational 4.2 Fact table Definitions
regions in the European part of Russia. Moreover, there are Activity Fact Table (activity_f)
currently five sea regions represented in the model. These regions Economic activities such as energy consumption, industrial
are denoted as IDRegion. Furthermore, there regions could be production and agricultural farming cause emissions of air
grouped into several region groups, denoted by RegionGroup. pollutants, which have several negative effects on ecosystems and
AllRegions->RegionGroup->IDRegion human health. These variables describe the level of the activity in
a sector and a country. The Activity Fact Table is defined based
Sector Dimension (sector_d) and Activity Dimension (activity_d) on five dimensions, i.e. Scenario Dimension (scenario_d),
GAINS covers a number of sectors, and each sector may be Region Dimension (region_d), Sector Dimension (sector_d),
associated with a number of different activities. Hence, in GAINS Activity Dimension (activity_d), Time Dimension(time_d) and
activity data are structured by sector-activity combinations. For denoted as follows:
example, in the sector ‘industrial boilers’ the associated activities activity_f(idscenario,idregion,idactivity,idsector,year,activityvalue)
are the various fuels that are used in industrial boilers, i.e., coal,
oil, etc. Activities may be further subdivided, e.g., hard coal
(grade 1), hard coal (grade 2), etc. The sector and activity
dimensions covered by GAINS-BI are organized as follows
AllSectors->SecType->IDSector
AllActivities->ActType->IDActivity
Pollutant Dimension (pollutant_d)
The set of pollutants in GAINS covers both the traditional air
pollutants (SO2, NOx, PM2.5, NH3 and VOC) as well as the
greenhouse gases CO2, CH4, N2O and FGAS (a GWP-weighted
average of HFCs, PFCs, SF6). The pollutant dimension is
organized as follows:
AllPollutants->PollutantGroup \
->Pollutant>IDPollutant_Fraction
Technology Dimension (technology_d)
Emissions of pollutants can be controlled with control Figure 5. Activity Data Cube Schema
technologies, but not every technology controls every pollutant.

335
Proceedings of iiWAS2008 iiWAS 2008

Figure 7 shows an example of Activity report generated from Cost Fact Table (emission_f)
Activity Data Cube GAINS does not produce nor use single pollutant cost curves in
the optimization. However, single pollutant cost curves can be
Emission Fact Table (emission_f) constructed by GAINS, if so desired [14].The Cost Fact Table is
Emissions of each pollutant are calculated as the product of the defined based on six dimensions, i.e. Scenario Dimension
activity levels, the “uncontrolled” emission factor in absence of (scenario_d), Region Dimension (region_d), Sector Dimension
any emission control measures, the efficiency of emission control (sector_d), Activity Dimension (activity_d), Time
measures and the application rate of such measures. The Emission Dimension(time_d), Pollutant Dimension (pollutant_d),
Fact Table is defined based on five dimensions, i.e. Scenario Technology Dimension (technology_d) and denoted as follows:
Dimension (scenario_d), Region Dimension (region_d), Sector cost_f(idscenario,idregion,idactivitytivity,idsectortor,idtechnology
Dimension (sector_d), Activity Dimension (activity_d), Time ,year,idpollutant_fraction,factor,activityvalue,perc,cost,idcostsets
Dimension(time_d), Pollutant Dimension (pollutant_d) and ,unit)
denoted as follows:
emission_f(idscenario,idregion,idactivitytivity,idsectortor,year,
idpollutant_fraction,activityvalue,impl_ef,factor_noc_abtd,rem_ef,emiss_c
alc,emiss,emiss_co2eq)

Figure 6. Cost Data Cube Schema

Figure 5. Emission Data Cube Schema

Figure 7. An example of using Activity Data Cube to generate Energy Data aggregated by Activity

336
iiWAS 2008 Proceedings of iiWAS2008

Figure 8. An example of using three data cubes to generate multi reports

Member states to meet the environmental targets of the


5. CONCLUSIONS AND FUTURE WORKS Thematic Strategy on Air Pollution. NEC Sceanrio Analysis
As the process of turning data into information and then into
Report No. 3. International Institute for Applied Systems
knowledge, the concept of Business Intelligence has been
Analysis (IIASA), Laxenburg, Austria.
emerging as a potential solution to open new perspectives and
areas for improvement of the decision making processes and [2] Amann, M., Bertok I., Cofala, J., Heyes, C., Klimont, Posch,
operations. In this paper, a BI solution, with the integrated M. Schöpp, W. and Wagner F. (2006), Baseline Scenarios
repositories and data warehouse as the central components, has for the Revision of the NEC Emission Ceilings Directive.
been introduced to support the GAINS experts for better business Part 1: emission projections. NEC Scenario Analysis Report
decisions. Moreover, we have presented how to apply the Nr.1. International Institute for Applied Systems Analysis
mathematical formulation to specify the GAINS-BI conceptual (IIASA), Laxenburg, Austria,
data model as well as to calculate emission and costs in the ETL http://www.iiasa.ac.at/rains/CAFE_files/NEC-BL-p1-
process and data cube generation. The GAINS-BI architecture and v21.pdf.
its concepts have also been introduced. The multidimensional data [3] Amann, M., Bertok, I., Cabala, R., Cofala, J., Heyes, C.,
model is defined. Gyarfas, F., Klimont, Z., Schöpp, W. and Wagner, F. (2005),
In the near future, the pursuit of semantic technologies will be Analysis for the final CAFE scenario. CAFÉ Report No. 6.
used to enhance the efficiency and agility of GAINS-BI solution, International Institute for Applied Systems Analysis
i.e. representation of data combination and constrains. Moreover, (IIASA), http://www.iiasa.ac.at/rains/CAFE_files/CAFE-
mathematical optimization and data mining algorithms will be D3.pdf.
adapted for multidimensional analysis of integrated data from [4] Amann, M., Cofala, J., Heyes, C., Klimont, Z., Mechler, R.,
heterogeneous sources in university environment. Thus, the Posch, M. and Schöpp, W. (2004), The RAINS model.
proposed BI tools can provide improved analytic capabilities for Documentation of the model approach prepared for the
the multi level of information delivery across an enterprise, and RAINS review. International Institute for Applied Systems
that would help the visibility about strategic decisions. Analysis (IIASA), Laxenburg, Austria,
www.iiasa.ac.at/rains/review/index.html.
6. REFERENCES [5] G. R. Gangadharan, S. N. Swami, "Business Intelligence
[1] Amann, M., Asman W., Bertok I., Cofala, J., Heyes, C., Systems: Design and Implementation Strategies," in Proc of
Klimont, Z., R., Posch, M. and Schöpp, W. (2007)., Cost- the 26th International Conference Information Technology
optimized reductions of air pollutant emissions in the EU Interfaces ITI 2004, Croatia, 2004, pp. 139-144.

337
Proceedings of iiWAS2008 iiWAS 2008

[6] Grant A. J., Luqi, "Intranet Portal Model and Metrics: A Conference on Computer information systems and Industrial
Strategic Management Perspective," IT Professional, vol. 7, management applications, 2007, pp. 364-368.
pp. 37-44, 2005. [12] Wei X., Xiaofe X., Lei S., Quanlong L., Hao L, “Business
[7] Hugh 7. J. W., Barbara H. W., "The Current State of intelligence based group decision support system”, in Proc of
Business Intelligence," Computer, vol. 40, pp. 96-99, 2007. the International Conferences on Info-tech and Info-net ICII
[8] Klaassen, G., Amann, M., Berglund, C., Cofala, J., Höglund- 2001, Beijing, China, 2001, pp. 295 – 300.
Isaksson, L., Heyes, C., Mechler, R., Tohka, A., Schöpp, W., [13] Zeng L., Z. Shi, M. Wang, W. Wu, "Techniques, Process,
Winiwarter, W. (2004) The Extension of the RAINS Model and Enterprise Solutions of Business Intelligence," in Proc of
to Greenhouse Gases. An interim report describing the state the IEEE Conference on Systems, Man, and Cybernetics,
of work as of April 2004. IIASA IR-04-015. Taipei, Taiwan, 2006, pp. 4722-4726.
[9] Makowski M.P. Data Cleaning and Performance Tuning in [14] Wagner, F., W. Schoepp and C. Heyes. The RAINS
the GAINS Model. Thesis at the Database and Artificial optimization module for the Clean Air For Europe (CAFE)
Intelligence Group (DBAI) of the Technical University of Programme, Interim Report IR-06-029, International
Vienna, 2008. Institute for Applied Systems Analysis (IIASA), September
[10] Ta’a A., Bakar M. S. A., Saleh A. R., “Academic business 2006.
intelligence system development using SAS® tools”, in [15] www: http://www.iiasa.ac.at/web-apps/apd/gains/
Online Proc of the SAS Global Forum, 2008.
[11] Tvrdikova M., "Support of Decision Making by Business
Intelligence Tools," in Proc of the 6th International

338

Вам также может понравиться