Вы находитесь на странице: 1из 5

Data Warehousing in the Context of a Bologna

Undergraduate Degree

José Victor Ramos Rui Oliveira


Informatics Engineering Department Informatics Engineering Department
Polytechnic Institute of Leiria Polytechnic Institute of Leiria
Leiria, Portugal Leiria, Portugal
jose.ramos@ipleiria.pt rui.oliveira@ipleiria.pt

Abstract—Facing the incessant growth of data that well as the challenges we faced and the feedback from the
organizations have to deal with on a daily basis, decision support students. Section 4 focus the software tools used in the course.
systems and data warehousing techniques assume a vital Finally, the conclusions and further developments are
importance in supporting the decision making process. Taking summarized in section 5.
this necessity in consideration, the Bologna process was an
opportunity to introduce data warehousing competences in the
undergraduate degree of Informatics Engineering at Polytechnic II. COURSE CONTEXT
Institute of Leiria (IPL). The paper focus some aspects related The undergraduate degree in Informatics Engineering has a
with this adaptation, the difficulties and challenges during the three years duration, with 180 ECTS (European Credit Transfer
implementation process and the solutions adopted, towards an and Accumulation System), and two specializations,
effective acquisition of competences by students. Information Systems (IS) and Information and Communication
Technologies (ICT). In the beginning of the 4th semester, the
Keywords—data warehousing; decision support systems; student selects the specialization he/she wants to pursue. Each
acquisition of competences; Bologna process; Informatics
specialization has four main profiles, with Software Engineer
Engineering
common to both. The other profiles for IS specialization are
Database Engineer, Knowledge Engineer and Enterprise
I. INTRODUCTION Information Systems Engineer. For ICT specialization the
In the past few years there was a rapid growth in the profiles are Systems Engineer, Network Engineer and
decision support systems’ industry since most of the Multimedia Engineer.
organizations understood the strategic importance of these Decision Support Systems course integrates the Database
systems in data analysis and supporting the decision making Engineer and Knowledge Engineer profiles. The course
process. With an increasing demand of skilled professionals in belongs to the 4th semester and has 30 hours for theoretical
this area, higher education institutions are responsible for classes, 45 hours for laboratory classes, 5 hours for tutorial
giving an appropriate answer to marketplace needs. classes and 82 hours of autonomous student work, totalizing
The Bologna process presented an opportunity for higher 162 hours, the equivalent to 6 ECTS.
education institutions to fulfill these marketplace needs, In the following a short description of the objectives/skills,
adapting their curricula, in order to improve educational quality teaching/learning methodology and assessment methodology is
and to guarantee that students acquire competences to enter presented.
immediately in the marketplace after finishing an undergraduate
degree. Facing this reality, some of the competences typically
A. Objectives/Skills
belonging to pre-Bologna graduate degrees, with five years
duration, were shifted to undergraduate degrees, with three The main goals/competences of the course are:
years duration. The course of Decision Support Systems x Identify the technologies needed to implement a data
exemplifies this change, since it belonged to the 8th semester of warehouse.
a graduate degree, while in the Bologna framework it receded
to the 4th semester. Adding this course to the curriculum was a x Define the architecture of a data warehouse.
challenge and the implementation revealed more complex than x Design a data warehouse to find answers for critical
initially expected in the adaptation plan [1]. business questions.
The paper is organized as follows. Section 2 describes how x Design and implement the Extract, Transform and
the course relates to the rest of the Informatics Engineering Load (ETL) process.
curriculum, the course objectives and the techniques used in x Design and implement the reporting system.
instruction and assessment. Section 3 describes course
operation, with particular emphasis on theoretical, laboratory x Use On-Line Analytical Processing (OLAP) tools to
and tutorial classes, the team project, knowledge assessment, as analyze data from a data warehouse.

978-1-4673-6110-1/13/$31.00 ©2013 IEEE Technische Universität Berlin, Berlin, Germany, March 13-15, 2013
2013 IEEE Global Engineering Education Conference (EDUCON)
Page 540
B. Teaching/Learning Methodology dimensional modeling, fills most of the theoretical classes since
The teaching/learning methodology adopted in the course it is the core of the course. The last three topics concern issues
follows project based learning, in consonance with the that broaden the knowledge of students in the data warehousing
methodology adopted in the undergraduate degree of area.
Informatics Engineering [1]. Development of team projects The Bologna process brought some changes in the contents
promotes work planning and organization, search and of the theoretical classes, particularly the simplification of
acquisition of knowledge, and also development of administration and maintenance topic.
autonomous ability, initiative, critical analysis and evaluation
of solutions [2]. The main difficulty, as expected, was dimensional
modeling, because of its complexity. Students, at this point in
The contact teaching includes theoretical, laboratory and the academic path, are mostly prepared to solve problems with
tutorial classes. The autonomous learning comprises study, e- one solution, using convergent thinking. Dimensional
learning and the team project. modeling, however, needs divergent thinking, where multiple
solutions are possible for the same problem.
C. Assessment Methodology
The main assessment methodology adopted in the course is B. Laboratoty Classes
based on continuous evaluation, in consonance with the The main topics covered in laboratory classes are the
methodology adopted in the undergraduate degree of following:
Informatics Engineering. Continuous evaluation promotes a
more regular and intensive work by the students, allowing an x Normalization vs. denormalization.
effective acquisition and development of competences [1], [3]. x Implementation of ETL mechanisms.
Besides continuous evaluation there is also a final
x OLAP tool for data analysis.
evaluation, outside classes’ period, typically for a small
number of students, since the majority opts for continuous The first topic puts students in contact with the
evaluation. denormalization concept, its advantages and disadvantages, and
its importance in the context of data warehousing. The second
In order to optimize student’s work, a mechanism for
topic fills most of the laboratory classes and is one of the key
reusing assessment components was implemented, allowing
aspects of this course. In the last topic students use an OLAP
students to keep the grade of a component in which already got
tool to analyze data from the data warehouse, to create reports
approving for the remainder epochs of final evaluation.
and ad-hoc queries. Like in theoretical classes the methodology
adopted for the data warehousing process is the methodology
III. COURSE OPERATION developed by Ralph Kimball [4], [5], [6].
This section describes the main options for course The implementation of a data warehouse in laboratory
operation, with particular emphasis on the different classes is quite complex and most of the students face for the
teaching/learning and assessment approaches implemented. first time a software system with these features and magnitude.
One of the challenges for students has been to understand the
A. Theoretical Classes data warehouse in the end user’s perspective and system’s
Theoretical classes are essential for the exposition and perspective. To help students understand the system, several
understanding of the theoretical concepts. The methodology documents were prepared to give a broader view of the system
adopted for the data warehousing process is the methodology and simultaneously understand the implementation details.
developed by Ralph Kimball [4], [5], [6]. Among those documents are the dimensional model, the logical
data map, algorithms for data extraction, transformation and
The content of the theoretical classes is as follows: loading and the transformation error logger. Figure 1 presents
x Introduction to Decision Support Systems. the architecture of the data warehouse plan.

x OLAP and Data Warehousing. Since the transition to Bologna three different approaches
were introduced in laboratory classes to teach the
x Data Warehousing Process. implementation process of a data warehouse.
x Dimensional Modeling. x Graphical programming.
x The ETL process. x Graphical programming and PL/SQL programming.
x Administration and Maintenance of Data Warehouses. x PL/SQL programming.
x Data Warehousing Project. The initial approach, graphical programming, was based on
the existent one before Bologna, with of course the necessary
x Business Intelligence and Data Mining. adaptations.
The first three topics are introductory and the purpose is to In the following these three approaches are described along
lead students to think in broader terms and to have the with a critical analysis to outline the advantages and
management perspective of an organization. The next topic, disadvantages of each.

978-1-4673-6110-1/13/$31.00 ©2013 IEEE Technische Universität Berlin, Berlin, Germany, March 13-15, 2013
2013 IEEE Global Engineering Education Conference (EDUCON)
Page 541
Figure 1. Data Warehouse Architecture Plan.

1) Graphical Programming 2) Mixed Approch: Graphical and PL/SQL Programming


This approach was introduced in the initial course edition, In scholar year 2009/2010, based on the problems of the
scholar year 2007/2008, after the transition to Bologna, and the previous edition, significant changes were introduced in
software tool adopted to implement a data warehouse was laboratory classes, in order to assure a higher efficiency in the
Oracle Warehouse Builder (OWB). acquisition of competences by students.
In this course edition there was no problems in the The main change was in the approach used to implement
implementation of a data warehouse, despite the students the data warehouse, in this case done mostly by programming
mentioned the complexity of the system and the software tool. in SQL and PL/SQL code. The software tool selected was
The second edition of the course, scholar year 2008/2009, Oracle SQL Developer, because it's a quite simple
brought unexpected difficulties and it was clear that students development tool. During the implementation of the data
were not prepared to implement such a complex system and to warehouse OWB was only used to show the implementation of
use a software tool like OWB, which has a slow learning curve. some mappings from the ETL process in a graphical
The reason for this discrepancy in two consecutive scholar environment.
years has to do with the fact that with the transition to the
Bologna process, in scholar year 2007/2008, most of the With the adoption of this approach we started to build a
students that attend the course were actually from 3rd year. didactic platform for teaching the ETL process, allowing the
Only in 2008/2009 the course was attended by 2nd year immersion of students in a data warehousing environment, with
students, as planned in the curriculum, and the preparation, the following mandatory features:
maturity and study autonomy were found not adequate. We x Fast adaptation to the environment.
already expect something like that but despite all the efforts it
was clear that the challenge was huge and laboratory classes x Fast adaptation to the programming language/software
had to be restructured. tools for structural manipulation of the environment.
The main difficulties in this approach were the following: x Focus on fundamental concepts of the ETL process,
supported by the state of the art.
x Learning curve of the software tool.
x Simplicity of the scenario for easy understanding and
x More concern of the students with the software tool adaptation.
than the implementation of the data warehouse.
The absence of prebuild published environments around the
x Complexity of the system to implement. desired mandatory features evidenced the need for an
x Weak perception by students of the objectives and implementation in-the-house. As a result, a didactic platform
utility of the data warehouse after the implementation was developed, under the form of a data warehouse prototype,
process. with basic functionalities and scalable [7].

978-1-4673-6110-1/13/$31.00 ©2013 IEEE Technische Universität Berlin, Berlin, Germany, March 13-15, 2013
2013 IEEE Global Engineering Education Conference (EDUCON)
Page 542
The utilization of the didactic platform for partial In pre-Bologna, the team project consisted in the full
implementation of the data warehouse in laboratory classes was implementation of a data warehouse, that is, business
valuable, considering the knowledge and maturity of the requirements gathering, dimensional modeling, ETL process
students. This new approach was considered more suitable by implementation, reports and data analysis. With Bologna
most of the students, and particularly by those repeating the process the team project focus only on business requirements
course, since it allowed a more exact notion of details and gathering, dimensional modeling and design of the ETL
complexity of the system to implement, and as well as the process. Nevertheless, students are confronted with the
perception of the objective and utility of the data warehouse. modeling of a complex software system which, combined with
some immaturity on the ability to structure solutions and
Another important change had to do with the way students abstract reasoning, created inherent difficulties.
worked in the laboratory classes, they started to work
individually, one student per workstation. In the first three In 2007/2008, the first course edition, after the adaptation
semesters students usually work in groups of two elements in to the Bologna process, students results were quite satisfactory,
laboratory classes, so this change was not easy in the even though, as mentioned before, the course was mainly
beginning, however it was crucial since it improved in a attended by 3rd year students, as result of changes in the
significantly way the autonomy of the students, one of the main curriculum. In the second edition, scholar year 2008/2009,
goals of the Bologna process. students revealed more difficulties to do the team project, with
results beneath the expected.
3) PL/SQL Programming
Once initial difficulties with the introduction of the didactic The main factors that lead to this situation were:
platform for teaching the ETL process were overcome, in
x Difficulty of students to understand the difference
scholar year 2010/2011 some minor adjustments were
between the relational and the dimensional model.
introduced in order to optimize this approach. Graphical
programming, and the software tool OWB, were definitely x Weak abstract reasoning for dimensional modeling.
abandoned, and the implementation of the data warehouse was
done exclusively in SQL and PL/SQL code. x Difficulties in structuring solutions.
An important aspect we tried to improve was the perception x Difficulties in mapping requirements.
by the students of the objective and utility of the data x Complexity of the problem.
warehouse. To achieve that, before the implementation a
demonstration was performed, to show the concept in the end To overcome these difficulties some strategies were
user’s perspective and system’s perspective, through OLAP defined, namely the use of real world case studies in data
operations (e.g., drill-down, roll-up), ad-hoc queries, and its warehousing and the introduction of tutorial classes for the
utility from a practical point of view. This was possible team project. Among the case studies we must mention SAD-
because the didactic platform at this point was fully functional, IES (the acronym in English stands for Decision Support
which was highlighted by students, since it allowed them to System for Higher Education Institutions), developed at IPL.
have, from the beginning, a full vision of the final result it is The main advantage of this case study is in the fact that
supposed to get with the implementation of the data warehouse students are well familiarized with the business processes,
during the semester. since most of them are related with academic activities (e.g.,
admission in courses, evaluation, attendance to classes, etc.).
Table 1 presents a synthesis of the advantages and
disadvantages of the two main approaches used in laboratory
classes, graphical programming and PL/SQL programming. D. Tutorial Classes
The mixed approach is not relevant for this comparison. The tutorial orientation classes provide the guidance of
students, organized in small groups, or individually, in solving
problems and supervision of activities related to the course.
TABLE I. MAIN APPROACHES FOR IMPLEMENTATION OF THE DATA
WAREHOUSE Some tutorial classes are dedicated to the study of real
Approach Advantages Disadvantages world projects in data warehousing, as a way to help students
Graphical Fast implementation Low level of detail to consolidate the acquired knowledge and to overcome some
Programming Complexity of the High abstraction difficulties in dimensional modeling.
solution quite low Slow learning curve
PL/SQL High level of detail Complexity of the solution E. E-Learning
Programming Low abstraction quite high
Fast learning curve The introduction of new technologies in the learning
Reusing of code process allows news ways of interaction and communication
between students and teachers, and among students.
The using of content management platforms (e.g.,
C. Team Project Blackboard, Moodle) has been established at IPL long before
The course includes the development of a team project in the Bologna process. Besides the interaction tools in the
order to promote planning and organization, search and platform students have access to all the materials and content
acquisition of knowledge, and also to develop autonomous of the course (e.g., slides, knowledge verification tests, real
ability, initiative, critical analysis and evaluation of solutions. world case studies in data warehousing, etc.).

978-1-4673-6110-1/13/$31.00 ©2013 IEEE Technische Universität Berlin, Berlin, Germany, March 13-15, 2013
2013 IEEE Global Engineering Education Conference (EDUCON)
Page 543
F. Knowledge Assessment IV. SOFTWARE TOOLS
The knowledge assessment includes two different types of Software tools and the underlying infrastructure are vital to
assessment, continuous evaluation, during the period of classes, teach Decision Support Systems courses, and particularly data
and final evaluation at the end of the semester. warehousing technologies. In order to implement a data
warehouse it is necessary to select several software tools,
1) Continuous Evaluation
namely, a Database Management System, a Modeling tool, an
Most of the students opt for continuous evaluation, which ETL tool and an OLAP tool. The Database Management
integrates the following elements: System used to implement the data warehouse is Oracle 11g
x Theoretical Written Test (TWT) (45%). [8], [9]. For dimensional modeling Oracle Data Modeler was
chosen. To implement the ETL process first we adopt OWB
x Practical Computer Tests (PCT) (25%). 11g and later with the adoption of the PL/SQL programming
x Team Project (25%). approach the option was for Oracle SQL Developer. Finally,
Oracle Discoverer 11g was the software tool for data analysis,
x Class Participation (CP) (5%). reports and ad-hoc queries.
TWT evaluates knowledge related with the data Most of the work performed by students in this course is
warehousing process, whereas PCT evaluates the quite similar to the work developed in the data warehousing
implementation of a data warehouse. In the team project industry, but the software tools used in the industry usually are
knowledge related with the data warehousing process, and not suitable for instruction. The main reasons have to do with
particularly with dimensional modeling, is evaluated. CP has a the complexity of the tools, the learning curve and a set of
weekly periodicity and regardless of its small weight in the additional features that are not necessary in an academic
final grade is a motivation factor for most of the students. environment. Even though, in database courses of the
undergraduate degree in Informatics Engineering the option
2) Final Evaluation was for professional tools, Oracle software tools, since they are
Knowledge assessment in final evaluation comprehends valuable when students enter the marketplace.
two elements, a Theoretical Written Test a Practical Written
Test, each with a weight of 50%.
V. CONCLUSIONS
G. Evolution of the Model The shift of content and competences typically belonging to
To successfully introduce a data warehousing course in an pre-Bologna graduate degrees, with five year duration, to
undergraduate degree two aspects are particularly relevant. Bologna undergraduate degrees, with three year duration, put
several challenges during the implementation process. Decision
x Dimensional modeling. Support Systems course exemplifies this change and it took a
few editions to optimize the operation and guarantee an
x Data warehouse implementation. effective acquisition of competences by students in data
The evaluation results and feedback from the students show warehousing.
that the PL/SQL programming approach is the most The main contributions in our model are the changes
appropriate since it allows students to focus mainly on the introduced in dimensional modeling and particularly in data
software system to implement and the level of abstraction is warehouse implementation. The development of a didactic
lower. In graphical programming the functionalities inherent to platform for teaching ETL process helped students to improve
objects, and the way objects interact, create an additional level the perception of the utility of a data warehouse, not only in the
of abstraction that deviates the focus of students from the main end user’s perspective but also in a system’s perspective and
point, which is the implementation of a data warehouse. from a practical point of view.
Besides the issues inherent to each one of the approaches
the other key factors for the success of the PL/SQL
REFERENCES
programming approach are:
[1] IPL, Adequação do Ciclo de Estudos do Curso de Licenciatura em
x Demonstration of the data warehouse concept. Engenharia Informática, Instituto Politécnico de Leiria, 2007.
[2] T. Markham, J. Larmer, and J. Ravitz, Project Based Learning
x Didactic platform for teaching the ETL process in data Handbook, Buck Institute for Education, 2003.
warehouses. [3] J. Biggs, Teaching for Quality Learning at University, The Society for
Research into Higher Education, Open University Press, 2003.
x One student per workstation.
[4] R. Kimball, The Data Warehouse Lifecycle Toolkit, John Wiley & Sons,
The challenge to introduce data warehousing competences 1998.
in the undergraduate degree of Informatics Engineering was [5] R. Kimball, The Data Warehouse Toolkit, John Wiley & Sons, 2002.
successfully accomplished. Although, in a future adaptation of [6] R. Kimball and J. Caserta, The Data Warehouse ETL Toolkit. John
the study programme, the possibility of the Decision Support Wiley & Sons, 2004.
Systems course to transit from the 4th semester to the 5th [7] R. Oliveira and J. Ramos, Implementation of a Didactic Platform for
Teaching ETL in Data Warehouses, in press.
semester can make the difference in student’s maturity and
preparation to more easily deal with a highly complex area, [8] Oracle, Oracle 11g Data Warehousing Guide, 2007.
such as data warehousing. [9] Oracle, Oracle 11g PL/SQL Language Reference, 2009.

978-1-4673-6110-1/13/$31.00 ©2013 IEEE Technische Universität Berlin, Berlin, Germany, March 13-15, 2013
2013 IEEE Global Engineering Education Conference (EDUCON)
Page 544

Вам также может понравиться