Академический Документы
Профессиональный Документы
Культура Документы
WHITE PAPER
Data Warehouse
Project Management
DAVID M WALKER
Version: 1.0
Date: 14/10/2008
http://www.datamgmt.com
White Paper - Data Warehouse Project Management
Table of Contents
Synopsis
Data warehouse projects pose a specific set of challenges for the project manager. Whilst
most IT projects are a development to support a well defined pattern of work a data
warehouse is, by design, there to support users asking ad hoc questions of the data available
to the business. It is also a project that will have more interfaces and more change than any
other system within the organisation.
Projects often have poorly set expectations in terms of timescales; the likely return on
investment, the vendors’ promises for tools or the expectations set between the business and
IT within an organisation. They also have large technical architectures and resourcing issues
that need to be handled.
This document will outline the building blocks of good project control including the definition of
phases, milestones, activities, tasks, issues, enhancements, test cases, defects and risks and
will discuss how they can be managed, and when, using an event horizon, the project
manager can expect to get information.
To help manage these building blocks this paper will look at the types of tools and technology
that are available and how they can be used to assist the project manager. It also looks at
how these tools fit into methodologies.
The final section of the paper has looked at how effective project leadership and estimating
can improve the chances of success for a project. This includes understanding the roles of
the executive sponsor, project manager, technical architect and senior business analyst along
with the use of different leadership styles, organisational learning and team rotation.
Intended Audience
Reader Recommended Reading
Executive Entire Document
Business Users Synopsis
IT Management Entire Document
IT Strategy Synopsis
IT Project Management Entire Document
IT Developers Synopsis
Introduction
Data warehousing projects are notorious for both being delivered late and over budget. Why
is this and what can be done to prevent it?
The common problems that affect the data warehouse can be grouped by the expectations
that are set, the technology used and the management of resources. There is also a need to
define the components of project control and to develop understanding of the level to which
these can be planned and forecast.
With this understanding it is possible to look at tools, technologies and methodologies that
can assist the project manager to control, support and deliver a solution.
Finally this paper looks at the impact of project leadership techniques and the effect that good
estimate efforts can have on the delivery of a project.
This paper does not set out to be an exhaustive manual for the management of a data
warehouse project, rather a guide to help project managers understand the difference
between data warehousing projects and others that they may have been responsible. It also
suggests some things that should be considered when managing such projects.
Common Problems
Data warehouse projects often suffer from poorly set expectations, problems with the
technical architecture and resource turnover. These common categories have multiple and
varied underlying causes:
• Timescales
For most large organisations a data warehouse project will have many phases,
take years to build and have to be supported for many more years afterwards.
Development and production phases will overlap and once in production the
environment will be subject to massive amounts of change. When a Data
Warehouse project is conceived it will not be discussed in these terms but as
just another project with a finite timescale and a simple set of phases.
• Return on Investment
The business case generated to justify a data warehouse project will always
make a number of promises about return on investment, which are deemed
necessary to get the budget. However, the real benefit will either come from an
unexpected quarter (e.g. a fraud being discovered or the ability to drop a costly
product range) or be from intangible benefits (e.g. business analysts that stop
creating their own small database and spreadsheet applications and start
analysing the business and delivering indirect business benefit). Business
cases should be looked upon as a reason for starting rather than a defined
return on investment. The deliverable, and therefore the return on investment
should be measured in terms of what benefit was derived rather than against
some planned benefit. Some phases will deliver more, and some less than
expected.
o One reporting tool will not be sufficient for all your reporting needs.
o Changing from one tool to another often does not fix the underlying
issue, or does so only at significant cost.
o Technology sets leapfrog one another; a unique feature in one tool will
often be in another with the next release.
IT are often guilty of being over-optimistic about the point at which the benefit
will occur:
o The final solution can never be reconciled exactly against the current
solution, this is because there will be unknown bugs in the currently
delivered solution, new sources of data and different treatments of that
data. It should, however, be possible to explain all the differences.
o The testing and issue resolution of a system will take longer than
expected.
The business will often point to IT for its failings but it too will fail to manage
expectations. Commonly:
o The business will not invest the time required to deliver requirements
and analysis.
o The data quality will be much worse than the business is prepared to
acknowledge and the business will be slow to respond to the need to
change working practices to improve the data quality.
o The business will not spend enough time or dedicate enough resource
to testing.
o The business will not spend enough time or dedicate enough resource
to training.
o There will be phases of the project where they lose focus as other
business critical events take place.
Whilst this may seem obvious, simplicity is hard to do and has to be regularly worked
on. Keeping things simple requires a conscientious decision. Some common examples
include:
o Small is beautiful.
Creating more simple ETL mappings rather than large single blocks of code
makes change and maintenance easier.
Have a larger number of smaller data marts rather than one single ‘super-
mart’ which results in complex change management.
Have multiple reports for different people instead of one that tries to satisfy
everyone. A change to meet a requirement from one user will undoubtedly
leave another user unhappy.
o Minimise components.
There is often a temptation to make tweaks to the technical architecture to
have a larger number of ‘best of breed’ tools to accommodate certain
requirements. In practice this increases the maintenance costs and support
skills required whereas the existing toolset can often provide a sufficient
solution.
o Keep it clean.
A data warehouse whose data is not clean quickly falls into disrepute and
becomes unused. The technical architecture has to support data quality
monitoring and data cleansing and there must also be processes to deliver
change in the source system.
1
This may seem counter-intuitive, how can writing a whole separate application to maintain data be simpler than just
writing the SQL? The reason for this requirement is that there is no audit trail and therefore no way of knowing what
updates have been done. If instead a business user has to maintain the data via an application and this is loaded into
the data warehouse via ETL then the business takes responsibility for maintaining the data and there is an audit trail
of where the data change came from. Writing SQL also creates a large number of small, ad-hoc SQL jobs. This
takes time and sooner or later someone will make a mistake with the SQL that may take a lot of work to correct.
o Tactical Solutions.
Every project has them and there is often a very good case for them but they
always cost more to remove than to create and often de-rail the ‘keep it
simple’ principle. If tactical solutions become necessary then they should
have an implementation, an expected lifetime, maintenance cost for the
lifetime, a lifetime-overrun cost and a decommissioning cost. When presented
with these costs tactical solutions often look a lot less appealing.
Resource Turnover
Data Warehouses are long-term projects and are likely to engage internal and external
staff for several years. For a variety of reasons staff will come and go during that time.
It is also possible for the new arrivals to bring new ideas and ways to improve what is
being done. Whilst this is to be welcomed it also presents a very specific and potentially
expensive risk. This often manifests itself when new members of staff come in and say
‘I wouldn’t have done it this way – let’s start doing it another way’ For whatever reason
a significant change in direction most often results in one of two outcomes; expensive
re-work to keep a consistent architecture or a complex architecture that requires more
2
support and documenting the route to the current point can avoid such re-work.
In addition ensure that there is sufficient briefing material available so that new starters
can get up to speed quickly and before launching into the ‘I wouldn’t have done it this
way’ discussions.
2
Key Design Decisions – a template for documenting critical decisions in the design process. See the ‘Data
Warehouse Documentation Roadmap’ white paper from Data Management & Warehousing
Phases
The phase of the program is a set of work with a defined outcome that delivers benefit
to the project as a whole. Note that there should be phases of the project that do not
deliver direct business benefit but are required for the project to succeed. It would be
common for a data warehouse project to have the following initial phases:
Phases can overlap but only in a controlled way such as not to cause resourcing or
technical conflict between phases. The above outline means that whilst the speed of
delivery gets faster later in the project the initial delivery will be six to nine months from
inception.
4
Phases are often driven by an enhancement packaging process that groups together a
series of requirements and enhancements into a phase scope definition.
3
This is particularly common in system integrator run projects where the client often says “How big is your team and
when will they arrive?” because the client want to see movement on the project.
4
See Appendix 1 for a description of governance processes for a data warehouse environment
Milestones
Milestones are the checkpoints within a phase of the project. For example each data
mart phase may have milestones that indicate the completion of Requirements
Gathering, Analysis, Design, Build, Test, Deployment, Training, etc.
Activities
Activities describe what needs to be done to complete a specific milestone. Therefore
the milestone ‘Analysis Complete’ may have activities such as ‘Identify Potential
Source Systems’, ‘Perform Data Quality Analysis’, ‘Perform Source System Analysis’,
etc. These are normally carried out by a team or group of people and normally have
estimated elapsed durations associated with them. Whilst some activities are
dependent on others many will have dependencies at a lower level. For example whilst
the activity ‘Indentify Potential Source Systems’ is going on there may be no reason
why the ‘Perform Data Quality Analysis’ for the first identified system cannot start.
It is often difficult to represent this in a project management tool and common for
project managers to reflect the situation as a number of activities of fixed duration that
are dependent on the completion of one before the next starts, or not to have any
dependencies between activities. Both methods cause the plan to slip.
Tasks
Tasks are the specific items of work to be carried out by an individual. It is important
that a named individual is responsible for the task, not a group or a team. Where a task
is not yet assigned to an individual it should be assigned to the team leader so that they
are aware that they have to (re-)assign it on to someone to actually do the work.
It should also be noted that tasks can have sub-tasks, for example the task ‘Document
Technical Architecture’ may require sub-tasks of ‘Document Database Architecture’
and ‘Document Hardware Architecture’. All three of these tasks should also have an
individual owner who may or may not be the same person as the person responsible
for bringing the whole document together.
Tasks also have real dependencies on other tasks being completed. Individuals can
work very effectively if they have a bank of outstanding tasks, each with a priority and if
they cannot work on one task can move to the next one quickly and easily. Issues may
hold up tasks.
Issues
An issue is something that affects the progress of the project. Issues can occur at a
project level or affecting an individual task. For example:
Issues introduce delays in the project and so it is important to see the impact of issues
as soon as possible.
• Project level issues should be driven down to task level as quickly as possible.
Using the example above of long term sickness the process of driving it down
to task level would be to re-assign tasks that belonged to that individual to
others.
• Task level issues often mean that the person responsible moves onto the next
task in the task bank and can come back to the other task when the issue is
resolved.
Enhancements
An enhancement is a change to the system with the aim of improving it. In a
development such as a data warehouse where new requirements are continuously
being developed it usual that an enhancement is captured, considered and if
appropriate worked into the requirements documentation for inclusion in a subsequent
phase via the enhancement packaging process.
Not all enhancements need to go through the requirements process. Some, such as an
improved navigation paradigm for the user interface, may be considered and passed
directly as a task to the person who can affect the change. This rapid response process
is very useful if there are separate teams responsible for issues and maintenance as
well as the main development teams.
5
See Appendix 1 for a description of governance processes for a data warehouse environment
Test cases
Test cases are a special type of task associated with testing.
A test case is a set of tasks supported by scripts and routines that help automate and
manage the testing of software. Every piece of code developed should have an
associated test case. Where an issue is found with the code it becomes a defect that
has to be rectified. A test case may reveal many defects, or none and test cases may
have to be repeated a number of times until the code is defect free.
In some environments it is useful to consider not only code but also the review of
documents as test cases and the issues raised from the review as defects. This allows
a separation in the reporting between tasks originally associated with the plan and work
carried out because of the incompleteness of the deliverable (i.e. defects). The test
case for a document would simply be to review it.
As with issues above, mistakes are inevitable in the development process and
therefore should be accepted and documented as part of improving the process rather
than being hidden away.
Defects
Defects are a special type of issue associated with testing.
A defect is a material error with a deliverable. In the case of code this may be as a
result of an undelivered but promised requirement, a misinterpretation of the
requirements, unexpected results (common when testing boundary conditions and
exceptional usage) etc. A defect should be resolved before the code is deployed (even
if the resolution is to say that the functionality will be delivered in a subsequent phase).
This can also be applied to review comments for documentation that are, in effect,
defects in the deliverable. Each defect should be reviewed and changes applied where
appropriate before the defect is closed.
Risks
There may be external circumstances or events that cannot occur for the project to be
successful. If you believe such an event is likely to happen, then it would be a risk.
Identifying something as a risk increases its visibility, and allows a proactive risk
6
management plan to be put into place.
If an event is within control of the project team, such as having testing complete by a
certain date, then it is not a risk. If an event has a one hundred percent chance of
occurring, then it is not a risk, since there is no "likelihood" or risk involved but it is an
issue.
Risks are managed by developing mitigations or ways in which to avoid the risk or
prevent it from becoming a reality.
6
http://www.mariosalexandrou.com/definition/risk.asp
• The second is the impact that is a measure of the impact in terms of resource,
time or scope.
Short Term
Current or near term phases need to be tracking every aspect of the work from the
scope (enhancements) through milestones, activities, risks, tasks, test cases and for
the current and next phase issues and defects. It is the issues and defects that will
cause the current phase to be delayed or have other resource impact. The financial
implications in the short term are over-spending or use of contingency because there is
no space in which to manoeuvre.
Medium Term
Phases further out need to have the scope, milestones, activities, risks and where
possible tasks planned out. These plans, their timescales and the budget for these
phases should be adjusted in the light of experience gained on current and previous
phases should influence the work being defined for these phases.
Long Term
Long-term plans are normally limited to rough scopes of work and timescales (for
example we hope to add market segmentation to the data warehouse and expect it to
take 5 people 3 months). This allows estimated budget requirements to be calculated
but is highly dependent on the timescales and successful delivery of the previous
phases. Learning from current work will improve the estimating over time.
Using the above definitions it is possible to indicate what information a project manager
should ideally have available in the following graphical way:
In practice managers normally have even less information available (indicated by the red
line). This means that both the short-term and medium-term horizons each cover two phases
rather than the three phases of the ideal situation. The long-term view is then restricted to
knowing what enhancements are needed.
In order to manage such a large-scale project it is important to use a number of tools to help
keep track of all the aspects of the project.
7
There are six major groups of software that are needed for corporate programme/project
management:
8
• Project Management Software
9
• Collaborative Software
10
• Ticket Tracking Systems
11
• Project Portfolio Management
12
• Resource Management
13
• Revision Control
For a data warehouse it would be common to use the following tools in the following way:
As a result these tools are useful for holding the phase, milestone and activity level
information but are rarely useful for task level information in data warehouse projects
due to the large number of interlinking tasks required to manage the environment.
Ticketing
Ticketing systems are often used to track issues with software but their use can be
much broader. Most ticketing systems allow the classification of tickets as well as
dependencies between tickets and a workflow element that allows tickets to be
assigned to others. A well-configured system is therefore the ideal tool for managing
tasks, issues, risks, test cases, defects and individual enhancement requests.
Tasks are initially raised; as the team member works on the task they may create
dependant or sub-tasks (often as simple reminders to themselves or others). When
issues arise they are captured against the task that is being affected allowing project
7
http://en.wikipedia.org/wiki/List_of_project_management_software
8
http://en.wikipedia.org/wiki/Project_management_software
9
http://en.wikipedia.org/wiki/Collaborative_software
10
http://en.wikipedia.org/wiki/Issue_tracking_system
11
http://en.wikipedia.org/wiki/Project_Portfolio_Management
12
http://en.wikipedia.org/wiki/Resource_Management
13
http://en.wikipedia.org/wiki/Revision_control
managers to see what is impacting the project timelines. The same is true for the other
ticket types (enhancement, risk, test case and defect).
Deploying a ticketing system, and opening it to everyone involved in the project from
business user to developer, allowing everyone to see everything, can have dramatic
effects on the speed and quality of development.
It is important that the culture around the system is one of mutual trust and respect. It is
easy for a poor manager to go and count the number of defects in an individual
developer’s code as a KPI of that developer’s performance. This is very naïve for two
reasons: it discourages honesty and it does not measure performance at all.
Discouraging honesty is a disaster, because the manager will never know the true state
of the project again. The number of defects against one developer may be high
because that person is the most skilled developer and therefore gets the most difficult
work to do, because the specification was wrong, because the source system has
changed, or because the requirements changed when the users saw the initial
14
results.
An effective ticketing system will also highlight delays and bottlenecks in the process
where, for example, critical blocking tasks go unanswered. Careful review of this
information allows risks and issues to be addressed early, thus reducing impact and for
an understanding of the root cause of delays to be gained which can be used to speed
up subsequent phases.
It also removes the process whereby the project manager each week goes around and
disturbs everyone else on the project to get an update on the status of each task, which
wastes huge amounts of time for those disturbed. A project manager can get the status
from the system and then talk to those individuals whose tasks and issues have a
material effect on the status of the project. This is management by exception rather
than micro-management of the process.
Version Control
A version control system is vital to the development of a data warehouse, not just for
the code developed but every document used throughout the life of the project and
subsequent maintenance. Whilst there are products especially built for document rather
than code version control it is often sufficient just to use the same piece of software.
The advantages of source code control have long been established but documentation
control is often not addressed. If there is a requirements document on a shared drive
that is updated when a new requirement emerges and (if the author remembers) the
version number on the front page or in the file name is updated this does not mean that
everyone working on the project has the same document.
If someone copies it to their laptop to work on it at home over the weekend and on their
return copies back to the shared drive all would appear well. The issue arises if two
people have had the same idea, when the second person copies their work back to the
shared drive the first person’s work is simply overwritten.
14
This is a failing in the requirements capture process that should be picked up by the project manager from these
tickets and corrected
Using a version
control system also
means that the
development team
can work against a
fixed version of
requirements for a
period of time, even if
others in the
organisation are still
updating the
requirements. This
stability allows the
development process
to be smoother and a
gap analysis performed at a later stage to understand what has changes allows the
developers to catch up with the requirement (normally in another phase).
Finally the repository provides an auditable trail of the changes in all aspects of the
project documentation that again can help with improving future phases.
Wiki
The introduction to this section described collaborative software that covers a broad
swathe of technology. In practice the most useful software that one can use is a wiki.
This provides a number of web pages that any user can update with a syntax that is
simpler than standard HTML. The pages that get created are often part of the informal
structure that helps the programme move along.
Wikis can be used for frequently asked questions, staff lists, announcements, social
events, guides to using other parts of the project, etc. Since they are a shared
resource, editable by all they can provide a productive communication tool that is far
more effective than email for encouraging collaboration.
Wikis are more common than most people realise with websites such as Wikipedia
using the same quick, cheap, easy software to build an encyclopaedia that can be
deployed as the team’s main source of information for the project. It can also have the
‘water-cooler’ effect where people from IT and the business collect to share information
in an informal manner rather than through formal documentation or review processes.
Methodologies
Most organisations will have a preferred IT development methodology. It is likely that the
15
larger the organization the more structured the methodology (for example PRINCE2 or other
16
waterfall approaches). These processes are often assessed by mechanisms such as CMMI.
17
Few large organisations are willing to adopt agile approaches despite both the business
side saying that IT should deliver faster and meet requirements more completely and the IT
side saying that they are trying to be flexible and meet that demand.
As a result the organisations’ methodologies will not be well suited for the development of a
data warehouse. This is because most methodologies have a fixed scope and a clear
objective. The deployment of a new financial package covers certain well-defined business
processes, has well understood inputs and outputs and bears scrutiny by use cases to
examine the steps.
A data warehouse is a system that will have a number of fixed outputs but also the intangible
‘just load up the data and I will know what I need when I see it‘ type requirement. It has a
18
huge number of changing inputs from the source systems and often has large parts that will
be in production whilst others are under development causing a crossover between the
deployment and maintenance phases.
Data warehouses are therefore inherently risky ventures and yet assessments such as CMMI
19
set out to avoid projects that carry risk. In Peopleware it suggests of CMMI assessments:
The projects most worth doing are the ones that will
move you DOWN one full level on your process scale
Trying to deploy large-scale data warehouses in large organisations is what has lead to the
concepts described in this document, whereby phases, milestones and activities can be
managed within the organisation’s standard structures but allowing more agile development
methods at the task level as well as detailed tracking of the relationship between issues and
tasks which improves time and resource management.
Remember that it is the deliverable and not the methodology used to get there that is
important. It is all too easy to get trapped in trying to complete a methodology rather than the
deliverable.
15
http://en.wikipedia.org/wiki/Prince2
16
http://en.wikipedia.org/wiki/CMMI
17
http://en.wikipedia.org/wiki/Agile_software_development
18
Imagine an organisation that has 25 source systems and two software upgrades/patches a year to each of them.
This means that there are 50 changes a year or one a week to be analysed, designed, build and tested as well as
managing maintenance and other issues that arise.
19
Peopleware – Productive Projects and Team: Tom DeMarco and Timothy Lister, Dorset House Publishing, Second
Edition 1999, Chapter 29: Process Improvement Programmes
Project Leadership
The single biggest impact on any project comes from the leadership it receives. Since a data
warehouse is such a long-term commitment the leadership staff should understand that they
are entering into a project that they may be engaged in for several years. A revolving door
policy of leaders will only slow the project down.
• The sponsors: An executive sponsor and his delegate, the project sponsor, who
commission the project and will ultimately, control its use.
• The project team: Leadership within the team normally consists of three individuals, a
project manager, a senior technical architect and a senior business analyst who must
work together to deliver, each taking responsibility for their own area but deferring to
the others for their respective areas of responsibility.
The Executive Sponsor acts as a vocal and visible champion, legitimizes the project’s
goals and objectives, keeps abreast of major project activities, and is the ultimate
decision-maker for the project. The Executive Sponsor provides support for the Project
Sponsor and the Project Manager. They have final approval of all scope changes, and
signs off on approvals to proceed to each succeeding project phase. The Executive
Sponsor may elect to delegate some of the above responsibilities to the Project
Sponsor (sometimes known as the Executive Sponsor’s Agent).
Project Sponsor
20
The Project Sponsor is someone who strongly supports the objectives of the project
and is willing to act as a champion and advocate on behalf of the project. To this end
they are the primary source of business input and take ownership from a business and
governance perspective.
• Accountability
The project sponsor is responsible for holding the project manager to account
and consequently keeping the project on track. The project sponsors must also
make themselves available to provide support and address risks and issues.
20
Adapted from:
http://www.stanford.edu/dept/its/projects/PMO/files/linked_files/guidelines_sponsors.pdf
http://ithelp.lincoln.ac.nz/site/story_images/3021_project_sponsor_s8869.pdf
http://www.cit.cornell.edu/computer/robohelp/cpmm/Project_Roles_and_Responsibilities.htm
• Strategic Fit
Assure that the project is aligned with the organization’s strategic goals.
• Resources
Provide or locate resources for the project especially where these are outside
the control of the project manager and protect resources from being pulled
away.
• Project Finances
Provide or locate funding for the project and track the budget in conjunction
with the project manager.
Project Manager
The Project Manager is the person responsible for ensuring that the Project Team
completes the project. The Project Manager develops the project plan with the team
and manages the team’s performance of project activities. It is also the responsibility of
the Project Manager to secure acceptance and approval of deliverables from the
Project Sponsor and other stakeholders. The Project Manager is responsible for
communication, including status reporting, risk management, escalation of issues that
cannot be resolved in the team, and, in general, making sure the project is delivered in
budget, on schedule, and within scope.
Technical Architect
The Technical Architect is the single point of responsibility for the technical solution
from an application and system perspective.
Leadership Style
Managing a small team to develop a large complex environment requires a degree of
skill and the ability to adjust the style of leadership to match the situation. This type of
flexibility is known as situational leadership and is a necessary quality in a data
warehouse project.
21
Kenneth Blanchard wrote “... managers
should work for their people, ... and not
Supporting Coaching the reverse. ... If you think that your
people are responsible and that your job
Supportive
Behaviour
As staff develop on a data warehouse project it is typical for a good leadership team to
move from a directive style through the coaching style and supporting style and
ultimately reaching the delegating style. For example the technical architect may start
with ‘Build the data warehouse this way’ moving through ‘We build the data warehouse
this way because …’ and ‘I like what you have done but have you considered …’ and
ending with ‘Thanks, that’s just what I envisioned’.
To this end a strong relationship and shared vision between project manager, technical
architect and senior business analyst is essential.
21
"Leadership and the One-Minute Manager", Kenneth Blanchard, Patricia Zigarmi, and Drea Zigarmi, p18. This is a
refinement of Path-Goal Theory developed by Robert House in 1971
Organisational Learning
Most people working on a project will learn in some way, but this knowledge, locked
away in an individual, is of little value to the business as a whole. The team have to
become a learning organisation that actively creates, captures, transfers and mobilizes
22
knowledge to enable it to adapt to a changing environment.
23
“Knowledge is not knowledge until someone else knows that one knows.”
For a data warehouse project organisational learning is critical, as, not only is the
environment rapidly changing but the number of places from which information is
sourced is larger than any other IT project the business is likely to engage in. In this
context information is more than data being transferred inside systems, it also includes
information about how business processes work, key contacts both inside and outside
the organisation, critical business requirements, data quality issues, etc.
22
http://en.wikipedia.org/wiki/Organizational_learning
23
Gaius Lucilius c180Bc-103BC
Team Rotation
It is common practice for the most experienced
developers to be put in change of major
enhancements and then in a sliding scale of
experience to assign people to maintenance work
and issue resolution. This has initial benefits of
getting the best developers to do the largest
chunk of work but projects can also benefit from
rotating people so that after the enhancement is
developed those resources move to the issue
handling and other developers move up to
maintenance from issues and to enhancements
from maintenance.
For the other less experienced teams they see career development as they can learn
new skills and have an opportunity to show that they can move up to the next level. If
problems occur they can still be supported by more senior developers but it is better to
allow people learn early.
This practice is also helpful in staff retention because it creates varied work and an
opportunity to learn. It also means that people will work across subject areas and
therefore have a greater understanding of the whole. This in turn makes it easier when
individuals do leave as combined with the organisational learning discussed above the
team is greater than the sum of its parts.
It is important that everyone understands that the timescales will change in light of
experience. Subsequent re-assessments of effort will become more accurate if the
organisation is willing to learn. This is often done by the organisational learning process
described above, combined with end of milestone or phase reviews and detailed reviews of
the ticketing system – what issues arose, will similar issues arise in the next development and
how to allow or compensate for them.
Tasks are also non-uniform, some tasks will take a short amount of time – often those at the
start and the end of a phase, others will take a longer period – often those in the middle of the
24
phase. Project monitoring systems that determine completeness by the number of tasks
complete as a percentage of the total number of tasks will see the project initially getting
‘ahead of schedule’ before ‘falling massively behind schedule’ and finally ‘recovering some of
the lost time’. This is exacerbated by the fact that if a ticketing system is used additional tasks
will be created and therefore the percentage complete will appear to drop at some points in
the project. Better measures are around the ratio of tasks to issues and the completion of
activities to planned activity duration.
Agile and other similar methodologies provide several techniques for improving the
estimation.
25
The first of these is velocity (or sometimes better known as load factor). This is the ratio
between the developers’ estimate and what is actually achieved. So if a task is estimated as
taking 4 days and then takes 10 days the “velocity” is 10/4 or 2.5. For the next phase the
developers’ estimates are taken and multiplied by 2.5 to determine the likely timescale. The
velocity is measured for each estimate and whilst initially it will oscillate widely after a few
iterations it will settle to a factor that can regularly be used calculate elapsed time.
The accuracy comes from allowing for two factors. The first factor is the developers own
tendency to over or under estimate the effort required to do any given piece of work. The
second factor is the distraction that the work environment provides with telephone calls, e-
mails and meetings, etc. These work distractions usually have an organisation specific pattern
that is not normally factored in by developers but needs to be consistently applied to plans
Over a period of time developers will come to understand their own estimating and their
velocity will tend to a constant. The consistency of this number is what allows project
managers to be able to estimate accurately.
26
Agile development processes also introduce the concept of technical debt . This is the
obligation that an organization incurs when it chooses a design or construction approach that
is expedient in the short term but that increases complexity and is more costly in the long
term.
24
See the Data Management & warehousing white paper “How Data Works” for a discussion of volume vs.
complexity in data and how this affects the timescales in a data warehouse project.
25
Version One discussion of Velocity: http://www.versionone.com/Resources/Velocity.asp
26
First proposed by Ward Cunningham in 1992 at the OOPSLA92 conference (http://c2.com/doc/oopsla92.html)
Technical debt can occur accidentally (when a mistake is made) or deliberately (when a
conscience decision is made to put something off). The secret to a successful implementation
is to continually work to manage and reduce the debt, just like debt in real life. It is technical
debt, so often unaccounted for, that often derails large projects as they run out of resource
and don’t understand where it has gone.
Debt requires re-payment with interest, i.e. whatever has been done has cost something (the
debt) and correcting it requires additional resource (the interest) to put it to how it should be.
27
As in financial circumstances delaying repayment always costs more in the long term.
It is often thought that time-boxing project phases either cannot be done for data warehouse
projects or will force developers to deliver something. Neither is true. A time-boxed project will
deliver something but can incur technical debt for those things that get omitted or
compromised. The delivered solution may be of little value if the time allowed is insufficient. If
the techniques described above are employed then early phases should be run as scope or
benefit driven rather than relying on time boxing. As velocity and technical debt are better
understood in the environment the project can switch to time-boxed activities to allow the
business to see tangible time-scaled development and deployment roadmap.
Finally the project manager must understand the cost of their actions. By putting good
communication, processes and systems in place the project manager can effectively monitor
and manage a project by exception, allowing the senior business analyst to handle the
business aspects and the technical architect to manage the technology and together progress
the delivery. If a team of eight people is brought together once a week for a two-hour status
meeting with the project manager, then two working days out of the forty working days (eight
people for five days a week assuming eight hour days) available in the week are lost. That
weekly status meeting reflects 5% of all the available resource!
28
The ultimate management sin is wasting people’s time
27
Note: The original draft of this paper was written before the 2008 financial crisis. The failure to manage debt in the
real world caused significant institutional melt down. This is mirrored by the meltdown of large data warehouse
projects within large organisations that have failed to manage technical debt.
28
Peopleware – Productive Projects and Team: Tom DeMarco and Timothy Lister, Dorset House Publishing, Second
Edition 1999, Chapter 33
• Phase 1: Mobilisation
Setting up a project has a cost, having a Mobilisation phase allows time and resource
to be allocated to the setup. If there is no allowance for it then at the end of the first
phase the cost of setup will be at least equal to the delay in the delivery of the phase.
• Transparency
By default let everyone involved with the project see everything. Exceptionally restrict
access to sensitive documents. Always challenge any request to categorise a
document as “sensitive”. This helps for many reasons; delays are not a sudden
surprise to users, others on the project often volunteer help or expertise about issues
in unexpected areas, it develops an inclusive aspect to the team dynamic, it provides
an opportunity for people to learn from the work of others, etc.
29
“UML Distilled” Martin Fowler “The more I see of use cases, the less valuable the use case
diagram seems to be. With use cases, concentrate your energy on their text rather than on
the diagram. Despite the fact that the UML has nothing to say about the use case text, it is
the text that contains all the value in the technique.”
Summary
Data warehouse projects pose a specific set of challenges for the project manager. Projects
often have poorly set expectations in terms of timescales, the likely return on investment, the
vendors’ promises for tools or the expectations set between the business and IT within an
organisation. They also have large technical architectures and resourcing issues that need to
be handled.
This document has outlined the building blocks of good project control including the definition
of phases, milestones, activities, tasks, issues, enhancements, test cases, defects and risks
and discussed how they can be managed, and when, using an event horizon the project
manager can expect to get information.
To help manage these building blocks this paper has looked at the types of tools and
technology and how they can be used to assist the project manager. These tools include
project management software, ticketing systems, version control and wikis. It has also looked
at how these tools fit into methodologies.
The final section of the paper has looked at how effective project leadership and estimating
can improve the chances of success for a project. This has included understanding the roles
of the executive sponsor, project manager, technical architect and senior business analyst
along with the use of different leadership styles, organisational learning and team rotation.
Outlined below are the major components of each of the processes. These are given as
examples only and are shown only with the high level summary diagrams and short
descriptions as indicators to what should be developed by an organisation to service its
environment
The solution provides a complete environment including version control, ticketing, wiki and
many other features using free, widely available, open source software.
http://projects.datamgmt.com
http://projects.datamgmt.com/demo
The Data Management & Warehousing internal project management (screenshot above) can
be found at:
http://projects.datamgmt.com/datamgmt
A description of how to build the system on your own server can be found at:
http://projects.datamgmt.com/howto
The company also hosts the service for clients who require a fast start on their projects.
Copyright
© 2008 Data Management & Warehousing. All rights reserved. Reproduction not permitted
without written authorisation. References to other companies and their products use
trademarks owned by the respective companies and are for reference purposes only.