Вы находитесь на странице: 1из 356

Structured and

Comprehensive Approach to
Data Management and the
Data Management Book of
Knowledge (DMBOK)

Alan McSweeney
Objectives

• To provide an overview of a structured approach to


developing and implementing a detailed data management
policy including frameworks, standards, project, team and
maturity

March 8, 2010 2
Agenda

• Introduction to Data Management


• State of Information and Data Governance
• Other Data Management Frameworks
• Data Management and Data Management Book of
Knowledge (DMBOK)
• Conducting a Data Management Project
• Creating a Data Management Team
• Assessing Your Data Management Maturity

March 8, 2010 3
Preamble

• Every good presentation should start with quotations from


The Prince and Dilbert

March 8, 2010 4
Management Wisdom

• There is nothing more difficult to take in hand, more perilous to conduct or more
uncertain in its success than to take the lead in the introduction of a new order of
things.
− The Prince

• Never be in the same room as a decision. I'll illustrate my point with a puppet
show that I call "Journey to Blameville" starring "Suggestion Sam" and "Manager
Meg.“
• You will often be asked to comment on things you don't understand. These
handouts contain nonsense phrases that can be used in any situation so, let's
dominate our industry with quality implementation of methodologies.
• Our executives have started their annual strategic planning sessions. This involves
sitting in a room with inadequate data until an illusion of knowledge is attained.
Then we'll reorganise, because that's all we know how to do.
− Dilbert

March 8, 2010 5
Information

• Information in all its forms –


input, processed, outputs – is a
Applications core component of any IT
system
• Applications exist to process
data supplied by users and
other applications
Processes Information
• Data breathes life into
applications
IT Systems
• Data is stored and managed by
infrastructure – hardware and
software
• Data is a key organisation asset
with a substantial value
People Infrastructure • Significant responsibilities are
imposed on organisations in
managing data

March 8, 2010 6
Data, Information and Knowledge

• Data is the representation of facts as text, numbers, graphics,


images, sound or video
• Data is the raw material used to create information
• Facts are captured, stored, and expressed as data
• Information is data in context
• Without context, data is meaningless - we create meaningful
information by interpreting the context around data
• Knowledge is information in perspective, integrated into a viewpoint
based on the recognition and interpretation of patterns, such as
trends, formed with other information and experience
• Knowledge is about understanding the significance of information
• Knowledge enables effective action

March 8, 2010 7
Data, Information, Knowledge and Action

Knowledge Action

Information
Data

March 8, 2010 8
Information is an Organisation Asset

• Tangible organisation assets are seen as having a value and


are managed and controlled using inventory and asset
management systems and procedures
• Data, because it is less tangible, is less widely perceived as
a real asset, assigned a real value and managed as if it had
a value
• High quality, accurate and available information is a pre-
requisite to effective operation of any organisation

March 8, 2010 9
Data Management and Project Success

• Data is fundamental to the effective and efficient


operation of any solution
− Right data
− Right time
− Right tools and facilities
• Without data the solution has no purpose
• Data is too often overlooked in projects
• Project managers frequently do not appreciate the
complexity of data issues

March 8, 2010 10
Generalised Information Management Lifecycle

Enter, Create, Acquire, • Generalised lifecycle that


Derive, Update, Capture
differs for specific
information types
Store, Manage, M
an
Replicate and Distribute ag
e,
Co
nt
ro
la
nd
Ad
Protect and Recover mi
n is
t er

• Design, define and implement


framework to manage Archive and Recall
information through this
lifecycle
Delete/Remove

March 8, 2010 11
Expanded Generalised Information Management
Lifecycle
Plan, Design and
Specify
De
Implement sig
Underlying n,
Im
Infrastructure ple
m en
Enter, Create, t, M
Acquire, Derive, an
ag
Update, Capture e,
Co
nt
Store, Manage, ro
la
Replicate and nd
Distribute Ad
mi
ni ste
r
• Include phases for information Protect and Recover
management lifecycle design
and implementation of Archive and Recall
appropriate hardware and
software to actualise lifecycle
Delete/Remove

March 8, 2010 12
Data and Information Management

• Data and information management is a business process


consisting of the planning and execution of policies,
practices, and projects that acquire, control, protect,
deliver, and enhance the value of data and information
assets

March 8, 2010 13
Data and Information Management

To manage and utilise information as a strategic asset

To implement processes, policies, infrastructure and solutions to


govern, protect, maintain and use information

To make relevant and correct information available in all business


processes and IT systems for the right people in the right context at
the right time with the appropriate security and with the right
quality

To exploit information in business decisions, processes and


relations

March 8, 2010 14
Data Management Goals

• Primary goals
− To understand the information needs of the enterprise and all its
stakeholders
− To capture, store, protect, and ensure the integrity of data assets
− To continually improve the quality of data and information,
including accuracy, integrity, integration, relevance and
usefulness of data
− To ensure privacy and confidentiality, and to prevent
unauthorised inappropriate use of data and information
− To maximise the effective use and value of data and information
assets

March 8, 2010 15
Data Management Goals

• Secondary goals
− To control the cost of data management
− To promote a wider and deeper understanding of the value of
data assets
− To manage information consistently across the enterprise
− To align data management efforts and technology with business
needs

March 8, 2010 16
Triggers for Data Management Initiative

• When an enterprise is about to undertake architectural


transformation, data management issues need to be
understood and addressed
• Structured and comprehensive approach to data
management enables the effective use of data to take
advantage of its competitive advantages

March 8, 2010 17
Data Management Principles

• Data and information are valuable enterprise assets


• Manage data and information carefully, like any other
asset, by ensuring adequate quality, security, integrity,
protection, availability, understanding and effective use
• Share responsibility for data management between
business data owners and IT data management
professionals
• Data management is a business function and a set of
related disciplines

March 8, 2010 18
Organisation Data Management Function

• Business function of planning for, controlling and


delivering data and information assets
• Development, execution, and supervision of plans,
policies, programs, projects, processes, practices and
procedures that control, protect, deliver, and enhance the
value of data and information assets
• Scope of the data management function and the scale of
its implementation vary widely with the size, means, and
experience of organisations
• Role of data management remains the same across
organisations even though implementation differs widely
March 8, 2010 19
Scope of Complete Data Management Function

Data Management

Data Governance Data Architecture Management

Data Development Data Operations Management

Data Security Management Data Quality Management

Reference and Master Data Data Warehousing and Business


Management Intelligence Management

Document and Content Management Metadata Management

March 8, 2010 20
Shared Role Between Business and IT

• Data management is a shared responsibility between data


management professionals within IT and the business data
owners representing the interests of data producers and
information consumers
• Business data ownership is the concerned with
accountability for business responsibilities in data
management
• Business data owners are data subject matter experts
• Represent the data interests of the business and take
responsibility for the quality and use of data

March 8, 2010 21
Why Develop and Implement a Data Management
Framework?
• Improve organisation data management efficiency
• Deliver better service to business
• Improve cost-effectiveness of data management
• Match the requirements of the business to the management of the
data
• Embed handling of compliance and regulatory rules into data
management framework
• Achieve consistency in data management across systems and
applications
• Enable growth and change more easily
• Reduce data management and administration effort and cost
• Assist in the selection and implementation of appropriate data
management solutions
• Implement a technology-independent data architecture
March 8, 2010 22
Data Management Issues

March 8, 2010 23
Data Management Issues

• Discovery - cannot find the right information


• Integration - cannot manipulate and combine information
• Insight - cannot extract value and knowledge from
information
• Dissemination - cannot consume information
• Management – cannot manage and control information
volumes and growth

March 8, 2010 24
Data Management Problems – User View

• Managing Storage Equipment


• Application Recoveries / Backup Retention
• Vendor Management
• Power Management
• Regulatory Compliance
• Lack of Integrated Tools
• Dealing with Performance Problems
• Data Mobility
• Archiving and Archive Management
• Storage Provisioning
• Managing Complexity
• Managing Costs
• Backup Administration and Management
• Proper Capacity Forecasting and Storage Reporting
• Managing Storage Growth
March 8, 2010 25
Information Management Challenges

• Explosive Data Growth


− Value and volume of data is overwhelming
− More data is see as critical
− Annual rate of 50+% percent
• Compliance Requirements
− Compliance with stringent regulatory requirements and audit
procedures
• Fragmented Storage Environment
− Lack of enterprise-wide hardware and software data storage
strategy and discipline
• Budgets
− Frozen or being cut

March 8, 2010 26
Data Quality

• Poor data quality costs real money


• Process efficiency is negatively impacted by poor data
quality
• Full potential benefits of new systems not be realised
because of poor data quality
• Decision making is negatively affected by poor data quality

March 8, 2010 27
State of Information and Data Governance

• Information and Data Governance Report, April 2008


− International Association for Information and Data Quality (IAIDQ)
− University of Arkansas at Little Rock, Information Quality Program
(UALR-IQ)

March 8, 2010 28
Your Organisation Recognises and Values Information as a
Strategic Asset and Manages it Accordingly

Strongly Disagree 3.4%

Disagree 21.5%

Neutral 17.1%

Agree 39.5%

Strongly Agree 18.5%

0% 10% 20% 30% 40% 50%

March 8, 2010 29
Direction of Change in the Results and Effectiveness of the
Organisation's Formal or Informal Information/Data
Governance Processes Over the Past Two Years

Results and Effectiveness Have Significantly


8.8%
Improved

Results and Effectiveness Have Improved 50.0%

Results and Effectiveness Have Remained


31.9%
Essentially the Same

Results and Effectiveness Have Worsened 3.9%

Results and Effectiveness Have Significantly


0.0%
Worsened

Don’t Know 5.4%

0% 10% 20% 30% 40% 50% 60% 70%

March 8, 2010 30
Perceived Effectiveness of the Organisation's Current
Formal or Informal Information/Data Governance Processes

Excellent (All Goals are


2.5%
Met)

Good (Most Goals are


21.1%
Met)

OK (Some Goals are Met) 51.5%

Poor (Few Goals are Met) 19.1%

Very Poor (No Goals are


3.9%
Met)

Don’t Know 2.0%

0% 10% 20% 30% 40% 50% 60% 70%

March 8, 2010 31
Actual Information/Data Governance Effectiveness
vs. Organisation's Perception

It is Better Than Most


20.1%
People Think

It is the Same as Most


32.4%
People Think

It is Worse Than Most


35.8%
People Think

Don’t Know 11.8%

0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50%

March 8, 2010 32
Current Status of Organisation's Information/Data
Governance Initiatives
Started an Information/Data Governance Initiative, but
1.5%
Discontinued the Effort
Considered a Focused Information/Data Governance
0.5%
Effort but Abandoned the Idea

None Being Considered - Keeping the Status Quo 7.4%

Exploring, Still Seeking to Learn More 20.1%

Evaluating Alternative Frameworks and Information


23.0%
Governance Structures

Now Planning an Implementation 13.2%

First Iteration Implemented the Past 2 Years 19.1%

First Interation"in Place for More Than 2 Years 8.8%

Don’t Know 6.4%

0% 5% 10% 15% 20% 25% 30%

March 8, 2010 33
Expected Changes in Organisation's Information/Data
Governance Efforts Over the Next Two Years

Will Increase Significantly 46.6%

Will Increase Somewhat 39.2%

Will Remain the Same 10.8%

Will Decrease Somewhat 1.0%

Will Decrease Significantly 0.5%

Don’t Know 2.0%

0% 10% 20% 30% 40% 50% 60%


March 8, 2010 34
Overall Objectives of Information / Data Governance
Efforts
Improve Data Quality 80.2%

Establish Clear Decision Rules and Decisionmaking


65.6%
Processes for Shared Data

Increase the Value of Data Assets 59.4%

Provide Mechanism to Resolve Data Issues 56.8%

Involve Non-IT Personnel in Data Decisions IT Should


55.7%
not Make by Itself
Promote Interdependencies and Synergies Between
49.6%
Departments or Business Units

Enable Joint Accountability for Shared Data 45.3%

Involve IT in Data Decisions non-IT Personnel Should


35.4%
not Make by Themselves

Other 5.2%

None Applicable 1.0%

Don't Know 2.6%

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100
%
March 8, 2010 35
Change In Organisation's Information / Data Quality
Over the Past Two Years
Information / Data Quality
10.5%
Has Significantly Improved

Information / Data Quality


68.4%
Has Improved

Information / Data Quality


Has Remained Essentially 15.8%
the Same

Information / Data Quality


3.5%
Has Worsened

Information / Data Quality


0.0%
Has Significantly Worsened

Don’t Know 1.8%

0% 10% 20% 30% 40% 50% 60% 70% 80%

March 8, 2010 36
Maturity Of Information / Data Governance Goal
Setting And Measurement In Your Organisation

5 - Optimised 3.7%

4 - Managed 11.8%

3 - Defined 26.7%

2 - Repeatable 28.9%

1 - Ad-hoc 28.9%

0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50%

March 8, 2010 37
Maturity Of Information / Data Governance
Processes And Policies In Your Organisation
5 - Optimised 1.6%

4 - Managed 4.8%

3 - Defined 24.5%

2 - Repeatable 46.3%

1 - Ad-hoc 22.9%

0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50%

March 8, 2010 38
Maturity Of Responsibility And Accountability For
Information / Data Governance Among Employees In Your
Organisation
5 - Optimised 6.9%

4 - Managed 3.2%

3 - Defined 31.7%

2 - Repeatable 25.4%

1 - Ad-hoc 32.8%

0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50%

March 8, 2010 39
Other Data Management Frameworks

March 8, 2010 40
Other Data Management-Related Frameworks

• TOGAF (and other enterprise architecture standards) define a


process for arriving an at enterprise architecture definition, including
data
• TOGAF has a phase relating to data architecture
• TOGAF deals with high level
• DMBOK translates high level into specific details
• COBIT is concerned with IT governance and controls:
− IT must implement internal controls around how it operates
− The systems IT delivers to the business and the underlying business processes
these systems actualise must be controlled – these are controls external to IT
− To govern IT effectively, COBIT defines the activities and risks within IT that
need to be managed
• COBIT has a process relating to data management
• Neither TOGAF nor COBIT are concerned with detailed data
management design and implementation

March 8, 2010 41
DMBOK, TOGAF and COBIT
Can be a DMBOK Is a Specific and
Precursor to Comprehensive Data
Implementing Oriented Framework
Data
Management DMBOK Provides Detailed
for Definition,
Implementation and
TOGAF Defines the Process Operation of Data
for Creating a Data Management and Utilisation
Architecture as Part of an
Overall Enterprise
Architecture
Can Provide a Maturity
Model for Assessing
Data Management

COBIT Provides Data


Governance as Part of
Overall IT Governance

March 8, 2010 42
DMBOK, TOGAF and COBIT – Scope and Overlap
DMBOK
Data Development
Data Operations Management
Reference and Master Data Management
Data Warehousing and Business Intelligence Management
TOGAF Document and Content Management
Metadata Management
Data Quality Management

Data Architecture Management


Data Management
Data Migration

Data
Governance
Data Security COBIT
Management

March 8, 2010 43
TOGAF and Data Management
• Phase C1 (subset of
Phase C) relates to
Phase A:
Architecture defining a data
Vision
Phase H:
Phase B:
architecture
Architecture
Business
Change
Architecture
Management
Phase C1:
Data
Architecture
Phase G: Phase C:
Requirements Information
Implementation
Management Systems
Governance Architecture
Phase C2:
Solutions and
Application
Phase F: Phase D: Architecture
Migration Technology
Planning Architecture
Phase E:
Opportunities
and Solutions

March 8, 2010 44
TOGAF Phase C1: Information Systems Architectures
- Data Architecture - Objectives
• Purpose is to define the major types and sources of data
necessary to support the business, in a way that is:
− Understandable by stakeholders
− Complete and consistent
− Stable
• Define the data entities relevant to the enterprise
• Not concerned with design of logical or physical storage
systems or databases

March 8, 2010 45
TOGAF Phase C1: Information Systems Architectures
- Data Architecture - Overview
Phase C1: Information Systems
Architectures - Data Architecture

Approach Elements Inputs Steps Outputs

Key Considerations for Data Reference Materials External to the Select Reference Models,
Architecture Enterprise Viewpoints, and Tools

Develop Baseline Data Architecture


Architecture Repository Non-Architectural Inputs
Description

Develop Target Data Architecture


Architectural Inputs
Description

Perform Gap Analysis

Define Roadmap Components

Resolve Impacts Across the


Architecture Landscape

Conduct Formal Stakeholder


Review

Finalise the Data Architecture

Create Architecture Definition


Document
March 8, 2010 46
TOGAF Phase C1: Information Systems Architectures - Data
Architecture - Approach - Key Considerations for Data
Architecture
• Data Management
− Important to understand and address data management issues
− Structured and comprehensive approach to data management enables the
effective use of data to capitalise on its competitive advantages
− Clear definition of which application components in the landscape will serve as
the system of record or reference for enterprise master data
− Will there be an enterprise-wide standard that all application components,
including software packages, need to adopt
− Understand how data entities are utilised by business functions, processes, and
services
− Understand how and where enterprise data entities are created, stored,
transported, and reported
− Level and complexity of data transformations required to support the
information exchange needs between applications
− Requirement for software in supporting data integration with external
organisations

March 8, 2010 47
TOGAF Phase C1: Information Systems Architectures - Data
Architecture - Approach - Key Considerations for Data
Architecture
• Data Migration
− Identify data migration requirements and also provide indicators
as to the level of transformation for new/changed applications
− Ensure target application has quality data when it is populated
− Ensure enterprise-wide common data definition is established to
support the transformation

March 8, 2010 48
TOGAF Phase C1: Information Systems Architectures - Data
Architecture - Approach - Key Considerations for Data
Architecture
• Data Governance
− Ensures that the organisation has the necessary dimensions in
place to enable the data transformation
− Structure – ensures the organisation has the necessary structure
and the standards bodies to manage data entity aspects of the
transformation
− Management System - ensures the organisation has the
necessary management system and data-related programs to
manage the governance aspects of data entities throughout its
lifecycle
− People - addresses what data-related skills and roles the
organisation requires for the transformation

March 8, 2010 49
TOGAF Phase C1: Information Systems Architectures
- Data Architecture - Outputs
• Refined and updated versions of the Architecture Vision phase deliverables
− Statement of Architecture Work
− Validated data principles, business goals, and business drivers
• Draft Architecture Definition Document
− Baseline Data Architecture
− Target Data Architecture
• Business data model
• Logical data model
• Data management process models
• Data Entity/Business Function matrix
• Views corresponding to the selected viewpoints addressing key stakeholder concerns
− Draft Architecture Requirements Specification
• Gap analysis results
• Data interoperability requirements
• Relevant technical requirements
• Constraints on the Technology Architecture about to be designed
• Updated business requirements
• Updated application requirements
− Data Architecture components of an Architecture Roadmap
March 8, 2010 50
COBIT Structure
COBIT

Plan and Organise (PO) Acquire and Implement (AI) Deliver and Support (DS) Monitor and Evaluate (ME)

DS1 Define and manage service ME1 Monitor and evaluate IT


PO1 Define a strategic IT plan AI1 Identify automated solutions
levels performance

PO2 Define the information AI2 Acquire and maintain ME2 Monitor and evaluate
DS2 Manage third-party services
architecture application software internal control

PO3 Determine technological AI3 Acquire and maintain DS3 Manage performance and ME3 Ensure regulatory
direction technology infrastructure capacity compliance

PO4 Define the IT processes,


AI4 Enable operation and use DS4 Ensure continuous service ME4 Provide IT governance
organisation and relationships

PO5 Manage the IT investment AI5 Procure IT resources DS5 Ensure systems security

PO6 Communicate management


AI6 Manage changes DS6 Identify and allocate costs
aims and direction

AI7 Install and accredit solutions


PO7 Manage IT human resources DS7 Educate and train users
and changes

DS8 Manage service desk and


PO8 Manage quality
incidents

PO9 Assess and manage IT risks DS9 Manage the configuration

PO10 Manage projects DS10 Manage problems

DS11 Manage data


DS12 Manage the physical
environment

DS13 Manage operations

March 8, 2010 51
COBIT and Data Management

• COBIT objective DS11 Manage Data within the Deliver and


Support (DS) domain
• Effective data management requires identification of data
requirements
• Data management process includes establishing effective
procedures to manage the media library, backup and
recovery of data and proper disposal of media
• Effective data management helps ensure the quality,
timeliness and availability of business data

March 8, 2010 52
COBIT and Data Management

• Objective is the control over the IT process of managing data that


meets the business requirement for IT of optimising the use of
information and ensuring information is available as required
• Focuses on maintaining the completeness, accuracy, availability and
protection of data
• Involves taking actions
− Backing up data and testing restoration
− Managing onsite and offsite storage of data
− Securely disposing of data and equipment
• Measured by
− User satisfaction with availability of data
− Percent of successful data restorations
− Number of incidents where sensitive data were retrieved after media were
disposed of

March 8, 2010 53
COBIT Process DS11 Manage Data
• DS11.1 Business Requirements for Data Management
− Establish arrangements to ensure that source documents expected from the business are received, all data received from the
business are processed, all output required by the business is prepared and delivered, and restart and reprocessing needs are
supported
• DS11.2 Storage and Retention Arrangements
− Define and implement procedures for data storage and archival, so data remain accessible and usable
− Procedures should consider retrieval requirements, cost-effectiveness, continued integrity and security requirements
− Establish storage and retention arrangements to satisfy legal, regulatory and business requirements for documents, data, archives,
programmes, reports and messages (incoming and outgoing) as well as the data (keys, certificates) used for their encryption and
authentication
• DS11.3 Media Library Management System
− Define and implement procedures to maintain an inventory of onsite media and ensure their usability and integrity
− Procedures should provide for timely review and follow-up on any discrepancies noted
• DS11.4 Disposal
− Define and implement procedures to prevent access to sensitive data and software from equipment or media when they are
disposed of or transferred to another use
− Procedures should ensure that data marked as deleted or to be disposed cannot be retrieved.
• DS11.5 Backup and Restoration
− Define and implement procedures for backup and restoration of systems, data and documentation in line with business
requirements and the continuity plan
− Verify compliance with the backup procedures, and verify the ability to and time required for successful and complete restoration
− Test backup media and the restoration process
• DS11.6 Security Requirements for Data Management
− Establish arrangements to identify and apply security requirements applicable to the receipt, processing, physical storage and
output of data and sensitive messages
− Includes physical records, data transmissions and any data stored offsite

March 8, 2010 54
COBIT Data Management Goals and Metrics
Activity Goals Process Goals Activity Goals

•Backing up data and testing •Maintain the completeness, •Backing up data and testing
restoration accuracy, validity and restoration
•Managing onsite and offsite accessibility of stored data •Managing onsite and offsite
storage of data •Secure data during disposal storage of data
•Securely disposing of data of media •Securely disposing of data
and equipment •Effectively manage storage and equipment
media

Are Measured Are Measured Are Measured


By Drive By Drive By

Key Performance Process Key Goal IT Key Goal Indicators


Indicators Indicators
•% of successful data •Occurrences of inability to
restorations recover data critical to
•Frequency of testing of •# of incidents where business process
backup media sensitive data were retrieved •User satisfaction with
•Average time for data after media were disposed of availability of data
restoration •# of down time or data •Incidents of noncompliance
integrity incidents caused by with laws due to storage
insufficient storage capacity management issues

March 8, 2010 55
Data Management Book of Knowledge (DMBOK)

March 8, 2010 56
Data Management Book of Knowledge (DMBOK)

• DMBOK is a generalised and comprehensive framework for


managing data across the entire lifecycle
• Developed by DAMA (Data Management Association)
• DMBOK provides a detailed framework to assist
development and implementation of data management
processes and procedures and ensures all requirements
are addressed
• Enables effective and appropriate data management
across the organisation
• Provides awareness and visibility of data management
issues and requirements
March 8, 2010 57
Data Management Book of Knowledge (DMBOK)

• Not a solution to your data management needs


• Framework and methodology for developing and
implementing an appropriate solution
• Generalised framework to be customised to meet specific
needs
• Provide a work breakdown structure for a data
management project to allow the effort to be assessed
• No magic bullet

March 8, 2010 58
Scope and Structure of Data Management Book of
Knowledge (DMBOK)

Data Management
Environmental Elements

Data
Management
Functions

March 8, 2010 59
DMBOK Data Management Functions
Data Management
Functions

Data Governance Data Architecture Management

Data Development Data Operations Management

Data Security Management Data Quality Management

Data Warehousing and Business


Reference and Master Data Management
Intelligence Management

Document and Content Management Metadata Management

March 8, 2010 60
DMBOK Data Management Functions

• Data Governance - planning, supervision and control over data management and
use
• Data Architecture Management - defining the blueprint for managing data assets
• Data Development - analysis, design, implementation, testing, deployment,
maintenance
• Data Operations Management - providing support from data acquisition to
purging
• Data Security Management - Ensuring privacy, confidentiality and appropriate
access
• Data Quality Management - defining, monitoring and improving data quality
• Reference and Master Data Management - managing master versions and
replicas
• Data Warehousing and Business Intelligence Management - enabling reporting
and analysis
• Document and Content Management - managing data found outside of databases
• Metadata Management - integrating, controlling and providing metadata

March 8, 2010 61
DMBOK Data Management Environmental Elements
Data Management
Environmental Elements

Goals and Principles Activities

Primary Deliverables Roles and Responsibilities

Practices and Techniques Technology

Organisation and Culture

March 8, 2010 62
DMBOK Data Management Environmental Elements

• Goals and Principles - directional business goals of each function and the fundamental
principles that guide performance of each function
• Activities - each function is composed of lower level activities, sub-activities, tasks and
steps
• Primary Deliverables - information and physical databases and documents created as
interim and final outputs of each function. Some deliverables are essential, some are
generally recommended, and others are optional depending on circumstances
• Roles and Responsibilities - business and IT roles involved in performing and supervising
the function, and the specific responsibilities of each role in that function. Many roles will
participate in multiple functions
• Practices and Techniques - common and popular methods and procedures used to perform
the processes and produce the deliverables and may also include common conventions,
best practice recommendations, and alternative approaches without elaboration
• Technology - categories of supporting technology such as software tools, standards and
protocols, product selection criteria and learning curves
• Organisation and Culture – this can include issues such as management metrics, critical
success factors, reporting structures, budgeting, resource allocation issues, expectations
and attitudes, style, cultural, approach to change management

March 8, 2010 63
DMBOK Data Management Functions and
Environmental Elements
Goals and Activities Primary Roles and Practices and Technology Organisation
Principles Deliverables Responsibilities Techniques and Culture
Data
Governance
Data
Architecture
Management
Data
Development
Data
Operations
Management
 Scope of Each Data Management Function 
Data Security
Management
Data Quality
Management
Reference and
Master Data
Management
Data
Warehousing
and Business
Intelligence
Management
Document and
Content
Management
Metadata
Management
March 8, 2010 64
Scope of Data Management Book of Knowledge
(DMBOK) Data Management Framework
• Hierarchy
− Function
• Activity
− Sub-Activity (not in all cases)
• Each activity is classified as one (or more) of:
− Planning Activities (P)
• Activities that set the strategic and tactical course for other data management
activities
• May be performed on a recurring basis
− Development Activities (D)
• Activities undertaken within implementation projects and recognised as part of the
systems development lifecycle (SDLC), creating data deliverables through analysis,
design, building, testing, preparation, and deployment
− Control Activities (C)
• Supervisory activities performed on an on-going basis
− Operational Activities (O)
• Service and support activities performed on an on- going basis

March 8, 2010 65
Activity Groups Within Functions

• Activity groups are


classifications of data
management
Planning Development
activities
Activities Activities • Use the activity
groupings to define
the scope of data
management sub-
projects and identify
the appropriate tasks:
Control Operational
Activities − Analysis and design
Activities
− Implementation
− Operational
improvement
− Management and
administration

March 8, 2010 66
DMBOK Function and Activity Structure
Data
Management

Reference and Document and


Data Architecture Data Operations Data Security Data Quality DW and BI Metadata
Data Governance Data Development Master Data Content
Management Management Management Management Management Management
Management Management

Understand Data
Data Modeling, Develop and Promote Understand Reference Understand Business
Data Management Understand Enterprise Security Needs and Documents / Records Understand Metadata
Analysis, and Solution Database Support Data Quality and Master Data Intelligence
Planning Information Needs Regulatory Management Requirements
Design Awareness Integration Needs Information Needs
Requirements

Identify Master and


Develop and Maintain Define and Maintain
Data Management Data Technology Define Data Security Define Data Quality Reference Data Define the Metadata
the Enterprise Data Detailed Data Design the DW / BI Content Management
Control Management Policy Requirement Sources and Architecture
Model Architecture
Contributors

Analyse and Align Data Model and Define and Maintain Implement Data
Define Data Security Profile, Analyse, and Develop and Maintain
With Other Business Design Quality the Data Integration Warehouses and Data
Standards Assess Data Quality Metadata Standards
Models Management Architecture Marts

Implement Reference
Define and Maintain Define Data Security Implement a Managed
Define Data Quality and Master Data Implement BI Tools
the Database Data Implementation Controls and Metadata
Metrics Management and User Interfaces
Architecture Procedures Environment
Solutions

Define and Maintain Manage Users,


Define Data Quality Define and Maintain Process Data for Create and Maintain
the Data Integration Passwords, and Group
Business Rules Match Rules Business Intelligence Metadata
Architecture Membership

Define and Maintain Monitor and Tune


Manage Data Access Test and Validate Data Establish “Golden”
the DW / BI Data Warehousing Integrate Metadata
Views and Permissions Quality Requirements Records
Architecture Processes

Define and Maintain Monitor User Define and Maintain Monitor and Tune BI
Set and Evaluate Data Manage Metadata
Enterprise Taxonomies Authentication and Hierarchies and Activity and
Quality Service Levels Repositories
and Namespaces Access Behaviour Affiliations Performance

Define and Maintain Continuously Measure Plan and Implement


Classify Information Distribute and Deliver
the Metadata and Monitor Data Integration of New
Confidentiality Metadata
Architecture Quality Data Sources

Replicate and
Manage Data Quality Query, Report, and
Audit Data Security Distribute Reference
Issues Analyse Metadata
and Master Data

Clean and Correct Data Manage Changes to


Quality Defects Reference and Master
Data

Design and Implement


Operational DQM
Procedures

Monitor Operational
DQM Procedures and
Performance
March 8, 2010 67
DMBOK Function and Activity - Planning Activities
Data
Management

Reference and Document and


Data Architecture Data Operations Data Security Data Quality DW and BI Metadata
Data Governance Data Development Master Data Content
Management Management Management Management Management Management
Management Management
Understand Data Understand
Understand Data Modeling, Develop and Promote Understand Business Understand
Data Management Security Needs and Reference and Documents / Records
Enterprise Analysis, and Database Support Data Quality Intelligence Metadata
Planning Regulatory Master Data Management
Information Needs Solution Design Awareness Information Needs Requirements
Requirements Integration Needs
Develop and Identify Master and
Define and Maintain
Data Management Maintain the Data Technology Define Data Security Define Data Quality Reference Data Content Define the Metadata
Detailed Data Design the DW / BI
Control Enterprise Data Management Policy Requirement Sources and Management Architecture
Architecture
Model Contributors

Analyse and Align Data Model and Define and Maintain Implement Data Develop and
Define Data Security Profile, Analyse, and
With Other Business Design Quality the Data Integration Warehouses and Maintain Metadata
Standards Assess Data Quality
Models Management Architecture Data Marts Standards

Implement Reference
Define and Maintain Define Data Security Implement a
Define Data Quality and Master Data Implement BI Tools
the Database Data Implementation Controls and Managed Metadata
Metrics Management and User Interfaces
Architecture Procedures Environment
Solutions

Define and Maintain Manage Users,


Define Data Quality Define and Maintain Process Data for Create and Maintain
the Data Integration Passwords, and
Business Rules Match Rules Business Intelligence Metadata
Architecture Group Membership

Define and Maintain Manage Data Access Test and Validate Monitor and Tune
Establish “Golden”
the DW / BI Views and Data Quality Data Warehousing Integrate Metadata
Records
Architecture Permissions Requirements Processes

Define and Maintain


Monitor User Set and Evaluate Define and Maintain Monitor and Tune BI
Enterprise Manage Metadata
Authentication and Data Quality Service Hierarchies and Activity and
Taxonomies and Repositories
Access Behaviour Levels Affiliations Performance
Namespaces

Define and Maintain Continuously Plan and Implement


Classify Information Distribute and
the Metadata Measure and Monitor Integration of New
Confidentiality Deliver Metadata
Architecture Data Quality Data Sources

Replicate and
Manage Data Quality Query, Report, and
Audit Data Security Distribute Reference
Issues Analyse Metadata
and Master Data

Clean and Correct Manage Changes to


Data Quality Defects Reference and
Master Data

Design and
Implement
Operational DQM
Procedures

Monitor Operational
DQM Procedures and
Performance

March 8, 2010 68
DMBOK Function and Activity - Control Activities
Data
Management

Reference and Document and


Data Architecture Data Operations Data Security Data Quality DW and BI Metadata
Data Governance Data Development Master Data Content
Management Management Management Management Management Management
Management Management

Understand Data
Data Modeling, Develop and Promote Understand Reference Understand Business
Data Management Understand Enterprise Security Needs and Documents / Records Understand Metadata
Analysis, and Solution Database Support Data Quality and Master Data Intelligence
Planning Information Needs Regulatory Management Requirements
Design Awareness Integration Needs Information Needs
Requirements

Identify Master and


Develop and Maintain Define and Maintain
Data Management Data Technology Define Data Security Define Data Quality Reference Data Define the Metadata
the Enterprise Data Detailed Data Design the DW / BI Content Management
Control Management Policy Requirement Sources and Architecture
Model Architecture
Contributors

Analyse and Align Data Model and Define and Maintain Implement Data
Define Data Security Profile, Analyse, and Develop and Maintain
With Other Business Design Quality the Data Integration Warehouses and Data
Standards Assess Data Quality Metadata Standards
Models Management Architecture Marts

Implement Reference
Define and Maintain Define Data Security Implement a Managed
Define Data Quality and Master Data Implement BI Tools
the Database Data Implementation Controls and Metadata
Metrics Management and User Interfaces
Architecture Procedures Environment
Solutions

Define and Maintain Manage Users,


Define Data Quality Define and Maintain Process Data for Create and Maintain
the Data Integration Passwords, and Group
Business Rules Match Rules Business Intelligence Metadata
Architecture Membership

Define and Maintain Monitor and Tune


Manage Data Access Test and Validate Data Establish “Golden”
the DW / BI Data Warehousing Integrate Metadata
Views and Permissions Quality Requirements Records
Architecture Processes

Define and Maintain Monitor User Define and Maintain Monitor and Tune BI
Set and Evaluate Data Manage Metadata
Enterprise Taxonomies Authentication and Hierarchies and Activity and
Quality Service Levels Repositories
and Namespaces Access Behaviour Affiliations Performance

Define and Maintain Continuously Measure Plan and Implement


Classify Information Distribute and Deliver
the Metadata and Monitor Data Integration of New
Confidentiality Metadata
Architecture Quality Data Sources

Replicate and
Manage Data Quality Query, Report, and
Audit Data Security Distribute Reference
Issues Analyse Metadata
and Master Data

Clean and Correct Data Manage Changes to


Quality Defects Reference and Master
Data

Design and Implement


Operational DQM
Procedures

Monitor Operational
DQM Procedures and
Performance
March 8, 2010 69
DMBOK Function and Activity - Development
Activities Data
Management

Reference and Document and


Data Architecture Data Operations Data Security Data Quality DW and BI Metadata
Data Governance Data Development Master Data Content
Management Management Management Management Management Management
Management Management

Understand Data
Data Modeling, Develop and Promote Understand Reference Understand Business
Data Management Understand Enterprise Security Needs and Documents / Records Understand Metadata
Analysis, and Solution Database Support Data Quality and Master Data Intelligence
Planning Information Needs Regulatory Management Requirements
Design Awareness Integration Needs Information Needs
Requirements

Identify Master and


Develop and Maintain Define and Maintain
Data Management Data Technology Define Data Security Define Data Quality Reference Data Define the Metadata
the Enterprise Data Detailed Data Design the DW / BI Content Management
Control Management Policy Requirement Sources and Architecture
Model Architecture
Contributors

Analyse and Align Data Model and Define and Maintain Implement Data
Define Data Security Profile, Analyse, and Develop and Maintain
With Other Business Design Quality the Data Integration Warehouses and Data
Standards Assess Data Quality Metadata Standards
Models Management Architecture Marts

Implement Reference
Define and Maintain Define Data Security Implement a Managed
Define Data Quality and Master Data Implement BI Tools
the Database Data Implementation Controls and Metadata
Metrics Management and User Interfaces
Architecture Procedures Environment
Solutions

Define and Maintain Manage Users,


Define Data Quality Define and Maintain Process Data for Create and Maintain
the Data Integration Passwords, and Group
Business Rules Match Rules Business Intelligence Metadata
Architecture Membership

Define and Maintain Monitor and Tune


Manage Data Access Test and Validate Data Establish “Golden”
the DW / BI Data Warehousing Integrate Metadata
Views and Permissions Quality Requirements Records
Architecture Processes

Define and Maintain Monitor User Define and Maintain Monitor and Tune BI
Set and Evaluate Data Manage Metadata
Enterprise Taxonomies Authentication and Hierarchies and Activity and
Quality Service Levels Repositories
and Namespaces Access Behaviour Affiliations Performance

Define and Maintain Continuously Measure Plan and Implement


Classify Information Distribute and Deliver
the Metadata and Monitor Data Integration of New
Confidentiality Metadata
Architecture Quality Data Sources

Replicate and
Manage Data Quality Query, Report, and
Audit Data Security Distribute Reference
Issues Analyse Metadata
and Master Data

Clean and Correct Data Manage Changes to


Quality Defects Reference and Master
Data

Design and Implement


Operational DQM
Procedures

Monitor Operational
DQM Procedures and
Performance
March 8, 2010 70
DMBOK Function and Activity - Operational
Activities Data
Management

Reference and Document and


Data Architecture Data Operations Data Security Data Quality DW and BI Metadata
Data Governance Data Development Master Data Content
Management Management Management Management Management Management
Management Management

Understand Data
Data Modeling, Develop and Promote Understand Reference Understand Business
Data Management Understand Enterprise Security Needs and Documents / Records Understand Metadata
Analysis, and Solution Database Support Data Quality and Master Data Intelligence
Planning Information Needs Regulatory Management Requirements
Design Awareness Integration Needs Information Needs
Requirements

Identify Master and


Develop and Maintain Define and Maintain
Data Management Data Technology Define Data Security Define Data Quality Reference Data Define the Metadata
the Enterprise Data Detailed Data Design the DW / BI Content Management
Control Management Policy Requirement Sources and Architecture
Model Architecture
Contributors

Analyse and Align Data Model and Define and Maintain Implement Data
Define Data Security Profile, Analyse, and Develop and Maintain
With Other Business Design Quality the Data Integration Warehouses and Data
Standards Assess Data Quality Metadata Standards
Models Management Architecture Marts

Implement Reference
Define and Maintain Define Data Security Implement a Managed
Define Data Quality and Master Data Implement BI Tools
the Database Data Implementation Controls and Metadata
Metrics Management and User Interfaces
Architecture Procedures Environment
Solutions

Define and Maintain Manage Users,


Define Data Quality Define and Maintain Process Data for Create and Maintain
the Data Integration Passwords, and Group
Business Rules Match Rules Business Intelligence Metadata
Architecture Membership

Define and Maintain Monitor and Tune


Manage Data Access Test and Validate Data Establish “Golden”
the DW / BI Data Warehousing Integrate Metadata
Views and Permissions Quality Requirements Records
Architecture Processes

Define and Maintain Monitor User Define and Maintain Monitor and Tune BI
Set and Evaluate Data Manage Metadata
Enterprise Taxonomies Authentication and Hierarchies and Activity and
Quality Service Levels Repositories
and Namespaces Access Behaviour Affiliations Performance

Define and Maintain Continuously Measure Plan and Implement


Classify Information Distribute and Deliver
the Metadata and Monitor Data Integration of New
Confidentiality Metadata
Architecture Quality Data Sources

Replicate and
Manage Data Quality Query, Report, and
Audit Data Security Distribute Reference
Issues Analyse Metadata
and Master Data

Clean and Correct Data Manage Changes to


Quality Defects Reference and Master
Data

Design and Implement


Operational DQM
Procedures

Monitor Operational
DQM Procedures and
Performance
March 8, 2010 71
DMBOK Environmental Elements Structure
Data Management
Environmental
Elements

Goals and Primary Roles and Practices and Organisation and


Activities Technology
Principles Deliverables Responsibilities Techniques Culture

Phases. Tasks, Inputs and Recognised Best Critical Success


Vision and Mission Individual Roles Tool Categories
Steps Outputs Practices Factors

Standards and Common Reporting


Business Benefits Dependencies Information Organisation Roles
Protocols Approaches Structures

Sequence and Business and IT Alternative Management


Strategic Goals Documents Selection Criteria
Flow Roles Techniques Metrics

Use Cases and Qualifications and Values, Beliefs,


Specific Objectives Databases Learning Curves
Scenarios Skills Expectations

Attitudes. Styles,
Guiding Principles Trigger Events Other Resources
Preferences

Teamwork, Group
Dynamics,
Authority,
Empowerment.

Contracting
Strategies

Change
Management
Approach
March 8, 2010 72
DMBOK Environmental Elements

March 8, 2010 73
Data Governance

March 8, 2010 74
Data Governance

• Core function of the Data Management Framework


• Interacts with and influences each of the surrounding ten data
management functions
• Data governance is the exercise of authority and control (planning,
monitoring, and enforcement) over the management of data assets
• Data governance function guides how all other data management
functions are performed
• High-level, executive data stewardship
• Data governance is not the same thing as IT governance
• Data governance is focused exclusively on the management of data
assets

March 8, 2010 75
Data Governance – Definition and Goals

• Definition
− The exercise of authority and control (planning, monitoring, and
enforcement) over the management of data assets
• Goals
− To define, approve, and communicate data strategies, policies,
standards, architecture, procedures, and metrics
− To track and enforce regulatory compliance and conformance to
data policies, standards, architecture, and procedures
− To sponsor, track, and oversee the delivery of data management
projects and services
− To manage and resolve data related issues
− To understand and promote the value of data assets

March 8, 2010 76
Data Governance - Overview
Inputs Primary Deliverables

•Business Goals •Data Policies


•Business Strategies •Data Standards
•IT Objectives •Resolved Issues
•IT Strategies •Data Management Projects and
•Data Needs Services
•Data Issues •Quality Data and Information
•Regulatory Requirements •Recognised Data Value

Suppliers Data Governance Consumers

•Business Executives •Data Producers


•IT Executives •Knowledge Workers
•Data Stewards •Managers and Executives
•Regulatory Bodies •Data Professionals
•Customers

Participants Tools Metrics

•Executive Data Stewards •Intranet Website •Data Value


•Coordinating Data Stewards •E-Mail •Data Management Cost
•Business Data Stewards •Metadata Tools •Achievement of Objectives
•Data Professionals •Metadata Repository •# of Decisions Made
•DM Executive •Issue Management Tools •Steward Representation / Coverage
•CIO •Data Governance KPI •Data Professional Headcount
•Dashboard •Data Management Process Maturity

March 8, 2010 77
Data Governance Function, Activities and Sub-
Activities
Data Governance

Data Management Planning Data Management Control

Supervise Data Professional Organisations


Understand Strategic Enterprise Data Needs
and Staff

Develop and Maintain the Data Strategy Coordinate Data Governance Activities

Establish Data Professional Roles and


Manage and Resolve Data Related Issues
Organisations

Identify and Appoint Data Stewards Monitor and Ensure Regulatory Compliance

Establish Data Governance and Stewardship Monitor and Enforce Conformance with Data
Organisations Policies, Standards and Architecture

Develop and Approve Data Policies, Oversee Data Management Projects and
Standards, and Procedures Services

Communicate and Promote the Value of Data


Review and Approve Data Architecture
Assets

Plan and Sponsor Data Management Projects


and Services

Estimate Data Asset Value and Associated


Costs

March 8, 2010 78
Data Governance

• Data governance is accomplished most effectively as an


on-going program and a continual improvement process
• Every data governance programme is unique, taking into
account distinctive organisational and cultural issues, and
the immediate data management challenges and
opportunities
• Data governance is at the core of managing data assets

March 8, 2010 79
Data Governance - Possible Organisation Structure

Data Governance Structure

Organisation Data Governance


CIO
Council

Data Governance Office Data Management Executive

Business Unit Data Governance


Data Technologists
Councils

Data Stewardship Committees

Data Stewardship Teams

March 8, 2010 80
Data Governance Shared Decision Making
Business Decisions Shared Decision Making IT Decisions

Enterprise
Business Operating Enterprise Information Database
Model Information Model Management Architecture
Strategy
Enterprise
Information Needs Information Data Integration
IT Leadership Management Architecture
Policies
Enterprise Data Warehousing
Information Information and Business
Capital Investments Specifications Management Intelligence
Standards Architecture

Research and Enterprise


Quality Information Metadata
Development Requirements Management Architecture
Funding Metrics

Enterprise
Data Governance Issue Resolution Information Technical Metadata
Model Management
Services

March 8, 2010 81
Data Stewardship

• Formal accountability for business responsibilities ensuring effective


control and use of data assets
• Data steward is a business leader and/or recognised subject matter
expert designated as accountable for these responsibilities
• Manage data assets on behalf of others and in the best interests of
the organisation
• Represent the data interests of all stakeholders, including but not
limited to, the interests of their own functional departments and
divisions
• Protects, manages, and leverages the data resources
• Must take an enterprise perspective to ensure the quality and
effective use of enterprise data

March 8, 2010 82
Data Stewardship - Roles

• Executive Data Stewards – provide data governance and


make of high-level data stewardship decisions
• Coordinating Data Stewards - lead and represent teams of
business data stewards in discussions across teams and
with executive data stewards
• Business Data Stewards - subject matter experts work
with data management professionals on an ongoing basis
to define and control data

March 8, 2010 83
Data Stewardship Roles Across Data Management
Functions - 1
All Data Stewards Executive Data Stewards Coordinating Data Business Data Stewards
Stewards
Data Architecture Review, validate, approve, Review and approve the Integrate specifications, Define data requirements
Management maintain and refine data enterprise data resolving differences specifications
architecture architecture
Data Development Validate physical data Define data requirements
models and database and specifications
designs, participate in
database testing and
conversion
Data Operations Define requirements for
Management data recovery, retention
and performance
Help identify, acquire, and
control externally sourced
data
Data Security Management Provide security, privacy
and confidentiality
requirements, identify and
resolve data security
issues, assist in data
security audits, and classify
information confidentiality
Reference and Master Data Control the creation,
Management update, and retirement of
code values and other
reference data, define
master data management
requirements, identify and
help resolve issues

March 8, 2010 84
Data Stewardship Roles Across Data Management
Functions - 2
All Data Stewards Executive Data Stewards Coordinating Data Business Data Stewards
Stewards
Data Warehousing and Provide business
Business Intelligence intelligence requirements
Management and management metrics,
and they identify and help
resolve business
intelligence issues
Document and Content Define enterprise
Management taxonomies and resolve
content management
issues
Metadata Management Create and maintain
business metadata (names,
meanings, business rules),
define metadata access
and integration needs and
use metadata to make
effective data stewardship
and governance decisions
Data Quality Management Define data quality
requirements and business
rules, test application edits
and validations, assist in
the analysis, certification,
and auditing of data
quality, lead clean-up
efforts, identify ways to
solve causes of poor data
quality, promote data
quality awareness
March 8, 2010 85
Data Strategy

• High-level course of action to achieve high-level goals


• Data strategy is a data management program strategy a
plan for maintaining and improving data quality, integrity,
security and access
• Address all data management functions relevant to the
organisation

March 8, 2010 86
Elements of Data Strategy

• Vision for data management


• Summary business case for data management
• Guiding principles, values, and management perspectives
• Mission and long-term directional goals of data management
• Management measures of data management success
• Short-term data management programme objectives
• Descriptions of data management roles and business units along
with a summary of their responsibilities and decision rights
• Descriptions of data management programme components and
initiatives
• Outline of the data management implementation roadmap
• Scope boundaries
March 8, 2010 87
Data Strategy

Data Management
Programme Charter
Data Management Data Management
Scope Statement Overall vision, business case,
goals, guiding principles, Implementation
measures of success, critical Roadmap
Goals and objectives for a success factors, recognised risks
defined planning horizon and the
Identifying specific programs,
roles, organisations, and
projects, task assignments, and
individual leaders accountable
delivery milestones
for achieving these objectives

March 8, 2010 88
Data Policies

• Statements of intent and fundamental rules governing the


creation, acquisition, integrity, security, quality, and use of
data and information
• More fundamental, global, and business critical than data
standards
• Describe what to do and what not to do
• Should be few data policies stated briefly and directly

March 8, 2010 89
Data Policies

• Possible topics for data policies


− Data modeling and other data development activities
− Development and use of data architecture
− Data quality expectations, roles, and responsibilities
− Data security, including confidentiality classification policies,
intellectual property policies, personal data privacy policies,
general data access and usage policies, and data access by
external parties
− Database recovery and data retention
− Access and use of externally sourced data
− Sharing data internally and externally
− Data warehousing and business intelligence
− Unstructured data - electronic files and physical records

March 8, 2010 90
Data Architecture

• Enterprise data model and other aspects of data


architecture sponsored at the data governance level
• Need to pay particular attention to the alignment of the
enterprise data model with key business strategies,
processes, business units and systems
• Includes
− Data technology architecture
− Data integration architecture
− Data warehousing and business intelligence architecture
− Metadata architecture

March 8, 2010 91
Data Standards and Procedures

• Include naming standards, requirement specification


standards, data modeling standards, database design
standards, architecture standards and procedural
standards for each data management function
• Must be effectively communicated, monitored, enforced
and periodically re-evaluated
• Data management procedures are the methods,
techniques, and steps followed to accomplish a specific
activity or task

March 8, 2010 92
Data Standards and Procedures

• Possible topics for data standards and procedures


− Data modeling and architecture standards, including data naming conventions,
definition standards, standard domains, and standard abbreviations
− Standard business and technical metadata to be captured, maintained, and
integrated
− Data model management guidelines and procedures
− Metadata integration and usage procedures
− Standards for database recovery and business continuity, database
performance, data retention, and external data acquisition
− Data security standards and procedures
− Reference data management control procedures
− Match / merge and data cleansing standards and procedures
− Business intelligence standards and procedures
− Enterprise content management standards and procedures, including use of
enterprise taxonomies, support for legal discovery and document and e-mail
retention, electronic signatures, report formatting standards and report
distribution approaches

March 8, 2010 93
Regulatory Compliance

• Most organisations are is impacted by government and


industry regulations
• Many of these regulations dictate how data and
information is to be managed
• Compliance is generally mandatory
• Data governance guides the implementation of adequate
controls to ensure, document, and monitor compliance
with data-related regulations.

March 8, 2010 94
Regulatory Compliance

• Data governance needs to work the business to find the best


answers to the following regulatory compliance questions
− How relevant is a regulation?
− Why is it important for us?
− How do we interpret it?
− What policies and procedures does it require?
− Do we comply now?
− How do we comply now?
− How should we comply in the future?
− What will it take?
− When will we comply?
− How do we demonstrate and prove compliance?
− How do we monitor compliance?
− How often do we review compliance?
− How do we identify and report non-compliance?
− How do we manage and rectify non-compliance?
March 8, 2010 95
Issue Management

• Data governance assists in identifying, managing, and resolving data


related issues
− Data quality issues
− Data naming and definition conflicts
− Business rule conflicts and clarifications
− Data security, privacy, and confidentiality issues
− Regulatory non-compliance issues
− Non-conformance issues (policies, standards, architecture, and procedures)
− Conflicting policies, standards, architecture, and procedures
− Conflicting stakeholder interests in data and information
− Organisational and cultural change management issues
− Issues regarding data governance procedures and decision rights
− Negotiation and review of data sharing agreements

March 8, 2010 96
Issue Management, Control and Escalation

• Data governance implements issue controls and


procedures
− Identifying, capturing, logging and updating issues
− Tracking the status of issues
− Documenting stakeholder viewpoints and resolution alternatives
− Objective, neutral discussions where all viewpoints are heard
− Escalating issues to higher levels of authority
− Determining, documenting and communicating issue resolutions.

March 8, 2010 97
Data Management Projects

• Data management roadmap sets out a course of action for


initiating and/or improving data management functions
• Consists of an assessment of current functions, definition
of a target environment and target objectives and a
transition plan outlining the steps required to reach these
targets including an approach to organisational change
management
• Every data management project should follow the project
management standards of the organisation

March 8, 2010 98
Data Asset Valuation

• Data and information are truly assets because they have


business value, tangible or intangible
• Different approaches to estimating the value of data assets
• Identify the direct and indirect business benefits derived
from use of the data
• Identify the cost of data loss, identifying the impacts of not
having the current amount and quality level of data

March 8, 2010 99
Data Architecture Management

March 8, 2010 100


Data Architecture Management

• Concerned with defining and maintaining specifications


that
− Provide a standard common business vocabulary
− Express strategic data requirements
− Outline high level integrated designs to meet these requirements
− Align with enterprise strategy and related business architecture
• Data architecture is an integrated set of specification
artifacts used to define data requirements, guide
integration and control of data assets and align data
investments with business strategy
• Includes formal data names, comprehensive data
definitions, effective data structures, precise data integrity
rules, and robust data documentation
March 8, 2010 101
Data Architecture Management – Definition and
Goals
• Definition
− Defining the data needs of the enterprise and designing the
master blueprints to meet those needs
• Goals
− To plan with vision and foresight to provide high quality data
− To identify and define common data requirements
− To design conceptual structures and plans to meet the current
and long-term data requirements of the enterprise

March 8, 2010 102


Data Architecture Management - Overview
Inputs Primary Deliverables
•Business Goals •Enterprise Data Model Information
•Business Strategies Value Chain Analysis
•Business Architecture •Data Technology Architecture
•Process Architecture •Data Integration / MDM Architecture
•IT Objectives •DW / BI Architecture
•IT Strategies •Metadata Architecture
•Data Strategies •Enterprise Taxonomies and
•Data Issues and Needs Namespaces
•Technical Architecture •Document Management Architecture
•Metadata
Data Architecture
Suppliers Consumers
Management
•Executives •Data Producers
•Data Stewards •Knowledge Workers
•Data Producers •Managers and Executives
•Information Consumers •Data Professionals
•Customers

Participants Tools Metrics


•Data Stewards
•Subject Matter Experts (SMEs) Data •Data Value
Architects •Data Modeling Tools •Data Management Cost
•Data Analysts and Modelers Other •Model Management Tool •Achievement of Objectives
Enterprise Architects •Metadata Repository Office •# of Decisions Made
•DM Executive and Managers •Productivity Tools •Steward Representation / Coverage
•CIO and Other Executives •Data Professional Headcount
•Database Administrators •Data Management Process Maturity
•Data Model Administrator
March 8, 2010 103
Enterprise Data Architecture

• Integrated set of specifications and documents


− Enterprise Data Model - the core of enterprise data architecture
− Information Value Chain Analysis - aligns data with business
processes and other enterprise architecture components
− Related Data Delivery Architecture - including database
architecture, data integration architecture, data warehousing /
business intelligence architecture, document content
architecture, and metadata architecture

March 8, 2010 104


Data Architecture Management Activities

• Understand Enterprise Information Needs


• Develop and Maintain the Enterprise Data Model
• Analyse and Align With Other Business Models
• Define and Maintain the Database Architecture
• Define and Maintain the Data Integration Architecture
• Define and Maintain the Data Warehouse / Business
Intelligence Architecture
• Define and Maintain Enterprise Taxonomies and
Namespaces
• Define and Maintain the Metadata Architecture

March 8, 2010 105


Understanding Enterprise Information Needs

• In order to create an enterprise data architecture, the


organisation must first define its information need
• An enterprise data model is a way of capturing and
defining enterprise information needs and data
requirements
• Master blueprint for enterprise-wide data integration
• Enterprise data model is a critical input to all future
systems development projects and the baseline for
additional data requirements analysis
• Evaluate the current inputs and outputs required by the
organisation, both from and to internal and external
targets
March 8, 2010 106
Develop and Maintain the Enterprise Data Model

• Data is the set of facts collected about business entities


• Data model is a set of data specifications that reflect data
requirements and designs
• Enterprise data model is an integrated, subject-oriented
data model defining the critical data produced and
consumed across the organisation
• Define and analyse data requirements
• Design logical and physical data structures that support
these requirements

March 8, 2010 107


Enterprise Data Model

Enterprise Data
Model

Other Enterprise
Conceptual Data Enterprise Logical
Subject Area Model Data Model
Model Data Models
Components

Data Steward
Valid Reference Data Quality
Responsibility Entity Life Cycles
Data Values Specifications
Assignments

March 8, 2010 108


Enterprise Data Model

• Build an enterprise data model in layers


• Focus on the most critical business subject areas

March 8, 2010 109


Subject Area Model

• List of major subject areas that collectively express the


essential scope of the enterprise
• Important to the success of the entire enterprise data
model
• List of enterprise subject areas becomes one of the most
significant organisation classifications
• Acceptable to organisation stakeholders
• Useful as the organising framework for data governance,
data stewardship, and further enterprise data modeling

March 8, 2010 110


Conceptual Data Model

• Conceptual data model defines business entities and their


relationships
• Business entities are the primary organisational structures in a
conceptual data model
• Business needs data about business entities
• Include a glossary containing the business definitions and other
metadata associated with business entities and their relationships
• Assists improved business understanding and reconciliation of terms
and their meanings
• Provide the framework for developing integrated information
systems to support both transactional processing and business
intelligence.
• Depicts how the enterprise sees information

March 8, 2010 111


Enterprise Logical Data Models

• Logical data model contain a level of detail below the


conceptual data model
• Contain the essential data attributes for each entity
• Essential data attributes are those data attributes without
which the enterprise cannot function – can be a subjective
decision

March 8, 2010 112


Other Enterprise Data Model Components

• Data Steward Responsibility Assignments- for subject


areas, entities, attributes, and/or reference data value sets
• Valid Reference Data Values - controlled value sets for
codes and/or labels and their business meaning
• Data Quality Specifications - rules for essential data
attributes, such as accuracy / precision requirements,
currency (timeliness), integrity rules, nullability,
formatting, match/merge rules, and/or audit requirements
• Entity Life Cycles - show the different lifecycle states of
the most important entities and the trigger events that
change an entity from one state to another
March 8, 2010 113
Analyse and Align with Other Business Models

• Information value-chain analysis maps the relationships


between enterprise model elements and other business
models
• Business value chain identifies the functions of an
organisation that contribute directly or indirectly to the
organisation’s goals

March 8, 2010 114


Define and Maintain the Data Technology
Architecture
• Data technology architecture guides the selection and integration of
data-related technology
• Data technology architecture defines standard tool categories,
preferred tools in each category, and technology standards and
protocols for technology integration
• Technology categories include
− Database management systems (DBMS)
− Database management utilities
− Data modelling and model management tools
− Business intelligence software for reporting and analysis
− Extract-transform-load (ETL), changed data capture (CDC), and other data
integration tools
− Data quality analysis and data cleansing tools
− Metadata management software, including metadata repositories
March 8, 2010 115
Define and Maintain the Data Technology
Architecture
• Classify technology architecture components as
− Current - currently supported and used
− Deployment - deployed for use in the next 1-2 years
− Strategic - expected to be available for use in the next 2+ years
− Retirement - the organisation has retired or intends to retire this
year
− Preferred - preferred for use by most applications.
− Containment - limited to use by certain applications
− Emerging - being researched and piloted for possible future
deployment

March 8, 2010 116


Define and Maintain the Data Integration
Architecture
• Defines how data flows through all systems from
beginning to end
• Both data architecture and application architecture,
because it includes both databases and the applications
that control the data flow into the system, between
databases and back out of the system

March 8, 2010 117


Define and Maintain the Data Warehouse / Business
Intelligence Architecture
• Focuses on how data changes and snapshots are stored in
data warehouse systems for maximum usefulness and
performance
• Data integration architecture shows how data moves from
source systems through staging databases into data
warehouses and data marts
• Business intelligence architecture defines how decision
support makes data available, including the selection and
use of business intelligence tools

March 8, 2010 118


Define and Maintain Enterprise Taxonomies and
Namespaces
• Taxonomy is the hierarchical structure used for outlining
topics
• Organisations develop their own taxonomies to organise
collective thinking about topics
• Overall enterprise data architecture includes
organisational taxonomies
• Definition of terms used in such taxonomies should be
consistent with the enterprise data model

March 8, 2010 119


Define and Maintain the Metadata Architecture

• Metadata architecture is the design for integration of


metadata across software tools, repositories, directories,
glossaries, and data dictionaries
• Metadata architecture defines the managed flow of
metadata
• Defines how metadata is created, integrated, controlled,
and accessed
• Metadata repository is the core of any metadata
architecture
• Focus of metadata architecture is to ensure the quality,
integration, and effective use of metadata
March 8, 2010 120
Data Architecture Management Guiding Principles

• Data architecture is an integrated set of specification master blueprints used to


define data requirements, guide data integration, control data assets, and align
data investments with business strategy
• Enterprise data architecture is part of the overall enterprise architecture, along
with process architecture, business architecture, systems architecture, and
technology architecture
• Enterprise data architecture includes three major categories of specifications: the
enterprise data model, information value chain analysis, and data delivery
architecture
• Enterprise data architecture is about more than just data - it helps to establish a
common business vocabulary
• An enterprise data model is an integrated subject-oriented data model defining
the essential data used across an entire organisation
• Information value-chain analysis defines the critical relationships between data,
processes, roles and organisations and other enterprise elements
• Data delivery architecture defines the master blueprint for how data flows across
databases and applications
• Architectural frameworks like TOGAF help organise collective thinking about
architecture
March 8, 2010 121
Data Development

March 8, 2010 122


Data Development

• Analysis, design, implementation, deployment, and


maintenance of data solutions to maximise the value of
the data resources to the enterprise
• Subset of project activities within the system development
lifecycle focused on defining data requirements, designing
the data solution components, and implementing these
components
• Primary data solution components are databases and
other data structures

March 8, 2010 123


Data Development – Definition and Goals

• Definition
− Designing, implementing, and maintaining solutions to meet the
data needs of the enterprise
• Goals
− Identify and define data requirements
− Design data structures and other solutions to these requirements
− Implement and maintain solution components that meet these
requirements
− Ensure solution conformance to data architecture and standards
as appropriate
− Ensure the integrity, security, usability, and maintainability of
structured data assets

March 8, 2010 124


Data Development - Overview
Inputs Primary Deliverables
•Data Requirements and Business
•Business Goals and Strategies Rules
•Data Needs and Strategies •Conceptual Data Models
•Data Standards •Logical Data Models and
•Data Architecture Specifications
•Process Architecture •Physical Data Models and
•Application Architecture Specifications
•Technical Architecture •Metadata (Business and Technical)
•Data Modeling and DB Design
Standards
Suppliers Data Development •Data Model and DB Design Reviews
•Version Controlled Data Models
•Test Data
•Data Stewards •Development and Test Databases
•Subject Matter Experts •Information Products
•IT Steering Committee •Data Access Services
•Data Governance Council •Data Integration Services
•Data Architects and Analysts •Migrated and Converted Data
•Software Developers
•Data Producers
•Information Consumers

Participants Tools Consumers


•Data Stewards and SMEs •Data Modeling Tools
•Data Architects and Analysts •Database Management Systems •Data Producers
•Database Administrators •Software Development Tools •Knowledge Workers
•Data Model Administrators •Testing Tools •Managers and Executives
•Software Developers •Data Profiling Tools •Customers
•Project Managers •Model Management Tools •Data Professionals
•DM Executives and Other IT •Configuration Management Tools •Other IT Professionals
Management •Office Productivity Tools

March 8, 2010 125


Data Development Function, Activities and Sub-
Activities
Data Development

Data Modelling,
Data Model and Design
Analysis and Solution Detailed Data Design Data Implementation
Quality Management
Design

Implement
Analyse Information Design Physical Develop Data Modeling
Development / Test
Requirements Databases and Design Standards
Database Changes

Develop and Maintain


Physical Database Review Data Model and Create and Maintain
Conceptual Data
Design Database Design Quality Test Data
Models

Performance Conceptual and Logical Migrate and Convert


Entities
Modifications Data Model Reviews Data

Physical Database Physical Database Build and Test


Relationships
Design Documentation Design Review Information Products

Develop and Maintain Design Information Build and Test Data


Data Model Validation
Logical Data Models Products Access Services

Manage Data Model


Design Data Access Validate Information
Attributes Versioning and
Services Requirements
Integration

Design Data Integration Prepare for Data


Domains
Services Deployment

Keys

Develop and Maintain


Physical Data Models

March 8, 2010 126


Data Development - Principles

• Data development activities are an integral part of the software development lifecycle
• Data modeling is an essential technique for effective data management and system design
• Conceptual and logical data modeling express business and application requirements while
physical data modeling represents solution design
• Data modeling and database design define detail solution component specifications
• Data modeling and database design balances tradeoffs and needs
• Data professionals should collaborate with other project team members to design
information products and data access and integration interfaces
• Data modeling and database design should follow documented standards
• Design reviews should review all data models and designs, in order to ensure they meet
business requirements and follow design standards
• Data models represent valuable knowledge resources and so should be carefully managed
and controlled them through library, configuration, and change management to ensure
data model quality and availability
• Database administrators and other data professionals play important roles in the
construction, testing, and deployment of databases and related application systems

March 8, 2010 127


Data Modeling, Analysis, and Solution Design

• Data modeling is an analysis and design method used to


define and analyse data requirements, and design data
structures that support these requirements
• A data model is a set of data specifications and related
diagrams that reflect data requirements and designs
• Data modeling is a complex process involving interactions
between people and with technology which do not
compromise the integrity or security of the data
• Good data models accurately express and effectively
communicate data requirements and quality solution
design
March 8, 2010 128
Data Model

• The purposes of a data model are:


− Communication - a data model is a bridge to understanding data between
people with different levels and types of experience. Data models help us
understand a business area, an existing application, or the impact of modifying
an existing structure. Data models may also facilitate training new business
and/or technical staff
− Formalisation - a data model documents a single, precise definition of data
requirements and data related business rules
− Scope – a data model can help explain the data context and scope of
purchased application packages
• Data models that include the same data may differ by:
− Scope - expressing a perspective about data in terms of function (business view
or application view), realm (process, department, division, enterprise, or
industry view), and time (current state, short-term future, long-term future)
− Focus - basic and critical concepts (conceptual view), detailed but independent
of context (logical view), or optimised for a specific technology and use
(physical view)

March 8, 2010 129


Analyse Information Requirements

• Information is relevant and timely data in context


• To identify information requirements, first identify business
information needs, often in the context of one or more business
processes
• Business processes (and the underlying IT systems) consume
information output from other business processes
• Requirements analysis includes the elicitation, organisation,
documentation, review, refinement, approval, and change control of
business requirements
• Some of these requirements identify business needs for data and
information
• Logical data modeling is an important means of expressing business
data requirements

March 8, 2010 130


Develop and Maintain Conceptual Data Models

• Visual, high-level perspective on a subject area of


importance to the business
• Contains the basic and critical business entities within a
given realm and function with a description of each entity
and the relationships between entities
• Define the meanings of the essential business vocabulary
• Reflect the data associated with a business process or
application function
• Independent of technology and usage context

March 8, 2010 131


Develop and Maintain Conceptual Data Models

• Entities
− A data entity is a collection of data about something that the
business deems important and worthy of capture
− Entities appear in conceptual or logical data models
• Relationships
− Business rules define constraints on what can and cannot be done
• Data Rules – define constraints on how data relates to other data
• Action Rules - instructions on what to do when data elements contain
certain values

March 8, 2010 132


Develop and Maintain Logical Data Models

• Detailed representation of data requirements and the


business rules that govern data quality
• Independent of any technology or specific implementation
technical constraints
• Extension of a conceptual data model
• Logical data models transform conceptual data model
structures by normalisation and abstraction
− Normalisation is the process of applying rules to organise business
complexity into stable data structure
− Abstraction is the redefinition of data entities, elements, and
relationships by removing details to broaden the applicability of
data structures to a wider class of situations

March 8, 2010 133


Develop and Maintain Physical Data Models

• Physical data model optimises the implementation of


detailed data requirements and business rules in light of
technology constraints, application usage, performance
requirements, and modeling standards
• Physical data modeling transforms the logical data model
• Includes specific decisions
− Name of each table and column or file and field or schema and
element
− Logical domain, physical data type, length, and nullability of each
column or field
− Default values
− Primary and alternate unique keys and indexes
March 8, 2010 134
Detailed Data Design

• Detailed data design activities include


− Detailed physical database design, including views, functions,
triggers, and stored procedures
− Definition of supporting data structures, such as XML schemas
and object classes
− Creation of information products, such as the use of data in
screens and reports
− Definition of data access solutions, including data access objects,
integration services, and reporting and analysis services

March 8, 2010 135


Design Physical Databases

• Create detailed database implementation specifications


• Ensure the design meets data integrity requirements
• Determine the most appropriate physical structure to house and organise the data,
such as relational or other type of DBMS, files, OLAP cubes, XML, etc.
• Determine database resource requirements, such as server size and location, disk
space requirements, CPU and memory requirements, and network requirements
• Creating detailed design specifications for data structures, such as relational
database tables, indexes, views, OLAP data cubes, XML schemas, etc.
• Ensure performance requirements are met, including batch and online response
time requirements for queries, inserts, updates, and deletes
• Design for backup, recovery, archiving, and purge processing, ensuring availability
requirements are met
• Design data security implementation, including authentication, encryption needs,
application roles and data access and update permissions
• Review code to ensure that it meets coding standards and will run efficiently

March 8, 2010 136


Physical Database Design

• Choose a database design based on both a choice of architecture


and a choice of technology
• Base the choice of architecture (for example, relational, hierarchical,
network, object, star schema, snowflake, cube, etc.) on data
considerations
• Consider factors such as how long the data needs to be kept,
whether it must be integrated with other data or passed across
system or application boundaries, and on requirements of data
security, integrity, recoverability, accessibility, and reusability
• Consider organisational or political factors, including organisational
biases and developer skill sets, that lean toward a particular
technology or vendor

March 8, 2010 137


Physical Database Design - Principles

• Performance and Ease of Use - Ensure quick and easy access to data
by approved users in a usable and business-relevant form
• Reusability - The database structure should ensure that, where
appropriate, multiple applications would be able to use the data
• Integrity - The data should always have a valid business meaning and
value, regardless of context, and should always reflect a valid state
of the business
• Security - True and accurate data should always be immediately
available to authorised users, but only to authorised users
• Maintainability - Perform all data work at a cost that yields value by
ensuring that the cost of creating, storing, maintaining, using, and
disposing of data does not exceed its value to the organisation

March 8, 2010 138


Physical Database Design - Questions

• What are the performance requirements? What is the maximum permissible time for a
query to return results, or for a critical set of updates to occur?
• What are the availability requirements for the database? What are the window(s) of time
for performing database operations? How often should database backups and transaction
log backups be done (i.e., what is the longest period of time we can risk non-recoverability
of the data)?
• What is the expected size of the database? What is the expected rate of growth of the
data? At what point can old or unused data be archived or deleted? How many concurrent
users are anticipated?
• What sorts of data virtualisation are needed to support application requirements in a way
that does not tightly couple the application to the database schema?
• Will other applications need the data? If so, what data and how?
• Will users expect to be able to do ad-hoc querying and reporting of the data? If so, how and
with which tools?
• What, if any, business or application processes does the database need to implement?
(e.g., trigger code that does cross-database integrity checking or updating, application
classes encapsulated in database procedures or functions, database views that provide
table recombination for ease of use or security purposes, etc.).
• Are there application or developer concerns regarding the database, or the database
development process, that need to be addressed?
• Is the application code efficient? Can a code change relieve a performance issue?
March 8, 2010 139
Performance Modifications

• Consider how the database will perform when applications


make requests to access and modify data
• Indexing can improve query performance in many cases
• Denormalisation is the deliberate transformation of a
normalised logical data model into tables with redundant
data

March 8, 2010 140


Physical Database Design Documentation

• Create physical database design document to assist


implementation and maintenance

March 8, 2010 141


Design Information Products

• Design data-related deliverables


• Design screens and reports to meet business data requirements
• Ensure consistent use of business data terminology
• Reporting services give business users the ability to execute both pre-developed
and ad-hoc reports
• Analysis services give business users to ability slice and dice data across multiple
dimensions
• Dashboards display a wide array of analytics indicators, such as charts and graphs,
efficiently
• Scorecard display information that indicates scores or calculated evaluations of
performance
• Use data integrated from multiple databases as input to software for business
process automation that coordinates multiple business processes across disparate
platforms
• Data integration is a component of Enterprise Application Integration (EAI)
software, enabling data to be easily passed from application to application across
disparate platforms

March 8, 2010 142


Design Data Access Services

• May be necessary to access and combine data from


remote databases with data in the local database
• Goal is to enable easy and inexpensive reuse of data across
the organisation preventing, wherever possible, redundant
and inconsistent data
• Options include
− Linked database connections
− SOA web services
− Message brokers
− Data access classes
− ETL
− Replication

March 8, 2010 143


Design Data Integration Services

• Critical aspect of database design is determining


appropriate update mechanisms and database transaction
for recovery
• Define source-to-target mappings and data transformation
designs for extract-transform-load (ETL) programs and
other technology for ongoing data movement, cleansing
and integration
• Design programs and utilities for data migration and
conversion from old data structures to new data structures

March 8, 2010 144


Data Model and Design Quality Management

• Balance the needs of information consumers (the people


with business requirements for data) and the data
producers who capture the data in usable form
• Time and budget constraints
• Ensure data resides in data structures that are secure,
recoverable, sharable, and reusable, and that this data is
as correct, timely, relevant, and usable as possible
• Balance the short-term versus long-term business data
interests of the organisation

March 8, 2010 145


Develop Data Modeling and Design Standards

• Data modeling and database design standards serve as the guiding


principles to effectively meet business data needs, conform to data
architecture, and ensure data quality
• Data modeling and database design standards should include
− A list and description of standard data modeling and database design
deliverables
− A list of standard names, acceptable abbreviations, and abbreviation rules for
uncommon words, that apply to all data model objects
− A list of standard naming formats for all data model objects, including attribute
and column class words
− A list and description of standard methods for creating and maintaining these
deliverables
− A list and description of data modeling and database design roles and
responsibilities
− A list and description of all metadata properties captured in data modeling and
database design, including both business metadata and technical metadata,
with guidelines defining metadata quality expectations and requirements
− Guidelines for how to use data modeling tools
− Guidelines for preparing for and leading design reviews
March 8, 2010 146
Review Data Model and Database Design Quality

• Conduct requirements reviews and design reviews,


including a conceptual data model review, a logical data
model review, and a physical database design review

March 8, 2010 147


Conceptual and Logical Data Model Reviews

• Conceptual data model and logical data model design


reviews should ensure that:
− Business data requirements are completely captured and clearly
expressed in the model, including the business rules governing
entity relationships
− Business (logical) names and business definitions for entities and
attributes (business semantics) are clear, practical, consistent,
and complementary
− Data modeling standards, including naming standards, have been
followed
− The conceptual and logical data models have been validated

March 8, 2010 148


Physical Database Design Review

• Physical database design reviews should ensure that:


− The design meets business, technology, usage, and performance
requirements
− Database design standards, including naming and abbreviation
standards, have been followed
− Availability, recovery, archiving, and purging procedures are
defined according to standards
− Metadata quality expectations and requirements are met in order
to properly update any metadata repository
− The physical data model has been validated

March 8, 2010 149


Data Model Validation

• Validate data models against modeling standards, business


requirements, and database requirements
• Ensure the model matches applicable modeling standards
• Ensure the model matches the business requirements
• Ensure the model matches the database requirements

March 8, 2010 150


Manage Data Model Versioning and Integration

• Data models and other design specifications require


change control
− Each change should include
− Why the project or situation required the change
− What and how the object(s) changed, including which tables had
columns added, modified, or removed, etc.
− When the change was approved and when the change was made
to the model
− Who made the change
− Where the change was made

March 8, 2010 151


Data Implementation

• Data implementation consists of data management


activities that support system building, testing, and
deployment
− Database implementation and change management in the
development and test environments
− Test data creation, including any security procedures
− Development of data migration and conversion programs, both
for project development through the SDLC and for business
situations
− Validation of data quality requirements
− Creation and delivery of user training
− Contribution to the development of effective documentation

March 8, 2010 152


Implement Development / Test Database Changes

• Implement changes to the database that are required


during the course of application development
• Monitor database code to ensure that it is written to the
same standards as application code
• Identify poor SQL coding practices that could lead to errors
or performance problems

March 8, 2010 153


Create and Maintain Test Data

• Populate databases in the development environment with


test data
• Observe privacy and confidentiality requirements and
practices for test data

March 8, 2010 154


Migrate and Convert Data

• Key component of many projects is the migration of legacy


data to a new database environment, including any
necessary data cleansing and reformatting

March 8, 2010 155


Build and Test Information Products

• Implement mechanisms for integrating data from multiple


sources, along with the appropriate metadata to ensure
meaningful integration of the data
• Implement mechanisms for reporting and analysing the
data, including online and web-based reporting, ad-hoc
querying, BI scorecards, OLAP, portals, and the like
• Implement mechanisms for replication of the data, if
network latency or other concerns make it impractical to
service all users from a single data source

March 8, 2010 156


Build and Test Data Access Services

• Develop, test, and execute data migration and conversion


programs and procedures, first for development and test
data and later for production deployment
• Data requirements should include business rules for data
quality to guide the implementation of application edits
and database referential integrity constraints
• Business data stewards and other subject matter experts
should validate the correct implementation of data
requirements through user acceptance testing

March 8, 2010 157


Validate Information Requirements

• Test and validate that the solution meets the


requirements, and plan deployment, developing training,
and documentation.
• Data requirements may change abruptly, in response to
either changed business requirements, invalid
assumptions regarding the data or reprioritisation of
existing requirements
• Test the implementation of the data requirements and
ensure that the application requirements are satisfied

March 8, 2010 158


Prepare for Data Deployment

• Leverage the business knowledge captured in data modeling to


define clear and consistent language in user training and
documentation
• Business concepts, terminology, definitions, and rules depicted in
data models are an important part of application user training
• Data stewards and data analysts should participate in deployment
preparation, including development and review of training materials
and system documentation, especially to ensure consistent use of
defined business data terminology
• Help desk support staff also require orientation and training in how
system users appropriately access, manipulate, and interpret data
• Once installed, business data stewards and data analysts should
monitor the early use of the system to see that business data
requirements are indeed met

March 8, 2010 159


Data Operations Management

March 8, 2010 160


Data Operations Management

• Management is the development, maintenance, and


support of structured data to maximise the value of the
data resources to the enterprise and includes
− Database support
− Data technology management

March 8, 2010 161


Data Operations Management – Definition and
Goals
• Definition
− Planning, control, and support for structured data assets across
the data lifecycle, from creation and acquisition through archival
and purge
• Goals
− Protect and ensure the integrity of structured data assets
− Manage the availability of data throughout its lifecycle
− Optimise performance of database transactions

March 8, 2010 162


Data Operations Management - Overview
Inputs Primary Deliverables

•DBMS Technical Environments


•Data Requirements •Dev/Test, QA, DR, and Production
•Data Architecture Databases
•Data Models •Externally Sourced Data
•Legacy Data •Database Performance
•Service Level Agreements •Data Recovery Plans
•Business Continuity
•Data Retention Plan
Data Operations •Archived and Purged Data
Suppliers
Management
Consumers
•Executives
•IT Steering Committee
•Data Governance Council
•Data Stewards •Data Creators
•Data Architects and Modelers •Information Consumers
•Software Developers •Enterprise Customers
•Data Professionals
•Other IT Professionals

Participants Tools
•Database Administrators Metrics
•Software Developers
•Project Managers •Database Management Systems
•Data Stewards •Data Development Tools
•Data Architects and Analysts •Database Administration Tools •Availability
•DM Executives and Other IT •Office Productivity Tools •Performance
Management
•IT Operators

March 8, 2010 163


Data Operations Management Function, Activities
and Sub-Activities
Data Operations Management

Database Support Data Technology Management

Implement and Control Database


Understand Data Technology Requirements
Environments

Obtain Externally Sourced Data Define the Data Technology Architecture

Plan for Data Recovery Evaluate Data Technology

Backup and Recover Data Install and Administer Data Technology

Set Database Performance Service Levels Inventory and Track Data Technology Licenses

Monitor and Tune Database Performance Support Data Technology Usage and Issues

Plan for Data Retention

Archive, Retain, and Purge Data

Support Specialised Databases


March 8, 2010 164
Data Operations Management - Principles

• Write everything down


• Keep everything
• Whenever possible, automate a procedure
• Focus to understand the purpose of each task, manage scope,
simplify, do one thing at a time
• Measure twice, cut once
• React to problems and issues calmly and rationally, because panic
causes more errors
• Understand the business, not just the technology
• Work together to collaborate, be accessible, share knowledge
• Use all of the resources at your disposal
• Keep up to date
March 8, 2010 165
Database Support - Scope

• Ensure the performance and reliability of the database,


including performance tuning, monitoring, and error
reporting
• Implement appropriate backup and recovery mechanisms
to guarantee the recoverability of the data in any
circumstance
• Implement mechanisms for clustering and failover of the
database, if continual data availability data is a
requirement
• Implement mechanisms for archiving data operations
management
March 8, 2010 166
Database Support - Deliverables

• A production database environment, including an instance of the


DBMS and its supporting server, of a sufficient size and capacity to
ensure adequate performance, configured for the appropriate level
of security, reliability and availability
• Mechanisms and processes for controlled implementation and
changes to databases into the production environment
• Appropriate mechanisms for ensuring the availability, integrity, and
recoverability of the data in response to all possible circumstances
that could result in loss or corruption of data
• Appropriate mechanisms for detecting and reporting any error that
occurs in the database, the DBMS, or the data server
• Database availability, recovery, and performance in accordance with
service level agreements
March 8, 2010 167
Implement and Control Database Environments

• Updating DBMS software


• Maintaining multiple installations, including different DBMS versions
• Installing and administering related data technology, including data
integration software and third party data administration tools
• Setting and tuning DBMS system parameters
• Managing database connectivity
• Tune operating systems, networks, and transaction processing
middleware to work with the DBMS
• Optimise the use of different storage technology for cost-effective
storage

March 8, 2010 168


Obtain Externally Sourced Data

• Managed approach to data acquisition centralises


responsibility for data subscription services
• Document the external data source in the logical data
model and data dictionary
• Implement the necessary processes to load the data into
the database and/or make it available to applications

March 8, 2010 169


Plan for Data Recovery

• Establish service level agreements (SLAs) with IT data management


services organisations for data availability and recovery
• SLAs set availability expectations, allowing time for database
maintenance and backup, and set recovery time expectations for
different recovery scenarios, including potential disasters
• Ensure a recovery plan exists for all databases and database servers,
covering all possible scenarios
− Loss of the physical database server
− Loss of one or more disk storage devices
− Loss of a database, including the DBMS master database, temporary storage
database, transaction log segment, etc.
− Corruption of database index or data pages
− Loss of the database or log segment file system
− Loss of database or transaction log backup files
March 8, 2010 170
Backup and Recover Data

• Make regular backups of database and the database


transaction logs
• Balance the importance of the data against the cost of
protecting it
• Databases should reside on some sort of managed storage
area
• For critical data, implement some sort of replication facility

March 8, 2010 171


Set Database Performance Service Levels

• Database performance has two components - availability


and performance
• An unavailable database has a performance measure of
zero
• SLAs between data management services organisations
and data owners define expectations for database
performance
• Availability is the percentage of time that a system or
database can be used for productive work
• Availability requirements are constantly increasing, raising
the business risks and costs of unavailable data

March 8, 2010 172


Set Database Performance Service Levels

• Factors affecting availability include


− Manageability - ability to create and maintain an effective environment
− Recoverability - ability to reestablish service after interruption, and correct
errors caused by unforeseen events or component failures
− Reliability - ability to deliver service at specified levels for a stated period
− Serviceability - ability to determine the existence of problems, diagnose their
causes, and repair / solve the problems
• Tasks to ensure databases stay online and operational
− Running database backup utilities
− Running database reorganisation utilities
− Running statistics gathering utilities
− Running integrity checking utilities
− Automating the execution of these utilities
− Exploiting table space clustering and partitioning
− Replicating data across mirror databases to ensure high availability

March 8, 2010 173


Set Database Performance Service Levels

• Cause of loss of database availability


− Planned and unplanned outages
− Loss of the server hardware
− Disk hardware failure
− Operating system failure
− DBMS software failure
− Application problems
− Network failure
− Data center site loss
− Security and authorisation problems
− Corruption of data (due to bugs, poor design, or user error)
− Loss of database objects
− Loss of data
− Data replication failure
− Severe performance problems
− Recovery failures
− Human error

March 8, 2010 174


Monitor and Tune Database Performance

• Optimise database performance both proactively and reactively, by


monitoring performance and by responding to problems quickly and
effectively
• Run activity and performance reports against both the DBMS and
the server on a regular basis including during periods of heavy
activity
• When performance problems occur, use the monitoring and
administration tools of the DBMS to help identify the source of the
problem
− Memory allocation (buffer / cache for data)
− Locking and blocking
− Failure to update database statistics
− Poor SQL coding
− Insufficient indexing
− Application activity
− Increase in the number, size, or use of databases
− Database volatility

March 8, 2010 175


Support Specialised Databases

• Some specialised situations require specialised types of


databases

March 8, 2010 176


Data Technology Management

• Managing data technology should follow the same


principles and standards for managing any technology
• Use a reference model for technology management such
as Information Technology Infrastructure Library (ITIL)

March 8, 2010 177


Understand Data Technology Requirements

• Understand the data and information needs of the business


• Understand the best possible applications of technology to solve business
problems and take advantage of new business opportunities
• Understand the requirements of a data technology before determining what
technical solution to choose for a particular situation
− What problem does this data technology mean to solve?
− What does this data technology do that is unavailable in other data technologies?
− What does this data technology not do that is available in other data technologies?
− Are there any specific hardware requirements for this data technology?
− Are there any specific Operating System requirements for this data technology?
− Are there any specific software requirements or additional applications required for this
data technology to perform as advertised?
− Are there any specific storage requirements for this data technology?
− Are there any specific network or connectivity requirements for this data technology?
− Does this data technology include data security functionality? If not, what other tools
does this technology work with that provides for data security functionality?
− Are there any specific skills required to be able support this data technology? Do we
have those skills in-house or must we acquire them?

March 8, 2010 178


Define the Data Technology Architecture

• Data technology architecture addresses three core questions


− What technologies are standard (which are required, preferred, or
acceptable)?
− Which technologies apply to which purposes and circumstances?
− In a distributed environment, which technologies exist where, and how does
data move from one node to another?
• Technology is never free - even open-source technology requires
maintenance
• Technology should always be regarded as the means to an end,
rather than the end itself
• Buying the same technology that everyone else is using, and using it
in the same way, does not create business value or competitive
advantage for the organisation

March 8, 2010 179


Define the Data Technology Architecture

• Technology categories include


− Database management systems (DBMS)
− Database management utilities
− Data modelling and model management tools
− Business intelligence software for reporting and analysis
− Extract-transform-load (ETL), changed data capture (CDC), and
other data integration tools
− Data quality analysis and data cleansing tools
− Metadata management software, including metadata repositories

March 8, 2010 180


Define the Data Technology Architecture

• Classify technology architecture components as


− Current - currently supported and used
− Deployment - deployed for use in the next 1-2 years
− Strategic - expected to be available for use in the next 2+ years
− Retirement - the organisation has retired or intends to retire this
year
− Preferred - preferred for use by most applications.
− Containment - limited to use by certain applications
− Emerging - being researched and piloted for possible future
deployment
• Create road map for the organisation consisting of these
components to helps govern future technology decisions
March 8, 2010 181
Evaluate Data Technology

• Selecting appropriate data related technology, particularly the


appropriate database management technology, is an important data
management responsibility
• Data technologies to be researched and evaluated include:
− Database management systems (DBMS) software
− Database utilities, such as backup and recovery tools, and performance
monitors
− Data modeling and model management software
− Database management tools, such as editors, schema generators, and
database object generators
− Business intelligence software for reporting and analysis
− Extract-transfer-load (ETL) and other data integration tools
− Data quality analysis and data cleansing tools
− Data virtualisation technology
− Metadata management software, including metadata repositories

March 8, 2010 182


Evaluate Data Technology

• Use a standard technology evaluation process


− Understand user needs, objectives, and related requirements
− Understand the technology in general
− Identify available technology alternatives
− Identify the features required
− Weigh the importance of each feature
− Understand each technology alternative
− Evaluate and score each technology alternative’s ability to meet
requirements
− Calculate total scores and rank technology alternatives by score
− Evaluate the results, including the weighted criteria
− Present the case for selecting the highest ranking alternative

March 8, 2010 183


Evaluate Data Technology

• Selecting strategic DBMS software is very important


• Factors to consider when selecting DBMS software include:
− Product architecture and complexity
− Application profile, such as transaction processing, business intelligence, and
personal profiles
− Organisational appetite for technical risk
− Hardware platform and operating system support
− Availability of supporting software tools
− Performance benchmarks
− Scalability
− Software, memory, and storage requirements
− Available supply of trained technical professionals
− Cost of ownership, such as licensing, maintenance, and computing resources
− Vendor reputation
− Vendor support policy and release schedule
− Customer references

March 8, 2010 184


Install and Administer Data Technology

• Need to deploy new technology products in development /


test, QA / certification, and production environments
• Create and document processes and procedures for
administering the product
• Cost and complexity of implementing new technology is
usually underestimated
• Features and benefits are usually overestimated
• Start with small pilot projects and proof-of-concept (POC)
implementations to get a good idea of the true costs and
benefits before proceeding with larger production
implementation
March 8, 2010 185
Inventory and Track Data Technology Licenses

• Comply with licensing agreements and regulatory


requirements
• Track and conduct yearly audits of software license and
annual support costs
• Track other costs such as server lease agreements and
other fixed costs
• Use data to determine the total cost-of-ownership (TCO)
for each type of technology and technology product
• Evaluate technologies and products that are becoming
obsolete, unsupported, less useful, or too expensive

March 8, 2010 186


Support Data Technology Usage and Issues

• Work with business users and application developers to


− Ensure the most effective use of the technology
− Explore new applications of the technology
− Address any problems or issues that surface from its use
• Training is important to effective understanding and use of
any technology

March 8, 2010 187


Data Security Management

March 8, 2010 188


Data Security Management

• Planning, development, and execution of security policies


and procedures to provide proper authentication,
authorisation, access, and auditing of data and information
assets
• Effective data security policies and procedures ensure that
the right people can use and update data in the right way,
and that all inappropriate access and update is restricted
• Effective data security management function establishes
governance mechanisms that are easy enough to abide by
on a daily operational basis

March 8, 2010 189


Data Security Management – Definition and Goals

• Definition
− Planning, development, and execution of security policies and
procedures to provide proper authentication, authorisation,
access, and auditing of data and information.
• Goals
− Enable appropriate, and prevent inappropriate, access and
change to data assets
− Meet regulatory requirements for privacy and confidentiality
− Ensure the privacy and confidentiality needs of all stakeholders
are met

March 8, 2010 190


Data Security Management

• Protect information assets in alignment with privacy and


confidentiality regulations and business requirements
− Stakeholder Concerns - organisations must recognise the privacy and
confidentiality needs of their stakeholders, including clients, patients, students,
citizens, suppliers, or business partners
− Government Regulations - government regulations protect some of the
stakeholder security interests. Some regulations restrict access to information,
while other regulations ensure openness, transparency, and accountability
− Proprietary Business Concerns - each organisation has its own proprietary data
to protect - ensuring competitive advantage provided by intellectual property
and intimate knowledge of customer needs and business partner relationships
is a cornerstone in any business plan
− Legitimate Access Needs - data security implementers must also understand
the legitimate needs for data access

March 8, 2010 191


Data Security Requirements and Procedures

• Data security requirements and the procedures to meet


these requirements
− Authentication - validate users are who they say they are
− Authorisation - identify the right individuals and grant them the
right privileges to specific, appropriate views of data
− Access - enable these individuals and their privileges in a timely
manner
− Audit - review security actions and user activity to ensure
compliance with regulations and conformance with policy and
standards

March 8, 2010 192


Data Security Management - Overview
Inputs Primary Deliverables

•Business Goals
•Business Strategy
•Business Rules
•Business Process •Data Security Policies
•Data Strategy •Data Privacy and Confidentiality
•Data Privacy Issues Standards
•Related IT Policies and Standards •User Profiles, Passwords and
Memberships
Data Security •Data Security Permissions
•Data Security Controls
Suppliers
Management •Data Access Views
•Document Classifications
•Authentication and Access History
•Data Security Audits
•Data Stewards
•IT Steering Committee
•Data Stewardship Council
•Government
•Customers

Participants Tools Consumers


•Data Stewards
•Data Security Administrators •Data Producers
•Database Administrators •Database Management System •Knowledge Workers
•BI Analysts •Business Intelligence Tools •Managers
•Data Architects •Application Frameworks •Executives
•DM Leader •Identity Management Technologies •Customers
•CIO/CTO •Change Control Systems •Data Professionals
•Help Desk Analysts

March 8, 2010 193


Data Security Management Function, Activities and
Sub-Activities
Data Security
Management

Understand
Define Data Manage Users, Monitor User
Data Security Define Data Manage Data Classify
Define Data Security Passwords, and Authentication Audit Data
Needs and Security Access Views Information
Security Policy Controls and Group and Access Security
Regulatory Standards and Permissions Confidentially
Procedures Membership Behaviour
Requirements

Password
Business
Standards and
Requirements
Procedures

Regulatory
Requirements

March 8, 2010 194


Data Operations Management - Principles
• Be a responsible trustee of data about all parties. Understand and respect the privacy and confidentiality needs of all
stakeholders, be they clients, patients, students, citizens, suppliers, or business partners
• Understand and comply with all pertinent regulations and guidelines
• Data-to-process and data-to-role relationship (CRUD Create, Read, Update, Delete) matrices help map data access
needs and guide definition of data security role groups, parameters, and permissions
• Definition of data security requirements and data security policy is a collaborative effort involving IT security
administrators, data stewards, internal and external audit teams, and the legal department
• Identify detailed application security requirements in the analysis phase of every systems development project
• Classify all enterprise data and information products against a simple confidentiality classification schema
• Every user account should have a password set by the user following a set of password complexity guidelines, and
expiring every 45 to 60 days
• Create role groups; define privileges by role; and grant privileges to users by assigning them to the appropriate role
group. Whenever possible, assign each user to only one role group
• Some level of management must formally request, track, and approve all initial authorisations and subsequent
changes to user and group authorisations
• To avoid data integrity issues with security access information, centrally manage user identity data and group
membership data
• Use relational database views to restrict access to sensitive columns and / or specific rows
• Strictly limit and carefully consider every use of shared or service user accounts
• Monitor data access to certain information actively, and take periodic snapshots of data access activity to understand
trends and compare against standards criteria
• Periodically conduct objective, independent, data security audits to verify regulatory compliance and standards
conformance, and to analyse the effectiveness and maturity of data security policy and practice
• In an outsourced environment, be sure to clearly define the roles and responsibilities for data security and
understand the chain of custody data across organisations and roles.

March 8, 2010 195


Understand Data Security Needs and Regulatory
Requirements
• Distinguish between business rules and procedures and
the rules imposed by application software products
• Common for systems to have their own unique set of data
security requirements over and above those required
business processes

March 8, 2010 196


Business Requirements

• Implementing data security within an enterprise requires


an understanding of business requirements
• Business needs of an enterprise define the degree of
rigidity required for data security
• Business rules and processes define the security touch
points
• Data-to-process and data-to-role relationship matrices are
useful tools to map these needs and guide definition of
data security role-groups, parameters, and permissions
• Identify detailed application security requirements in the
analysis phase of every systems development project
March 8, 2010 197
Regulatory Requirements

• Organisations must comply with a growing set of


regulations
• Some regulations impose security controls on information
management

March 8, 2010 198


Define Data Security Policy

• Definition of data security policy based on data security


requirements is a collaborative effort involving IT security
administrators, data stewards, internal and external audit
teams, and the legal department
• Enterprise IT strategy and standards typically dictate high-
level policies for access to enterprise data assets
• Data security policies are more granular in nature and take
a very data-centric approach compared to an IT security
policy

March 8, 2010 199


Define Data Security Standards

• No one prescribed way of implementing data security to meet


privacy and confidentiality requirements
• Regulations generally focus on ensuring achieving an end without
defining them means for achieving it
• Organisations should design their own security controls,
demonstrate that the controls meet the requirements of the law or
regulations and document the implementation of those controls
• Information technology security standards can also affect
− Tools used to manage data security
− Data encryption standards and mechanisms
− Access guidelines to external vendors and contractors
− Data transmission protocols over the internet
− Documentation requirements
− Remote access standards
− Security breach incident reporting procedures
March 8, 2010 200
Define Data Security Standards

• Consider physical security, especially with the explosion of portable


devices and media, to formulate an effective data security strategy
− Access to data using mobile devices
− Storage of data on portable devices such as laptops, DVDs, CDs or USB drives
− Disposal of these devices in compliance with records management policies
• An organisation should develop a practical, implementable security
policy including data security guiding principles
• Focus should be on quality and consistency not creating a lengthy
body of guidelines
• Execution of the policy requires satisfying the elements of securing
information assets: authentication, authorisation, access, and audit
• Information classification, access rights, role groups, users, and
passwords are the means to implementing policy and satisfying
these elements

March 8, 2010 201


Define Data Security Controls and Procedures

• Implementation and administration of data security policy


is primarily the responsibility of security administrators
• Database security is often one responsibility of database
administrators
• Implement proper controls to meet the objectives of
relevant laws
• Implement a process to validate assigned permissions
against a change management system used for tracking all
user permission requests

March 8, 2010 202


Manage Users, Passwords, and Group Membership

• Role groups enable security administrators to define


privileges by role and to grant these privileges to users by
enrolling them in the appropriate role group
• Data consistency in user and group management is a
challenge in a mixed IT environment
• Construct group definitions at a workgroup or business
unit level
• Organise roles in a hierarchy, so that child roles further
restrict the privileges of parent roles

March 8, 2010 203


Password Standards and Procedures

• Passwords are the first line of defense in protecting access


to data
• Every user account should be required to have a password
set by the user with a sufficient level of password
complexity defined in the security standards

March 8, 2010 204


Manage Data Access Views and Permissions

• Data security management involves not just preventing


inappropriate access, but also enabling valid and appropriate access
to data
• Most sets of data do not have any restricted access requirements
• Control sensitive data access by granting permissions - opt-in
• Access control degrades when achieved through shared or service
accounts
− Implemented as convenience for administrators, these accounts often come
with enhanced privileges and are untraceable to any particular user or
administrator
− Enterprises using shared or service accounts run the risk of data security
breaches
− Evaluate use of such accounts carefully, and never use them frequently or by
default
March 8, 2010 205
Monitor User Authentication and Access Behaviour

• Monitoring authentication and access behaviour is critical


because:
− It provides information about who is connecting and accessing
information assets, which is a basic requirement for compliance
auditing
− It alerts security administrators to unforeseen situations,
compensating for oversights in data security planning, design, and
implementation
• Monitoring helps detect unusual or suspicious transactions
that may warrant further investigation and issue resolution
• Perform monitoring either actively or passively
• Automated systems with human checks and balances in
place best accomplish both methods
March 8, 2010 206
Classify Information Confidentiality

• Classify an organisation’s data and information using a simple


confidentiality classification schema
• Most organisations classify the level of confidentiality for
information found within documents, including reports
• A typical classification schema might include the following five
confidentiality classification levels:
− For General Audiences: Information available to anyone, including the general
public
− Internal Use Only: Information limited to employees or members, but with
minimal risk if shared
− Confidential: Information which should not be shared outside the organisation.
Client Confidential information may not be shared with other clients
− Restricted Confidential: Information limited to individuals performing certain
roles with the need to know
− Registered Confidential: Information so confidential that anyone accessing the
information must sign a legal agreement to access the data and assume
responsibility for its secrecy

March 8, 2010 207


Audit Data Security

• Auditing data security is a recurring control activity with


responsibility to analyse, validate, counsel, and recommend policies,
standards, and activities related to data security management
• Auditing is a managerial activity performed with the help of analysts
working on the actual implementation and details
• The goal of auditing is to provide management and the data
governance council with objective, unbiased assessments, and
rational, practical recommendations
• Auditing data security is no substitute for effective management of
data security
• Auditing is a supportive, repeatable process, which should occur
regularly, efficiently, and consistently

March 8, 2010 208


Audit Data Security

• Auditing data security includes


− Analysing data security policy and standards against best practices and needs
− Analysing implementation procedures and actual practices to ensure
consistency with data security goals, policies, standards, guidelines, and
desired outcomes
− Assessing whether existing standards and procedures are adequate and in
alignment with business and technology requirements
− Verifying the organisation is in compliance with regulatory requirements
− Reviewing the reliability and accuracy of data security audit data
− Evaluating escalation procedures and notification mechanisms in the event of a
data security breach
− Reviewing contracts, data sharing agreements, and data security obligations of
outsourced and external vendors, ensuring they meet their obligations, and
ensuring the organisation meets its obligations for externally sourced data
− Reporting to senior management, data stewards, and other stakeholders on
the state of data security within the organisation and the maturity of its
practices
− Recommending data security design, operational, and compliance
improvements

March 8, 2010 209


Data Security and Outsourcing

• Outsourcing IT operations introduces additional data security


challenges and responsibilities
• Outsourcing increases the number of people who share
accountability for data across organisational and geographic
boundaries
• Previously informal roles and responsibilities must now be explicitly
defined as contractual obligations
• Outsourcing contracts must specify the responsibilities and
expectations of each role
• Any form of outsourcing increases risk to the organisation
• Data security risk is escalated to include the outsource vendor, so
any data security measures and processes must look at the risk from
the outsource vendor not only as an external risk, but also as an
internal risk
March 8, 2010 210
Data Security and Outsourcing

• Transferring control, but not accountability, requires tighter risk


management and control mechanisms:
− Service level agreements
− Limited liability provisions in the outsourcing contract
− Right-to-audit clauses in the contract
− Clearly defined consequences to breaching contractual obligations
− Frequent data security reports from the service vendor
− Independent monitoring of vendor system activity
− More frequent and thorough data security auditing
− Constant communication with the service vendor
• In an outsourced environment, it is important to maintain and track
the lineage, or flow, of data across systems and individuals to
maintain a chain of custody

March 8, 2010 211


Reference and Master Data Management

March 8, 2010 212


Reference and Master Data Management

• Reference and Master Data Management is the ongoing


reconciliation and maintenance of reference data and master data
− Reference Data Management is control over defined domain values (also
known as vocabularies), including control over standardised terms, code values
and other unique identifiers, business definitions for each value, business
relationships within and across domain value lists, and the consistent, shared
use of accurate, timely and relevant reference data values to classify and
categorise data
− Master Data Management is control over master data values to enable
consistent, shared, contextual use across systems, of the most accurate,
timely, and relevant version of truth about essential business entities
• Reference data and master data provide the context for transaction
data

March 8, 2010 213


Reference and Master Data Management –
Definition and Goals
• Definition
− Planning, implementation, and control activities to ensure
consistency with a golden version of contextual data values
• Goals
− Provide authoritative source of reconciled, high-quality master
and reference data
− Lower cost and complexity through reuse and leverage of
standards
− Support business intelligence and information integration efforts

March 8, 2010 214


Reference and Master Data Management - Overview
Inputs Primary Deliverables

•Business Drivers
•Data Requirements Policy and •Master and Reference Data
Regulations Requirements
•Standards •Data Models and Documentation
•Code Sets •Reliable Reference and Master Data
•Master Data •Golden Record Data Lineage
•Transactional Data •Data Quality Metrics and Reports
Reference and •Data Cleansing Services

Suppliers Master Data


Management Consumers
•Steering Committees
•Business Data Stewards
•Subject Matter Experts •Application Users
•Data Consumers •BI and Reporting Users
•Standards Organisations Tools •Application Developers and Architects
•Data Providers •Data Integration Developers and
Architects
•Reference Data Management •BI Developers and Architects
Applications •Vendors, Customers, and Partners
Participants •Master Data Management
Applications
•Data Stewards •Data Modeling Tools Metrics
•Subject Matter Experts •Process Modeling Tools
•Data Architects •Metadata Repositories •Reference and Master Data Quality
•Data Analysts •Data Profiling Tools •Change Activity
•Application Architects •Data Cleansing Tools •Issues, Costs, Volume
•Data Governance Council •Data Integration Tools •Use and Re-Use
•Data Providers •Business Process and Rule Engines •Availability
•Other IT Professionals Change Management Tools •Data Steward Coverage

March 8, 2010 215


Reference and Master Data Management Function,
Activities and Sub-Activities
Reference
and Master
Data
Management

Understand Identify Implement


Define and Define and Plan and Replicate and Manage
Reference Reference Reference
Maintain the Define and Establish Maintain Implement Distribute Changes to
Reference and Master and Master and Master
Master Data Data Maintain Golden Hierarchies Integration of Reference Reference
Data Data Data Sources Data
integration Match Rules Records and New Data and Master and Master
Integration and Management
Architecture Affiliations Sources Data Data
Needs Contributors Solutions

Vocabulary
Management
Party Master
and
Data
Reference
Data

Defining
Financial Golden
Master Data Master Data
Values

Product
Master Data

Location
Master Data

March 8, 2010 216


Reference and Master Data Management -
Principles
• Shared reference and master data belongs to the organisation, not to a particular
application or department
• Reference and master data management is an on-going data quality improvement
program; its goals cannot be achieved by one project alone
• Business data stewards are the authorities accountable for controlling reference
data values. Business data stewards work with data professionals to improve the
quality of reference and master data
• Golden data values represent the organisation’s best efforts at determining the
most accurate, current, and relevant data values for contextual use. New data
may prove earlier assumptions to be false. Therefore, apply matching rules with
caution, and ensure that any changes that are made are reversible
• Replicate master data values only from the database of record
• Request, communicate, and, in some cases, approve of changes to reference data
values before implementation

March 8, 2010 217


Reference Data

• Reference data is data used to classify or categorise other


data
• Business rules usually dictate that reference data values
conform to one of several allowed values
• In all organisations, reference data exists in virtually every
database
• Reference tables link via foreign keys into other relational
database tables, and the referential integrity functions
within the database management system ensure only valid
values from the reference tables are used in other tables

March 8, 2010 218


Master Data

• Master data is data about the business entities that


provide context for business transactions
• Master data is the authoritative, most accurate data
available about key business entities, used to establish the
context for transactional data
• Master data values are considered golden
• Master Data Management is the process of defining and
maintaining how master data will be created, integrated,
maintained, and used throughout the enterprise

March 8, 2010 219


Master Data Challenges

• What are the important roles, organisations, places, and things referenced
repeatedly?
• What data is describing the same person, organisation, place, or thing?
• Where is this data stored? What is the source for the data?
• Which data is more accurate? Which data source is more reliable and credible?
Which data is most current?
• What data is relevant for specific needs? How do these needs overlap or conflict?
• What data from multiple sources can be integrated to create a more complete
view and provide a more comprehensive understanding of the person,
organisation, place or thing?
• What business rules can be established to automate master data quality
improvement by accurately matching and merging data about the same person,
organisation, place, or thing?
• How do we identify and restore data that was inappropriately matched and
merged?
• How do we provide our golden data values to other systems across the
enterprise?
• How do we identify where and when data other than the golden values is used?
March 8, 2010 220
Party Master Data

• Includes data about individuals, organisations, and the roles they


play in business relationships
• Customer relationship management (CRM) systems perform MDM
for customer data (also called Customer Data Integration (CDI))
• Focus is to provide the most complete and accurate information
about each and every customer
• Need to identify duplicate, redundant and conflicting data
• Party master data issues
− Complexity of roles and relationships played by individuals and organisations
− Difficulties in unique identification
− High number of data sources
− Business importance and potential impact of the data

March 8, 2010 221


Financial Master Data

• Includes data about business units, cost centers, profit


centers, general ledger accounts, budgets, projections, and
projects
• Financial MDM solutions focus on not only creating,
maintaining, and sharing information, but also simulating
how changes to existing financial data may affect the
organisation’s bottom line

March 8, 2010 222


Product Master Data

• Product master can consists of information on an


organisation’s products and services or on the entire
industry in which the organisation operates, including
competitor products, and services
• Product Lifecycle Management (PLM) focuses on managing
the lifecycle of a product or service from its conception
(such as research), through its development,
manufacturing, sale / delivery, service, and disposal

March 8, 2010 223


Location Master Data

• Provides the ability to track and share reference


information about different geographies, and create
hierarchical relationships or territories based on
geographic information to support other processes
• Different industries require specialised earth science data
(geographic data about seismic faults, flood plains, soil,
annual rainfall, and severe weather risk areas) and related
sociological data (population, ethnicity, income, and
terrorism risk), usually supplied from external sources

March 8, 2010 224


Understand Reference and Master Data Integration
Needs
• Reference and master data requirements are relatively
easy to discover and understand for a single application
• Potentially much more difficult to develop an
understanding of these needs across applications,
especially across the entire organisation
• Analysing the root causes of a data quality problem usually
uncovers requirements for reference and master data
integration
• Organisations that have successfully managed reference
and master data typically have focused on one subject
area at a time
− Analyse all occurrences of a few business entities, across all
physical databases and for differing usage patterns
March 8, 2010 225
Identify Reference and Master Data Sources and
Contributors
• Successful organisations first understand the needs for
reference and master data
• Then trace the lineage of this data to identify the original
and interim source databases, files, applications,
organisations and the individual roles that create and
maintain the data
• Understand both the upstream sources and the
downstream needs to capture quality data at its source

March 8, 2010 226


Define and Maintain the Data integration
Architecture
• Effective data integration architecture controls the shared access, replication, and
flow of data to ensure data quality and consistency, particularly for reference and
master data
• Without data integration architecture, local reference and master data
management occurs in application silos, inevitably resulting in redundant and
inconsistent data
• The selected data integration architecture should also provide common data
integration services
− Change request processing, including review and approval
− Data quality checks on externally acquired reference and master data
− Consistent application of data quality rules and matching rules
− Consistent patterns of processing
− Consistent metadata about mappings, transformations, programs and jobs
− Consistent audit, error resolution and performance monitoring data
− Consistent approach to replicating data
• Establishing master data standards can be a time consuming task as it may involve
multiple stakeholders.
• Apply the same data standards, regardless of integration technology, to enable
effective standardisation, sharing, and distribution of reference and master data

March 8, 2010 227


Data Integration Services Architecture
Data Quality Management

Data Acquisition, File Data Standardisation Replication


Management and Cleansing and Management
Audit Matching

Source Data Rules Reconciled


Master Data

Archives Errors Subscriptions

Staging

MetaData Management
Business Integration Job Flow and
Metadata Metadata Statistics

March 8, 2010 228


Implement Reference and Master Data
Management Solutions
• Reference and master data management solutions are
complex
• Given the variety, complexity, and instability of
requirements, no single solution or implementation
project is likely to meet all reference and master data
management needs
• Organisations should expect to implement reference and
master data management solutions iteratively and
incrementally through several related projects and phases

March 8, 2010 229


Define and Maintain Match Rules

• Matching, merging, and linking of data from multiple systems about


the same person, group, place, or thing is a major master data
management challenge
• Matching attempts to remove redundancy, to improve data quality,
and provide information that is more comprehensive
• Data matching is performed by applying inference rules
− Duplicate identification match rules focus on a specific set of fields that
uniquely identify an entity and identify merge opportunities without taking
automatic action
− Match-merge rules match records and merge the data from these records into
a single, unified, reconciled, and comprehensive record.
− Match-link rules identify and cross-reference records that appear to relate to a
master record without updating the content of the cross-referenced record

March 8, 2010 230


Establish Golden Records

• Establishing golden master data values requires more


inference, application of matching rules, and review of the
results

March 8, 2010 231


Vocabulary Management and Reference Data

• A vocabulary is a collection of terms / concepts and their


relationships
• Vocabulary management is defining, sourcing, importing,
and maintaining a vocabulary and its associated reference
data
− See ANSI/NISO Z39.19 - Guidelines for the Construction, Format,
and Management of Monolingual Controlled Vocabularies -
http://www.niso.org/kst/reports/standards?step=2&gid=&project
_key=7cc9b583cb5a62e8c15d3099e0bb46bbae9cf38a
• Vocabulary management requires the identification of the
standard list of preferred terms and their synonyms
• Vocabulary management requires data governance,
enabling data stewards to assess stakeholder needs
March 8, 2010 232
Vocabulary Management and Reference Data

• Key questions to ask to enable vocabulary management


− What information concepts (data attributes) will this vocabulary support?
− Who is the audience for this vocabulary? What processes do they support, and
what roles do they play?
− Why is the vocabulary needed? Will it support applications, content
management, analytics, and so on?
− Who identifies and approves the preferred vocabulary and vocabulary terms?
− What are the current vocabularies different groups use to classify this
information? Where are they located? How were they created? Who are their
subject matter experts? Are there any security or privacy concerns for any of
them?
− Are there existing standards that can be leveraged to fulfill this need? Are
there concerns about using an external standard vs. internal? How frequently
is the standard updated and what is the degree of change of each update? Are
standards accessible in an easy to import / maintain format in a cost efficient
manner?

March 8, 2010 233


Defining Golden Master Data Values

• Golden data values are the data values thought to be the most
accurate, current, and relevant for shared, consistent use across
applications
• Determine golden values by analyssing data quality, applying data
quality rules and matching rules, and incorporating data quality
controls into the applications that acquire, create, and update data
• Establish data quality measurements to set expectations, measure
improvements, and help identify root causes of data quality
problems
• Assess data quality through a combination of data profiling activities
and verification against adherence to business rules
• Once the data is standardised and cleansed, the next step is to
attempt reconciliation of redundant data through application of
matching rules

March 8, 2010 234


Define and Maintain Hierarchies and Affiliations

• Vocabularies and their associated reference data sets are


often more than lists of preferred terms and their
synonyms
• Affiliation management is the establishment and
maintenance of relationships between master data
records

March 8, 2010 235


Plan and Implement Integration of New Data
Sources
• Integrating new reference data sources involves
− Receiving and responding to new data acquisition requests from
different groups
− Performing data quality assessment services using data cleansing
and data profiling tools
− Assessing data integration complexity and cost
− Piloting the acquisition of data and its impact on match rules
− Determining who will be responsible for data quality
− Finalising data quality metrics

March 8, 2010 236


Replicate and Distribute Reference and Master Data

• Reference and master data may be read directly from a


database of record, or may be replicated from the
database of record to other application databases for
transaction processing, and data warehouses for business
intelligence
• Reference data most commonly appears as pick list values
in applications
• Replication aids maintenance of referential integrity

March 8, 2010 237


Manage Changes to Reference and Master Data

• Specific individuals have the role of a business data


steward with the authority to create, update, and retire
reference data
• Formally control changes to controlled vocabularies and
their reference data sets
• Carefully assess the impact of reference data changes

March 8, 2010 238


Data Warehousing and Business Intelligence
Management

March 8, 2010 239


Data Warehousing and Business Intelligence
Management
• A Data Warehouse is a combination of two primary
components
− An integrated decision support database
− Related software programs used to collect, cleanse, transform,
and store data from a variety of operational and external sources
• Both components combine to support historical, analytical,
and business intelligence (BI) requirements
• A Data Warehouse may also include dependent data
marts, which are subset copies of a data warehouse
database
• A Data Warehouse includes any data stores or extracts
used to support the delivery of data for BI purposes

March 8, 2010 240


Data Warehousing and Business Intelligence
Management
• Data Warehousing means the operational extract, cleansing,
transformation, and load processes and associated control processes
that maintain the data contained within a data warehouse
• Data Warehousing process focuses on enabling an integrated and
historical business context on operational data by enforcing business
rules and maintaining appropriate business data relationships and
processes that interact with metadata repositories
• Business Intelligence is a set of business capabilities including
− Query, analysis, and reporting activity by knowledge workers to monitor and
understand the financial operation health of, and make business decisions
about, the enterprise
− Strategic and operational analytics and reporting on corporate operational
data to support business decisions, risk management, and compliance

March 8, 2010 241


Data Warehousing and Business Intelligence
Management
• Together Data Warehousing and Business Intelligence
Management is the collection, integration, and
presentation of data to knowledge workers for the
purpose of business analysis and decision-making
• Composed of activities supporting all phases of the
decision support life cycle that provides context
− Moves and transforms data from sources to a common target
data store
− Provides knowledge workers various means of access,
manipulation
− Reporting of the integrated target data

March 8, 2010 242


Data Warehousing and Business Intelligence
Management – Definition and Goals
• Definition
− Planning, implementation, and control processes to provide
decision support data and support knowledge workers engaged in
reporting, query and analysis
• Goals
− To support and enable effective business analysis and decision
making by knowledge workers
− To build and maintain the environment / infrastructure to support
business intelligence activity, specifically leveraging all the other
data management functions to cost effectively deliver consistent
integrated data for all BI activity

March 8, 2010 243


Data Warehousing and Business Intelligence
Management - Overview
Inputs Primary Deliverables
•Business Drivers
•BI Data and Access Requirements
•Data Quality Requirements •DW/BI Architecture
•Data Security Requirements •Data Warehouses
•Data Architecture •Data Marts and OLAP Cubes
•Technical Architecture •Dashboards and Scorecards
•Data Modeling Standards and Guidelines
•Transactional Data
Data Warehousing •Analytic Applications
•File Extracts (for Data Mining/Statistical
•Master and Reference Data
•Industry and External Data and Business Tools)
•BI Tools and User Environments
•Data Quality Feedback Mechanism/Loop
Suppliers
Intelligence
•Executives and Managers
Management
•Subject Matter Experts Consumers
•Data Governance Council
•Information Consumers (Internal and
External)
•Data Producers •Knowledge Workers
•Data Architects and Analysts Tools •Managers and Executives
•External Customers and Systems
•Internal Customers and Systems
Participants •Data Professionals Other IT Professionals
•Database Management Systems
•Business Executives and Managers •Data Profiling Tools
•DM Execs and Other IT Management •Data Integration Tools
•BI Program Manage •Data Cleansing Tools
•SMEs and Other Information Consumers •Business Intelligence Tools Metrics
•Data Stewards •Analytic Applications
•Project Managers •Data Modeling Tools
•Data Architects and Analysts •Performance Management Tools
•Data Integration (ETL) Specialists •Usage Metrics
•Metadata Repository •Customer/User Satisfaction
•BI Specialists •Data Quality Tools
•Database Administrators •Subject Area Coverage %
•Data Security Tools •Response/Performance Metrics
•Data Security Administrators
•Data Quality Analysts

March 8, 2010 244


Data Warehousing and Business Intelligence
Management Objectives
• Providing integrated storage of required current and historical data, organised by
subject areas
• Ensuring credible, quality data for all appropriate access capabilities
• Ensuring a stable, high-performance, reliable environment for data acquisition,
data management, and data access
• Providing an easy-to-use, flexible, and comprehensive data access environment
• Delivering both content and access to the content in increments appropriate to
the organisation’s objectives
• Leveraging, rather than duplicating, relevant data management component
functions such as Reference and Master Data Management, Data Governance,
Data Quality, and Metadata
• Providing an enterprise focal point for data delivery in support of the decisions,
policies, procedures, definitions, and standards that arise from DG
• Defining, building, and supporting all data stores, data processes, data
infrastructure, and data tools that contain integrated, post-transactional, and
refined data used for information viewing, analysis, or data request fulfillment
• Integrating newly discovered data as a result of BI processes into the DW for
further analytics and BI use.

March 8, 2010 245


Data Warehousing and Business Intelligence
Management Function, Activities and Sub-Activities Data Warehousing
and Business
Intelligence
Management

Understand Business Define and Maintain Implement Data Implement Business Monitor and Tune Monitor and Tune BI
Process Data for
Intelligence the DW-BI Warehouses and Data Intelligence Tools and Data Warehousing Activity and
Business Intelligence
Information Needs Architecture Marts User Interfaces Processes Performance

Query and Reporting


Staging Areas
Tools

On Line Analytical
Mapping Sources and
Processing (OLAP)
Targets
Tools

Data Cleansing and


Analytic Applications Transformations
(Data Acquisition)

Implementing
Management
Dashboards and
Scorecards

Performance
Management Tools

Predictive Analytics
and Data Mining Tools

Advanced
Visualisation and
Discovery Tools
March 8, 2010 246
Data Warehousing and Business Intelligence
Management Principles
• Obtain executive commitment and support as these projects are labour intensive
• Secure business SMEs as their support and high availability are necessary for getting the correct data
and useful BI solution
• Be business focused and driven. Make sure DW / BI work is serving real priority business needs and
solving burning business problems. Let the business drive the prioritisation
• Demonstrable data quality is essential
• Provide incremental value. Ideally deliver in continual 2-3 month segments
• Transparency and self service. The more context (metadata of all kinds) provided, the more value
customers derive. Wisely exposing information about the process reduces calls and increases
satisfaction.
• One size does not fit all. Make sure you find the right tools and products for each of your customer
segments
• Think and architect globally, act and build locally. Let the big-picture and end- vision guide the
architecture, but build and deliver incrementally, with much shorter term and more project-based
focus
• Collaborate with and integrate all other data initiatives, especially those for data governance, data
quality, and metadata
• Start with the end in mind. Let the business priority and scope of end-data- delivery in the BI space
drive the creation of the DW content. The main purpose for the existence of the DW is to serve up data
to the end business customers via the BI capabilities
• Summarise and optimise last, not first. Build on the atomic data and add aggregates or summaries as
needed for performance, but not to replace the detail.

March 8, 2010 247


Understand Business Intelligence Information Needs

• All projects start with requirements


• Gathering requirements for DW-BIM projects has both similarities to and differences from
gathering requirements for other projects
• For DW-BIM projects, it is important to understand the broader business context of the
business area targeted as reporting is generalised and exploratory
• Capturing the actual business vocabulary and terminology is a key to success
• Document the business context, then explore the details of the actual source data
• Typically, the ETL portion can consume 60%-70% of a DW-BIM project’s budget and time
• The DW is often the first place where the pain of poor quality data in source systems and /
or data entry functions becomes apparent
• Creating an executive summary of the identified business intelligence needs is a best
practice
• When starting a DW-BIM programme, a good way to decide where to start is using a simple
assessment of business impact and technical feasibility
− Technical feasibility will take into consideration things like complexity, availability and state of the
data, and the availability of subject matter experts
− Projects that have high business impact and high technical feasibility are good candidates for
starting.

March 8, 2010 248


Define and Maintain the DW-BI Architecture

• Successful DW-BIM architecture requires the identification and


bringing together of a number of key roles
− Technical Architect - hardware, operating systems, databases and DW-BIM
architecture
− Data Architect - data analysis, systems of record, data modeling and data
mapping
− ETL Architect / Design Lead - staging and transform, data marts, and schedules
− Metadata Specialist - metadata interfaces, metadata architecture and
contents
− BI Application Architect / Design Lead - BI tool interfaces and report design,
metadata delivery, data and report navigation and delivery
• Technical requirements including performance, availability, and
timing needs are key drivers in developing the DW-BIM architecture
• The design decisions and principles for what data detail the DW
contains is a key design priority for DW-BIM architecture
• Important that the DW-BIM architecture integrate with the overall
corporate reporting architecture

March 8, 2010 249


Define and Maintain the DW-BI Architecture

• No DW-BIM effort can be successful without business acceptance of


data
• Business acceptance includes the data being understandable, having
verifiable quality and having a demonstrable origin
• Sign-off by the Business on the data should be part of the User
Acceptance Testing
• Structured random testing of the data in the BIM tool against data in
the source systems over the initial load and a few update load cycles
should be performed to meet sign-off criteria
• Meeting these requirements is paramount for every DW-BIM
architecture

March 8, 2010 250


Implement Data Warehouses and Data Marts

• The purpose of a data warehouse is to integrate data from multiple


sources and then serve up that integrated data for BI purposes
• Consumption is typically through data marts or other systems
• A single data warehouse will integrate data from multiple source
systems and serve data to multiple data marts
• Purpose of data marts is to provide data for analysis to knowledge
workers
• Start with the end in mind - identify the business problem to solve,
then identify the details and what would be used and continue to
work back into the integrated data required and ultimately all the
way back to the data sources.

March 8, 2010 251


Implement Business Intelligence Tools and User
Interfaces
• Well defined set of well-proven BI tools
• Implementing the right BI tool or User Interface (UI) is
about identifying the right tools for the right user set
• Almost all BI tools also come with their own metadata
repositories to manage their internal data maps and
statistics

March 8, 2010 252


Query and Reporting Tools

• Query and reporting is the process of querying a data


source and then formatting it to create a report
• With business query and reporting the data source is more
often a data warehouse or data mart
• While IT develops production reports, power users and
casual business users develop their own reports with
business query tools
• Business query and reporting tools enable users who want
to author their own reports or create outputs for use by
others

March 8, 2010 253


Query and Reporting Tools Landscape

Customers, Suppliers
and Regulators

Published
Frontline Workers Reports

Embedded BI Executives and


Managers
Scorecards
Analysts and
Dashboards Information Workers
Interactive
IT Developers Fixed
OLAP Reports

BI Spreadsheets Business
Production Reporting Tools Statistics Query

Commonly Commonly
Specialist Tools
Used Tools Used Tools
March 8, 2010 254
On Line Analytical Processing (OLAP) Tools

• OLAP provides interactive, multi-dimensional analysis with different


dimensions and different levels of detail
• The value of OLAP tools and cubes is reduction of the chance of
confusion and erroneous interpretation by aligning the data content
with the analyst's mental model
• Common OLAP operations include slice and dice, drill down, drill up,
roll up, and pivot
− Slice - a slice is a subset of a multi-dimensional array corresponding to a single
value for one or more members of the dimensions not in the subset
− Dice - the dice operation is a slice on more than two dimensions of a data
cube, or more than two consecutive slices
− Drill Down / Up - drilling down or up is a specific analytical technique whereby
the user navigates among levels of data, ranging from the most summarised
(up) to the most detailed (down)
− Roll-Up – a roll-up involves computing all of the data relationships for one or
more dimensions. To do this, define a computational relationship or formula
− Pivot - to change the dimensional orientation of a report or page display

March 8, 2010 255


Analytic Applications

• Analytic applications include the logic and processes to


extract data from well-known source systems, such as
vendor ERP systems, a data model for the data mart, and
pre-built reports and dashboards
• Analytic applications provide businesses with a pre-built
solution to optimise a functional area or industry vertical
• Different types of analytic applications include customer,
financial, supply chain, manufacturing, and human
resource applications

March 8, 2010 256


Implementing Management Dashboards and
Scorecards
• Dashboards and scorecards are both ways of efficiently
presenting performance information
• Dashboards are oriented more toward dynamic
presentation of operational information while scorecards
are more static representations of longer-term
organisational, tactical, or strategic goals
• Typically, scorecards are divided into 4 quadrants or views
of the organisation such as Finance, Customer,
Environment, and Employees, each with a number of
metrics

March 8, 2010 257


Performance Management Tools

• Performance management applications include budgeting,


planning, and financial consolidation

March 8, 2010 258


Predictive Analytics and Data Mining Tools

• Data mining is a particular kind of analysis that reveals


patterns in data using various algorithms
• A data mining tool will help users discover relationships or
show patterns in more exploratory fashion

March 8, 2010 259


Advanced Visualisation and Discovery Tools

• Advanced visualisation and discovery tools allow users to


interact with the data in a highly visual, interactive way
• Patterns in a large dataset can be difficult to recognise in a
numbers display
• A pattern can be picked up visually fairly quickly when
thousands of data points are loaded into a sophisticated
display on a single page of display

March 8, 2010 260


Process Data for Business Intelligence

• Most of the work in any DW-BIM effort involves in the


preparation and processing of the data

March 8, 2010 261


Staging Areas

• A staging area is the intermediate data store between an


original data source and the centralised data repository
• All required cleansing, transformation, reconciliation, and
relationships happen in this area

March 8, 2010 262


Mapping Sources and Targets

• Source-to-target mapping is the documentation activity that defines


data type details and transformation rules for all required entities
and data elements and from each individual source to each
individual target
• DW-BIM adds additional requirements to this source-to-target
mapping process encountered as a component of any typical data
migration
• One of the goals of the DW-BIM effort should be to provide a
complete lineage for each data element available in the BI
environment all the way back to its respective source(s)
• A solid taxonomy is necessary to match the data elements in
different systems into a consistent structure in the EDW

March 8, 2010 263


Data Cleansing and Transformations (Data
Acquisition)
• Data cleansing focuses on the activities that correct and enhance the
domain values of individual data elements, including enforcement of
standards
• Cleansing is particularly necessary for initial loads where significant
history is involved
• The preferred strategy is to push data cleansing and correction
activity back to the source systems whenever possible
• Data transformation focuses on activities that provide organisational
context between data elements, entities, and subject areas
• Organisational context includes cross- referencing, reference and
master data management and complete and correct relationships
• Data transformation is an essential component of being able to
integrate data from multiple sources

March 8, 2010 264


Monitor and Tune Data Warehousing Processes

• Processing should be monitored across the system for


bottlenecks and dependencies among processes
• Database tuning techniques should be employed where
and when needed, including partitioning, tuned backup
and recovery strategies
• Archiving is a difficult subject in data warehousing
• Users often consider the data warehouse as an active
archive due to the long histories that are built, and are
unwilling, particularly if the OLAP sources have dropped
records, to see the data warehouse engage in archiving

March 8, 2010 265


Monitor and Tune BI Activity and Performance

• A best practice for BI monitoring and tuning is to define


and display a set of customer- facing satisfaction metrics
• Average query response time and the number of users per
day / week / month, are examples of useful metrics to
display
• Regular review of usage statistics and patterns is essential
• Reports providing frequency and resource usage of data,
queries, and reports allow prudent enhancement
• Tuning BI activity is analogous to the principle of profiling
applications in order to know where the bottlenecks are
and where to apply optimisation efforts
March 8, 2010 266
Document and Content Management

March 8, 2010 267


Document and Content Management

• Document and Content Management is the control over capture,


storage, access, and use of data and information stored outside
relational databases
• Strategic and tactical focus overlaps with other data management
functions in addressing the need for data governance, architecture,
security, managed metadata, and data quality for unstructured data
• Document and Content Management includes two sub-functions:
− Document management is the storage, inventory, and control of electronic and
paper documents. Document management encompasses the processes,
techniques, and technologies for controlling and organising documents and
records, whether stored electronically or on paper
− Content management refers to the processes, techniques, and technologies for
organising, categorising, and structuring access to information content,
resulting in effective retrieval and reuse. Content management is particularly
important in developing websites and portals, but the techniques of indexing
based on keywords, and organising based on taxonomies, can be applied
across technology platforms.
March 8, 2010 268
Document and Content Management – Definition
and Goals
• Definition
− Planning, implementation, and control activities to store, protect,
and access data found within electronic files and physical records
(including text, graphics, images, audio, and video)
• Goals
− To safeguard and ensure the availability of data assets stored in
less structured formats
− To enable effective and efficient retrieval and use of data and
information in unstructured formats
− To comply with legal obligations and customer expectations
− To ensure business continuity through retention, recovery, and
conversion
− To control document storage operating costs

March 8, 2010 269


Document and Content Management - Overview
Inputs Primary Deliverables
•Text Documents •Managed Records in Many Media
•Reports Formats
•Spreadsheets •E-discovery Records
•Email •Outgoing Letters and Emails
•Instant Messages •Contracts and Financial Documents
•Faxes •Policies and Procedures
•Voicemail •Audit Trails and Logs
•Images •Meeting Minutes
•Video Recordings
•Audio Recordings
Document and •Formal Reports
•Significant Memoranda
•Printed Paper Files
•Microfiche/Microfilm
Content
•Graphics Management Consumers
Suppliers
•Business and IT Users
•Employees Tools •Government Regulatory Agencies
•External Parties • Senior Management
•External Customers

•Stored Documents
Participants •Office Productivity Tools
•All Employees •Image and Workflow
•Data Stewards Management Tools Metrics
•DM Professionals •Records Management Tools
•Records Management Staff •XML Development Tools
•Other IT Professionals •Collaboration Tools •Return on investment
•Data Management Executive •Internet •Key Performance Indicators
•Other IT Managers •Email Systems •Balanced Scorecards
•Chief Information Officer
•Chief Knowledge Officer
March 8, 2010 270
Document and Content Management Function,
Activities and Sub-Activities
Document and Content Management

Document / Record Management Content Management

Define and Maintain Enterprise


Plan for Managing Documents / Records Taxonomies (Information Content
Architecture)

Implement Document / Record


Document / Index Information Content
Management Systems for Acquisition,
Metadata
Storage, Access, and Security Controls

Backup and Recover Documents /


Provide Content Access and Retrieval
Records

Retention and Disposition of Documents


Govern for Quality Content
/ Records

Audit Document / Records Management

March 8, 2010 271


Document and Content Management - Principles

• Everyone in an organisation has a role to play in protecting


its future. Everyone must create, use, retrieve, and dispose
of records in accordance with the established policies and
procedures
• Experts in the handling of records and content should be
fully engaged in policy and planning. Regulatory and best
practices can vary significantly based on industry sector
and legal jurisdiction
• Even if records management professionals are not
available to the organisation, everyone can be trained and
have an understanding of the issues. Once trained,
business stewards and others can collaborate on an
effective approach to records management

March 8, 2010 272


Document and Content Management

• A document management system is an application used to track and


store electronic documents and electronic images of paper
documents
• Document management systems commonly provide storage,
versioning, security, metadata management, content indexing, and
retrieval capabilities
• A content management system is used to collect, organise, index,
and retrieve information content; storing the content either as
components or whole documents, while maintaining links between
components
• While a document management system may provide content
management functionality over the documents under its control, a
content management system is essentially independent of where
and how the documents are stored

March 8, 2010 273


Document / Record Management

• Document / Record Management is the lifecycle management of the


designated significant documents of the organisation
• Records can
− Physical such as documents, memos, contracts, reports or microfiche
− Electronic such as email content, attachments, and instant messaging
− Content on a website
− Documents on all types of media and hardware
− Data captured in databases of all kinds
• More than 90% of the records created today are electronic
• Growth in email and instant messaging has made the management
of electronic records critical to an organisation

March 8, 2010 274


Document / Record Management

• The lifecycle of Document / Record Management includes:


− Identification of existing and newly created documents / records
− Creation, Approval, and Enforcement of documents / records
policies
− Classification of documents / records
− Documents / Records Retention Policy
− Storage: Short and long term storage of physical and electronic
documents / records
− Retrieval and Circulation: Allowing access and circulation of
documents / records in accordance with policies, security and
control standards, and legal requirements
− Preservation and Disposal: Archiving and destroying documents /
records according to organisational needs, statutes, and
regulations

March 8, 2010 275


Plan for Managing Documents / Records

• Plan document lifecycle from creation or receipt, organisation for


retrieval, distribution and archiving or disposition
• Develop classification / indexing systems and taxonomies so that the
retrieval of documents is easy
• Create planning and policy around documents and records on the
value of the data to the organisation and as evidence of business
transactions
• Identify the responsible, accountable organisational unit for
managing the documents / records
• Develop and execute a retention plan and policy to archive, such as
selected records for long-term preservation
• Records are destroyed at the end of their lifecycle according to
operational needs, procedures, statutes and regulations

March 8, 2010 276


Implement Document / Record Management Systems for
Acquisition, Storage, Access, and Security Controls

• Documents can be created within a document management system


or captured via scanners or OCR software
• Electronic documents must be indexed via keywords or text during
the capture process so that the document can be found
• A document repository enables check-in and check-out features,
versioning, collaboration, comparison, archiving, status state(s),
migration from one storage media to another and disposition
• Document management can support different types of workflows
− Manual workflows that indicate where the user sends the document
− Rules-based workflow, where rules are created that dictate the flow of the
document within an organisation
− Dynamic rules that allow for different workflows based on content

March 8, 2010 277


Backup and Recover Documents / Records

• The document / record management system needs to be


included as part of the overall corporate backup and
recovery activities for all data and information
• Document / records manager be involved in risk mitigation
and management, and business continuity especially
regarding security for vital records
• A vital records program provides the organisation with
access to the records necessary to conduct its business
during a disaster and to resume normal business afterward

March 8, 2010 278


Retention and Disposition of Documents / Records

• Defines the period of time during which documents /


records for operational, legal, financial or historical value
must be maintained
• Specifies the processes for compliance, and the methods
and schedules for the disposition of documents / records
• Must deal with privacy and data protection issues
• Legal and regulatory requirements must be considered
when setting up document record retention schedules

March 8, 2010 279


Audit Document / Records Management

• Document / records management requires auditing on a periodic basis to ensure


that the right information is getting to the right people at the right time for
decision making or performing operational activities
− Inventory - Each location in the inventory is uniquely identified
− Storage - Storage areas for physical documents / records have adequate space to
accommodate growth
− Reliability and Accuracy - Spot checks are executed to confirm that the documents /
records are an adequate reflection of what has been created or received
− Classification and Indexing Schemes - Metadata and document file plans are well
described
− Access and Retrieval - End users find and retrieve critical information easily
− Retention Processes - Retention schedule is structured in a logical way
− Disposition Methods - Documents / records are disposed of as recommended
− Security and Confidentiality - Breaches of document / record confidentiality and loss of
documents / records are recorded as security incidents and managed appropriately
− Organisational Understanding of Documents / Records Management - Appropriate
training is provided to stakeholders and staff as to the roles and responsibilities related
to document / records management

March 8, 2010 280


Content Management

• Organisation, categorisation, and structure of data /


resources so that they can be stored, published, and
reused in multiple ways
• Includes data / information, that exists in many forms and
in multiple stages of completion within its lifecycle
• Content management systems manage the content of a
website or intranet through the creation, editing, storing,
organising, and publishing of content

March 8, 2010 281


Define and Maintain Enterprise Taxonomies
(Information Content Architecture)
• Process of creating a structure for a body of information or
content
• Contains a controlled vocabulary that can help with
navigation and search systems
• Content Architecture identifies the links and relationships
between documents and content, specifies document
requirements and attributes and defines the structure of
content in a document or content management system

March 8, 2010 282


Document / Index Information Content Metadata

• Development of metadata for unstructured data content


• Maintenance of metadata for unstructured data becomes
the maintenance of a cross-reference of various local
schemes to the official set of organisation metadata

March 8, 2010 283


Provide Content Access and Retrieval

• Once the content has been described by metadata / key


word tagging and classified within the appropriate
Information Content Architecture, it is available for
retrieval and use
• Finding unstructured data can be eased through portal
technology

March 8, 2010 284


Govern for Quality Content

• Managing unstructured data requires effective


partnerships between data stewards, data professionals,
and records managers
• The focus of data governance can include document and
record retention policies, electronic signature policies,
reporting formats, and report distribution policies
• High quality, accurate, and up-to-date information will aid
in critical business decisions
• Timeliness of the decision-making process with high
quality information may increase competitive advantage
and business effectiveness
March 8, 2010 285
Metadata Management

March 8, 2010 286


Metadata Management

• Metadata is data about data


• Metadata Management is the set of processes that ensure proper creation,
storage, integration, and control to support associated usage of metadata
• Leveraging metadata in an organisation can provide benefits
− Increase the value of strategic information by providing context for the data, thus aiding
analysts in making more effective decisions
− Reduce training costs and lower the impact of staff turnover through thorough
documentation of data context, history, and origin
− Reduce data-oriented research time by assisting business analysts in finding the
information they need, in a timely manner
− Improve communication by bridging the gap between business users and IT
professionals, leveraging work done by other teams, and increasing confidence in IT
system data
− Increase speed of system development time-to-market by reducing system
development life-cycle time
− Reduce risk of project failure through better impact analysis at various levels during
change management
− Identify and reduce redundant data and processes, thereby reducing rework and use of
redundant, out-of-date, or incorrect data
March 8, 2010 287
Metadata Management – Definition and Goals

• Definition
− Planning, implementation, and control activities to enable easy
access to high quality, integrated metadata
• Goals
− Provide organisational understanding of terms, and usage
− Integrate metadata from diverse source
− Provide easy, integrated access to metadata
− Ensure metadata quality and security

March 8, 2010 288


Metadata

• Metadata is information about the physical data, technical and business processes, data rules and
constraints, and logical and physical structures of the data, as used by an organisation
• Descriptive tags describe data, concepts and the connections between the data and concepts
− Business Analytics: Data definitions, reports, users, usage, performance
− Business Architecture: Roles and organisations, goals and objectives
− Business Definitions: The business terms and explanations for a particular concept, fact, or other item
found in an organisation
− Business Rules: Standard calculations and derivation methods
− Data Governance: Policies, standards, procedures, programs, roles, organisations, stewardship
assignments
− Data Integration: Sources, targets, transformations, lineage, ETL workflows, EAI, EII, migration /
conversion
− Data Quality: Defects, metrics, ratings
− Document Content Management: Unstructured data, documents, taxonomies, name sets, legal
discovery, search engine indexes
− Information Technology Infrastructure: Platforms, networks, configurations, licenses
− Logical Data Models: Entities, attributes, relationships and rules, business names and definitions
− Physical Data Models: Files, tables, columns, views, business definitions, indexes, usage, performance,
change management
− Process Models: Functions, activities, roles, inputs / outputs, workflow, business rules, timing, stores
− Systems Portfolio and IT Governance: Databases, applications, projects and programs, integration
roadmap, change management
− Service-Oriented Architecture (SOA) Information: Components, services, messages, master data
− System Design and Development: Requirements, designs and test plans, impact
− Systems Management: Data security, licenses, configuration, reliability, service levels

March 8, 2010 289


Metadata Management - Overview
Inputs Primary Deliverables
•Metadata Requirements
•Metadata Issues •Metadata Repositories
•Data Architecture •Quality Metadata
•Business Metadata •Metadata Models and Architecture
•Technical Metadata •Metadata Management
•Process Metadata •Operational Analysis
•Operational Metadata •Metadata Analysis
•Data Stewardship Metadata •Data Lineage
•Change Impact Analysis
Metadata •Metadata Control Procedures
Suppliers
Management
•Data Stewards Consumers
•Data Architects •Data Stewards
•Data Modelers •Data Professionals
•Database Administrators •Other IT Professionals
•Other Data Professionals •Knowledge Workers
•Data Brokers Tools •Managers and Executives
•Government and Industry Regulators •Customers and Collaborators
•Metadata Repositories •Business Users
•Data Modeling Tools
•Database Management Systems
Participants •Data Integration Tools Metrics
•Business Intelligence Tools
•Metadata Specialist •System Management Tools •Meta Data Quality
•Data Integration Architects •Object Modeling Tools •Master Data Service Data Compliance
•Data Stewards •Process Modeling Tools •Metadata Repository Contribution
•Data Architects and Modelers •Report Generating Tools Metadata Documentation Quality
•Database Administrators •Data Quality Tools Steward Representation / Coverage
•Other DM Professionals •Data Development and •Metadata Usage / Reference
•Other IT Professionals Administration Tools •Metadata Management Maturity
•DM Executive •Reference and Master Data •Metadata Repository Availability
•Business Users Management Tools
March 8, 2010 290
Metadata Management Function, Activities and Sub-
Activities
Metadata
Management

Develop and Implement a


Understand Define the Create and Manage Distribute and Query, Report
Maintain Managed Integrate
Metadata Metadata Maintain Metadata Deliver and Analyse
Metadata Metadata Metadata
Requirements Architecture Metadata Repositories Metadata Metadata
Standards Environment

Industry /
Centralised
Business User Consensus Metadata
Metadata
Requirements Metadata Repositories
Architecture
Standards

Directories,
Distributed International Glossaries and
Technical User
Metadata Metadata Other
Requirements
Architecture Standards Metadata
Stores

Hybrid Standard
Metadata Metadata
Architecture Metrics

March 8, 2010 291


Metadata Management - Principles
• Establish and maintain a metadata strategy and appropriate policies, especially clear goals and objectives for
metadata management and usage
• Secure sustained commitment, funding, and vocal support from senior management concerning metadata
management for the enterprise
• Take an enterprise perspective to ensure future extensibility, but implement through iterative and
incremental delivery
• Develop a metadata strategy before evaluating, purchasing, and installing metadata management products
• Create or adopt metadata standards to ensure interoperability of metadata across the enterprise
• Ensure effective metadata acquisition for both internal and external meta- data
• Maximise user access, since a solution that is not accessed or is under-accessed will not show business value
• Understand and communicate the necessity of metadata and the purpose of each type of metadata;
socialisation of the value of metadata will encourage business usage
• Measure content and usage
• Leverage XML, messaging, and Web services
• Establish and maintain enterprise-wide business involvement in data stewardship, assigning accountability for
metadata
• Define and monitor procedures and processes to ensure correct policy implementation
• Include a focus on roles, staffing, standards, procedures, training, and metrics
• Provide dedicated metadata experts to the project and beyond
• Certify metadata quality

March 8, 2010 292


Understand Metadata Requirements

• Metadata management strategy must reflect an


understanding of enterprise needs for metadata
• Gather requirements to confirm the need for a metadata
management environment, to set scope and priorities,
educate and communicate, to guide tool evaluation and
implementation, guide metadata modeling, guide internal
metadata standards, guide provided services that rely on
metadata, and to estimate and justify staffing needs
• Gather requirements from business and technical users
• Summarise the requirements from an analysis of roles,
responsibilities, challenges, and the information needs of
selected individuals in the organisation

March 8, 2010 293


Business User Requirements

• Business users require improved understanding of the


information from operational and analytical systems
• Business users require a high level of confidence in the
information obtained from corporate data warehouses,
analytical applications, and operational systems
• Need appropriate access to information delivery methods,
such as reports, queries, ad-hoc, OLAP, dashboards with a
high degree of quality documentation and context
• Business users must understand the intent and purpose of
metadata management

March 8, 2010 294


Technical User Requirements

• Technical requirement topics include


− Daily feed throughput: size and processing time
− Existing metadata
− Sources - known and unknown
− Targets
− Transformations
− Architecture flow logical and physical
− Non-standard metadata requirements
• Technical users must understand the business context of
the data at a sufficient level to provide the necessary
support, including implementing the calculations or
derived data rules
March 8, 2010 295
Define the Metadata Architecture

• Metadata management solutions consist of


− Metadata creation / sourcing
− metadata integration
− Mmetadata repositories
− Metadata delivery
− Metadata usage
− Metadata control / management

March 8, 2010 296


Centralised Metadata Architecture

• Single metadata repository that contains copies of the live metadata


from the various sources
• Advantages
− High availability, since it is independent of the source systems
− Quick metadata retrieval, since the repository and the query reside together
− Resolved database structures that are not affected by the proprietary nature of
third party or commercial systems
− Extracted metadata may be transformed or enhanced with additional
metadata that may not reside in the source system, improving quality
• Disadvantages
− Complex processes are necessary to ensure that changes in source metadata
quickly replicate into the repository
− Maintenance of a centralised repository can be substantial
− Extraction could require custom additional modules or middleware
− Validation and maintenance of customised code can increase the demands on
both internal IT staff and the software vendors
March 8, 2010 297
Distributed Metadata Architecture

• Metadata retrieval engine responds to user requests by retrieving


data from source systems in real time with no persistent repository
• Advantages
− Metadata is always as current and valid as possible
− Queries are distributed, possibly improving response / process time
− Metadata requests from proprietary systems are limited to query processing
rather than requiring a detailed understanding of proprietary data structures,
therefore minimising the implementation and maintenance effort required
− Development of automated metadata query processing is likely simpler,
requiring minimal manual intervention
− Batch processing is reduced, with no metadata replication or synchronisation
processes
• Disadvantages
− No enhancement or standardisation of metadata is possible between systems
− Query capabilities are directly affected by the availability of the participating
source systems
− No ability to support user-defined or manually inserted metadata entries since
there is no repository in which to place these additions

March 8, 2010 298


Hybrid Metadata Architecture

• Hybrid architecture where metadata still moves directly from the source systems
into a repository but the repository design only accounts for the user-added
metadata, the critical standardised items and the additions from manual sources
• Advantages
− Near-real-time retrieval of metadata from its source and enhanced metadata to meet
user needs most effectively, when needed
− Lowers the effort for manual IT intervention and custom-coded access functionality to
proprietary systems.
• Disadvantages
− Source systems must be available because the distributed nature of the back-end
systems handles processing of queries
− Additional overhead is required to link those initial results with metadata augmentation
in the central repository before presenting the result set to the end user
− Design forces the metadata repository to contain the latest version of the metadata
source and forces it to manage changes to the source, as well
− Sets of program / process interfaces to tie the repository back to the meta- data
source(s) must be built and maintained

March 8, 2010 299


Develop and Maintain Metadata Standards

• Check industry or consensus standards and international


standards
• International standards provide the framework from which
the industry standards are developed and executed

March 8, 2010 300


Industry / Consensus Metadata Standards

• Understanding the various standards for the implementation and management of meta-
data in industry is essential to the appropriate selection and use of a metadata solution for
an enterprise
− OMG (Object Management Group) specifications
• Common Warehouse Metadata (CWM)
• Information Management Metamodel (IMM)
• MDC Open Information Model (OIM)
• Extensible Markup Language (XML)
• Unified Modeling Language (UML)
• Extensible Markup Interface (XMI)
• Ontology Definition Metamodel (ODM)
− World Wide Web Consortium (W3C) RDF (Relational Definition Framework) for describing and
interchanging meta- data using XML
− Dublin Core Metadata Initiative (DCMI) interoperable online metadata standard using RDF
− Distributed Management Task Force (DTMF) Web-Based Enterprise Management (WBEM)
Common Information Model (CIM) standards-based management tools facilitating the exchange of
data across otherwise disparate technologies and platforms
− Metadata standards for unstructured data
• ISO 5964 - Guidelines for the establishment and development of multilingual thesauri
• ISO 2788 - Guidelines for the establishment and development of monolingual thesauri
• ANSI/NISO Z39.1 - American Standard Reference Data and Arrangement of Periodicals
• ISO 704 - Terminology work Principles and methods

March 8, 2010 301


International Metadata Standards

• ISO / IEC 11179 is an international metadata standard for


standardising and registering of data elements to make
data understandable and shareable

March 8, 2010 302


Standard Metadata Metrics

• Controlling the effectiveness of the metadata deployed


environment requires measurements to assess user
uptake, organisational commitment, and content coverage
and quality
− Metadata Repository Completeness
− Metadata Documentation Quality
− Master Data Service Data Compliance
− Steward Representation / Coverage
− Metadata Usage / Reference
− Metadata Management Maturity
− Metadata Repository Availability

March 8, 2010 303


Implement a Managed Metadata Environment

• Implement a managed metadata environment in


incremental steps in order to minimise risks to the
organisation and to facilitate acceptance
• First implementation is a pilot to prove concepts and learn
about managing the metadata environment

March 8, 2010 304


Create and Maintain Metadata

• Metadata creation and update facility provides for the


periodic scanning and updating of the repository in
addition to the manual insertion and manipulation of
metadata by authorised users and program
• Audit process validates activities and reports exceptions
• Metadata is the guide to the data in the organisation so its
quality is critical

March 8, 2010 305


Integrate Metadata

• Integration processes gather and consolidate metadata from across


the enterprise including metadata from data acquired outside the
enterprise
• Challenges will arise in integration that will require resolution
through the governance process
• Use a non-persistent metadata staging area to store temporary and
backup files that supports rollback and recovery processes and
provides an interim audit trail to assist repository managers when
investigating metadata source or quality issues
• ETL tools used for data warehousing and Business Intelligence
applications are often used effectively in metadata integration
processes

March 8, 2010 306


Manage Metadata Repositories

• Implement a number of control activities in order to


manage the metadata environment
• Control of repositories is control of metadata movement
and repository updates performed by the metadata
specialist

March 8, 2010 307


Metadata Repositories

• Metadata repository refers to the physical tables in which


the metadata are stored
• Generic design and not merely reflecting the source
system database designs
• Metadata should be as integrated as possible this will be
one of the most direct valued-added elements of the
repository

March 8, 2010 308


Directories, Glossaries and Other Metadata Stores

• A Directory is a type of metadata store that limits the


metadata to the location or source of data in the
enterprise
• A Glossary typically provides guidance for use of terms
• Other Metadata stores include specialised lists such as
source lists or interfaces, code sets, lexicons, spatial and
temporal schema, spatial reference, and distribution of
digital geographic data sets, repositories of repositories
and business rules

March 8, 2010 309


Distribute and Deliver Metadata

• Metadata delivery layer is responsible for the delivery of


the metadata from the repository to the end users and to
any applications or tools that require metadata feeds to
them

March 8, 2010 310


Query, Report and Analyse Metadata

• Metadata guides management and use of data assets


• A metadata repository must have a front-end application
that supports the search-and- retrieval functionality
required for all this guidance and management of data
assets

March 8, 2010 311


Data Quality Management

March 8, 2010 312


Data Quality Management

• Critical support process in organisational change management


• Data quality is synonymous with information quality since poor data
quality results in inaccurate information and poor business
performance
• Data cleansing may result in short-term and costly improvements
that do not address the root causes of data defects
• More rigorous data quality program is necessary to provide an
economic solution to improved data quality and integrity
• Institutionalising processes for data quality oversight, management,
and improvement hinges on identifying the business needs for
quality data and determining the best ways to measure, monitor,
control, and report on the quality of data
• Continuous process for defining the parameters for specifying
acceptable levels of data quality to meet business needs, and for
ensuring that data quality meets these levels

March 8, 2010 313


Data Quality Management – Definition and Goals

• Definition
− Planning, implementation, and control activities that apply quality
management techniques to measure, assess, improve, and ensure
the fitness of data for use
• Goals
− To measurably improve the quality of data in relation to defined
business expectations
− To define requirements and specifications for integrating data
quality control into the system development lifecycle
− To provide defined processes for measuring, monitoring, and
reporting conformance to acceptable levels of data quality

March 8, 2010 314


Data Quality Management

• Data quality expectations provide the inputs necessary to


define the data quality framework
• Framework includes defining the requirements, inspection
policies, measures, and monitors that reflect changes in
data quality and performance
• Requirements reflect three aspects of business data
expectations
− Way to record the expectation in business rules
− Way to measure the quality of data within that dimension
− Acceptability threshold

March 8, 2010 315


Data Quality Management Approach

• Planning for the assessment of the current state and


identification of key metrics for measuring data quality
• Deploying processes for measuring and improving the
quality of data
• Monitoring and measuring the levels in relation to the
defined business expectations
• Acting to resolve any identified issues to improve data
quality and better meet business expectations

March 8, 2010 316


Data Quality Management - Overview
Inputs Primary Deliverables

•Business Requirements
•Data Requirements •Improved Quality Data
•Data Quality Expectations •Data Management
•Data Policies and Standards •Operational Analysis
•Business metadata •Data Profiles
•Technical metadata •Data Quality Certification Reports
•Data Sources and Data Stores •Data Quality Service Level
Agreements

Suppliers

•External Sources Consumers


•Regulatory Bodies Data Quality
•Business Subject Matter Experts
•Information Consumers Management •Data Stewards
•Data Professionals
•Data Producers •Other IT Professionals
•Data Architects •Knowledge Workers
•Data Modelers •Managers and Executives Customers

Participants Tools Metrics

•Data Quality Analysts


•Data Analysts •Data Profiling Tools •Data Value Statistics
•Database Administrators •Statistical Analysis Tools •Errors / Requirement Violations
•Data Stewards •Data Cleansing Tools •Conformance to Expectations
•Other Data Professionals •Data Integration Tools •Conformance to Service Levels
•DRM Director •Issue and Event Management Tools
•Data Stewardship Council

March 8, 2010 317


Data Quality Management Function, Activities and
Sub-Activities
Data Quality
Management

Monitor
Design and
Develop and Profile, Define Data Test and Set and Continuously Clean and Operational
Define Data Define Data Implement
Promote Data Analyse and Quality Validate Data Evaluate Data Measure and Manage Data Correct Data DQM
Quality Quality Operational
Quality Assess Data Business Quality Quality Monitor Data Quality Issues Quality Procedures
Requirements Metrics DQM
Awareness Quality Rules Requirements Service Levels Quality Defects and
Procedures
Performance

March 8, 2010 318


Data Quality Management - Principles

• Manage data as a core organisational asset


• All data elements will have a standardised data definition, data type, and
acceptable value domain
• Leverage Data Governance for the control and performance of DQM
• Use industry and international data standards whenever possible
• Downstream data consumers specify data quality expectations
• Define business rules to assert conformance to data quality expectations
• Validate data instances and data sets against defined business rules
• Business process owners will agree to and abide by data quality SLAs
• Apply data corrections at the original source, if possible
• If it is not possible to correct data at the source, forward data corrections to the
owner of the original source whenever possible
• Report measured levels of data quality to appropriate data stewards, business
process owners, and SLA managers
• Identify a gold record for all data elements

March 8, 2010 319


Develop and Promote Data Quality Awareness

• Promoting data quality awareness means more than ensuring that


the right people in the organisation are aware of the existence of
data quality issues
• Establish a data governance framework for data quality
− Set priorities for data quality
− Develop and maintain standards for data quality
− Report relevant measurements of enterprise-wide data quality
− Provide guidance that facilitates staff involvement
− Establish communications mechanisms for knowledge sharing
− Develop and apply certification and compliance policies
− Monitor and report on performance
− Identify opportunities for improvements and build consensus for approval
− Resolve variations and conflicts

March 8, 2010 320


Define Data Quality Requirements

• Applications are dependent on the use of data that meets specific needs associated with
the successful completion of a business process
• Data quality requirements are often hidden within defined business policies
− Identify key data components associated with business policies
− Determine how identified data assertions affect the business
− Evaluate how data errors are categorised within a set of data quality dimensions
− Specify the business rules that measure the occurrence of data errors
− Provide a means for implementing measurement processes that assess conformance to those
business rules
• Dimensions of data quality
− Accuracy
− Completeness
− Consistency
− Currency
− Precision
− Privacy
− Reasonableness
− Referential Integrity
− Timeliness
− Uniqueness
− Validity

March 8, 2010 321


Profile, Analyse and Assess Data Quality

• Perform an assessment of the data using two different approaches,


bottom-up and top-down
• Bottom-up assessment of existing data quality issues involves
inspection and evaluation of the data sets themselves
• Top-down approach involves understanding how their processes
consume data, and which data elements are critical to the success of
the business application
− Identify a data set for review
− Catalog the business uses of that data set
− Subject the data set to empirical analysis using data profiling tools and
techniques
− List all potential anomalies, review and evaluate
− Prioritise criticality of important anomalies in preparation for defining data
quality metrics

March 8, 2010 322


Define Data Quality Metrics

• Poor data quality affects the achievement of business objectives


• Seek and use indicators of data quality performance to report the
relationship between flawed data and missed business objectives
• Measuring quality similarly to monitoring any type of business
performance activity
• Data quality metrics should be reasonable and effective
− Measurability
− Business Relevance
− Acceptability
− Accountability / Stewardship
− Controllability
− Trackability

March 8, 2010 323


Define Data Quality Business Rules

• Measurement of conformance to specific business rules


requires definition
• Monitoring conformance to these rules requires
• Segregating data values, records, and collections of
records that do not meet business needs from the valid
ones
• Generating a notification event alerting a data steward of a
potential data quality issue
• Establishing an automated or event driven process for
aligning or possibly correcting flawed data within business
expectations
March 8, 2010 324
Test and Validate Data Quality Requirements

• Data profiling tools analyse data to find potential anomalies


• Data profiling tools allow data analysts to define data rules for
validation, assessing frequency distributions and corresponding
measurements and then applying the defined rules against the data
sets
• Characterising data quality levels based on data rule conformance
provides an objective measure of data quality
• By using defined data rules to validate data, an organisation can
distinguish those records that conform to defined data quality
expectations and those that do not
• In turn, these data rules are used to baseline the current level of
data quality as compared to ongoing audits

March 8, 2010 325


Set and Evaluate Data Quality Service Levels

• Data quality SLAs specify the organisation’s expectations for response and
remediation
• Having data quality inspection and monitoring in place increases the likelihood of
detection and remediation of a data quality issue before a significant business
impact can occur
• Operational data quality control defined in a data quality SLA includes
− The data elements covered by the agreement
− The business impacts associated with data flaws
− The data quality dimensions associated with each data element
− The expectations for quality for each data element for each of the identified dimensions
in each application or system in the value chain
− The methods for measuring against those expectations
− The acceptability threshold for each measurement
− The individual(s) to be notified in case the acceptability threshold is not met. The
timelines and deadlines for expected resolution or remediation of the issue
− The escalation strategy and possible rewards and penalties when the resolution times
are met.

March 8, 2010 326


Continuously Measure and Monitor Data Quality

• Provide continuous monitoring by incorporating control


and measurement processes into the information
processing flow
• Incorporating the results of the control and measurement
processes into both the operational procedures and
reporting frameworks enable continuous monitoring of the
levels of data quality

March 8, 2010 327


Manage Data Quality Issues

• Supporting the enforcement of the data quality SLA requires a


mechanism for reporting and tracking data quality incidents and
activities for researching and resolving those incidents
• Data quality incident reporting system provides this capability
• Tracking of data quality incidents provides performance reporting
data, including mean-time-to-resolve issues, frequency of
occurrence of issues, types of issues, sources of issues and common
approaches for correcting or eliminating problems
• Data quality incident tracking also requires a focus on training staff
to recognise when data issues appear and how they are to be
classified, logged and tracked according to the data quality SLA
• Implementing a data quality issues tracking system provides a
number of benefits
− Information and knowledge sharing can improve performance and reduce
duplication of effort
− Analysis of all the issues will help data quality team members determine any
repetitive patterns, their frequency, and potentially the source of the issue
March 8, 2010 328
Clean and Correct Data Quality Defects

• Perform data correction in three general ways


− Automated correction - Submit the data to data quality and data
cleansing techniques using a collection of data transformations
and rule-based standardisations, normalisations, and corrections
− Manual directed correction - Use automated tools to cleanse and
correct data but require manual review before committing the
corrections to persistent storage
− Manual correction: Data stewards inspect invalid records and
determine the correct values, make the corrections, and commit
the updated records

March 8, 2010 329


Design and Implement Operational DQM Procedures

• Using defined rules for validation of data quality provides a


means of integrating data inspection into a set of
operational procedures associated with active DQM
• Design and implement detailed procedures for
operationalising activities
− Inspection and monitoring
− Diagnosis and evaluation of remediation alternatives
− Resolving the issue
− Reporting

March 8, 2010 330


Monitor Operational DQM Procedures and
Performance
• Accountability is critical to the governance protocols
overseeing data quality control
• Issues must be assigned to some number of individuals,
groups, departments, or organisations
• Tracking process should specify and document the
ultimate issue accountability to prevent issues from
dropping through the cracks
• Metrics can provide valuable insights into the effectiveness
of the current workflow, as well as systems and resource
utilisation and are important management data points that
can drive continuous operational improvement for data
quality control

March 8, 2010 331


Conducting a Data Management Project

March 8, 2010 332


Conducting a Data Management Project

• Data management project depends on:


− Scope of the Project – data management functions to be
encompassed
− Type of Project – from architecture to analysis to implementation
− Scope Within the Organisation – one or more business units or
the entire organisation

March 8, 2010 333


Data Management Function and Project Type

Scope of Project
Data Data Data Data Data Security Reference and Data Document Metadata Data Quality
Type of Governance Architecture Development Operations Management Master Data Warehousing and Content Management Management
Project Management Management Management and Business
Intelligence
Management

Management
Architecture

Analysis and
Design

Implementation

Operational
Improvement

Management
and
Administration

March 8, 2010 334


Mapping the Path Through the Selected Data
Management Project
• Use the framework
to define the
breakdown of the
selected project

March 8, 2010 335


Project Elements – Data Management Functions,
Type of Project, Organisational Scope

Organisational
Scope of Project

Data
Management
Functions
Within Scope
of Project

Type of
Project

• Select the project building blocks based on the project scope


March 8, 2010 336
Creating a Data Management Team

March 8, 2010 337


Creating a Data Management Team

• Having implemented a data management framework,


must be monitored, managed and constantly improved
• Need to consolidate and coordinate data management and
governance efforts to meet the challenges of
− Demand for performance management data
− Complexity in systems and processes
− Greater regulatory and compliance requirements
• Build a Data Management Center of Excellence (DMCOE)

March 8, 2010 338


Data Management Center of Excellence

• Separate business units with the organisation generally implement their own
solutions
• Each business unit will have different IT systems, data warehouses/data marts and
business intelligence tools
• Organisation-wide coordination of data resources requires a centralised dedicated
structure like the DMCOE providing data services
• Leads a organisation to business benefits through continuous improvement of
data management
• DMCOE functions need to focus on leveraging organisational knowledge and skills
to maximise the value of data to the organisation
• Maximise technology investment while decreasing costs and increasing efficiency,
centralise best practices and standards and empower knowledge workers with
information and provide thought leadership to the entire company
• DMCOE does not exist in isolation to other operations and service management
functions

March 8, 2010 339


DMCOE Functions

• Maximise the value of the data technology investment to


the organisation by taking a portfolio approach to increase
skills and leverage and to optimise the infrastructure
• Focus on project delivery and information asset creation
with an emphasis on reusability and knowledge
management along with solution delivery
• Ensure the integrity of the organisation’s business
processes and information systems
• Ensure the quality compliance effort related to the
configuration, development, and documentation of
enhancements
• Develop information learning and effective practices

March 8, 2010 340


Data Charter

• Create charter that lists the fundamental principles of data


management the DMCOE will adhere to:
− Data Strategy - Create a data blueprint, based upon business
functions to facilitate data design
− Data Sharing - Promote the sharing of data across the
organisation and reduce data redundancy
− Data Integrity - Ensure the integrity of data from design and
availability perspectives
− Technical Expertise - Provide the expertise for the development
and support of data systems
− High Availability and Optimal Performance - Ensure consistent
high availability of data systems through proper design and use
and optimise performance of the data systems
March 8, 2010 341
DMCOE Skills

• DMCOE needs skills across three dimensions


− Specific data management functions
− Business management and administration
− Technology and service management

March 8, 2010 342


DMCOE Skills
Data Management
Design and
Development
Data Management
Data Management Business Skills
Process Management

Personnel Management

Data Management
Portfolio Management
Data
Reference
Data Data Warehousing Document
Data Management Data Data Data Security and Master Metadata Data Quality
Architecture Operations and Business and Content
Strategy Governance Development Management Data Management Management
Management Management Intelligence Management
Management
Management

Data Management
Environment and Specific Functions
Infrastructure
Management

Service Management • Idealised set of DMCOE skills that need to


Data
and Support
be customised to suit specific organisation
Management Application Deployment needs
Technology and and Data Migration
Service
Functions • Just one view of a DMCOE
Technical Architecture

March 8, 2010 343


DMCOE Business Management and Administration
Skills
DMCOE Business
Management and
Administration

Data Management Data Management Data Management


Data Management Personnel
Portfolio Process Design and
Strategy Management
Management Management Development

Management of Education and


Creation and Requirements
Strategic Planning Portfolio of Data Skills
Enforcement of Definition and
Processes Management Identification and
Process Standards Management
Initiatives Development

Co-ordination of
Resource
Data Management Management of Analysis and
Management and
Systems and Data Processes Design
Allocation
Initiatives

Creation and
Enforcement of Vendor Performance Development
Data Principles Management Management Standards
and Standards

Solution
Data Usage
Data Quality Development and
Strategy
Deployment

March 8, 2010 344


DMCOE Technology and Service Management Skills
DMCOE Technology
and Service
Management

Environment and Application


Service Management
Infrastructure Deployment and Data Technical Architecture
and Support
Management Migration

Change Management Application Infrastructure


Service Desk
and Control Deployment Architecture

Test Management –
Version Management Service Level Application and Tools
System, Integration,
and Control Management Architecture
UAT, UAT Support

Performance
Data Migration Data and Content
Monitoring and Security Management
Management Architecture
Management

Integration
Reporting System Maintenance
Architecture

March 8, 2010 345


Benefits of DMCOE

• Consistent infrastructure that reduces time to analyse and


design and implement new IT solutions
• Reduced data management costs through a consistent
data architecture and data integration infrastructure -
reduced complexity, redundancy, tool proliferation
• Centralised repository of the organisation's data
knowledge
• Organisation-wide standard methodology and processes to
develop and maintain data infrastructure and procedures
• Increased data availability
• Increased data quality

March 8, 2010 346


Assessing Your Data Management Maturity

March 8, 2010 347


Assessing Your Data Management Maturity

• A Data Management Maturity Model is a measure of and then a


process for determining the level of maturity that exists within an
organisation’s data management function
• Provides a systematic framework for improving data management
capability and identifying and prioritising opportunities, reducing
cost and optimising the business value of data management
investments
• Measure of data management maturity so that:
− It can be tracked over time to measure improvements
− It can be use to define project for data management maturity improvements
within costs, time, and return on investment constraints
• Enables organisations to improve their data management function
so that they can increase productivity, increase quality, decrease
cost and decrease risk

March 8, 2010 348


Data Management Maturity Model

• Assesses data management maturity on a level of 1 to 5


across a number of data management capabilities
Level Title Description
1 Initial Data management is ad hoc and localised. Everybody has their own
approach that is unique and not standardised except for local
initiatives.
2 Repeatable and Data management has become independent of the person or
Reactive business unit administering and is standardised.
3 Defined and Data management is fully documented, determined by subject
Standardised matter experts and validated.
4 Managed and Data management results and outcomes are stored and pro-
Predictable actively cross-related within and between business units. The data
management function actively exploit benefits of standardisation.
5 Optimising and As time, resources, technology, requirements and business
Innovating landscape changes the data management function is able to be
easily and quickly adjusted to fit new needs and environments

March 8, 2010 349


Maturity Level 1 - Initial

• Data management processes are mostly disorganised and generally performed on


an ad hoc or even even chaotic basis
• Data is considered as general purpose and is not viewed by either business or
executive management to be a problem or a priority
• Data is accessible but not always available and is not secure or auditable
• No data management group and no one owns the responsibility for ensuring the
quality, accuracy or integrity of the data
• Data management (to the degree that it is done at all) is reliant on the efforts and
competence of individuals
• Data proliferates without control and the quality is inconsistent across the various
business and applications silos
• Data exists in unconnected databases and spreadsheets using multiple formats
and inconsistent definitions
• Little data profiling or analysis and data is not considered or understood as a
component of linked processes
• No formal data quality processes and the processes that do exist are not
repeatable because they are neither well defined nor well documented

March 8, 2010 350


Maturity Level 2 - Repeatable and Reactive

• Fundamental data management practices are established, defined, documented


and can be repeated
• Data policies for creation and change management exist, but still rely on
individuals and are not institutionalised throughout the organisation
• Data as valuable asset is a concept understood by some, but senior management
support is lacking and there is little organisational buy-in to the importance of an
enterprise-wide approach to managing data
• data is stored locally and data quality is reactive to circumstances
• Requirements are known and managed at the business unit and application level
• Procurement is ad hoc based on individual needs and data duplication is mostly
invisible
• Data quality varies among business units and data failures occur on a cross-
functional basis.
• Most data is integrated point-to-point and not across business units

March 8, 2010 351


Maturity Level 3 - Defined and Standardised

• Business analysts begin to control the data management process with IT playing a
supporting role
• Data is recognised as a business enabler and moves from an undervalued commodity to an
enterprise asset but there are still limited controls in place
• Executive management appreciates and understands the role of data governance and
commits resources to its management
• Data administrative function exists as a complement to the database administration
function and data is present for both business and IT related development discussions
• Some core data has defined policy that it is documented as part of the applications
development lifecycle and the policies are enforced to a limited extent and testing is
performed to ensure that data quality requirements are being achieved
• Data quality is not fully defined and there are multiple views of what quality
• Metadata repository exists and a data group maintains corporate data definitions and
business rules
• A centralised platform for managing data is available at the group level and feeds analytical
data marts
• Data is available to business users and can be audited

March 8, 2010 352


Maturity Level 4 - Managed and Predictable

• Data is treated as a critical corporate asset and viewed as equivalent to other enterprise wide assets
• Unified data governance strategy exists throughout the enterprise with executive level and CEO
support
• Data management objectives are reviewed by senior management
• Business process interaction is completely documented and planning is centralised
• Data quality control, integration and synchronisation are integral parts of all business processes
• Content is monitored and corrected in real time to manage the reliability of the data manufacturing
process and is based on the needs of customers, end users and the organisation as a whole
• Data quality is understood in statistical terms and managed throughout the transactions lifecycle
• Root cause analysis is well established and proactive steps are taken to prevent and not just correct
data inconsistencies
• A centralised metadata repository exists and all changes are synchronised
• Data consistency is expected and achieved
• Data platform is managed at the enterprise level and feeds all reference data repositories
• Advanced platform tools are used to manage the metadata repository and all data transformation
processes
• Data quality and integration tools are standardised across the enterprise.

March 8, 2010 353


Maturity Level 5 - Optimising and Innovating

• The organisation is in continuous improvement mode


• Process enhancements are managed through monitoring feedback and a
quantitative understanding of the causes of data inconsistencies
• Enterprise wide business intelligence is possible
• Organisation is agile enough to respond to changing circumstances and evolving
business objectives
• Data is considered as the key resource for process improvement
• Data requirements for all projects are defined and agreed prior to initiation
• Development stresses the re-use of data and is synchronised with the
procurement process
• Process of data management is continuously being improved
• Data quality (both monitoring and correction) is fully automated and adaptive
• Uncontrolled data duplication is eliminated and controlled duplication must be
justified
• Governance is data driven and the organisation adopts a “test and learn”
philosophy

March 8, 2010 354


Data Management Maturity Evaluation - Key
Capabilities and Maturity Levels
Level 1 Level 2 Level 3 Level 4 Level 5

Data Governance

< Description of
Data Architecture Management capability associated
with maturity level >

Data Development

Data Operations Management

Data Security Management

Reference and Master Data


Management

Data Warehousing and Business


Intelligence Management

Document and Content


Management

Metadata Management

Data Quality Management

March 8, 2010 355


More Information

Alan McSweeney
alan@alanmcsweeney.com

March 8, 2010 356