Вы находитесь на странице: 1из 13

Matthew Lawler lawlermj1@gmail.

com Datawarehouse Dictionary

Datawarehouse
Dictionary

Matthew Lawler lawlermj1@gmail.com

D:\D\Documents\DW Me\0 Publish\DW Dictionary.docx February 13, 2018 1 of 13


Matthew Lawler lawlermj1@gmail.com Datawarehouse Dictionary

INTRODUCTION 3

WHAT IS THE PROBLEM? 5

WHAT IS THE SOLUTION? 7

DICTIONARY DESIGN 12

D:\D\Documents\DW Me\0 Publish\DW Dictionary.docx February 13, 2018 2 of 13


Matthew Lawler lawlermj1@gmail.com Datawarehouse Dictionary

Introduction

Licence
As these are generic software documentation standards, they will be covered by the 'Creative
Commons Zero v1.0 Universal' CC0 licence.

Warranty
The author does not make any warranty, express or implied, that any statements in this document
are free of error, or are consistent with particular standard of merchantability, or they will meet the
requirements for any particular application or environment. They should not be relied on for solving
a problem whose incorrect solution could result in injury or loss of property. If you do use this
material in such a manner, it is at your own risk. The author disclaims all liability for direct or
consequential damage resulting from its use.

Purpose
This document describes the design for a Metadata Registry.

Audience
The Metadata Registry should be of use to all staff. This document can be read by all to understand
the why and how of the MDR.

Assumptions
It is assumed that this will be a useful firm wide resource. No business or technical knowledge is
required.

Approach
This document defines a simple implementation of the ISO/IEC 11179 standard for use on the firm’s
Intranet.

The Metadata Registry is the international standard for representing metadata for an organization.
In effect, it is a dictionary standard. Most large companies consist of a series of distinct professional
areas, each with their distinct, and sometime overlapping terms. A common Metadata Registry can
disambiguate the conflicting terms, and remove an important source of confusion.

Related Documents
Document Title Document Owner Department File Name and
Location

ISO/IEC 11179, Information ISO - International ISO/IEC JTC 001 http://metadata-


Technology -- Metadata Organization for "Information standards.org/
registries (MDR) Standardization technology"

Definitions

D:\D\Documents\DW Me\0 Publish\DW Dictionary.docx February 13, 2018 3 of 13


Matthew Lawler lawlermj1@gmail.com Datawarehouse Dictionary

Term Definition

Authority An Organization or Person responsible for maintaining a set of data elements. Eg


APRA, ASIC, ATO, Visa.

Conceptual A set of unique terms that are used by a group of people to represent concepts that
Domain the group needs for communication. For example, Treasury, GL, IT, HR, etc. This is
the same as a Namespace.

Data Element A Data Element is considered to be a basic unit of data of interest to an


organization. It is a unit of data for which the definition, identification,
representation, and permissible values are specified by means of a set of attributes.
For example, Customer occurs in many systems.

Data Element A Data Element Concept is a concept that can be represented in the form of a data
Concept element, described independently of any particular representation. For example,
Account (GL) is different to Account (Cards).

Metadata Under ISO/IEC 11179, Metadata is defined to be data that defines and describes
other data. This means that metadata are data, and data become metadata when
they are used in this way. This happens under particular circumstances, for
particular purposes, and with certain perspectives, as no data are always metadata.
The set of circumstances, purposes, or perspectives for which some data are used
as metadata is called the context. So, metadata are data about data in some
context.

Metadata An information system for registering metadata. This could also be called a
Registry Metadata Glossary or Metadata Dictionary.

Value Domain A set of Permissible Values. For example, the set of all 6 character alphanumeric
fields or the subset of 6 numeric char used for BSB.

Value This is a finite allowed inventory of notions that can be enumerated. For example,
Enumerated valid BSB numbers are …

Tags
Business Intelligence ; Data Governance ; Data Mapping ; ISO 11179 ; Metadata ; Metadata
Dictionary ; Metadata Glossary ; Metadata Registry ; Namespace ; Standards ; Data Architect ;
Data Architecture ;

D:\D\Documents\DW Me\0 Publish\DW Dictionary.docx February 13, 2018 4 of 13


Matthew Lawler lawlermj1@gmail.com Datawarehouse Dictionary

What is the problem?

How does confusion arise?


Reasons for confusion can arise for many reasons:

 New staff

 Staff who move to new functional areas

 Staff who need to work with in cross functional areas

 Standard Terms that are used in a unique way within the organisation or function.

 New terms introduced by external parties, such as regulators or IT package providers.

The number of terms can easily expand to the thousands. The key need is to be able to collect all
terms in one place. Homonyms and synonyms can then be distinguished, and separate definitions
provided. Note that this is not an attempt to impose standard terms on all staff. Instead, it is to
recognise that each area has their jargon, and enabling staff how to negotiate the quite valid
differences.

Tower of Babel

D:\D\Documents\DW Me\0 Publish\DW Dictionary.docx February 13, 2018 5 of 13


Matthew Lawler lawlermj1@gmail.com Datawarehouse Dictionary

"That is why it was called Babel because there the Lord confused the language of the whole world."
Genesis 11:9

Current Situation
There are a number of HTML pages, PDFs, etc that are used to define terms throughout .

This situation are inadequate because they

1. Do not distinguish the domain over which the definition applies. (eg Account has a different
meaning in Cards and GL )

2. Do not distinguish the authority which has created the definition applies. (eg Product is
defined differently across , creating reporting confusion.)

3. Does not support the resolution of definitional issues such as homonyms, synonyms, etc.

4. Do not cover the whole company, or all required definitions.

5. Forces staff to spend time collecting and resolving definitions on an ad hoc basis.

6. Do not allow updates from staff except through manual process.

7. Cannot support more rigorous semantic data mappings.

This creates an incomplete, and potentially contradictory set of definitions that are only available to
some staff.

D:\D\Documents\DW Me\0 Publish\DW Dictionary.docx February 13, 2018 6 of 13


Matthew Lawler lawlermj1@gmail.com Datawarehouse Dictionary

What is the Solution?


Overview
Establish a partial implementation of ISO 11179. It consists of a number of HTML pages that support
the data below. This should provide visibility on terms used, and help to identify terms that need
clarification.

ISO High Level Meta Model


This is not a full attempt to define the standard. The following data model shows the 4 key entities
required to support the MDR. This was extracted from the specification document. While this is not
a large number of entities, it does capture the essential idea. This would be one of the tools needed
for Data Governance.

data_element_concept_conceptual_domain_relationship

Data_Element_Concept Conceptual_Domain
0..* 1..1

expressed_by 1..1 having specifying 1..1 represented_by

data_element_concept_expression conceptual_domain_representation

expressing 0..* represented_by representing 0..* representing

0..* 1..1
Data_Element Value_Domain

data_element_representation

Figure 2: High-level metamodel

Example Data
The following shows sample data that applies to an organisation.

D:\D\Documents\DW Me\0 Publish\DW Dictionary.docx February 13, 2018 7 of 13


Matthew Lawler lawlermj1@gmail.com Datawarehouse Dictionary

Metadata Registry Conceptual Domains (Namespace) Sample


Short Full name

11179 ISO 11179

AS3806 Australian Standard AS3806-2006

Australia Australia

Data Data

Domain Domain

Finance Finance

HR Human Resources

IT Information Technology

Legal Legal

Location Location

Measure Measure

Operations Operations

OU Organisation Unit

Product Product

Project Project

Risk Risk

Sales Sales

Strategy Strategy

System System

Time Time

Tran Transaction

Treasury Treasury

UML Unified Modelling Language

XO External Organisation

D:\D\Documents\DW Me\0 Publish\DW Dictionary.docx February 13, 2018 8 of 13


Matthew Lawler lawlermj1@gmail.com Datawarehouse Dictionary

Metadata Registry Authorities Sample


Abbreviation Full Name Web

APCA Australian Payments Clearing Association

APRA Australian Prudential Regulatory Authority www.apra.gov.a


u

AS Australian Standard

ASIC Australian Securities Investment Commission www.asic.gov.au

ASX Australian Stock Exchange

ATO Australian Tax Office

BO Business Objects

BPMI Business Processing Management Initiative

DM Data Modelling

GA General Abbreviation

GAAP Generally Accepted Accounting Principles

GIT General Information Term

Govt Australian Government

IASB International Accounting Standards Board

IBM International Business Machines

IETF Internet Engineering Task Force

ISACA Information Systems Audit and Control Association

ISO International Standards Organisation

ITU International Telecommunication Union

NYSSCPA New York State Society of Certified Public Accountants

OMG Object Management Group

Org Organisation Name

RBA Reserve Bank of Australia

Visa Visa International

D:\D\Documents\DW Me\0 Publish\DW Dictionary.docx February 13, 2018 9 of 13


Matthew Lawler lawlermj1@gmail.com Datawarehouse Dictionary

Metadata Registry Terms Sample


Term Domai Aut Description
n hori
ty

Acquirer Cards Visa Acquirer

ADI XO APR Approved Deposit-Taking Institution


A

APCA XO APC Australian Payments Clearing Association


A

APRA XO APR Australian Prudential Regulatory Authority


A

ASIC XO ASIC Australian Securities Investment Commission

B&F IT GA Budgeting and Forecasting tool

B2B IT GA Business to business

B2C IT GA Business to consumer/customer

BAU IT GA Business as usual

BC IT GA Business Case

BCM IT GA Business Continuity Management

BCP IT GA Business Continuity Planning

BECS Syste APC Bulk Electronic Clearing System


m A

BI IT GA Business Intelligence

BO Produ BO Business Objects


ct

BPAY XO Org Bill Pay

BTS OU Business Technology Services

CAP IT Corporate Application Policy

Capital Financ GA Generally, the money used to run a business. However, 's capital
e base is much larger in order to cover regulatory requirements
against losses as well as encompass the amount necessary for
operation.

D:\D\Documents\DW Me\0 Publish\DW Dictionary.docx February 13, 2018 10 of 13


Matthew Lawler lawlermj1@gmail.com Datawarehouse Dictionary

Capital Financ GA Seeking money from current shareholders and other potential
raising e investors so can fund growth plans.

CFD Treasu ASX Contracts for difference


ry

COB Treasu ASX Close of business


ry

Compliance AS380 AS Adhering to requirements of laws, industry and organisational


6 standards and codes, principles of good government and accepted
community and ethical standards.

Compliance AS380 AS The values, ethics and beliefs that exist throughout and
culture 6 organisation and interact with the organisations structures and
control systems to produce behavioural norms that are conducive
to compliance outcomes.

D:\D\Documents\DW Me\0 Publish\DW Dictionary.docx February 13, 2018 11 of 13


Matthew Lawler lawlermj1@gmail.com Datawarehouse Dictionary

Dictionary Design

What are the definitions for Term, Domain and Authority?


A Term is a word or phrase of interest to an organisation . A Term is like word in a dictionary.

A Domain or Namespace is a set of unique terms that are used by a group of people to represent
common concepts. For example, the set of all terms to do with Cards. Other relevant domains
include Treasury, Finance, IT, etc. For example, Account normally means a members credit card, but
it also means a GL Account. Both terms are valid within their domain or context. All terms should
have a unique definition named within their domain, but the same term can be reused in another
domain.

An Authority is an organization or person responsible for maintaining a set of dictionary terms. For
example, or APRA. In some instances, the actual source may not yet be determined, in which case it
is called a General term.

Who can change these terms?


All staff can add, or edit these definitions. This is similar to editor or bottom up approach used by
Wikipedia. To paraphrase Linus's Law, "Given a large enough group, almost every term will be
obvious to someone." So, the best way to ensure quality is to have all staff use and review these
definitions.

How can terms and definitions be added or changed?


When defining any Term, the Authority and Domain must always also be added. Consequently, the
combination of Term, Domain and Authority should only occur once.

1 Is the Term used by staff within any domain at the organisation?

If not, then the term need not be defined.

If yes, then continue.

2 Does the Term belong to a currently defined Domain?

If yes, then choose from the current list.

If not, then ask the moderator to create a new domain.

3 Is there an external Authority that defines the Term?

If yes, then choose from the current list.

If not, then determine the new Authority, and ask the moderator to add it or add a new
internal Subject Matter Expert (SME).

4 Add the new term, domain, authority and definition.

5 Is the external definition correct for the organisation’s use?

If yes, then do nothing.

D:\D\Documents\DW Me\0 Publish\DW Dictionary.docx February 13, 2018 12 of 13


Matthew Lawler lawlermj1@gmail.com Datawarehouse Dictionary

If not, then add the new term, domain, internal authority and definition. Note that the
external definition should be retained. This will help to clarify the reason for the different definition,
and help people who know the external definition. Clearly, if the Authority is external then there
should not be much change in the term. Locally defined Terms may be subject to more change. See
content policy below for guidelines.

6 Are there any Terms that have an identical Domain and Authority?

If no, then do nothing.

If yes, then determine how the duplicate arose. If needed, the moderator will resolve the
duplication.

What is the dictionary content policy?


Verifiability: Always provide a reliable source as an authority. For internal definitions, this would be
a Subject Matter Expert. Samuel Johnson: "Knowledge is of two kinds. We know a subject
ourselves, or we know where we can find information on it."

No original research: If it is an internal definition, always provide a definition that has been
previously agreed by two or more staff members. Do not use this dictionary as a forum to discuss a
new definition. Use e-mail or a meeting instead.

Neutral point of view: Always define from a neutral point of view, without bias, all possible
definitions.

Simplicity: Einstein: "Everything should be made as simple as possible, but no simpler." Be careful
not to oversimplify.

What is the background to this dictionary?


On a technical note, the structure of this data dictionary is a partial implementation of the ISO/IEC
11179 Metadata Registry (MDR) standard. See the following website for more details: metadata-
standards.org.

Next Steps
Staff would provide definitions. Initially, these will be collected from the intranet, documents,
contracts, etc. This can then be collected together, and published. Staff should be able to directly
update these via the intranet.

The domains and authorities pages would be fairly stable, so these could be set up as static HMTL.
The terms pages would be more dynamic, with a need to be able to update across the organisation.

D:\D\Documents\DW Me\0 Publish\DW Dictionary.docx February 13, 2018 13 of 13

Вам также может понравиться