Вы находитесь на странице: 1из 13

1

Data Quality

20 november 2008
Alex Bruschke – Sales Consultant
Informatica Benelux

2
Technology Framework
Common Project Lifecycle: Do More With Less

Data Data Data Master Data Data B2B Data


Warehouse Migration Consolidation Management Synchronization Exchange

Identity
PowerExchange PowerCenter
Resolution

B2B Exchange Data Explorer Data Quality B2B Exchange

3
Data Quality Products

Identify & Design &


Measure Implement

Informatica Data Explorer (IDE) Informatica Data Quality (IDQ)


• Rapid analysis of data in multiple source • Define & build DQ rule sets for
systems • Analysis
• Catalog details of each data source in • Standardization & enrichment
repository • Matching & consolidation
• Monitoring & reporting
• Tables, columns, domains
• Deploy DQ rule sets and manage over time
• Data structures (inferred & documented)
• Seamless integration with PowerCenter
• Data completeness & redundancy
• Batch, real time deployment
• High-level DQ status & issues
• Tag data and document instructions for
follow-on processes

4
Analyze Phase
Informatica Data Explorer
High
High level
level features
features
•• Import
Import Metadata
Metadata
•• Column
Column Profiling
Profiling
(Column
(Column Analysis)
Analysis)
•• Dependency
Dependency Profiling
Profiling
(Single
(Single Table
Table Analysis)
Analysis)
•• Redundancy
Redundancy Profiling
Profiling
(Cross Table Analysis)
(Cross Table Analysis)
•• Orphan
Orphan Analysis/Key
Analysis/Key
Validation
Validation
•• Data
Data Design
Design
•• Tags
Tags &
& Specifications
Specifications
•• Open
Open Repository
Repository &
&
Repository Reports
Repository Reports

5
Data Explorer

• Informatica Data Explorer (IDE)


• Demo

6
Six Types of Data Quality Dimensions

Completeness What data is missing or unusable?

Conformity What data is stored in a non-standard format?

Consistency What data values give conflicting information?

Accuracy What data is incorrect or out of date?

Duplicates What data records or attributes are repeated?

Integrity What data is missing or not referenced?

7
Customer Master Data - Examples

COMPLETENESS CONFORMITY CONSISTENCY DUPLICATION INTEGRITY ACCURACY

8
Business Rules Voorbeeld
Conformity
Kwaliteitscriteria t.b.v. analyse Basis Relatie gegevens
Conformity
Rubriek Omschrijving Quality Criteria Kwaliteitscriteria
Nummer Rubriek
1 Naam-man Name-male -Voorvoegsel altijd achter eigennaam
Consistency
- Prefix always behind surname
- No names with a dash (-) Accuracy
-Er mag geen naam met koppelteken
voorkomen
2 Naam-vrouw Name-female -Voorvoegsel altijd achter eigennaam
- Prefix always behind surname -Er mag geen naam met koppelteken
- No names with a dash (-) voorkomen
- When Geslacht (gender)=M, then empty -Indien geslacht M is dan leeg
Conformity
3 Voornaam First Name -Minimaal een letter, rubriek mag niet
- Not empty, at least one alphabetic character leeg zijn.
4 Overige voorletters Remaining initials -Letters gescheiden door punten.
- seperated by a dot
5 Straat + nummer Street and number -Geldige nederlandse straatnaam
- Dutch streetname, conforming format in conform schrijfwijze postcodetabel van
Cendris postal-code table Cendris

… … Completeness
… …
… … … …

9
Business and IT Collaboration
Informatica Supports the Entire Data Quality Lifecycle
SME

5. Review exceptions
1. Profile the Data SME
4. Integrate Data Quality
and refine rules 2. Establish Metrics
3. Design
Rules and into DI processes and Define Targets
Step 5 is an exception review
implement
Step 4 Data
is–deployment. Typically
SME 2.process
Establish
1.Profile where metrics
the data all records
Quality 6. Rules
there is
Monitor
which the
Data option to deploy in
have failed automated
and
Step define
1
Quality
Step 3 involves
standalone involves
Versus targets
designing
i.e. be data
independent of a
3. Design and
Implement Data
Developer

processes
Step Targets can
26.is a
profiling data quality
which viewed
isplatform and
audit
the cleansing,
data
review.
Monitor
integration Data Quality
or as Quality Rules
with
partthe
used todeliverable
adiscover orof a
Versus
standardisation,
of matching
data targets
integration
Data Quality

scorecard
SME
profile
and consolidation
process which
undocumented
rules.
i.e. with identifies,
PowerCenter.
5.Step
categorises
Review
data sources.6. Report on the results
and quantifies
Exceptions
and Refine Rules Developer
i.e.
data quality web based
issues reporting
within all to 4. Integrate DQ Rules
into DI Processes
sources.distribute the results in the
form of customizable web
reports including drilldown and
alerts.

10
Data Quality

• Informatica Data Quality (IDQ)


• Demo

11
Key Strengths of Informatica Data Quality
• Business-Focused Data Quality
• DQ Audits, Building business case, Collaboration between
Business and IT, common data quality definitions, business rules,
metrics monitoring, ease-of-use drive rapid Implementation

• All Master Data Objects


• All data master data types structured & unstructured across all
domains including Customer, Product, Supplier, Materials, Assets,
Contracts, ….

• Data Quality Metrics


• Measure, report and monitor on data quality using extensible set of
data quality dimensions

• Enterprise Data Quality Deployment


• Multiple data quality deployment options interactive, batch and real-
time, SOA on-site or hosted On Demand

12
13

Вам также может понравиться