Академический Документы
Профессиональный Документы
Культура Документы
Data Quality
20 november 2008
Alex Bruschke – Sales Consultant
Informatica Benelux
2
Technology Framework
Common Project Lifecycle: Do More With Less
Identity
PowerExchange PowerCenter
Resolution
3
Data Quality Products
4
Analyze Phase
Informatica Data Explorer
High
High level
level features
features
•• Import
Import Metadata
Metadata
•• Column
Column Profiling
Profiling
(Column
(Column Analysis)
Analysis)
•• Dependency
Dependency Profiling
Profiling
(Single
(Single Table
Table Analysis)
Analysis)
•• Redundancy
Redundancy Profiling
Profiling
(Cross Table Analysis)
(Cross Table Analysis)
•• Orphan
Orphan Analysis/Key
Analysis/Key
Validation
Validation
•• Data
Data Design
Design
•• Tags
Tags &
& Specifications
Specifications
•• Open
Open Repository
Repository &
&
Repository Reports
Repository Reports
5
Data Explorer
6
Six Types of Data Quality Dimensions
7
Customer Master Data - Examples
8
Business Rules Voorbeeld
Conformity
Kwaliteitscriteria t.b.v. analyse Basis Relatie gegevens
Conformity
Rubriek Omschrijving Quality Criteria Kwaliteitscriteria
Nummer Rubriek
1 Naam-man Name-male -Voorvoegsel altijd achter eigennaam
Consistency
- Prefix always behind surname
- No names with a dash (-) Accuracy
-Er mag geen naam met koppelteken
voorkomen
2 Naam-vrouw Name-female -Voorvoegsel altijd achter eigennaam
- Prefix always behind surname -Er mag geen naam met koppelteken
- No names with a dash (-) voorkomen
- When Geslacht (gender)=M, then empty -Indien geslacht M is dan leeg
Conformity
3 Voornaam First Name -Minimaal een letter, rubriek mag niet
- Not empty, at least one alphabetic character leeg zijn.
4 Overige voorletters Remaining initials -Letters gescheiden door punten.
- seperated by a dot
5 Straat + nummer Street and number -Geldige nederlandse straatnaam
- Dutch streetname, conforming format in conform schrijfwijze postcodetabel van
Cendris postal-code table Cendris
… … Completeness
… …
… … … …
9
Business and IT Collaboration
Informatica Supports the Entire Data Quality Lifecycle
SME
5. Review exceptions
1. Profile the Data SME
4. Integrate Data Quality
and refine rules 2. Establish Metrics
3. Design
Rules and into DI processes and Define Targets
Step 5 is an exception review
implement
Step 4 Data
is–deployment. Typically
SME 2.process
Establish
1.Profile where metrics
the data all records
Quality 6. Rules
there is
Monitor
which the
Data option to deploy in
have failed automated
and
Step define
1
Quality
Step 3 involves
standalone involves
Versus targets
designing
i.e. be data
independent of a
3. Design and
Implement Data
Developer
processes
Step Targets can
26.is a
profiling data quality
which viewed
isplatform and
audit
the cleansing,
data
review.
Monitor
integration Data Quality
or as Quality Rules
with
partthe
used todeliverable
adiscover orof a
Versus
standardisation,
of matching
data targets
integration
Data Quality
scorecard
SME
profile
and consolidation
process which
undocumented
rules.
i.e. with identifies,
PowerCenter.
5.Step
categorises
Review
data sources.6. Report on the results
and quantifies
Exceptions
and Refine Rules Developer
i.e.
data quality web based
issues reporting
within all to 4. Integrate DQ Rules
into DI Processes
sources.distribute the results in the
form of customizable web
reports including drilldown and
alerts.
10
Data Quality
11
Key Strengths of Informatica Data Quality
• Business-Focused Data Quality
• DQ Audits, Building business case, Collaboration between
Business and IT, common data quality definitions, business rules,
metrics monitoring, ease-of-use drive rapid Implementation
12
13