Вы находитесь на странице: 1из 51

Data validation and Error handling

SAP NetWeaver Regional Implementation Group Business Intelligence SAP AG

Contents

1 2 3 4

Motivation Data validation with BW 3.x Error handling with BW 3.x Data repair

SAP AG 2004, Efficient usage of BW InfoProviders / 2

Contents

1 2 3 4

Motivation Data validation with BW 3.x Error handling with BW 3.x Data repair techniques

SAP AG 2004, Efficient usage of BW InfoProviders / 3

Why Data Validation?


BW data are expected to be of high quality
BW data needs to be complete BW data needs to be correct BW data needs to be up to date

BW requires high data accuracy for effective decision support


On top management level To support operational processes

BW is used as EDW, this means it is the central data store for consolidation and distribution of enterprise wide data
BW data often serve as foundation for further processing BW data are highly integrated

BW data are queried frequently


SAP AG 2004, Efficient usage of BW InfoProviders / 4

ROI of Data Quality - Data Quality as an Investment


You should ask
What is the risk of incomplete / incorrect data sets ? What is the cost to fix data, once contaminated ? What are corporate quality standard ? In which time frame incorrect data need to be repaired ? Availability of correct data in BW might be seen as critical as availability of operational system

However, also you should ask


What is the reliability of source data ? Where is the point of diminishing returns ? Can different data models / data validation procedures set up in order to respond to different needs concerning availability and quality of data ?

SAP AG 2004, Efficient usage of BW InfoProviders / 5

Sources for Dirty Data


Data are incorrect in source system Data consolidation causes issues Technical platforms are different (code pages, etc.) Administration issues (double loadings,) Custom logic (errors in routines,) Technology issues (SW, DB, O/S, HW, )

SAP AG 2004, Efficient usage of BW InfoProviders / 6

Data Contaminants - 1

free form fields


red wheel, type "014-2221" CA blue wheel, type "012-3342" CA 023-2211 white wheel CA

012-3344 Cup Holder, green 012-3378 Cup Holder, red Lighter, black 012-4122 012-552 white cover 012-7662 green Cup Holder

US US US US US

inconsistent keys
JP JP JP JP JP

multiple keys

012-401 Cup Holder, green 012-4122 phone plug 012-661 channel 013-1452 plastic cover, red 013-1452 (pink version of above)

invalid characters

surprises
SAP AG 2004, Efficient usage of BW InfoProviders / 7

Data Contaminants - 2

data format
XYZ.com Ltd. 10/10/99 $ 44332 XYZ.com Ltd. 10/12/99 $ 33222

data redundancy
XYZ.com Ltd. XYZ.com Ltd. XYZ.com Ltd. XYZ.com Ltd. XYZ.com Ltd. XYZ.com Ltd. XYZ.com Ltd. 10/10/2000 10/10/2000 10/10/2000 10/12/2000 10/14/2000 10/17/2000 10/19/2000 $ 67221 $ 67221 $ 67221 $ 35332 $ 31122 $ 99999999 $ 78882

ABC Co. LMN Ltd. XYZ.com Ltd. ZZZ Sl.

10/14/2000 10/14/2000 10/14/2000 10/14/2000

$ 4333 $ 9000 $ 31122 $ 122211

data anomalies

data redundancy

SAP AG 2004, Efficient usage of BW InfoProviders / 8

Data Contaminants - 3
Data Contamination during upload via Exits
Application Exits Generic BW Exit RSAP0001 Transfer- / Update-Routines Virtual Exits

Consider the following:


Timeliness of Data Check for Versions Check for Return Codes Delta Trigger Capabilities Performance and General Architecture

SAP AG 2004, Efficient usage of BW InfoProviders / 9

Contents

1 2 3 4

Motivation Data validation with BW 3.x Error handling with BW 3.x Data repair techniques

SAP AG 2004, Efficient usage of BW InfoProviders / 10

Data validation
Data validation answers the questions: check what?
Technical quality Semantically (Business rules) Completeness

check where?
Automatically during data load Rule driven

check how?
Built-in Routine Formula (planned)

SAP AG 2004, Efficient usage of BW InfoProviders / 11

Where: Checks in BW 3.x


ODS Object Master Data Info Cube

Update Rules

Texts Master data Hierarchies

InfoSource InfoSource

Transfer Rules

Data
SAP AG 2004, Efficient usage of BW InfoProviders / 12

Data

What: possible checks


Degree of Detail Check Type

technical
Empty field, Correct data type, code page, Master data check (SID)

business Black: built in Grey: Not built in


Free delivery: Revenue = 0 Supplier <> Receiver

Single field

Single record Multi record

Check for double records Records sent to BW = Records updated Sum of single revenues < 200

Multi table

Referential integrity (Foreign key check)

SAP AG 2004, Efficient usage of BW InfoProviders / 13

What: Captured errors (1)


In a transfer rule:
Not allowed characteristic values Lower case letters Arithmetic and conversion errors User built routine with returncode <> 0 No aggregation

Check for referential integrity on the InfoSource


Against Master data tables Against ODS-Objects

In an update rule:
Arithmetic or conversion error Master data read unsuccessful Currency translation or time conversion error User built routine with error message No aggregation
SAP AG 2004, Efficient usage of BW InfoProviders / 14

What: Captured errors (2)


Checks during master data and text update
Not allowed characteristic values No SID for navigational attribute No language in text upload Double records concerning the key Overlapping or invalid time intervals Data does not map with the scheduler selection No aggregation

Checks during Hierarchy Update


Errors in Hierarchy structure Overlapping time intervals No aggregation

Errors during InfoCube update


No SID for characteristic values No aggregation
SAP AG 2004, Efficient usage of BW InfoProviders / 15

Check for Permitted Characters


Case A: characters not permitted Case B: characters permitted

Permitted by standard: !"%&'()*+,-/:;<=>?_0123456789 ABCDEFGHIJKLMNOPQRSTUVWXYZ


SAP AG 2004, Efficient usage of BW InfoProviders / 16

Consistency Check for Characteristic Values


Checking for use of character values in the Data type NUMC fields correct consideration of the conversion routine ALPHA use of lower case letters use of special characters plausibility of date / time fields Consider performance impacts !

SAP AG 2004, Efficient usage of BW InfoProviders / 17

New in BW 3.x: Check of Referential Integrity


ODS-Object defined in InfoObject as check table Communication Structure Enable check (optional)

Bus. Partner 1000 9000

Material PC 9000 PC 9000

Look Up Error Handling Bus. Partner InfoObject Bus. Partner 9000 doesnt meet the referential Integrity => the record is marked as erroneous

1000

9000 not allowed

SAP AG 2004, Efficient usage of BW InfoProviders / 18

Check of Referential Integrity: Example

COSTC## Master Data

COSTC##_FLEX_MD

ODS-Object defined in InfoObject as check table

Check for existing master data not possible here

0COMP_CODE
Communication Structure

9999
SUBSTRING ( TXTSH , ' 0' , ' 4' )

0TXTSH
Transfer Structure
SAP AG 2004, Efficient usage of BW InfoProviders / 19

9999-0000000001100

ODS: Flexible Master Data Staging

Master Data Customers

Master Data Vendors

Additional Master Data layer


Optional Cleansing Consolidation

Update Rules

Benefit: Flexibility

Master Data ODS-Object Business Partners

Master Data InfoSource


SAP AG 2004, Efficient usage of BW InfoProviders / 20

Difference to Master Data check in InfoPackage

Referential Integrity
All Data targets One check in transfer rules Only for selected InfoObjects Error handling Works for all ODS objects types Check against MD-table or ODS object is possible

Master Data Check


All Data targets Check after update rules for each data target All InfoObjects BW 3.0: Error handling (except ODS object) BW 2.0 SP18: ODS object only if BEx-Reporting is active Check only against SID-table

SAP AG 2004, Efficient usage of BW InfoProviders / 21

Check: No aggregation allowed

If you select this indicator, the request is regarded as incorrect if the number of records received in BW does not match the number of updated records. That means that the request is regarded as incorrect if the records are sorted out, aggregated or created in the following:
transfer rules update rules update

SAP AG 2004, Efficient usage of BW InfoProviders / 22

Handling of double data records


Handling of double data records is available in InfoPackage for time independent master data and text DataSources R/3 DataSources can deliver flag if transferring double data records

Handling of double Data Records (checked on the key fields of the characteristics) means that only the last data record is updated to the master data / text table Checking for double data records only possible if update method is Only PSA Update Subsequently in Data Targets
SAP AG 2004, Efficient usage of BW InfoProviders / 23

Customer build checks


Implement checks in customer routines in the update or transfer rules these checks can call the Error Handling Early checking in customer routines can avoid time consuming rollback and recovery of complex load scenarios !!

Additional customer check scenarios might be:


Build business check procedure Data Integrity Checks on Data Packages in PSA Use custom check points during extraction Check on master data completeness Build Audit Dimension in Data Modeling

SAP AG 2004, Efficient usage of BW InfoProviders / 24

Build own business check


Data loaded to BW data targets is compared with source data using business rules like: correct subtotals, correct +/-, etc. Check can be undertaken using a MultiProvider query comparing source data in PSA or from the source system (possible usage of Virtual or Remote cubes) with data contained in BW data targets Built Exception on Column Difference <> 0
Sales organization Source Data Data in InfoProvider Difference

1000 2000

120.000,00 190.000,00

120.000,00 189.600,00

0,00 - 400,00

Proactive Alerting off Administrator via Reporting Agent Embed this check in BW 3.x process chains
SAP AG 2004, Efficient usage of BW InfoProviders / 25

Build own business check


Check and filter data
Transfer Rules PSA
Data

variable 0LSTRQID
Sales (Basis Cube)

Update Rules

Compare valid with loaded data

InfoSource InfoSource

Transfer Rules

MultiCube

Remote Cube

Source System

Business Information Warehouse

variable 0MAPRQID *see: How to Validate InfoCube Data by Comparing it With PSA Data

SAP AG 2004, Efficient usage of BW InfoProviders / 26

Data Integrity Checks on Data Packages in PSA

APIs are available to read PSA contents Function RSAR_ODS_MAINTAIN,. Check for reference between records Summary checks, .

SAP AG 2004, Efficient usage of BW InfoProviders / 27

Use Custom Check Points during extraction

Identify check points in source system Write check point data to custom table Use generic extractor for load Populate check cube Perform Compress with 0 suppression Execute exception report

SAP AG 2004, Efficient usage of BW InfoProviders / 28

Check on master data completeness


Material
Material Type Global Material Packing Size QM Status

Source System 1 Material Source System 2 0Material Source System 3 0Material


Packing Size Global Material Material Type

Scenario: master data is loaded from different source systems

SAP AG 2004, Efficient usage of BW InfoProviders / 29

Check on master data completeness


Material

Material Type Global Material Packing Size QM Status

Source System 1 Source System 2 Source System 3

Complete Incomplete

Myself Data Mart Export DataSource Export DataSource Material Check completeness
Material Type Global Material Packing Size QM Status

SAP AG 2004, Efficient usage of BW InfoProviders / 30

Check on master data completeness

Material

Material Type Global Material Packing Size QM Status

Complete Incomplete Expert Report stored information! Consumer Report only complete Information!

SAP AG 2004, Efficient usage of BW InfoProviders / 31

Build Audit Dimensions in Data Modeling

Audit Dimensions can identify: When were the data created? Which source did the data come from? Which tools where used for extraction? Which rules had touched the data?

SAP AG 2004, Efficient usage of BW InfoProviders / 32

Contents

1 2 3 4

Motivation Data validation in BW 3.x Error handling in BW 3.x Data repair techniques

SAP AG 2004, Efficient usage of BW InfoProviders / 33

Data Cleansing
Where: In the Source System? During Data Extraction? In the BW System? When: In the productive phase? In the test phase? In the blueprint phase? Who: Is it a technical issue? Is it a project issue? Is it an organizational issue?
Data cleansing occurs at all levels. Avoid tendency to attempt cleanse only within the BW extraction process. Often data cleansing is best performed at the legacy / source system level.

Data cleansing is one of the greatest risks in data movement efforts. Design belongs into blueprint phase. Test data are often cleaner than real data. Often data quality and inconsistency issues are systemic in the organization and must be addressed at higher level in the organization to get resolved.

SAP AG 2004, Efficient usage of BW InfoProviders / 34

Error handling in BW 3.x


Handling of invalid data answers the questions: What to do in case of error? Abort validation or continue loading Book or dont book valid data Report or dont report valid data How to correct the invalid data? Source System PSA During upload How to re-book the corrected data? Re-load from source system Book from PSA to data target
SAP AG 2004, Efficient usage of BW InfoProviders / 35

Error handling features

Show error status of records in PSA table Possibility to choose in the scheduler to... abort process when errors occurr process the correct records but do not allow reporting on them process the correct records and allow reporting on them It also can be chosen, with how many errors the whole request is wrong Write invalid records to a new request Update the invalid records after correction

SAP AG 2004, Efficient usage of BW InfoProviders / 36

Error handling
Error handling No Error handling

Restrictions on Error handling capabilities in InfoPackages on


Connected Data targets ( is data updated to an ODS object ? ) and Update mode ( is data load a delta update ? ) and Serialization ( is serialization required ? ) or Transfer method ( is transfer method IDoc used ? )
SAP AG 2004, Efficient usage of BW InfoProviders / 37

Error handling: Overview


Consider automation using Process Chains !
Business Information Warehouse
Error Handling: 1- No Update, No Reporting 2- Valid Records Update, No Reporting 3- Valid Records Update, Reporting Possible

Scheduler Scheduler

Extract

OK

PSA

Staging Staging Engine Engine

Error

PSA

Correction of invalid data: within source System manually in PSA by Rule


SAP AG 2004, Efficient usage of BW InfoProviders / 38

Error handling: Features

Monitor entry No Error handling Error handling

Abort of update

Upd. valid records

Application log

Marked in PSA

ErrorRequest

Color of Request

red red

No PSA available (e.g. Transfer via Idoc) PSA available

red red

No update, no reporting Update valid records, no reporting Update valid records, Reporting possible

X X X

X X X

X X X X X

red red green

SAP AG 2004, Efficient usage of BW InfoProviders / 39

Error handling in BW (2.0B)


ODS object Master Data Info Cube

IDoc Update Rules Texts Master Data Hierarchies

InfoSource InfoSource

Transfer Rules

Data
SAP AG 2004, Efficient usage of BW InfoProviders / 40

Data

Error handling in BW (3.x)


ODS object Master Data Info Cube

PSA Update Rules Texts Master Data

IDoc

Hierarchies

InfoSource InfoSource

Transfer Rules

Data
SAP AG 2004, Efficient usage of BW InfoProviders / 41

Data

Call Error handling from customer routines


Customer routines in the update rules or transfer rules can mark the record and call the Error Handling From update rules append table MONITOR From transfer rules append table G_T_ERRORLOG process single record using field RECORD

If no Error Handling is needed, records or even the whole data package can be skipped: RETURNCODE <> 0 means skipping the record ABORT <> 0 means skipping the data package With BW 3.0B the functions SKIP RECORD and ABORT PACKAGE exist in the Transformation library

SAP AG 2004, Efficient usage of BW InfoProviders / 42

Example

Bad

Good

Abort Lookup + Check + Error Handling Lookup + Check + Error Handling

Lookup
Communication structure Communication structure

Transfer structure

Transfer structure

SAP AG 2004, Efficient usage of BW InfoProviders / 43

Automate Error handling using a customer program


Automatic correction of the error-request can be done in a customer program Therefore use method GET_ERRORS of class CL_RSSM_ERROR_HANDLER As template the program RS_ERRORLOG_EXAMPLE can be used

SAP AG 2004, Efficient usage of BW InfoProviders / 44

Contents

1 2 3 4

Motivation Data validation in BW 3.0 Error handling in BW 3.0 Data repair

SAP AG 2004, Efficient usage of BW InfoProviders / 45

BW Data and Metadata Test and Repair Environment


Transaction RSRV Transaction RSRV checks the consistency of data stored in BW. The transaction interface was redesigned for SAP BW Release 3.0

SAP AG 2004, Efficient usage of BW InfoProviders / 46

Conversion to consistent Internal Values


Converting Inconsistent Internal Characteristic Values find and correct incorrect internal characteristic values. a characteristic value is inconsistent when the characteristic has a conversion routine and when its values do not correspond to the internal format of the conversion routine. The following conversion routines are covered: ALPHA NUMCV GJAHR Successful conversion of the system is required before upgrading to Release 3.x

SAP AG 2004, Efficient usage of BW InfoProviders / 47

Conversion to consistent Internal Values


Once the conversion is complete, stricter checks / conversions are activated in the system to prevent new inconsistent values entering the system. You can then run an Optional Conversion to correct internal values (according to conversion exit in the Transfer Rules)

SAP AG 2004, Efficient usage of BW InfoProviders / 48

Repair requests
With BW 3.x repair requests can be updated to an ODS object. This means: Full update for correction purposes updated to an ODS object which is usually updated using delta loads In the InfoPackage menu the request is then marked as repair request Before doing this, incorrect data can be deleted selectively from the ODS object

Possible approach in BW 2.0B / 2.1C: Use generic DataSource based on PSA table to correct ODS object data via full uploads

SAP AG 2004, Efficient usage of BW InfoProviders / 49

Further Information
Public Web:
www.sap.com > Solutions > SAP NetWeaver

SAP Service Marketplace:


http://service.sap.com/bw
Folder Data Consistency SAP BW InfoIndex Data Quality Services & Implementation HOW TO Guides Guide List SAP BW 2.x How to Create monitor entries from an update routine

SAP AG 2004, Efficient usage of BW InfoProviders / 50

Copyright 2004 SAP AG. All rights reserved


No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice. Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors. Microsoft, WINDOWS, NT, EXCEL, Word, PowerPoint and SQL Server are registered trademarks of Microsoft Corporation. IBM, DB2, OS/2, DB2/6000, Parallel Sysplex, MVS/ESA, RS/6000, AIX, S/390, AS/400, OS/390, and OS/400 are registered trademarks of IBM Corporation. ORACLE is a registered trademark of ORACLE Corporation. INFORMIX-OnLine for SAP and Informix Dynamic ServerTM are registered trademarks of Informix Software Incorporated. UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group. Citrix, the Citrix logo, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, MultiWin and other Citrix product names referenced herein are trademarks of Citrix Systems, Inc. HTML, DHTML, XML, XHTML are trademarks or registered trademarks of W3C, World Wide Web Consortium, Massachusetts Institute of Technology. JAVA is a registered trademark of Sun Microsystems, Inc. JAVASCRIPT is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape. SAP, SAP Logo, R/2, RIVA, R/3, SAP ArchiveLink, SAP Business Workflow, WebFlow, SAP EarlyWatch, BAPI, SAPPHIRE, Management Cockpit, mySAP.com Logo and mySAP.com are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other products mentioned are trademarks or registered trademarks of their respective companies.

SAP AG 2004, Efficient usage of BW InfoProviders / 51

Вам также может понравиться