Вы находитесь на странице: 1из 14

Ariba Spend Visibility Best

Practice Guide:
Data Extraction Services
Ariba Spend Visibility Best Practice Guide: Data Extraction Services

Table of Contents
3 Introduction
3 Overview
3 Purpose
3 Best Practice Recommendations
3 Phase 1: Kick-off
4 Phase 2: Data Collection - Mapping
9 Phase 3: Data Collection - Extraction
10 Phase 4: Data Collection - File Transmission
12 Phase 5: Data Validation
12 Conclusion
13 Appendix 1: Common Data Collection Problems
14 Appendix 2: Summary of Bad Practices and Potential Impact
to the Project

Ariba Spend Visibility Best Practice Guide: Data Extraction Services

Task: Extracting data from company ERP systems to load into the Ariba Spend
Visibility solution.

Required Solution(s): Ariba Spend Visibility

Audience: The intended audience of this document is a project team with the task of
extracting data for a spend analysis utility.

2 Copyright © 2010 Ariba, Inc. All rights reserved.


Ariba Spend Visibility Best Practice Guide: Data Extraction Services

Introduction
Overview: The first and likely most-difficult task in implementing a spend analysis tool,
such as Ariba Spend Visibility, is extracting data from the existing ERP systems. In a
perfect world, one button is pushed and the data is migrated between systems.
Unfortunately, the process can be quite daunting and time-consuming. This can delay
results, or even scare companies away from implementing such a solution. This creates
a status-quo where a single corporate-level view of spend is not obtained. In today’s
global economy, this can mean millions of dollars in savings opportunities lost.

This document will walk you step-by-step through a typical extraction process up to and
including the validation step. We will use the example of a customer who just
purchased Ariba Spend Visibility and will do the extract work themselves (i.e. no
consulting or Data Transformation services). Each phase will include tasks that should
be completed and tips for best practices.

Purpose: Based on Ariba’s experience with more than 100 current Ariba Spend
Visibility customers, this guide will share best practices that we recommend to complete
this task quickly and accurately.

Best Practice Recommendations

Phase 1: Kick-off
Day one, and the Ariba Spend Visibility contract is signed, sealed, and delivered. The
project champion has assigned an internal project manager (PM) to lead the initiative.
At this stage, the source systems that will be included should have been identified. The
PM should confirm the list and work to have at a minimum one IT resource and one
business user per source (it is possible that there is overlap for some sources).

This team should be in place and be required to attend the project kick-off meeting,
where the full project overview will be presented by the Ariba project manager (APM).
In most cases, most of this team did not take part in any of the sales discussions, so
this will provide the internal project team with an overview of the project. This is a key
step so that everyone involved knows the scope of the project and what steps require
their participation.

The Ariba PM will plan the data schema overview soon after the project kick-off. This
session will provide guidelines and formatting requirements for the Ariba Analysis tool.
This presentation should be attended by the PM, all IT resources, and core business
users. We will cover more details in the next section, Data Collection - Mapping

Copyright © 2010 Ariba, Inc. All rights reserved. 3


Ariba Spend Visibility Best Practice Guide: Data Extraction Services

Common FAQ/Scenarios Recommended Best Practice


Who should attend the kick-off Business users representing the site/ERP system should
presentation and data schema also attend to understand the format to assist in mapping
(extract) training? fields, and to provide input on data requirements.

Although spend analytics tools are intended for finding


sourcing opportunities, other groups can benefit from the
work that is being done. This includes the senior
management team, tax department or the
accounting/finance teams. They should be represented
in these discussions and will help drive ROI of the
analytics tool.

All IT resources should attend the data schema kick-off.


While the data schema and format requirements may
seem straightforward, the session will provide additional
details as well as it being a good discussion session.

Customer is deploying more than 10 Identify an IT project manager who is able to lead the IT
source systems effort, track task completion, coordinate between all
source system owners and drive consistency.

Phase 2: Data Collection - Mapping


With the project kick-off and data schema sessions complete, it is now time to determine
what data needs to be pulled. Prior to mapping sessions, the core team needs to define
the overall requirements. This should include the project champion and business users.
These requirements will build consistency and ensure that all teams (each source
system) work together. Below is a suggested list of items that should be determined
before a single line of data is extracted. All teams should agree to these in order to
provide a consistent and accurate extract.

• What constitutes an accounts payable (AP) spend event?


This could include the date an invoice is received, entered, paid, or another date
used internally.
• Once you determine the answer to the first question, now you need to determine
what date range should be included. Will you load the past 12 months, the last
complete year, etc.? This should have been determined as part of the commercial
discussions because there is a potential impact to the SLAs.
• Is there data that should be excluded?
Examples of data that would be excluded would be payroll, inter-company
transactions, and executive expenses. This will also lead to a discussion on
whether data that is included needs to be reported on in the tool, and, if so, does it
need to be restricted from certain users. (We will not cover data access controls in
this document, but you can contact your APM to discuss these.)

4
2 Copyright © 2010 Ariba, Inc. All rights reserved.
Ariba Spend Visibility Best Practice Guide: Data Extraction Services

• Do you have the ability to load specific invoice lines such as taxes, freight, and
other indirect costs associated with purchases?
• Is it necessary to flag direct vs. indirect spend?
This is commonly overlooked, and ends up being a valuable filter in reporting. In
some cases, this will affect enrichment, but being able to quickly filter direct vs.
indirect spend is beneficial.
• Is it necessary to flag suppliers as preferred, internal, or in another method?
For example, if you are not able to link contract details to invoices, you may be able
to flag preferred suppliers quite easily. This will allow quick and easy compliance
reviews to see where quick-hit savings can be achieved.
• Are there other specific requirements that are required internally such as property-
level reporting?
A use case for this would be state-level tax reporting guidelines that require
companies to report all spend done within a specific state or country.

Common FAQ/Scenarios Recommended Best Practice


How much data should be provided for At least the prior 12 months of spend detail should be
the initial deployment? loaded. This will provide a good review and allow
analysis of trends. We often suggest that customers
load the last full fiscal or calendar year and any
months since to have a solid baseline. Adding
significantly more data increases cost without adding
proportional value for most companies.
Are there data points that we should Example fields that should be considered include:
consider providing? • Indirect vs. Direct spend
• Sourceable/Non-Sourcable Spend
• Preferred supplier details
• Corporate Hierarchy information
• Invoice Quantity – allows price-level reporting
Where do we start in data-mapping The team should determine at a high level what fields
since each site has different data? should be provided. This includes company site
levels, etc. Specifics will need to be worked out
within each source, but building a high-level
guideline will assist in source field mapping, and
maintain consistency.
Do we have to go through this process No, but the client project manager should track the
for each refresh? answers to the overall assumptions that the team
determines. This is an excellent reference point
should questions come up down the road or when
future source systems are added.

Copyright © 2010 Ariba, Inc. All rights reserved. 5


Ariba Spend Visibility Best Practice Guide: Data Extraction Services

You have now defined the scope of the extract and overall requirements. Next up is the
mapping session. This can be difficult between various ERPs (i.e. Indirect spend may
not have part level detail, while direct data does).

The APM will have provided the data acquisition schema after the training session. You
will need to work with each source system team (IT and business user) to map their fields.

The list of available tables is as follows:

• "Invoice2.csv" for AP transactions table


• “PO2.csv” for Purchase Order Line transactions
• “Account.csv” for General Ledger accounts table
• “CompanySite.csv” for Facility/Site location Master table
• “Contract.csv” for Contract Master table
• “CostCenter.csv” for Cost Center Master table
• “CostCenterMgmt.csv” for Cost Center management hierarchy
• “ERPCommodity.csv” for the Material Code Master table
• “FlexDimension1.csv” for other related dimensions
• “FlexDimension2.csv” for other related dimensions
• “FlexDimension3.csv” for other related dimensions
• “FlexDimension4.csv” for other related dimensions
• “FlexDimension5.csv” for other related dimensions
• “FlexDimension6.csv” for other related dimensions
• “Part.csv” for Item Master table
• “Supplier.csv” for Supplier Master table
• “User.csv” for the Buyer Master table
• “CurrencyMap.csv” for mapping Currency codes
• “UOMMap.csv” for Item Master table

The diagram on the next page provides a sample overview of the available fields and
how the tables are related.

6
4 Copyright © 2010 Ariba, Inc. All rights reserved.
Ariba Spend Visibility Best Practice Guide: Data Extraction Services

Your project manager will provide you with the data acquisition schema and document.
They will also provide the team with training on the extract requirements.

Copyright © 2010 Ariba, Inc. All rights reserved. 7


Ariba Spend Visibility Best Practice Guide: Data Extraction Services

Common FAQ/Scenarios Recommended Best Practice


What take-away is there after this A mapping guide should be maintained for each source
step is completed? system. It should include details on what field in the ERP
is mapped to the Ariba field, and any potential formatting
requirements and other pertinent information. It should
also be kept up to date if changes are required during the
validation step. This will allow seamless knowledge
transfer if there is a resource change.

Are there required fields? The data schema document will denote what is required
vs. recommended. It is common that a customer will not
have all data elements that Ariba can accept. In addition
to the ID key fields that link the fact tables (Invoice and
PO) to the dimension tables (i.e. Supplier, Account, etc.),
additional fields are recommended to support enrichment.
For commodity enrichment, the following list is the set of
fields that are sent to the enrichment team. Anything that
the customer believes will assist in assigning a commodity
should be provided in one of these fields:
- Invoice Description
- PO Description
- Part Description
- Supplier
- ERP Commodity
- GL Account
- Flex Fields (the first three of six are used as
enrichment key fields)
Similar to the commodity enrichment key fields mentioned
above, the more details that are provided about the
supplier, the more accurate the supplier enrichment
process can be. The following fields are recommended to
be populated as much as possible:
- Supplier Name
- Street Address
- City
- State
- Postal Code
- Country
I have a field that I need to provide, Do not get stuck on field names. They can be renamed in
but I cannot find its equivalent in the utility if they need to fit the requirements of the data
your data schema. you would like to provide. A common reason for this is to
provide hierarchies on appropriate fields.

8
6 Copyright © 2010 Ariba, Inc. All rights reserved.
Ariba Spend Visibility Best Practice Guide: Data Extraction Services

Phase 3: Data Collection - Extraction


Data mapping has now been completed and it is time for the IT teams to extract the
data. This process may include multiple iterations of extracts to ensure the content and
format is accurate. The following are recommendations to follow during this process.

• Document the extracts very well. It should be extremely easy to understand what
was done, what data was pulled, etc. If proper documentation is not done, this
could lead to delays in the long run.
• Create the extracts to be as automated as possible. There are two reasons for this:
- Manual steps can lead to errors and are not repeatable.
- Most spend analytic projects include refreshes, so it is not a single extract, but
will typically include four per year.
• Update the extracts if there are changes required on the data. It is common that
small changes are just made in the raw files because it is easier, but the extracts
are never updated. Then the same problems are encountered during a refresh,
delaying the project.
• Have validation reports created as part of the extract. Further details can be found
in the Validation section of this document, but it is helpful to have these built with
the extracts.
Ariba has extraction guides for Peoplesoft, SAP, and Oracle. These guides provide
some basic extract details, common field mappings and starter scripts, but will need
adjusted for the specific configurations of the system. Ask your APM for these guides if
you have one of these systems.

Common FAQ/Scenarios Recommended Best Practice


Can we do any testing on Ariba recommends creating a sample extract and forward
the exports? it to the Ariba project manager for review before pulling the
full dataset. In some cases, a full extract will take time to
run, while a subset (perhaps one week of data) can be
generated quickly to make sure there are no major issues
with the data.
Do you have recommended The data files must be uploaded in a comma-separated
software that should be used to format (csv). Unless absolutely required, do not open and
create the files? save files during the data extraction or file transmission via
Excel. This can remove leading zeroes, change date
formats, etc, and will cause multiple problems during the
upload/load process. Always use a database such as
Access or Sequel, or a text editor such as Textpad or
Notepad if you need to review the data.

Copyright © 2010 Ariba, Inc. All rights reserved. 9


Ariba Spend Visibility Best Practice Guide: Data Extraction Services

Phase 4: Data Collection – File Transmission


You are now in the home stretch—the files have been generated and they are ready to
be sent to the team. Ariba Spend Visibility provides two file transmission options; your
APM will provide details for the selected option:
• Manual via a user-interface on the client-specific https secure website
• Automated via a Java script utility that can be utilized as a stand-alone or entered in
the extract scripts
Any files you uploaded are kept in the staging area. You will be able to view the files
you uploaded. This includes the validation report that is completed automatically when
the file is uploaded. Some of the validations that are completed automatically are
duplicate checks on the ID keys, header and format issues. Additional system
validations are completed during the upload process provided you check the boxes for
extended validations when uploading the files. These validations include duplicate
checks on the ID keys and population density of recommended fields. The APM will
also review these validation reports and will provide feedback and recommendations.

The following screenshots show the staging area and a sample validation report.

10
8 Copyright © 2010 Ariba, Inc. All rights reserved.
Ariba Spend Visibility Best Practice Guide: Data Extraction Services

NOTE: These example slides show the validations when the extended validations are
not checked. So you may want to show both, if just one, as we recommend that you
always check that option.

Common FAQ/Scenarios Recommended Best Practice


Do we have to upload each Individual csv files should be zipped into a folder per each
individual file individually? That can source system. This is more efficient than loading
be up to 17 files per source. individual files and keeps data together.

Customer is deploying more than Customers with greater than 20 source systems may
20 source systems. consider creating a central collection location. The
transmission utility could be used from here to send the
files to Ariba. Some customers have also built a validation
step at this location, which also performs customer-
specific validations.

Copyright © 2010 Ariba, Inc. All rights reserved. 11


Ariba Spend Visibility Best Practice Guide: Data Extraction Services

Phase 5: Data Validation


Data has been sent to Ariba and it has been loaded to the Ariba Spend Visibility site.
Prior to enrichment services, the data needs to be validated. You should determine
what data points can be validated against the original source data. This is one phase
that customers can have a variety of methods to complete. Some will look at the overall
spend total, and others will do a line-by-line comparison.

This is an extremely important step, as incorrect data at rollout can derail user adoption.
Keeping this in mind, while completing the task efficiently, Ariba recommends the following:

• Review the total spend of all data as a whole, and per source system. This may be
difficult to get a figure down to the penny, but you should be close—allowing for a
small margin. This review should be done from the ERP system itself.
• Review spend in the top 20 items per each dimension that was provided. For
example, top 20 suppliers’ spend, top 20 GL accounts, etc. per source system. This
will confirm that data loaded and is linking accurately from the master table to the
supporting dimensions. Ninety-nine percent of the time, any potential data issue will
be found following this process.
• Review unclassified spend for each provided dimension. For example, unclassified
supplier names, account names, etc. Were these unclassified because a name was
not included with the ID listed in the dimension table or because the ID listed in the
Invoice was not listed, with the supporting information in the dimension?
• As mentioned in the last section, summary reports created with the extracts will
speed up the process and be the validation that the data extracted matches the
data loaded.
• Keep in mind any filters that were built on the extract. It is common to forget that
you excluded certain expenses, and time is wasted attempting to find the
discrepancy between Ariba and the ERP system.
• If you provide hierarchies for items such as Company Site, Account, Cost Center
Management Files or Flex Dimension 6, check for accurate roll-ups. Making sure
the spend matches is first priority, but you want to ensure your hierarchies are
named correctly and all fields are populated.

Conclusion
The recent economic downturn has created a need for increased visibility to track risk,
compliance, and to make sure that companies are doing everything they can to spend
wisely. While spend analytics tools require an investment in time and money, you can
no longer afford to use complexity as an excuse.

In addition to this document, your assigned Ariba project manager should be contacted
for any questions you may have. We have deployed many successful large
deployments, and each company had the same concern of up-front efforts and the
experience from each of those projects is shared among the project manager team.
Those customers likely started the project with the same concerns, but once those
hurdles were crossed, the benefits were ultimately achieved. Hopefully this document is
helpful in getting you started on the project.

12
10 Copyright © 2010 Ariba, Inc. All rights reserved.
Ariba Spend Visibility Best Practice Guide: Data Extraction Services

Appendix 1: Common Data Collection Problems


Problem Definition Identification/Solution of the Problem
Duplicate Each table that is created has fields When the file is uploaded to the Ariba site, you can select an
Records that make the item unique, such as option to perform a duplicate row check to be completed. The
Supplier ID and Supplier Location results will be provided on the error summary report. As
ID on the Supplier Dimension. If mentioned in this document, the Ariba PM will also monitor the
there are multiple records with the uploaded files and will provide feedback based on the
same keys, they are considered information, though you can review the file yourself.
duplicates. The way the loading If you want to check the files prior to sending to Ariba, a simple
process into Ariba Analysis works, query can be done in most databases (SQL, Oracle, etc.).
only one record per key-set is created, Depending on the file size (less than 1GB), MS Access has a
and subsequent records with the same relatively simple way to check for duplicates using the “Find
key-set will overwrite the previous record. Duplicates Query Wizard”. Simply select the key ID fields for the
The impact can be incorrect given table in the wizard and it will provide any duplicates
reference data provided, in this contained in the file.
case the wrong supplier. In the If duplicates are found, additional research can be done to determine
case of an Invoice or PO file, the the impact and what changes need to be made to the extract.
total spend can be incorrect.

Invalid joins Links between fact tables (Invoice When the file is uploaded to the Ariba site, you can select an
between and PO) and the dimensions (i.e. option to perform dimension reference checks to be completed.
supporting Supplier, GL Account, ERP The results will be provided on the error summary report. As
Commodity) are extremely important. mentioned in this document, the Ariba PM will also monitor the
tables
If the links fail, then reporting on uploaded files and will provide feedback based on the
the data is not possible. information, though you can review the file yourself.
Keeping with the supplier If you want to check the files prior to sending to Ariba, a simple
dimension example above, the query can be done in most databases (SQL, Oracle, etc).
Supplier ID and Supplier Location Depending on the file size (less than 1GB), MS Access, has a
ID that is on the Invoice table must relatively simple way to check for duplicates using the “Find
be in the same format as the one Duplicates Query Wizard”. Simply select the key ID fields for the
on the Supplier Dimension. This given table. If the joins are not valid, additional research must be
will link the invoice to the supplier, done to correct the extract files.
and allow reporting on that record.
If the link fails, data will appear as
“Unclassified” in the reporting tool,
which is the default value for data
elements that are NULL.

Commas/ The Ariba Analysis database uses This is one of the more-difficult issues to review. The upload
Double Quotes comma-separated format files. validation will catch issues if the delimited issue causes data to
Many customer ERP systems allow jump into a field that fails validation. The Ariba PM will also work
in data
commas to be included in fields with you to review validation reports after the data is loaded to
such as descriptions. If this data is make sure each of the fields loaded is correct.
included, the files must be You can also use Access or another database to search for
delimited with double-quotes or unparsable separators, typically an upload error that is presented
the fields will run together, when importing the files.
causing problems with links and An example of how the data should appear in the file is as follows:
field content. ERP Data: hex ½”x4”, cap screw
Extract View: “10”,”100”,”hex ½"”x4,"” cap screw”

Incorrect date This is one of the more-common The standard upload validation on the Ariba site checks for this
format (YYYY- issues that are seen in the initial and provides feedback on the validation report. It is also
MM-DD) extracts. The Ariba required format recommended that the extract script is reviewed, and a sanity
is yyyy-mm-dd. check can be done on the raw files prior to uploading.

Extracts are not Best practice is to automate the Ariba will provide a mapping template that can be used to track
automated/ extracts as fully as possible. In the mapping and format elements.
doing so, any changes in formats
changes not
or mappings must be reflected in Customer PM and IT Lead (if applicable) should be diligent with
documented the extract automation. Risks of the IT team members that the extracts are automated and
not doing this include: updated throughout the process.
• Any manual work increases
the risk for errors
• Refreshes are delayed due to
the re-work, and potentially
the same errors being made
in the initial deployment
• Increased transition time if
new team members are
brought on to do the work

Copyright © 2010 Ariba, Inc. All rights reserved. 13


Ariba Spend Visibility Best Practice Guide: Data Extraction Services

Appendix 2: Summary of Bad Practices and Potential Impact to


the Project
Phase Bad Practice Impact
Kick-off/Data Representatives from all teams are Teams that get pulled in throughout the middle of the project have
Mapping not present in the kick-off meeting little to no background, and can become frustrated and do not
and data schema trainings. see this project as a priority to them. Not only will this delay the
implementation plan, but it can have longer-term impacts such as
user adoption and ultimate value of the tool.

Also, re-work could be required on extracts when new teams


determine that data was missed, shouldn't have been included, or
is incorrect. This would delay the deployment date and also
increase internal costs due to the re-work.

Lastly, data changes that impact enrichment could ultimately have


commercial impacts due to re-work. Ariba will work with
customers to avoid this as much as possible, but it is a possibility.

Data Collection Mapping guides are not created This is quite common, and ultimately leads to the same errors
and updated with changes. and delays to be encountered during refreshes. Again, this leads
to increased internal costs to creating the extracts, and delays
deployment of new data.

Data Collection Data file changes are done Even if the updates are tracked well in a mapping guide, any
manually to correct data issues manual work can create problems, especially if there is a
instead of incorporating the resource change. It also increases turnaround time and resource
changes in the extracts themselves. costs to pull an extract. The extracts should be as automated
as possible.

Data Validation No plan is developed with Failure to validate the data, and provide proof that it is correct will
business users and IT to do a ultimately cause user deployment issues. The end-user will
proper validation. constantly question the data, and having good details on the
validation plan will alleviate concern that the data is not accurate.

14
12 Copyright © 2010 Ariba, Inc. All rights reserved.

Вам также может понравиться