Вы находитесь на странице: 1из 57

Financial Data Model Overview

Daniel Grieb Lori Silvestri

December 4, 2008

Agenda
Reporting Solution Star Schema Primer Data Modeling Process Finance Data Models Design Challenges and Choices Implementation Conclusion
2

December 4, 2008

Finance Data Modeling Guidelines


Campus Solution must use CSU Finance Reporting Solution as Source Replace Existing
1. Revenue and Expense (P & L) 2. Trial Balance Reporting 3. Drill from Summary to Transaction

Need daily refresh of large data sets Anticipate analytical reporting

December 4, 2008

Levels of Reporting
Analytics Enterprise Data Warehouse Combined information from multiple source systems. Current and historical information Much more sophisticated data structures to enable analysis: cubes and star schema Operational Reporting Tactical data from production systems that address operational needs Denormalized data structures with embedded business logic Transactional Reporting Supports day to day transactional users Requires knowledge of transactional data

Operational

Transactional

December 4, 2008

REPORTING SOLUTION

December 4, 2008

CSU Reporting Solution


Attribute Tables
one set for each Set ID XXCMP, XXCSU, XXGAP

Transaction Tables
separate tables per Business Unit

Summary Table
XXCMP and XXCSU
* Brothwell, Kist, and Yelland, Finance 9.0 Reporting Solution Training April, 2008

December 4, 2008

CSU Reporting Solution - Attributes


Attribute Tables one set for each Set ID (XXCMP, XXCSU, XXGAP)
Fund Department Account Program Project Class CSU_R_FUND_TBL CSU_R_DEPT_TBL CSU_R_ACCT_TBL CSU_R_PRGM_TBL CSU_R_PROJ_TBL CSU_R_CLASS_TBL

Can be joined to transaction and summary tables Department table contains flattened version of the campus organization department tree
* Brothwell, Kist, and Yelland, Finance 9.0 Reporting Solution Training April, 2008

December 4, 2008

CSU Reporting Solution - Transactions


Transaction Tables separate tables per Business Unit Campus Business Unit Transaction Tables
Actuals Budgets Encumbrances Pre-Encumbrances CSU_R_ACTDT_CMP CSU_R_BUDDT_CMP CSU_R_ENCDT_CMP CSU_R_PREDT_CMP

CSU Business Unit Transaction Tables GAP Business Unit Transaction Tables
* Brothwell, Kist, and Yelland, Finance 9.0 Reporting Solution Training April, 2008

December 4, 2008

CSU Reporting Solution - Summary


Summary Tables (XXCMP and XXCSU) Campus Business Unit Summary Table
CSU_R_SUMBL_CMP

CSU Business Unit Summary Table


CSU_R_SUMBL_CSU
* Brothwell, Kist, and Yelland, Finance 9.0 Reporting Solution Training April, 2008

December 4, 2008

Benefits of the Reporting Solution to the Dimensional Data Model


Validated independently
Reporting solution was validated between January and September 2008 Finance was heavily invested in, helped design and trusted the reporting solution Sped up data model validation because we could tie to the reporting solution
Finance validated within days, rather than weeks Validated using the dashboards

December 4, 2008

10

Benefits of the Reporting Solution to the Dimensional Data Model


Reporting solution now used in parallel by Finance for internal querying and to fill ad hoc requests
Phase one of the data models did not have to incorporate all of the reporting solution data Helped constrain project scope

December 4, 2008

11

STAR SCHEMA PRIMER

December 4, 2008

12

What Is a Star Schema


The star schema is perhaps the simplest data warehouse schema. It is called a star schema because the diagram of this schema resembles a star, with points radiating from a central table. The center of the star consists of a large fact table and the points of the star are the dimension tables.

December 4, 2008

13

Star Schema Database Design


Dimension Table

Star Schema - a data model that consists of one fact table and one or more dimension tables

Dimension Table

Fact Table Contains: facts and/or measures to be analyzed (i.e., amount, count, etc.) and foreign keys (keys to dimension tables) Dimension Table Dimension Table

Dimension Table Contains attributes describing a campus entity (i.e., department, account type, ledger, etc.)

December 4, 2008

14

Star Schema
Fact tables contain process activity located in the center (quantitative data) Some example facts are monetary amount, budget amount and statistics amount Dimensions tell the story and provide the detail to the facts. Which departments budget? When was the last transaction posted for a given account?
December 4, 2008

WHO?

WHAT?

THE FACTS

WHERE?

WHEN?

15

Star Schema Benefits


Data model is easy to understand
Based on business process

Easy to define hierarchies


City-State-Country Day-Accounting Period-Fiscal Year

Easy to navigate
Number of table joins reduced Star schema recognized by leading query tools

Maintainable and Scalable


Dimension tables shared between data models Can add new fact tables which use existing dimensions

December 4, 2008

16

Why Star Schema for Cal Poly Finance?


1. Dimensions can easily be reused
across current and future finance models

2. Superior query performance for large datasets


i.e., over 5 million rows

3. Usability
Understandable for users Better support unanticipated questions

4. Star schemas are extremely compatible with business intelligence query tools such as OBIEE.
17

December 4, 2008

DATA MODELING PROCESS

December 4, 2008

18

Data Modeling Process


Interactive/ Iterative Process Requirements Gathering Domain research Data profiling Modeling tool Design sessions with data steward

December 4, 2008

19

Data Modeling Process: Requirements Gathering


Primarily Done by Reporting Solution Development Our Requirement Refashion Reporting Solution into a Dimensional Model
Performance Accessibility

December 4, 2008

20

Data Modeling Process: Research


Domain research
Finance Cal Poly Financials Cal Poly Reports (nVision, Brio) Industry Finance Models (Kimball)

Data profiling
Querying reporting solution Correlating fields/ values Matrix of Attributes Across Document Sources

December 4, 2008

21

Data Modeling Process: Design


Modeling tool
Needed a tool to support efficient design Limitations of modeling tools like Visio Embarcadero ER Studio

Design sessions with data steward


model reviews
Validated groupings of attributes into dimensions New (non-reporting solution) sources (i.e., dept, prog and proj trees)

prototyping dashboards

December 4, 2008

22

FINANCE DATA MODELS

December 4, 2008

23

Cal Poly Finance Data Models


4 data models implemented to date 22 Dimensions
Reused across models Chart fields, Business unit, Ledger, etc

4 Fact tables
Actual Transactions Budget Transactions Encumbrance Transactions Actual, Budget and Encumbrance Summary

December 4, 2008

24

Actual Fact

High Level Finance Data Model Diagram

Who
(Dept ID, Vendor, etc)

Budget Fact

What
(Account, Fund, etc)

Encumbrance Fact

When
(Acctg. Period, Fiscal Year, etc.)

Summary Fact

Where
(Business Unit, etc)

Model Overview Actual, Budget and Encumbrance Summary

December 4, 2008

26

Model Overview Actual Transactions

December 4, 2008

27

Model Overview Budget Transactions

December 4, 2008

28

Model Overview Encumbrance Transactions

December 4, 2008

29

Closer Look at a Dimension


Department
FINANCE_DEPARTMENT

Initial source was CSU Reporting Solution Department Attribute table


PS_CSU_R_DEPT_TBL

December 4, 2008

30

Closer Look at a Dimension


Source Department table
contains flattened version of campus organization department tree Ragged hierarchy

Added additional source data Cal Poly department tree


Non-ragged hierarchy Robust hierarchy for data exploration Supports reporting on department reorganization or renaming Cal Poly users are accustomed to using this tree

December 4, 2008

31

Closer Look at Department Dimension


Department Budget Specialist and Manager
Reporting Solution provides a single manager field

Cal Poly Needs Primary and Secondary Budget Specialists and Managers
Available for querying and display in reports Used for access control in Finance dashboards - filtering / ease of use

Source Excel Spreadsheet


Provided by Finance Updated weekly Plan to create mini-web application to capture data in future

December 4, 2008

32

Department Dimension

December 4, 2008

33

Presentation of Data Models

December 4, 2008

34

Transactional vs. Summary Models


Dimensions in the summary model are a subset of those in the transactional models
Allows for drill-across from summary to transactional models Feels like a drill-down

December 4, 2008

35

Design Challenges and Choices

December 4, 2008

36

Design Challenges
Challenge Reporting solution is denormalized
PolyData typically sources normalized data sources and manages denormalization

Solution Took us a little outside of our comfort zone Deconstructed the reporting tables into unique combinations of elements

December 4, 2008

37

Design Challenges
Challenge Attributes are overloaded
For example, a document_id can represent an invoice number, a PO number, a journal identifier, etc.

Solution Preserved this concept in the dimensional models because it is familiar to Finance

December 4, 2008

38

Design Challenges
Challenge Uniqueness not enforced in the reporting solution

Solution Added an instance number for identical transactions

December 4, 2008

39

Design Challenges
Challenge Nightly rebuild of the reporting solution potentially deletes rows

Solution Effective-dated transactions in the fact

December 4, 2008

40

Design Challenges
Challenge Transactional and summary reporting tables may not tie
journal vs. ledger sources summing the detail may give the wrong answer

Solution This is a known issue to which Finance is accustomed Opportunity for a dashboard integrity report

December 4, 2008

41

Design Challenges - Naming


Challenge Reporting Solution names did not conform with PolyData Warehouse standards Solution Data Warehouse standards
Field and table names use full English words when possible for usability Codes precede corresponding description (Code, Descr)

Used reporting solution names with full spelling and adding Code and Descr where appropriate.

December 4, 2008

42

Design Choices Slowly Changing Dimensions


Most dimensional attributes were determined by data steward to be slowly changing dimension Type 1 (SCD1). Exception: Department Table
SCD1 attributes such as department description SCD2 department tree data

*IF* you need to track historical changes to dimensions


You may need to source dimensions from source system(s) Candidates include chart fields, vendors, customers

December 4, 2008

43

Design Choices SCD Example


Cal Poly needs department tree history
Department tree data
Slowly Changing Dimension Type 2 - preserves history Effective date rows (effective from and to dates) Add new row for each change

All other department attributes


Slowly Changing Dimension Type 1 overwrites history Replace old/outdated data with current

December 4, 2008

44

Design Choices New Sources


In design and prototyping sessions with end users, it became apparent that additional source data was needed New non-reporting solution sources were needed to supplement existing source.
Department tree Program tree Project tree

Design change from using only reporting solution as source

December 4, 2008

45

IMPLEMENTATION

December 4, 2008

46

Time and Resources


Modeling/Domain familiarization
2 data modelers June through August 2008

Source-to-Target analysis and documentation


2 analysts July through September 2008
47

December 4, 2008

Time and Resources


Coding and system integration
4 ETL programmers August through October 2008

Total person-days
July through October 2008 Approximately 140 person-days

December 4, 2008

48

Time and Resources


Caveats
Established documentation methods and coding standards Slowly changing logic developed or provided by toolset 3 transactional models implemented identically

December 4, 2008

49

Nightly Build
Job Source pull Reporting solution build Data model build End user table refresh TOTAL Minutes (approximate) 30 70 140 60 300

December 4, 2008

50

Performance Tuning: Nightly Build


Coordination with Finance on their builds
Nightly processing Reporting solution (in transactional database)

Approximately one month to level out on timing


Tuning specific to the finance jobs Coordination with other PolyData warehouse jobs

December 4, 2008

51

Performance Tuning: End-User Tables


Performance was reasonable prior to indexing
Largely due to the dimensional structure

Performance screamed after indexing


Indexes on fields used in selection criteria and drillable hierarchies Bitmap indexes on foreign keys in facts
52

December 4, 2008

Implementation: Interface with Front End Developers


joins should be fully documented front end developers may need some training in interpreting models we still have not come up with an ideal method for documenting hierarchies challenge - knowledge of hierarchies is shared
data steward front end developers modelers

December 4, 2008

53

CONCLUSION

December 4, 2008

54

Future Work
Labor Cost GAAP Reporting Management Dashboard/Analytics Integration with HR and Student Data

December 4, 2008

55

Questions?
Daniel Grieb Data Warehouse Architect, Analyst/Programmer Lori Silvestri Data Warehouse Analyst/Programmer

December 4, 2008

56

Contact
OBIEE Technical Conference:
http://polydata.calpoly.edu/dashboards/obiee_conf/index.html

Email: polydata@calpoly.edu

December 4, 2008

57