Вы находитесь на странице: 1из 32

By

Data Warehousing & Informatica

Deepthi.G

AGENDA

INTRODUCTION TO DATAWAREHOUSING DATAWAREHOUSING ARCHITECTURE STEPS FOR BULIDING DATAWAREHOUSE TYPES OF SCHEMAS CONCEPTS OF DATAWAREHOUSING LIST OF TOOLS

INTRODUCTION TO INFORMATICA INFORMATICA ARCHITECTURE COMPONENTS OF INFORMATICA WORKING WITH INFORMATICA



INSTALLATION AND CONFIGURATION

AS PER W.H.INMON DATWAREHOUSING IS


Subject-Oriented Integrated Time-Variant Non-volatile

The other name of data warehousing is Decision support system (DSS)

Subject Oriented Analysis


Process Oriented Subject Oriented

Entry
Sales Rep Quantity Sold Prod Number Date Customer Name Product Description Unit Price Mail Address

Sales Sales Customers Customers Products Products

Transactional Storage

Data Warehouse Storage

Integration of Data

Encoding

Appl. A - M, F Appl. B - 1, 0 Appl. C - X, Y Appl. A - pipeline cm. Appl. B - pipeline inches Appl. C - pipeline mcf Appl. A - balance dec(13,2) Appl. B - balance PIC 9(9)V99 Appl. C - balance float Appl. A - bal-on-hand Appl. B - current_balance Appl. C - balance Appl. A - date (Julian) Appl. B - date (yymmdd) Appl. C - date (absolute)

M, F

Unit of Attributes Physical Attributes Naming Conventions Data Consistency

pipeline cm

balance dec(13, 2)

balance

date (Julian)

Transactional Storage

Data Warehouse Storage

Volatility of Data
Volatile Non-Volatile

Insert

Change

Delete Insert Change Access Record-by-Record Data Manipulation Load

Access

Mass Load / Access of Data

Transactional Storage

Data Warehouse Storage

Time Variant Data Analysis

Current Data

Historical Data
Sales ( Region , Year - Year 97 - 1st Qtr)
20 15 Sales ( in lakhs 10 ) 5 0 January February March Year97 East West North

Transactional Storage

Data Warehouse Storage

Data warehouses store large volumes of data which are frequently used by DSS It is maintained separately from the organizations operational databases Data warehouses are relatively static with only infrequent updates A data warehouse is a stand-alone repository of information, integrated from several, possibly heterogeneous operational databases

Is the enabling technology that facilitates improved business decision-making Its a process, not a product A technique for assembling and managing a wide variety of data from multiple operational systems for decision support and analytical processing

Its a journey not an destination

Data Warehouse Architecture


Source Staging Area Data Warehouse Data Mart Analysis Oracle
Metadata

Teradata
Raw Data Summary Data

Reporting

DB2 Data Mining SQL Server

Source:

Its Database where data is extracted Ex : Oracle Teradata Sybase DB2

Staging area:

Its a temporary storage area used for the process of data

Meta Data: Data about the data. Or Description of the data.

Data Mart :

A Data mart is nothing but a Data warehouse but for specific domain A Data mart can be divided into two types: Independent Data mart Dependent Data mart

Steps For Building A Data warehouse


Identify key business drivers, sponsorship, risks . Survey information needs and identify desired functionality and define functional requirements for initial subject area. Architect long-term, data warehousing architecture Evaluate and Finalize DW tool & technology Conduct Proof-of-Concept Design target data base schema Build data mapping, extract, transformation, cleansing and aggregation/summarization rules Build initial data mart, using exact subset of enterprise data warehousing architecture and expand to enterprise architecture over subsequent phases Maintain and administer data warehouse

Snow Flake Schema

Same use star flake schema but the cube will have at least one dimension with two/more levels under at least Two hierarchy.

List Of Tools
ETL TOOLS Informatica,Ascential Data stage , IBM Visual Warehouse , Oracle warehouse Builder . Oracle Express Server, Hyperion Essbase, IBM DB2 OLAP Server, Microsoft SQL Server OLAP Services, Seagate HOLOS, SAS/MDDB . Oracle Express Suite, Business Objects, Web Intelligence, SAS, Cognos Powerplay/Impromtu, KALIDO, MicroStrategy, Brio Query, MetaCube . Data warehouse Oracle, Informix, Teradata, DB2/UDB, Sybase, Microsoft SQL Server .

OLAP SERVER

OLAP TOOLS

INTRODUCTION TO INFORMATICA

It is an ETL TOOL. Extracting of data from sources Performing the Transformations Loading the data in to target

INFORMATICA ARCHITECTURE
Informatica Repository manager Repository Server Repository Admin console

Source

Informatica server

Target

validation session

Status

Designer

Workflow Manager

Workflow Monitor

Components of Informatica

REPOSITORY MANAGER DESIGNER SERVER MANAGER


REPOSITORY MANAGER

REPOSITORY SECURITY FOLDER MANAGEMENT METADATA REPORTING REPOSITORY MAINTENANCE

ANALYSIS WINDOW NAVIGATOR WINDOW

DEPENDENCY WINDOW OUTPUT WINDOW

REPOSITORY SECURITY
CREATE USERS CREATE GROUPS ASSIGN PRIVILEGES MOVE USERS INTO GROUPS ASSIGN ADDITIONAL PRIVILEGES TO USERS

REPOSITORY SECURITY
LOCK TYPES ( READ, WRITE, EXEC, FETCH, SAVE ) OBJECT LOCKS ( FOLDERS, SOURCE DEF., TARGET DEF. ) VIEW LOCKS ( EDIT| SHOW LOCKS ) UNLOCKING OBJECTS

FOLDER MANAGEMENT
FOLDER ATTRIBUTES * OWNER * PERMISSIONS * SHARED * SHORTCUT * VERSIONS

DESIGNER

SOURCE ANALYZER TO CREATE SOURCE DEFINITIONS WAREHOUSE DESIGNER TO CREATE TARGET DEFINITIONS TRANSFORMATION DEVELOPER TO CREATE REUSABLE TRANSFORMATIONS MAPPLET DESIGNER TO CREATE REUSABLE MAPPINGS MAPPING DESIGNER TO CREATE SOURCE TO TARGET MAPPINGS

Designer
Mapping = Source +Transformation+Target Transformation : 2 Types Active Transformation Passive Transformation

ACTIVE TRANSFORMATION Sorter Rank Router Normalizer Source Qualifier Joiner Aggregator Advance external Procedure Update Strategy Custom Transformation Transformation control Union

PASSIVE TRANSFORMATION Lookup Expression Stored Procedure Sequence generator External Procedure XML Source Qualifier

SERVER MANAGER

CONFIGURE SERVER CREATE SESSION START SESSION MONITOR SESSION VIEW LOGS CORRECT SESSION PROBLEMS

Important Bottlenecks:

TEST SQL QUERY CHECK SESSION LOG FOR ERRORS CHECK PERFORMANCE DETAILS REDUCE NUMBER OF RECORDS PROCESSED INDEX THE SOURCE REPLACE DEFAULT QUERY WITH AN OPTIMIZED QUERY DROP INDEXES BEFORE LOADING CONSIDER INCEASING COMMIT LEVEL.

INSTALLATION AND CONFIGURATION

SYSTEM REQUIREMENTS :

OPERATING SYSTEM ( WINDOW 95/98/ NT 4.0 ) DISK SPACE ( 120 MB ) RAM ( 32 MB) CONNECTIVITY ( MERANT ODBC 3.50 ) NETWORK SUPPORT ( TCP/IP OR IPX/SPX )

THANK YOU

Вам также может понравиться