You are on page 1of 16

SAP

SAP : - SYSTEMS APPLICATIONS AND PRODUCTS IN DATA PROCESSING


ERP : - ENTERPRISE RESOURCE PLANNING
SAPR/3 : - R -> REAL TIME DATA PROCESSING
3 -> THREE TIER ARCHITECTURE, (i.e. DATA BASE, APPLICATION
SERVER AND CLIENT SAP GUI LAYER (PRESENTATION LAYER))
BW : - BUSINESS INFORMATION WAREHOUSE
ABAP : - ADVANCED BUSINESS APPLICATION PROGRAMMING
(It is a high level programming language created by SAP a German software
company)
XI : - EXCHANGE INFRASTRUCTURE

SAP software by default has two clients 000 (SAP client) and 066, and if we find
any other client like 100 it is a copy of 000 client and it becomes active client.
In SAP BW we can have only one active client at any given point of time.
In SAP R/3 we can have multiple active clients which are a copy of 000.

 In SAP R/3 we have client dependent and independent tables where as in SAP BW we have
only client independent tables.

Software: SAP software contains the following:


 Programming Language.
 Data Base (Place where data is stored).
 Operating System (Interface between user and system).
 Packages (An application which can’t be customized).
 Application (A software in which we can customize as per client/customer
requirement.
Here Applications consists of:
PRESENTATION LAYER + DATA BASE + OPERATING SYSTEM + CONCEPT.

Father of data warehouse is William Henry Imman.

Types of Applications:
1. OLTP: On-Line Transaction Processing
a. It is used to record transactions.
b. OLTP record a transaction into data base or modify a transaction in data base
and on doing reporting they extract data from data base and display the report.
c. Master Data + Application is Transaction Data.
d. Ex: SAP r/3.

2. OLAP: On-Line Analytical Processing


a. It is used to extract all transactions from different heterogeneous source and
provide analytical reports.
b. This analytical reports help to take business decisions, and to improve their
business process.
c. Transaction Data + Application is Analytical Data.
d. Ex: SAP BW.

1
Master Data:
It is the detailed information of an entity which refers to the data that remains unchanged
over an extended period of time.

Transaction Data:
This is the data used to represent the transactions in business process. Transaction Data
are the business transactions taken from specific data that are assigned to certain master
data.
When two or more entities interact with each other they form a transaction.

Entity:
Entity is an object which can perform work by itself or which can be used to perform some
other work.

SAP Products:
ENTERPRISE PORTALS SAP CRM SD + BW

SAP R/3 SAP APO PP+MM+BW

OLTP SYSTEM SAP BW

OLAP SYSTEM
SAP SEM
FICO + BW
SAP CRM: Consumer Relationship Management.
Holds detailed information about customers.
CRM server is called as CRMONLINE.
CRM is used for retaining customer.
1. Internet sales  ecommerce.
2. Mobile  sales
 Service
3. Call center .
CRM + ABAP + ORACLE/DB2 + UNIX/WINDOWS + TO RETAIN EXISTING COLUMNS.

SAP APO: Advanced Planning Optimizer.


APO is used for production planning. It has 5 sub-modules:
1. DP - Demand Planning (Demand of products can be planned).
2. SNP - System Network Planning.
3. GATP - Global Availability to Promise (Checking the availability of product
globally).
4. TV/TS - Transportation and scheduling.
5. PP/PS - Production Planning and scheduling.

SAP SEM: Strategic Enterprise Management.


SEM is used for financial planning. It has 4 modules:
1. BPS  Business Planning and Simulation.
2. BCS  Business Consolidation and Simulation.
3. CPM  Corporate Performance Monitor (Actual/Plan comparison).
4. SPM  Stock holder Relationship Management (Maintains information about stock
holder/share holder).

2
Enterprise Portal:
This is used this application to post data/reports. It works as a single sign on location to
access any of the data (reports) from any of the modules.
SAP Netweaver is not a single product but it is a suit of multiple products.

SAP NETWEAVER

PEOPLE INTEGRATION
MULTIPLE CHANNELS -
ENTERPRISE PORTAL
ACCESS PORTALS

INFORMATION -
INTEGRATION
1. BI.
2. KNOWLEDGE
BI
MANAGEMENT.
3. MASTER DATA
MANAGEMENT.

PROCESS INTEGRATION.
KNOWLEDGE MANAGEMENT

APPLICATION PLATFORM
LAYER.
J2EE, ABAP. OLTP SYSTEM

Enterprise Portal was built by SAP + YAHOO.


From SAP r/3 transaction data will be available from SAP SEM and APO we have the
plandata. We compare actual and plan data of BW and we finally publish the strategic reports
at enterprise portal.

Web Reports:
These can be published on the web as well as enterprise portal.

BW Versions:
BW 2.0, 2.1c, 3.0, 3.1, 3.5, BI 7.0.
Business Intelligence (BI 7.0) is a part of netweaver.
Using MDM (Master Data Management) all the master data in one application will
be referred when ever required.

Application platform layer have SAP as well as NON-SAP systems.


To connect/integrate with NON-SAP application (non SAP OLTP) to BW it’s not possible. So
we use some integration tools.
1. SAP XI (which integrates NON-SAP system to SAP system)
2. TIBCO.

3
PRESENTATION LAYER + DATA BASE + OPERATING SYSTEM + CONCEPT
ABAP + DATA BASE + UNIX/WINDOWS +DATA WAREHOUSE

BI Business Intelligence (Maintains historical data)

DSS Decision Support System (Helps to make effective decision)

FBS Fact Based System (Maintain data not subjected to change)

DW Data Warehouse (Called all data from OLTP)

Business Intelligence:
This concept is given by a management consulting group Gartner Group.
The main concept of BI is to have all the historical data for taking the decision.
The father of Data Warehouse is William Henry Imman.

Historical Data BI

Decision DSS
making process

DW

Fact Based
System
Facts

4
 Main concept of BI is to have all historical data for taking the decision or extracting data
from different heterogeneous source system to generate reports using
multidimensional methods which can be used for decision support system.

Data in Data Warehouse should have the following properties:


1. Time Variant (T):
Any information you store in Data Warehouse should be stored with atleast one
Time Factor (year, date, month).

2. Integrated (I):
We should be able to integrate the data coming from any heterogeneous OLTP
system.

3. Non-Volatile(N):
Should be fast, not changeable and the date should be fixed.

4. Subject oriented and should support decision making (S):


We are supposed to store data according to subject.
Sales data will be stored in sales cube and finance data will be stored in finance
cube.

Data Warehouse:
 Cognos  Only for reporting
 Informatica  ETL tool
 BO  reporting
 Oracle & DW  Staging
 SAP BW  Has got end to end data warehouse solutions like modeling,
extraction and reporting.

Different Layers in Data Warehousing:

Data Provider Layer Service Provider Layer Information Consumer


Layer

Extraction Modeling Reporting, BEx

Data Warehouse Management Layer

In modeling we build the cubes and other objects.


 Modeling: - Data Architecture
 Extraction: - ETL consultant
 Reporting: - Reporting consultant

5
Modeling Concept
Database design in OLTP and OLAP:
In any database we store the data in the form of tables.

Tables:
A table is a collection of rows and columns. All the columns of the table define the logical
definition of an entity. Rows can also be called as a record. Each row or record defines a
physical existence of an entity.

Primary Key:
Every table must have a primary key. Primary key is a column with which we can
identify a record uniquely in the table.

There are two types of columns in a table:


1. Key column: Any key which is a part of a primary key.

2. Non Key column: Any key that is not a part of a primary key.

All our non key columns of the tables are attributes/properties of key column (primary key).

Primary key/Key column non key column & also attributes of key column

CNO CNAME CADDRESS CREGION

C100 ABC HYD SOUTH

C200 XYZ HYD SOUTH

Customer Master Data

Composite Key/key column non key column

BILL NO ITEM NO CUST NO MAT NO AMOUNT

B001 10 C100 M100 1000

B001 20 C100 M200 1000

B002 15 C200 M300 1500

B002 15 C200 M400 1100

Sales Transaction Data

6
In every bill item no cannot be duplicated. In this table we don’t have a single column as a
primary key so here we have combination of columns.
Here BILL NO + ITEM NO are the Primary Key.
When multiple columns are acting as a primary key then that key is called Composite key.

BILL NO ITEM NO CUST NO CADDR CREGI MAT NO AMOUNT

B001 10 C100 HYD SOUTH M100 1000

B001 20 C100 HYD SOUTH M200 2000

B002 10 C100 HYD SOUTH M300 1000

B002 30 C100 HYD SOUTH M400 1500

Denormalized table

When all the information is stored in one table then that table is called Denormalized table.
The problem with this table is data redundancy (repetitive values), complexity will be increased
and wastage of storage space.
So in order to overcome we can serve the information in two different tables.

CTNO CNAM CADD CREGI BNO ITNO MTNO AMNT CTNO

C100 ABC HYD SOUTH B001 10 M100 1000 C100


B001 20 M200 2000 C100
C200 XYZ BAN WEST
B002 10 M300 1000 C100
B002 30 M400 1500 C100

When primary key of one table takes part in another table then the key is called Foreign Key.

Normalized Table:
The table without redundant data is called Normalized Table.
To over come the problem with denormalized table we split the data in one table into different
small normalized multiple tables and connect them with primary key and foreign key.

The process of converting denormalized tables into normalized tables is called


Normalization.
Normalization can be done with the help of normalization forms.
In OLTP the database design is completely normalized.
In OLAP the database design is completely denormalized.

7
Database design in OLTP system:
 It is two dimensional (row-column).
 This is implemented with the concept of ERM (Entity Relationship Management).

MATERIAL SALES
GROUP DEPARTMENT
CUSTOMER

MATERIAL SALES PERSON

SALES
TRANSACTION

Intersection Entity

When entities are trying to interact with each other (to perform a transaction)
we need to know the relationship between the entities.

Step 1: Define the business/business process/business system:


Identify all transactions included in the business – knowing business completely.

Step 2: Identify entities in the business:


We identity the entities using NOUNPHRASE approach.
NOUNPHRASW approach:  Customer
 Product
 Sales Person

Step 3: Draw the entity diagram:

It is nothing but a rectangular box with 3 blocks. Name to the entity


1. First block will have name of the entity.
2. Second block will hold properties of an entity (attributes).
3. Third block will hold the functionality of an entity.
Properties
Customer Name (attributes)

CNO, CNAME……… Attribute


Functionality
(Methods)
Sales Transactions
Functionality
Sales ( )
in company Entity
Customer

8
Product Sales

PNO, PNAME……… SID, STYPE………

Sales Transactions Sales Transactions


Sales ( ) Sales ( )
Products Sales

All the attributes will be taken care by data base design.


And the methods are taken by front-end people.

Step 4: Convert each entity into separate table:


Convert the customer entity into customer table, products entry into products table etc.

Primary key (Key column)

CNO CNAME CADDR CREGI


Customer table

PNO PNAME PGRP PCLR


Product table

S.PNO SPNAM SPDSG SPDEP


Sales table

Step 5: Normalize the entity table:


1. Identify key columns and non-key columns.
For customer table:
Key column Non-Key column
CNO CNAME
CADDR
CREGI

9
2. i. Find out relationship between key column and non-key column.

CREGI CNO

Indicates that the customer region and customer no has one to many relationship
in between them.
Date redundancy is possible because of one to many relationship i.e. multiple data can be
stored in a column.

CNO CNAME CADDR CREGI

C100 XYZ HYD SOUTH

C200 ABC HYD SOUTH

C300 SSS CHENNAI SOUTH

C400 ABZ DELHI NORTH

C500 ACX DELHI NORTH

Since the CREGI has one to many relationship with CNO (i.e. in one region there can be ‘n’
number of customers), so the column CREGI data can be split into another table as below.

Primary key (PK) Primary key (PK) Foreign key (FK)

CREGNO CREGI CNO CNAME CADDR CREGNO

10 SOUTH C100 XYZ HYD 10

20 NORTH C200 ABC HYD 10

C300 SSS CHENNAI 10

C400 ABZ DELHI 20

C500 ACX DELHI 20

Here the data in column CREGI is stored in numeric because the processing speed of
numeric is faster than alphanumeric.

ii.

CADDR CNO

Customer address and customer number has one to many relationship like that of
customer region.

10
iii.
CNAME CNO

Each customer has a unique name with that of a customer number, so duplication of data
is not possible in it and thus there is no need to split the table.

If there is one to many relationship between key column and non key column we
should split into two tables and connect them with primary key and foreign key
relationship.
If there is one to one relationship between key column and non key column there
is no need to split into different tables.
If there is many to many relationship between key column and non key column
we should split into two tables and connect them with primary key and foreign
key relationship.

Similarly for product table we can do as above.

Step 6: Form the intersection entity table (transaction data table) :


Primary key Foreign key

BILLNO IDNO CNO PNO SPNO P Q TOTAL

B001 10 C100 P100 S100 5 10 50

B002 20 C100 P200 S100 6 10 60

Intersection entity table

Step 7: Normalize intersection entity tables:


(i.e split the entity tables in order to have redundancy of data)
In OLTP system data is stored in ERM (Entity Relationship Management) which is completely
normalized.

When looking at the ER model we should be able to identify,


1. Intersection entity:
Connection between stronger entity and attributes of stronger entity is called
intersection entity.

2. Stronger entity:
Any tables which are directly connected to intersection entity table with the help of
primary key and foreign key relationship.

3. Attributes of stronger entity:


The entity tables which are connected to stronger entities.
Database design in OLAP:
It is Multidimensional Modeling (MDM) in OLAP.
MDM is of different schemas:
1. STAR SCHEMA or Traditional schema.

11
2. External star schema or BW star schema.
3. Snow flake.
4. Hybrid.

1. STAR schema/Traditional schema:


It has the flexibility of having multiple angles. It consists of a cube.
The table at the centre is called Fact table/Transactional data table and tables
surrounding the Fact tables are called Dimension table/Master table. These tables are
connected with primary key and foreign key. All these tables exist in a cube.

Transaction table:
It stores all the transaction data.

Master Data Table:


These tables store master data in it.

Dimension tables:
The same master data tables are acting as a side of a cube these are called as
dimension tables.

Fact table:
Records in it are cannot be changed but we can add records.
In fact tables there are 2 types of columns exits:
1. Characteristics :
On what basis we are analyzing the key figures becomes characteristics.
Ex: Analyze the revenue based on customer here revenue is the key figure and
customer is the characteristics.

2. Key figures :
These are the quantitative measures, these are nothing but what we are going to
analyze.

Fact table:

CID MID SID PR QTY REV

C1 M1 S1 4 2 8

C2 M1 S1 6 4 24

C1 M1 S2 3 5 15

C2 M1 S2 6 4 24

Intersection entity

From the above table:


Characteristics Key-figures
CID PRICE
MID QUANTITY

12
SID REVENUE

Principles of STAR schema:


Each characteristic column in FACT table will gets connected to one master
data table/dimension table.
Maximum number of characteristics in a fact table is 16.
Maximum number of angles we can analyze data is 16.

Difference between ERM and STAR schema:


ERM STAR schema
Normalized data De-normalized data
2 dimensional modeling multidimensional modeling (MDM)

Advantages of STAR schema:


1. Data can be analyzed from multiple angles.

Disadvantages of STAR schema:


1. Master data not reusable.
2. Degraded performance (numerics are better than alphanumerics in a fact table).
3. Limited analysis.

2. Extended STAR schema/BW STAR schema:


This is nothing but STAR schema + SID technology (Surrogated ID)
Even if CID is changed to NUMC even then the value is treated as string hence
there will be no improvement in performance.

SID table:
 Every master table will have its own SID table.
 SID table is also outside the cube like master data table.
 For every record in the master data table it generates an SID in the SID table
automatically.
SID-CID is always numeric. By implementing SID table we can improve the
performance.

When a SID does get generated?


For every record in master data table a SID is created in a SID table.

Dimension table:
To increase the analysis, instead of SID tables dimension tables are placed in the
cube, which acts as a mediator between SID tables and Fact tables.

 When loading transaction data into cube it generates Dimension ID (Data taken
from OLTP system).
 Only one dimension ID is created for each entity from the master data table.
 We can connect 248 master data tables to one dimension table.
 Maximum number of columns in any table is 255.
 Out of 255 columns 6 columns are used for internal purpose of the cube and
one column is left for dimension key and the remaining is 248 columns.
 Therefore a fact table can have 16 dimension tables and each dimension
tables can have 248 SID tables.
 Maximum number of master data tables we can have for a fact table is
16x248.

13
Primary Key Primary Key

CNO CNAME CADD CREGI MNO MNAME MDESC

C1 ABC HYD SOUTH M1 A ----------


2 records M2 B ----------
C2 XYZ BANG WEST 3 records

Customer Master Data Table M3 C ----------

Material Master Data table


Primary Key Foreign Key Primary Key Foreign Key

SID_CID CNO SID_MID MNO

1 C1 3 M1
2 records
2 C2 3 records 4 M2

5 M3
Customer SID Table
Material SID Table

Primary Key Foreign Key Primary Key Foreign Key

DI_CID SID_CID DI_MID SID_MID

9 1 11 3
2 records
10 2 3 records 12 4

13 5
Dimension Table

Foreign key Foreign key Dimension Table

DI_CID DI_MID PRICE QUANTY REVENUE

9 11 5 10 50
Total
9 12 6 10 60
5
10 11 7 7 49
records

FACT TABLE

INFOCUBE
Here all the data in fact table are in numeric, so this extended star schema has better
performance.

14
Here Customer master data table consists of 2 records as well as its SID because for
every record entered in master data table a SID is generated. Where as Material master data
table consists of 3 records as well as its SID.
Therefore the maximum number of records in each dimension table consists as many as
its respective master data records.
Here the fact table consists of 16 characteristics and each characteristic has a
dimensional table. Therefore there are 16 dimensional tables in an info-cube. Each
dimensional consists of SID table and those SID tables consists of master data tables.
We still have a limited analysis of 16 inorder to improve the analysis we can make one
dimension table for all SID tables as shown below:

CNO CNAME CADD MNO MNAME

C1 ABC HYD M1 A
2 records
3 records M2 B
C2 XYZ BAN
M3 C
Customer master data table Material master data table

SID_CID CNO SID_MID MNO

1 C1 3 M1
2 records 4 M2
2 C2 3 records
5 M3

Customer SID table Material SID table

DI SID_CID SID_MID

Total 6 records

DIMENSION TABLE

DI P Q REVENUE

FACT TABLE

INFOCUBE

 Maximum number of count in a dimension table depends upon the number of


records in a SID table for each master data table.

15
In order to design optimized info-cube we have to concentrate on reducing the
number of dimension tables in a cube and at the same time we should also reduce
the number of records in a dimension table.
If two master data tables have many to many relationships then it is preferred to
have two dimension tables in order to reduce the number of records.
If the two master data tables have one to many relationships then it is preferred to
have one dimensional table.

16