Академический Документы
Профессиональный Документы
Культура Документы
SAP HANA Smart Data Integration and SAP HANA Smart Data Quality 2.0 SP00
Document Version: 1.0 2016-12-07
Master Guide
Content
1 Getting Started. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 About This Document. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Use Cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Deployment Options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.1 Deployment in High Availability Scenarios. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4 Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5 Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Master Guide
2 PUBLIC Content
1 Getting Started
SAP HANA smart data integration and SAP HANA smart data quality provide a set of tools and processess that let
you connect to any source, provision and cleanse data, and load to SAP HANA on premise or in the cloud.
1.1 Overview
The SAP HANA smart data integration and SAP HANA smart data quality options provide tools to access source
data, and provision, replicate, and transform that data in SAP HANA on-premise or in the cloud.
The smart data integration and smart data quality options let you enhance, cleanse, and transform data to make it
more accurate and useful. These options let you efficiently connect to any source to provision and cleanse data
for loading into SAP HANA on-premise or in the cloud, and for supported systems, write back to the original
source.
Capabilities include:
A simplified landscape, that is, one environment in which to provision and consume data.
Access to more data formats including an open framework for new data sources.
In-memory performance, which means increased speed and decreased latency.
SAP HANA smart data Real-time, high-speed data provisioning, bulk data movement, and federation. Provides built-in
integration adapters plus an SDK so you can build your own.
Replication Editor in the SAP HANA Web-based Development Workbench, which lets you set
up batch or real-time data replication scenarios in an easy-to-use web application
Transformations presented as nodes in SAP HANA Web IDE and SAP HANA Web-based De
velopment Workbench, which lets you set up batch or real-time data transformation scenar
ios
Data Provisioning Agent, a lightweight component that hosts data provisioning adapters, en
abling data federation, replication, and transformation scenarios for on-premise or in-cloud
deployments
Data Provisioning adapters for connectivity to remote sources
Adapter SDK to create custom adapters
Monitors for Data Provisioning Agents, remote subscriptions, and data loads, accessible
from the SAP HANA cockpit
SAP HANA smart data Real-time, high-performance data cleansing, address cleansing, and geospatial data enrichment.
quality Provides an intuitive interface to define data transformation flowgraphs in SAP HANA Web IDE
and SAP HANA Web-based Development Workbench.
Master Guide
Getting Started PUBLIC 3
1.2 About This Document
This Master Guide is the central starting point for the technical implementation of SAP HANA smart data
integration and SAP HANA smart data quality.
Overview
Architecture
Software components
Deployment scenarios
Master Guide
4 PUBLIC Getting Started
2 Use Cases
You can use SAP HANA smart data integration and SAP HANA smart data quality to replicate or transform datea
from remote sources.
Replication You can replicate data (batch and real time) into an SAP HANA system (on premise or in the cloud).
Transformation You can transform data (batch and real time) on the way to the SAP HANA system (on premise or
in the cloud). An example of transforming data is cleansing data using smart data quality.
Master Guide
Use Cases PUBLIC 5
3 Deployment Options
Common deployment options for SAP HANA systems, Data Provisioning Agents, and source systems are
described.
Landscape Description
Using SAP HANA on premise or in the cloud is a choice of deployment. Here are some things to keep in mind when
deciding which deployment to use. If your deployment includes SAP HANA in the cloud and a firewall between
SAP HANA and the Data Provisioning Agent:
The Data Provisioning Proxy must be deployed. This is done by downloading and deploying the HANA_IM_DP
delivery unit.
The Data Provisioning Agent must be configured to communicate with SAP HANA using HTTP. This is done
using Data Provisioning Agent Configuration tool.
You may not have one Data Provisioning Agent registered in multiple SAP HANA instances.
You may have multiple instances of the Data Provisioning Agent installed on multiple machines. For example,
a developer may want to have a Data Provisioning Agent installed on their computer to work on a custom
adapter.
Master Guide
6 PUBLIC Deployment Options
3.1 Deployment in High Availability Scenarios
In addition to installing SAP HANA in a multiple-host configuration, you can use agent grouping to provide
automatic failover and load balancing for SAP HANA smart data integrationand SAP HANA smart data quality
functionality in your landscape.
In a multiple-host SAP HANA system, the Data Provisioning Server runs only in the active worker host. If the active
worker host fails, the Data Provisioning Server is automatically started in the standby host when it takes over, and
any active replication tasks are resumed.
Note
Load-balancing is not supported by the Data Provisioning Server.
For more information about installing SAP HANA in a multiple-host configuration, see the SAP HANA Server
Installation and Update Guide.
Agent grouping provides automatic failover for connectivity to data sources accessed through Data Provisioning
Adapters.
When an agent that is part of a group is inaccessible for a time longer than the configured heart beat time limit,
the Data Provisioning Server chooses a new active agent within the group and resumes replication for any remote
subscriptions active on the original agent.
Any remote subscriptions between the Queue and Distribute states at the time the agent became unavailable are
logged as exceptions, and must be manually reset with new Queue and Distribute commands.
Agent grouping also provides load balancing between individual agent hosts by using a round robin policy.
For example, with multiple agents in the group and a remote source configured on the group, each smart data
access request to the remote source is load balanced by rotating through the agents in the group.
For changed-data capture and real-time operations, an agent in the group is selected when a QUEUE command is
executed on a remote subscription. Any following CDC requests are directed to the agent selected during the
QUEUE command.
For complete information about configuring agent groups, see the Administration Guide for SAP HANA Smart Data
Integration and SAP HANA Smart Data Quality.
Master Guide
Deployment Options PUBLIC 7
Related Information
Master Guide
8 PUBLIC Deployment Options
4 Architecture
These diagrams represent common deployment architectures for using smart data integration and smart data
quality with SAP HANA.
In all deployments, the basic components are the same. However, the connections between the components may
differ depending on whether SAP HANA is deployed on premise, in the cloud, or behind a firewall.
Master Guide
Architecture PUBLIC 9
Figure 2: SAP HANA deployed in the cloud or behind a firewall
The following tables explain the diagram and the network connections in more detail.
Outbound Connections
Data Provisioning Agent When SAP HANA is deployed on premise, the 5050
Data Provisioning Server within SAP HANA con
nects to the agent using the TCP/IP protocol.
Master Guide
10 PUBLIC Architecture
Inbound Connections
Data Provisioning Agent When SAP HANA is deployed in the cloud or be 80xx
hind a firewall, the Data Provisioning Agent con
nects to the SAP HANA XS engine using the 43xx
HTTP/S protocol.
Note
When the agent connects to SAP HANA in the
cloud over HTTP/S, data is automatically gzip
compressed to minimize the required network
bandwidth.
Related Information
Master Guide
Architecture PUBLIC 11
5 Components
SAP HANA smart data integration and SAP HANA smart data quality include a number of components that you
need to install, deploy, and configure.
Component Description
Data Provisioning Server The Data Provisioning Server is a native SAP HANA process. It is built as an index server var
iant, runs in the SAP HANA cluster, and is managed and monitored just like other SAP HANA
services. It provides out-of-the-box native connectivity for many sources and connectivity to
the Data Provisioning Agent.
The Data Provisioning Server is installed with, but must be enabled in, the SAP HANA Server.
Data Provisioning Agent The Data Provisioning Agent is a container running outside the SAP HANA environment, but it
is managed by the Data Provisioning Server. It provides connectivity for all those sources
where the driver cannot run inside the Data Provisioning Server. Through the Data Provision
ing Agent, the preinstalled Data Provisioning Adapters communicate with Data Provisioning
Server for connectivity, metadata browsing, and data access. The Data Provisioning Agent
also hosts custom adapters created using the Adapter SDK.
The Data Provisioning Agent is installed separately from SAP HANA server or client.
HANA_IM_DP delivery unit The HANA_IM_DP delivery unit bundles monitoring and administration capabilities and the
Data Provisioning Proxy for when connecting to SAP HANA in the cloud.
The delivery unit includes the Data Provisioning adminisration application, the Data Provision
ing Proxy, and the Data Provisioning monitor.
Data Provisioning admin ap The Data Provisioning administration application is an XS application that manages the ad
plication ministration functions of the Data Provisioning Agent with SAP HANA in the cloud.
Data Provisioning Proxy The Data Provisioning Proxy is an XS application that acts as a proxy to provide communica
tion between the Data Provisioning Agent and Data Provisioning Server when SAP HANA runs
in the cloud. When SAP HANA is in the cloud, the agent uses HTTP(S) to connect to Data Pro
visioning Proxy in the XS Engine, which eliminates the need to open additional ports in corpo
rate IT firewalls.
Data Provisioning monitor The Data Provisioning monitor is a browser-based interface that lets you monitor agents,
tasks, and remote subscriptions created in the SAP HANA system. You can view the monitors
by directly entering the URL of each monitor into a web browser or by accessing the smart
data integration links in the SAP HANA cockpit, a web-based launchpad that is installed with
SAP HANA Server.
Enable Data Provisioning monitoring functionality (for agents, data loads, and remote sub
scriptions) by creating the statistics tables and deploying the HANA_IM_DP delivery unit.
SAP HANA Web-based Devel The SAP HANA Web-based Development Workbench, which includes the Replication Editor to
opment Workbench Replica set up replication tasks, is installed with SAP HANA Server.
tion Editor
Master Guide
12 PUBLIC Components
Component Description
SAP HANA Web-based Devel The SAP HANA Web-based Development Workbench Flowgraph Editor provides an interface
opment Workbench Flow to create data provisioning and data quality transformation flowgraphs.
graph Editor
Application function modeler The application function modeler provides an interface to create data provisioning and data
quality transformation flowgraphs.
Master Guide
Components PUBLIC 13
6 Summary of Workflow and Tasks
Implementation and use of SAP HANA smart data integration and SAP HANA smart data quality require the
installation, enablement, and deployment of various components across the SAP HANA landscape. Some of these
components are required and some are optional, depending on how you want to use these options.
The following table provides you with an overview of the various tasks that administrators and developers need to
perform to enable smart data integration and smart data quality.
Enable the Data Administrator SAP HANA studio Administration The DP Server is
Provisioning Server Guide for SAP disabled by default.
HANA Smart Data
Integration and
SAP HANA Smart
Data Quality
Enable the Script Administrator SAP HANA studio Administration The Script Server
Server Guide for SAP is disabled by de
HANA Smart Data fault. Enable it only
Integration and if you plan on using
SAP HANA Smart smart data quality.
Data Quality
Master Guide
14 PUBLIC Summary of Workflow and Tasks
Task Sub-task Performed by Tool to use Where to find Notes
more information
Create custom Install Data Provi Adapter developer SAP HANA studio Adapter SDK Guide
adapters sioning Framework for SAP HANA
Smart Data
Integration and
SAP HANA Smart
Data Quality
Import Launch Adapter developer SAP HANA studio Adapter SDK Guide
Configuration for SAP HANA
Smart Data
Integration and
SAP HANA Smart
Data Quality
Install PDE in order Adapter developer SAP HANA studio Adapter SDK Guide
to export devel for SAP HANA
oped plugin Smart Data
Integration and
SAP HANA Smart
Data Quality
Export plug-in Adapter developer SAP HANA studio Adapter SDK Guide
project as JAR file for SAP HANA
Smart Data
Integration and
SAP HANA Smart
Data Quality
Master Guide
Summary of Workflow and Tasks PUBLIC 15
Task Sub-task Performed by Tool to use Where to find Notes
more information
Configure data Replication, using ETL Developer / SAP HANA Web- Configuration
movement SAP HANA Web- Application Con based Develop Guide for SAP
based Develop sultant ment Workbench HANA Smart Data
ment Workbench Integration and
SAP HANA Smart
Data Quality
Configuration
Guide for SAP
HANA Smart Data
Integration and
SAP HANA Smart
Data Quality
Related Information
Master Guide
16 PUBLIC Summary of Workflow and Tasks
Important Disclaimers and Legal Information
Coding Samples
Any software coding and/or code lines / strings ("Code") included in this documentation are only examples and are not intended to be used in a productive system
environment. The Code is only intended to better explain and visualize the syntax and phrasing rules of certain coding. SAP does not warrant the correctness and
completeness of the Code given herein, and SAP shall not be liable for errors or damages caused by the usage of the Code, unless damages were caused by SAP
intentionally or by SAP's gross negligence.
Accessibility
The information contained in the SAP documentation represents SAP's current view of accessibility criteria as of the date of publication; it is in no way intended to be a
binding guideline on how to ensure accessibility of software products. SAP in particular disclaims any liability in relation to this document. This disclaimer, however, does
not apply in cases of willful misconduct or gross negligence of SAP. Furthermore, this document does not result in any direct or indirect contractual obligations of SAP.
Gender-Neutral Language
As far as possible, SAP documentation is gender neutral. Depending on the context, the reader is addressed directly with "you", or a gender-neutral noun (such as "sales
person" or "working days") is used. If when referring to members of both sexes, however, the third-person singular cannot be avoided or a gender-neutral noun does not
exist, SAP reserves the right to use the masculine form of the noun and pronoun. This is to ensure that the documentation remains comprehensible.
Internet Hyperlinks
The SAP documentation may contain hyperlinks to the Internet. These hyperlinks are intended to serve as a hint about where to find related information. SAP does not
warrant the availability and correctness of this related information or the ability of this information to serve a particular purpose. SAP shall not be liable for any damages
caused by the use of related information unless damages have been caused by SAP's gross negligence or willful misconduct. All links are categorized for transparency
(see: http://help.sap.com/disclaimer).
Master Guide
Important Disclaimers and Legal Information PUBLIC 17
go.sap.com/registration/
contact.html