Академический Документы
Профессиональный Документы
Культура Документы
2018-02-14
1 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 What's New. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4 Configuration Tasks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.1 Enable and Deploy the Service. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2 Install SAP Predictive Service Engine. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14
4.3 Create the Technical Database User. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.4 Bind the Data Source. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15
4.5 Assign Roles to Users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.6 Start the Service and Check the Binding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5 Business Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.1 Service Description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5.2 Architecture Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20
5.3 Synchronous Mode Versus Asynchronous Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.4 Use the Service. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.5 REST API Quick Reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Clustering APIs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Dataset APIs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Forecasts APIs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .45
Key Influencers APIs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Outliers APIs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Recommendation APIs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .62
Scoring Equation APIs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
What If APIs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .81
Job Response Body Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.6 Usage Scenarios. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Creating Clusters with Either High or Low Target Rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.7 Error Messages Explained. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
General Service Parameter Error Messages (EXX). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .97
Database Error Messages (EDB). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Dataset Service Error Messages (EDS). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Job Access Error Messages (EJB). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Modeling Parameter Error Messages (EMO). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
The SAP Predictive Service User Guide is your documentation reference for learning how to access and
consume Machine Learning services leveraged by the SAP Cloud Platform.
The SAP Predictive service is a service available on the SAP Cloud Platform for enabling applications with
predictive capabilities. Using the service, an application can analyze the data stored in an SAP HANA instance
to get insights and make predictions.
The SAP Predictive service offers two collections of RESTful web services that you deploy on the platform as
one application:
● Business Services
They allow to get insights from data to be used by business analysts. Each service answers a specific
business question and therefore returns a specific type of insight.
● Predictive Analytics Integrator Services
They allow non-predictive cloud applications to easily integrate and consume predictive models. These
services enable the productive utilization of predictive models within the context of real-life business
processes.
You deploy this application on the instance of the SAP Cloud Platform of your company or customer before
using it.
Audience
You are able to understand business questions and you explore your data in search of new insights. Learn more
about the services in Business Service Description [page 18] to find which insights the predictive service can
provide.
You know the SAP Cloud Platform cockpit and how to deploy the services on the platform instance. You assign
role to future users of the service. You create the technical database user and bind the service to the database
instance. See Configuration Tasks [page 13].
You address end-user needs identified by the business user. You develop the cloud application by using the
service that your cloud administrator has deployed. You know the concept of REST architecture style and you
are able to develop cloud applications based on REST APIs using Java or HTML5. You also know the OData 2.0
specification and are able to understand the entity model that supports the Predictive Analytics Integrator
services. REST APIs require a minimum amount of programming. See the REST API Quick Reference [page
25] for the Business Services and the OData REST API Quick Reference [page 103] for the Predictive
Analytics Integrator Services.
You have programming and data mining skills and you are able to help the developer to implement model
management tasks. See the OData REST API Quick Reference [page 103].
Learn More
This guide describes the services available in this release and how to install and deploy them on the instance.
The following table describes other information available for you to learn more about the predictive service and
the underlying Automated Analytics concepts.
Information Description
Automated Analytic User Guides and Scenarios on the SAP The SAP Predictive Analytics user guide for classification,
Help Portal, section User Guides and Scenarios regression, segmentation, and clustering scenarios. This
guide contains additional information about variables and
how to use them.
SAP HANA Automated Predictive Library Reference Guide on The reference for SAP HANA APL. This provides information
the SAP Help Portal, section Development on the predictive business functions such as forecast, key in
fluencers, and scoring equation.
Introducing SAP Predictive service A set of videos that show you how to get started with the
predictive service APIs, how they work and how to use them.
SAP API Business Hub - Business Services Explore, test, and consume the Business Services through
the SAP API Business Hub.
SAP API Business Hub - Predictive Analytics Integrator Serv Explore, test, and consume the Predictive Analytics Integra
ices tor Services through the SAP API Business Hub.
Tutorial Catalog Learn by doing on the SAP Cloud Platform developer center.
The following table provides information about what is new and what has changed since the last release.
A section about data protection and privacy has been added. See
Data Protection and Privacy [page 160].
Release notes are now archived into the user guide. See Archive - Re
lease Notes for SAP Predictive Service 2017 [page 166].
Not applicable 2017-11-21 New The Feature Scope Description document is released. See Feature
Scope Description.
2.02 2017-9-25 Announcement The service name is now SAP Predictive service.
Fix Documentation
Enhancement Documentation
The version numbers of the predictive service have been added to the
What's New.
The service now allows end-users to get the variables that are corre
lated to each other, with their coefficient of correlation. See Key Influ
encers APIs [page 52].
Enhancement Documentation
Fix Documentation
2.00 2017-6-19 New The new collection of services Predictive Analytics Integrator Serv
ices is available to add model management tasks to your application.
See Service Description [page 101].
Enhancement Forecasts
The service now allows end-users to specify the schema and the ta
ble of the dataset in SAP HANA separately in the request. This is ena
bled through the location input parameter. hanaURL has been
deprecated. See Dataset APIs [page 32].
The service now allows application users to choose the type of output
generated by the scoring equation (predicted value or probability).
This is enabled through the predictionOutputType input pa
rameter. See Scoring Equation APIs [page 77].
These services now allow application users to set the key of the target
variable through the TargetKey input parameter. See Outliers APIs
[page 58] and Scoring Equation APIs [page 77].
Enhancement Documentation
Enhancement Outliers
New Documentation
Enhancement Dataset
You can now explore, test, and consume the predictive service
through the SAP API Business Hub .
New Documentation
A link to videos has been added to the Overview [page 4]. These vid
eos explain how to deploy the predictive service and how it works.
Enhancement Dataset
The service now allows application users to modify the value types of
the variables. This is enabled through the new API [POST] /api/
analytics/dataset/<datasetID>/variables/update.
Enhancement Forecasts
● The service now returns the past data with both predicted and
real values of each data point. This is enabled through the
numberOfPastValuesInOutput request parameter.
● The service now returns the definition of the trend, cycles, and
fluctuation features found in the data and used by the underlying
time series model to generate forecasts. This information is
available through modelInformation in the output.
Enhancement Dataset
New Documentation
Enhancement Datasets
Enhancement Forecasts
New Forecasts
Enhancement Outliers
The SAP Predictive service offers two sets of web services used to perform predictive analysis on data
resources stored in the SAP HANA database of the cloud.
The predictive service takes charge of the complexity of developing services directly on the predictive model
engine. Web services are easy to understand, and provide a simple programming paradigm, which is adapted
for Web applications on the cloud. The predictive service supports CRUD (Create, Read, Update, Delete)
operations on data over HTTPS, sending requests and receiving responses in JSON format only.
You deploy these two sets as an application on the SAP Cloud Platform instance before using them. A schema
is created on the database to store the data used by the service, such as the service call history and the job
results. The data resources are datasets stored in SAP HANA database tables. They must be registered in the
database schema, which can be different from the one used by the service.
Business Services
These services are REST APIs and allow you to expose the predictive analytics functionalities in the application
you develop.
On each call to a service, a predictive model is built from the dataset and a target by using SAP HANA APL
(included in SAP Predictive service engine). The results returned by the service are retrieved from the
predictive model.
The predictive service provides synchronous and asynchronous predictive model execution. For models that
require a long processing time, asynchronous execution allows a better user experience.
One service returns one type of insight. Each service operates the data mining process. Each service allows you
to finetune the creation of the model, for example by using the variable autoselection feature, by correcting
the variable descriptions, or by ignoring columns that contain unnecessary information, such as column
identifiers.
These services are OData REST APIs and allow you to integrate predictive models for consumption in the
application you develop. The OData 2.0 specification is used to represent data objects that describe and
reference the models available.
A series of calls to the services is necessary to train and apply any predictive model stored in the back-end
system. These model management tasks can be fully integrated with the application's own user experience
and workflow.
Before using the predictive service, make sure you complete the following tasks.
Prerequisites
Remember
You need a productive SAP HANA instance to use the service. Your SAP HANA instance requires at least 64
GB of memory.
1. Click Services in the left menu of your SAP Cloud Platform cockpit.
If you have enabled the service, this role has automatically been assigned to you.
c. Go back to Predictive Service page.
5. Click Go to Service.
You can also change the Compute Unit setting to assign more CPU/RAM to the services.
You must install SAP Predictive service engine on the SAP HANA instance.
1. Click SAP HANA / SAP ASE Database Systems in the left menu and the name of your instance.
2. Click Install components.
Review and confirm before applying the installation is checked by default. If you uncheck it, click Install to
install SAP Predictive service engine directly; otherwise, click Continue.
5. After the component is prepared for installation, click Install.
You must create the technical database user, which is used to bind the SAP HANA instance to the application.
You create a technical database user that will be used to bind the SAP HANA instance to the application and
grant it the necessary rights to use the service.
1. Connect to your productive SAP HANA instance using an SAP HANA administration tool such as the SAP
HANA cockpit or SAP HANA Studio.
2. Create a new database user.
Remember
You cannot use your own database user as a technical database user for the service. You must create a
new database user.
○ SELECT rights to the schema that contains the datasets that the service will analyze
○ CREATE ANY and INSERT rights to the schema into which the service will write analysis results.
The technical database user has been created. It is granted all necessary rights for using the service.
Bind the SAP HANA instance to the application using the technical database user.
Create a data source binding for the application against the SAP HANA instance. You connect the predictive
service to your SAP HANA instance with the technical database user.
You must keep the default proposed <default> data source name.
Caution
The database user name you set in Custom Logon must be the new technical database user name. It
must not be your own database user name.
See the "Assign Users or Groups to the Roles" section in Managing Roles.
Users with the C4PA-User role can call the service after the application is started. Users with the C4PA-Admin
role can access the administration interface of the service after the service is started.
Next task: Start the Service and Check the Binding [page 17]
1. Go to the Overview page of the application dashboard and click Start to start the aac4paservices
application.
2. Once the application is started, check whether the binding is correct:
a. Click the application URL on the Overview page.
b. Choose the Administration tile and make sure that the status is OK.
c. Choose the Binding tile to see the binding details.
The Business Services allow you to perform predictive analysis operations on a dataset.
Clustering ● Segments a population into homogeneous Using this service, an end user can for example:
clusters ● Group together similar customers
● Sends segmentation results into the desired ● Identify emerging customer profiles
SAP HANA table or view for visualization pur
● Identify interesting clusters they can focus on
pose
● Creates clusters driven by a target indicator,
which mean members of a cluster are similar
according to some business question (super
vised clustering)
Forecasts Generates forecasts for a time series Using this service, an end user can predict the next
values of a time series from a reference date.
Key Influencers Returns the variables which have an influence on Using this service, an end user can for example:
a specified target ordered by decreasing contri
● Have a better understanding of the profile of the
bution
targeted population
● Identify the drivers of success and learn how to
improve performances
● Have leads on potential causes of a targeted
event
Outliers Identifies the odd profiles of a dataset whose tar Using this service, an end user can detect for exam
get indicator is significantly different from what is ple:
expected
● Attributes that were not filled correctly
● Potential frauds
● Unconventional profiles
Recommenda Creates and uses a recommendation model to Using this service, an end user can do the following:
tion generate a list of items to suggest to users.
● Create a model based on a dataset restricted to
a specific time period
● Generate recommendations for a given user or
group of users
● Generate recommendations from a basket of
items that are not purchased yet
● Update recommendations at the same time as
the transaction history is updated
● Estimate the cost and performance of the model
before its creation
● Iterate to find modeling settings with best bal
ance between costs and capabilities
Scoring Equa Exports the scoring equation of a predictive Using this service, an end user can for example:
tion model
● Integrate the predictive model within an applica
tion
● Use the predictive model as many times as
wanted
● Apply the predictive model on new data on the
fly
What If Simulates a planned action and returns the signif Using this service, an end user can for example:
icant deviations compared to what is expected
● Identify unexpected consequences of an action,
such as additional costs, workload reestimation,
and changes in a process
● Gain potential insights by investigating and vali
dating the hidden relationship between the plan
ned action and its consequences
● Registers a dataset stored in another SAP HANA schema for further use with
the predictive service
● Retrieves dataset and variable information
The SAP Cloud Platform application you develop consumes the predictive service deployed on your
subaccount.
User data and service metadata are stored in your SAP HANA instance. User data are datasets owned by the
end user and service metadata are service call history and job results (Predictive Service Repository).
It is possible to create a dataset that refers to a table/view in a different schema from the Predictive Service
Repository, as long as schemas are stored in the same SAP HANA instance.
1 The SAP Cloud Platform application you have developed reads/writes user data.
2 The predictive service reads the user data to create dataset objects.
3 SAP HANA APL reads user data for all service processing and writes to it for the Recommendation and Clustering serv
ices.
4 The predictive service reads/writes service metadata to the Predictive Service Repository.
The SAP Cloud Platform application also relies on the SAP Cloud Platform services such as administration,
monitoring, and authentication services.
An end user can call a service using either a synchronous or asynchronous mode.
The synchronous mode is convenient to test services on small datasets as you receive results in one call. It may
not be appropriate to larger datasets and if it takes too long to build the model. Therefore, the end user can use
the service in asynchronous mode to save time.
In asynchronous mode, the service first creates a job whose ID is returned to the end user. The job then
proceeds the same way as in the synchronous mode, except that the job results are saved for a certain amount
of time. Instead of waiting for the results, the end user can retrieve the job results later, after making sure they
are available.
You can use any supported programming language to develop your application in the cloud, for example Java
or HTML5/JavaScript. See the SAP Cloud Platform developer documentation.
You are developing an application that will allow an end user to perform predictive analysis from their own SAP
HANA data.
1. Use the following Dataset Service API to register an existing SAP HANA table as a dataset: POST/api/
analytics/dataset/sync
2. Call the service either in synchronous or in asynchronous mode:
Option Description
Synchronous mode 1. Call the service by using the dataset ID and other parameters.
POST /api/analytics/keyinfluencer/sync
Example
You are developing an HTML5 application on the SAP Cloud Platform to demonstrate the Key Infuencer APIs
usage. Your goal is to create buttons that trigger API calls and to make the application display dataset
information and final results. You use JavaScript to write server requests with the $.ajax() jQuery AJAX
where the displayVariables function triggers the GET /api/analytics/dataset call to retrieve
the dataset information:
function displayVariables(id) {
$.ajax({
type : "GET",
contentType : "application/json",
url : root + "/api/analytics/dataset/" + id,
dataType : 'json',
success : function(data, status, request) {
var i = 0;
var rowCount = 0;
var oItem;
ddlb_variables.destroyItems();
for (i = 0; i < data.variables.length; i++) {
addRowInTable("variablesTable", data.variables[i].name,
data.variables[i].value);
oItem = new sap.ui.core.ListItem();
oItem.setText(data.variables[i].name);
ddlb_variables.addItem(oItem);
}
ddlb_variables.setValue(data.variables[data.variables.length -
1].name);
gTarget = data.variables[data.variables.length - 1].name;
ddlb_variables.attachChange(function(){
$('#target').html(ddlb_variables.getValue());
gTarget = ddlb_variables.getValue();
});
ddlb_variables.placeAt("ddlb_ChooseTarget");
$('#target').html(ddlb_variables.getValue());
},
...
}
3. Write a button that triggers the following function to display the job result:
4. Write a button that triggers the following function to display the results:
<head>
<title>Your demo application</title>
<script id="sap-ui-bootstrap"
src="https://sapui5.hana.ondemand.com/resources/sap-ui-core.js"
data-sap-ui-theme="sap_bluecrystal"
data-sap-ui-libs="sap.ui.commons"></script>
<script type="text/javascript" src="code.js"></script>
</head>
<table id="param">
<tr>
<th><div id="get_dataset"></div></th>
<th>Dataset ID</th>
<th><input type="text" id="datasetID"/></th>
<th><div id="get_variables"></div></th>
</tr>
<tr>
<th colspan="4"><div id="out_param"></div></th>
</tr>
</table>
<table id="variablesTable">
<tr>
<th>Rank</th>
<th>Name</th>
<th>Type</th>
</tr>
</table>
To display the target variable, the job status and the results:
<table id="param">
<tr>
<td>Choose target</td>
<td><div id="ddlb_ChooseTarget"></div></td>
<td><div id="target"></div></td>
</tr>
<tr>
<td><div id="bKeyInfluencer"></div></td>
<td><div id="bGetStatus"></div></td>
<td><div id="status"></div></td>
</tr>
<tr>
<td><div id="bKeyInfluencerResult"></div></td>
<td>Job ID </td>
<td><input type="text" id="jobID"/></td>
</tr>
<tr>
<td>Predictive Power (KI)</td>
<td><div id="tKI"></div></td>
<td>Robustness (KR)</td>
<td><div id="tKR"></div></td>
</tr>
An overview of the SAP Predictive service APIs provided as web services in the cloud.
Test these APIs directly in the SAP API Business Hub with the sample datasets described in Datasets
Available for SAP API Business Hub [page 164].
Note
The base URL to use with the REST APIs depends on the instance where you have deployed the predictive
service.
Related Information
The Clustering service analyzes a dataset and segments it into homogeneous clusters.
The service groups similar entities of a dataset into clusters. The resulting clustering information is then
exported to the SAP HANA database. The clusters IDs can be written to a database table or view.
The end user must specify the numbers of clusters that the service should return, expressed as a range. The
service returns the best clustering whose number of clusters is in the range.
The user must also set the name and schema of the table or view where the results are stored. Optionally, they
can choose the variables from which the clustering can be defined and set a target variable in case of
supervised clustering.
The service returns information on the clusters and stores results into the requested table. It also provides
model performance information in case of supervised clustering.
APIs
Segmenting the da POST /api/analytics/ input [page 27] output [page 30]
taset into clusters clustering/sync
Creating a job POST /api/analytics/ input [page 27] output [page 86]
clustering
Getting the job sta GET /api/analytics/ None output [page 86]
tus clustering/<jobID>/
status
<jobID> Yes Integer The job identifier that you get by creating a job first
None
The list of input parameters for the synchronous mode API and the asynchronous job creation API of the
Clustering service.
{
"datasetID",
"numberOfClusters",
"exportSettings" : {
"method",
"destination" : {
"schema",
"table",
"overwrite",
},
"clusterIDColumn"
},
"selectedVariables",
"skippedVariables",
"target" : {
"column",
"value"
},
"modelSQLExportEnabled",
"distance"
}
datasetID Yes Integer The identifier of a dataset that has been registered N/A
in the schema.
numberOfCl Yes Array of inte The minimum and maximum numbers of clusters N/A
usters gers requested, specified as a 2-value array.
Note
A clustering model is created for each number
of clusters in the range, so that the service can
select the best. Therefore,
numberOfClusters could hinder the service
performance if a wide range is specified.
exportSett Yes Object The settings that configure how the segmentation N/A
ings [page results are returned to the end user.
28]
● table
The service exports the results into an SAP
HANA database table. If a primary key has
been specified for the input dataset, the desti
nation table only contains the primary key
component columns and the column that con
tains the cluster IDs. Else, all columns from the
input dataset are copied to the table, which
could hinder the service performance.
● view
The service exports the results in the form of a
dynamic view created on top of the dataset
specified in the request. As the clustering defi
nition is integrated into the view, new data
points in the table can automatically be as
signed a cluster ID.
Remember
● The view method requires the clustering
models to be defined as SQL queries, which
means modelSQLExportEnabled
must be set to true if you choose this ex
port method.
● Both methods generate different segmen
tation results, because
modelSQLExportEnabled is activated
by default in the view case, but not in the
table case.
destinatio Yes Object The destination where the segmentation results are
n [page 29] stored.
clusterIDC No String The name of the column that contains the cluster CLUSTER_ID
olumn IDs in the destination table or view.
schema Yes String The name of the schema where the destination ta N/A
ble or view is created.
table Yes String The name of the destination table or view that is N/A
created.
overwrite No Boolean Indicates that the destination table or view is drop false
ped and recreated if it exists.
selectedVa No Array of strings The list of variables used to do the clustering. All variables are
riables selected.
skippedVar No Array of strings The list of variables that should not be included in No variable is
iables the analysis. excluded.
If selectedVariables is specified,
skippedVariables is ignored.
variableDe Array of objects The tuples are name and value pairs for the follow Null. The de
scription Cau ing parameters that describe the variable: scription stored
tion ● variable
with the data
set is used.
Deprecated ● storage
● value
● key
● missing
Note
Only variable, storage, and value must
be in the input. The other parameters can be
omitted.
target [page No Object The definition of the target in the case of super Null
30] vised clustering.
modelSQLEx No Boolean Indicates if the clustering model can be exported as false if the
portEnable an SQL query. method is
d table
This allows to export the clustering results as a
view. It must be set to true if you choose the view true if the
method ("method" : "view"). method is
view
distance No String The distance used to measure the proximity of two SystemDete
data points. rmined
Possible values are the following:
column Yes String The name of the target column in case of super N/A
vised clustering.
value No String The target value if the target column is binary. Null
Note
This setting only applies to classification (binary
target) and is ignored in regression (continuous
target).
Related Information
The list of output parameters for the synchronous and asynchronous APIs of the Clustering service.
clusters [page 31] Array of objects The list of clusters with their IDs and frequency.
modelPerformanc Object Indicators on the quality of the results. Only provided in the case of su
e [page 32] pervised clustering.
percentageOfUna Number The percentage of rows in the dataset that are not in any clusters.
ssignedRecords
overlapRate Number The percentage of rows in the dataset that are in multiple clusters.
frequency Number The size of the cluster as percentage of the number of rows in the data
set.
targetMean Number The average value of the target inside the cluster.
Only returned if a target variable has been defined in the request (super
vised clustering).
schema String The name of the schema where the segmentation results are stored
table String The name of the table or view where the segmentation results are stored
confidenceIndic Integer The model robustness indicator. 1 if the results are reliable, else 0.
ator
predictivePower Number The predictive power of the model that has generated the results.
predictiveConfi Number The prediction confidence of the model that has generated the results.
dence
Related Information
The Dataset services provide a series of features that manage datasets to be used with the predictive service.
These services:
By using this service, you register a dataset stored in an SAP HANA table into the application and assigns an ID
to it.
Request
URI: /api/analytics/dataset/sync
{
"location" : {
"schema",
"table"
},
"variables" : [
{
"position",
"name",
"storage",
"value",
"key"
},
{...},
...
]
}
hanaURL String The reference to the SAP HANA table that needs to
Caution
be registered:
Deprecated.
<schema_name>/<table_name>
location Yes Object The location of the SAP HANA table or view that is
to be registered as dataset.
variables No Array of objects The list of variables and their description in the da
taset.
schema No String The schema of the dataset in the SAP HANA data Predictive Serv
base ice Repository
schema
If not specified, the Predictive Service repository
schema is considered by default.
table Yes String The name of the table or view where the dataset is
stored
storage No String The data type of the value stored in the variable: N/A
value Yes String The type of the value stored in the variable: Null
Note
Ordinal variables are currently considered as ei
ther continuous or nominal variables depending
on their storage type.
Response
{
"ID",
"name",
"location" : {
location Object The location of the SAP HANA table or view where the dataset is stored,
with the following information:
● schema
● table
variables Array of objects The list of variables in the dataset with the following information:
● position
● name
● storage
● value
● key
Example
Continuous vs Nominal vs Ordinal
The variable "salary" is a numerical variable, but in addition, is also a continuous variable. It may, for
instance, take on the following values: "$1,050", "$1,700", or "$1,750". The mean of these values may be
calculated.
The variable "zip code" is a nominal variable. The variable values ("10111", "20500", "90210", for example)
are clearly distinct, non-ranked categories, although they are represented by numbers. Binary variables are
considered nominal variables.
The variable "school grade" is an ordinal variable. Its values actually belong to definite categories and can be
sorted. This variable can be:
This service unregisters the specified dataset from the application. Unregistering a dataset prevents further
use of this dataset with the predictive service. It does not remove the dataset content.
The service returns an error when the dataset does not exist.
Request
URI: /api/analytics/dataset/<datasetID>
HTTP Method:DELETE
None
None
Response
None
This service returns an error when the end user cannot access the dataset.
Request
URI: /api/analytics/datasets/<datasetID>
HTTP Method:GET
None
None
{
"ID",
"name",
"location" : {
"schema",
"table"
},
"numberOfRows",
"numberOfColumns",
"variables" : [
{
"position",
"name",
"storage",
"value",
"key"
},
{...},
...
]
}
location Object The location of the SAP HANA table or view where the dataset is stored,
with the following information:
● schema
● table
variables Array of objects The list of variables in the dataset with the following information:
● position
● name
● storage
● value
● key
Related Information
This service returns description along with statistics on a specific variable of a dataset.
The statistics returned by the service depends on the following value type of the variable:
● Nominal - the list of the distinct values of this variable with their frequencies is returned
● Continuous - the minimum, maximum and average values of this variable are returned. The average value
is calculated only for numeric variables, not for dates.
Request
URI: /api/analytics/dataset/<datasetID>/variable/<variablePosition>
HTTP Method:GET
None
None
Response
● Variable information
● Statistics information, depending on the variable is nominal or continuous
storage String The data type of the value stored in the variable
Note
Ordinal variables are currently considered as either continuous or
nominal variables depending on their storage type.
key Integer A flag that indicates if the column is a component of a primary key
numberOfCategor Integer The number of distinct values of the variable. Only returned if the varia
ies ble is nominal.
valueStatistics Array of objects The statistics generated on the values of the variables
Nominal Variable
If the variable is nominal, there are as many category and frequency pairs as there are categories.
{
"position",
"name",
"storage",
"value",
"key",
"numberOfCategories",
"valueStatistics" : [
{
"category",
"frequency"
Continuous Variable
If the variable is continuous, the statistics contain the minimum, maximum, and average values of the variable.
{
"position",
"name",
"storage",
"value",
"key",
"valueStatistics" : {
"minimum",
"maximum",
"average"
}
}
Example
Continuous vs Nominal vs Ordinal
The variable "salary" is a numerical variable, but in addition, is also a continuous variable. It may, for
instance, take on the following values: "$1,050", "$1,700", or "$1,750". The mean of these values may be
calculated.
The variable "zip code" is a nominal variable. The variable values ("10111", "20500", "90210", for example)
are clearly distinct, non-ranked categories, although they are represented by numbers. Binary variables are
considered nominal variables.
The variable "school grade" is an ordinal variable. Its values actually belong to definite categories and can be
sorted. This variable can be:
● Correct the value types of variables, if the automatic guess did not return the correct results
● Indicate which columns are components of the primary key of the dataset
Only value types or components of the primary key can be changed. Positions, variable names, and storage
types must be the same as the ones in the physical dataset. The new description will be stored with dataset
metadata and used by default whenever the dataset is used by a service. The response contains the dataset
information with the changed value types or components of the primary key.
Note
The service does not allow partial change. If any of the changes specified in the request body cannot be
done, no change will be done.
Request
URI: /api/analytics/dataset/<datasetID>/variables/update
[
{
"name",
"value",
"key"
},
{...},
...
]
value No String The new value of the variable, which can be one of
the following:
Note
Ordinal variables are currently considered as ei
ther continuous or nominal variables depending
on their storage type.
key No Integer The new value of the key of the variable: 1 to indi
cate that the column is part of the primary key of
the dataset, else 0.
Response
{
"ID",
"name",
"location" : {
"schema",
"table"
}
"numberOfRows",
"numberOfColumns",
"variables" : [
{
"position",
"name",
"storage",
"value",
"key"
},
{...},
...
]
}
location Object The location of the SAP HANA table or view where the dataset is stored,
with the following information:
● schema
● table
variables Array of objects The list of variables in the dataset with the following information:
● position
● name
● storage
● value
● key
storage No String The data type of the value stored in the variable: N/A
value Yes String The type of the value stored in the variable: Null
● continuous
● nominal
● ordinal
Example
Continuous vs Nominal vs Ordinal
The variable "salary" is a numerical variable, but in addition, is also a continuous variable. It may, for
instance, take on the following values: "$1,050", "$1,700", or "$1,750". The mean of these values may be
calculated.
The variable "zip code" is a nominal variable. The variable values ("10111", "20500", "90210", for example)
are clearly distinct, non-ranked categories, although they are represented by numbers. Binary variables are
considered nominal variables.
The Forecasts service analyzes a dataset containing the successive values of a target indicator over time to
predict the next values.
This service:
The predictive model combines the trend, cycles, and fluctuations found in the time series to generate
forecasts. The prediction also depends on information provided through extrapredictive variables if any. The
granularity of the prediction is the same as the granularity used in the dataset. For example, if the dataset
contains daily observations of a time series, the service computes the values of the series in the next days. See
the Time Series Scenarios on the SAP Help Portal for a description of the time-series components.
Note
● By default, all the variables in the dataset are taken into account to calculate the forecasts. If the dataset
contains other variables than the target column and the date column, then make sure they all have
values in the forecast period or remove them from the analysis using the skippedVariables setting.
Else, the following EXX114 error message might appear:
An internal error has occurred: The training data set does not contain
enough values for the
extra-predictable variables to cover the number of requested forecasts.
● The service may return forecasts without error bars beyond the maximum confident horizon.
APIs
Getting the fore POST /api/analytics/ input [page 46] output [page 49]
casts forecast/sync
Creating a job POST /api/analytics/ input [page 46] output [page 86]
forecast
Getting the job sta GET /api/analytics/ None output [page 86]
tus forecast/<jobID>/
status
<jobID> Yes Integer The job identifier that you get by creating a job first
None
The list of input parameters for the synchronous mode and the asynchronous job creation API of the Forecasts
service.
{
"datasetID",
"targetColumn",
"dateColumn",
"numberOfForecasts",
"referenceDate",
"numberOfPastValuesInOutput",
"skippedVariables",
"weightVariable",
"smoothingCycleLength",
"forecastMethod",
"maxLag"
}
datasetID Yes Integer The identifier of a dataset that has been registered N/A
in the schema.
targetColu Yes String The name of the column containing the values of N/A
mn the time series.
dateColumn Yes String The name of the column containing the timestamps N/A
of the time series.
numberOfFo Yes Integer The number of forecasts to generate from the cur N/A
recasts rent time series.
referenceD No DateTime The last date of the data used to train the model. The date of the
ate The service generates the forecasts from the date last known
after referenceDate. In this case, the service value of the
uses only the data prior to this date. time series
Remember
The date must follow one of the two ISO 8601
formats below:
Caution
If numberOfPastValuesInOutput is -1,
then all past values are returned. This can be
voluminous.
variableDe Array of objects The tuples are name and value pairs for the follow Null. The de
scription Cau ing parameters that describe the variable: scription stored
tion ● variable
with the data
set is used.
Deprecated ● storage
● value
● key
● missing
Note
Only variable, storage, and value must
be in the input. The other parameters can be
omitted.
skippedVar No Array of strings The list of variables that should not be included in No variable is
iables the analysis. excluded.
weightVari No String The variable to be used as weight during modeling. Null. No varia
able ble is used as
weight.
smoothingC No Integer The cycle length to be used when smoothing the Null
ycleLength time series.
forecastMe No String The method used by the underlying model to gen default
thod erate the forecasts.
maxLag No Integer The maximum lag to consider to compute fore The default
casts. value in Predic
tive Analytics
maxLag controls the way that time series analyzes
the random fluctuations in the signal. It defines the
maximum dependency of the signal on its own past
values.
Remember
This parameter only applies with the default
forecast method and is ignored if another fore
cast method is used.
Related Information
The list of output parameters for the synchronous and asynchronous APIs of the Forecasts service.
{
"parameters" : {
...
},
"forecasts" : [
{
"date",
"realValue",
"forecastValue",
"errorBarLowerBound",
"errorBarHigherBound"
},
{...},
...
],
"modelInformation" : {
"trend,
"cycles",
"fluctuations"
},
"modelPerformance" : {
errorBarLowerBo Number The lower bound of error bar for current forecast.
und
errorBarHigherB Number The higher bound of error bar for current forecast.
ound
cycles String The periodic elements that can be found at least twice in the data.
fluctuations String Residuals after extraction of trend and cycles modeled through auto-re
gression.
mape Number The horizon-wide Mean Absolute Percentage Error (MAPE) indicator.
Caution
If maximumConfidentHorizon is lower than
numberOfForecasts, the MAPE indicator might not be calculated
based on the performances on the whole requested horizon.
mapePerHorizon Array of numbers Array of MAPE indicators for each horizon until the requested horizon
(the value of numberOfForecasts).
This shows the evolution of the performance of the predictive model de
pending of the horizon of the forecasts. The first element of this array is
the performance at horizon 1 (the next value), the second element is the
performance at horizon 2 (the second next) and so on.
maximumConfiden Integer The maximum horizon for which the performance indicators are reliable.
tHorizon
For an horizon higher than maximumConfidentHorizon, the service
may provide forecasts without error bars.
Quality Rating
The following table shows the correspondence between the MAPE indicator and the quality rating.
0 > 0.8
1 > 0.7
2 > 0.5
3 > 0.4
4 > 0.2
5 <= 0.2
Related Information
The Key Influencers service analyzes a dataset to identify the variables with an influence on a specified target.
This service:
● Identifies the variables with an influence on a specified target ordered by decreasing contribution
● Returns detailed information on the grouped categories for each contributive variable
● Returns the variables excluded from the analysis
● Returns the variables that are correlated with each other
● Provides indicators on the reliability of the results
Remember
The target of the dataset must be either binary or continuous. Multinomial targets are not supported.
APIs
Getting the key influ POST /api/analytics/ input [page 53] output [page 54]
encers keyinfluencer/sync
Creating the job POST /api/analytics/ input [page 53] output [page 86]
keyinfluencer
Getting the job sta GET /api/analytics/ None output [page 86]
tus keyinfluencer/<jobID>/
status
Getting the key influ GET /api/analytics/ None output [page 54]
encers keyinfluencer/<jobID>
<jobID> Yes Integer The job identifier that you get by creating a job first
None
The list of input parameters for the synchronous mode API and the asynchronous job creation API of the Key
Influencers service.
{
"datasetID",
"targetColumn",
"numberOfInfluencers",
"targetKey",
"skippedVariables",
"weightVariable",
"autoSelection"
}
datasetID Yes Integer The identifier of a dataset that has been registered N/A
in the schema.
targetColu Yes String The name of the column containing the target to N/A
mn use for the analysis.
numberOfIn No Integer A positive integer that represents the number of Null. All contri
fluencers key influencers to return. butive variables
are returned.
variableDe Array of objects The tuples are name and value pairs for the follow Null. The de
scription Cau ing parameters that describe the variable: scription stored
tion ● variable
with the data
set is used.
Deprecated ● storage
● value
● key
● missing
Note
Only variable, storage, and value must
be in the input. The other parameters can be
omitted.
targetKey No String or num The value of the target of interest. The least fre
ber quent category
skippedVar No Array of strings The list of variables that should not be included in No variable is
iables the analysis. excluded.
weightVari No String The variable to be used as weight during modeling. Null. No varia
able ble is used as
weight.
Related Information
The list of output parameters of the synchronous and asynchronous APIs of the Key Influencers service.
{
"parameters" : {
...
},
"influencers" : [
{
"variable",
"contribution",
"groups" : [
{
"groupName",
"groupDefinition" : {
"categories",
"higherBound",
"higherBoundIncluded",
"lowerBound",
"lowerBoundIncluded",
"kxmissingIncluded"
},
"significance",
"normalProfit",
"frequency",
"targetMean"
},
{...},
...
]
},
...
],
"modelPerformance" : {
"qualityRating",
"confidenceIndicator",
influencers [page Array of objects The list of the top N most contributive variables and detailed information
55] on how they influence the target.
excludedVariabl Array of objects The list of variables that were excluded automatically from the model.
es [page 56]
correlatedVaria Array of objects The list of pairs of correlated variables found in the dataset.
bles [page 57]
groupDefinition Object The definition of the grouped category. See table below.
frequency Number The frequency of the grouped category. This is the percentage of the da
taset that belongs to the grouped category.
targetMean Number The target average in the grouped category. If the target is a binary value,
it corresponds to the target rate inside the grouped category.
categories Array of strings The list of values in the grouped category of a nominal variable. Null if the
variable is continuous.
higherBound Number The higher bound in current category range for continuous variables
lowerBound Number The lower bound in current category range for continuous variables
Note
The following group definition values are null if the variable is nominal:
● higherBound
● higherBoundIncluded
● lowerBound
● lowerBoundIncluded
● kxmissingIncluded
confidenceIndic Integer The model robustness indicator. 1 if the results are reliable, else 0.
ator
predictivePower Number The predictive power of the model that has generated the results
predictionConfi Number The prediction confidence of the model that has generated the results
dence
reason String The reason why the variable has been excluded, which can be:
● Leak Variable
● Fully Compressed
● Small KI on Estimation
● Small KI on Validation
● Large KI Difference
● Small KR
● Constant
● Small variance
Note
The reason values are the ones used in SAP Predictive Analytics. The Leak Variable reason corresponds
to Suspicious Variable. For more information about the variable exclusion causes, see the Automated
Analytics User Guides and Scenarios on the SAP Help Portal.
Related Information
The Outliers service identifies the odd profiles of a dataset whose target indicator is significantly different from
what is expected.
This service:
An outlier can either result from a data quality issue to correct or represent a suspicious case to investigate.
Remember
The target of the dataset must be either binary or continuous. Multinomial targets are not supported.
APIs
Getting the outliers POST /api/analytics/ input [page 59] output [page 60]
outliers/sync
Creating the job POST /api/analytics/ input [page 59] output [page 86]
outliers
Getting the job sta GET /api/analytics/ None output [page 86]
tus outliers/<jobID>/
status
<jobID> Yes Integer The job identifier that you get by creating a job first
None
The list of input parameters for the synchronous mode API and the asynchronous job creation API of the
Outliers service.
{
"datasetID",
"targetColumn",
"numberOfOutliers",
"numberOfReasons",
"targetKey",
"skippedVariables",
"weightVariable",
"autoSelection
}
datasetID Yes Integer The identifier of a dataset that has been registered N/A
in the schema.
targetColu Yes String The name of the column containing the target. N/A
mn
numberOfOu No Integer The number of outliers to return. The value 0 re 100
tliers turns all the outliers.
variableDe Array of objects The tuples are name and value pairs for the follow Null. The de
scription Cau scription stored
ing parameters that describe the variable:
tion with the data
● variable set is used.
Deprecated
● storage
● value
● key
● missing
Note
Only variable, storage, and value must
be in the input. The other parameters can be
omitted.
targetKey No String or num The value of the target of interest. The least fre
ber quent category
skippedVar No Array of strings The list of variables that should not be included in No variable is
iables the analysis. excluded.
weightVari No String The variable to be used as weight during modeling. Null. No varia
able ble is used as
weight.
Related Information
The list of output parameters for the synchronous and asynchronous APIs of the Outliers service.
{
"parameters" : {
...
},
"numberOfOutliers",
"outliers" : [
{
"dataPoint",
"predictedValue",
"errorBar",
"realValue",
"reasons" : [
{
outliers [page 61] Array of objects The list of outliers, including their reasons.
dataPoint Object The content of the record flagged as an outlier, that is, the values of all
the dataset columns for this record.
predictedValue Number The expected value of the target indicator computed from each attribute
value of a record.
errorBar Number The error bar associated with the expected value.
confidenceIndic Integer The model robustness indicator. 1 if the results are reliable, else 0.
ator
predictivePower Number The predictive power of the model that has generated the results.
predictionConfi Number The prediction confidence of the model that has generated the results.
dence
Related Information
The Recommendation APIs provide a set of services that allows you to create a recommendation model and
generate recommendations from it.
The services compute the model and its recommendations based on a transaction dataset. The dataset
contains information about user transactions, which consist of customer/purchased item pairs.
Related Information
This Recommendation service creates a recommendation model from the user transaction history.
This service:
● Estimates the costs and performance of a recommendation model before you create it
● Creates a recommendation model, either in synchronous mode or asynchronous mode
● Returns statistics on the input transaction data, such as the number of rows, which gives an idea of the size
of the dataset
● Returns statistics on the selected recommendation rules for the created model
● Deletes the recommendation model and the corresponding job
The estimates include the duration of the process and the size of the resulting model.
Remember
In synchronous mode, the call also generates a job to make the recommendation model accessible. The call
outputs the job identifier, which is also the model identifier.
Recommender APIs
Creating a recom POST /api/analytics/ input [page 64] output [page 66]
mendation model recommendations/
recommender/sync
Estimating the costs POST /api/analytics/ input [page 64] output [page 67]
of a recommenda recommendations/
tion model before recommender/guess/
creation sync
Creating the job POST /api/analytics/ input [page 64] output [page 86]
recommendations/
recommender
Getting the job sta GET /api/analytics/ None output [page 86]
tus recommendations/
recommender/<jobID>/
status
<jobID> Yes Integer The job identifier that you get by creating a job first.
Remember
This is also the identifier of the recommendation model.
None
The list of input parameters for the synchronous mode API and the asynchronous job creation API that create a
recommendation model.
{
"transactionData" : {
"datasetID",
"transaction" : {
"userColumn",
"itemColumn",
"dateColumn"
},
"period" :{
"startDate",
"endDate"
}
},
"modelingSettings" : {
"minimumSupport",
"minimumConfidence",
"minimumPredictivePower",
"bestSellersThreshold"
}
}
transactionData Yes Object The details on the transaction data that is used
[page 65] to create the recommendation model.
datasetID Yes Integer The identifier of a dataset that has been registered N/A
in the schema.
transactio Yes Object The list of the dataset columns that define a trans N/A
n action.
period No Object The time period of the dataset on which the crea N/A
tion of the model is based.
userColumn Yes String The name of the column containing the user IDs re N/A
lated to a transaction.
itemColumn Yes String The name of the column containing the item IDs re N/A
lated to a transaction.
dateColumn Yes String The name of the column containing the date or N/A
timestamp of the transaction.
Related Information
The list of output parameters for the synchronous and asynchronous APIs that create a recommendation
model.
{
"parameters" : {...},
"recommenderID",
"transactionDataStatistics" : {
"numberOfRows",
"numberOfUsers",
"numberOfItems",
"numberOfUserItemPairs",
"density"
},
"modelMetrics" : {
"numberOfRules",
"numberOfItems",
"percentageOfItems"
}
}
recommenderID Number The identifier of the recommender job to access the resulting recom
mendation model.
transactionData Object A summary of the statistics of the dataset used to generate the recom
Statistics [page mendations.
67]
modelMetrics Object Statistics about the generated model and the possible recommended
[page 67] items.
numberOfUserIte Integer The number of distinct user/item pairs in the transaction dataset.
mPairs
density Number The ratio between the number of existing user/item pairs and the total
number of possible user/item pairs.
numberOfItems Integer The number of distinct items that can be recommended by this model.
percentageOfIte Number The percentage of items that can be recommended by this model com
ms pared to the total number of distinct items in the dataset
(transactionDataStatistics.numberOfItems).
Related Information
The list of output parameters of the "guess" API of the Recommendation service.
{
"transactionDataStatistics" : {
"numberOfRows",
"numberOfUsers",
estimates [page 68] Object The estimation of the costs and size of the recommendation
model that would result from the call [POST] /api/
analytics/recommender with the same request.
numberOfUserIte Integer The number of distinct user/item pairs in the transaction dataset
mPairs
density Number The ratio between the number of existing user/item pairs and the total
number of possible user/item pairs
rulesCountRange Array of integers The range of the number of rules in the recommendation model
modelingProcess Integer An estimate in seconds of the time required to create the recommenda
Duration tion model
Related Information
You can get a list of recommendations from a recommendation model for a specific user.
Scores and ranks are computed with the metrics selected and passed in the request.
By using the fillList parameter, this service guarantees the users always get recommendations,
independently of their purchase history. Bestsellers items can be added to the recommendation list if the
maximum number of items that can be recommended for a user is not reached. They are called fillers in that
case. Bestsellers are items with the highest frequencies in the dataset. They do not have any score associated.
This service also identifies which items are part of the user purchase history and allows you not to recommend
them. By default, purchased items are not recommended.
Request
From a User ID
URI: /api/analytics/recommendations?
recommenderID=<integer>&userID=<string>&maxItems=<integer>&rankingMetric=<enum>&fil
lList=<boolean>&skipAlreadyOwned=<boolean>
URI: /api/analytics/recommendations?
recommenderID=<integer>&itemList=<string>,<string>,...&maxItems=<integer>&rankingMe
tric=<enum>&fillList=<boolean>&skipAlreadyOwned=<boolean>
userID No String The ID of the user for whom the service generates recom N/A
mendations.
itemList No Array of The list of items used as basis to generate recommenda N/A
strings tions
Note
If userID and itemList are both specified in the re
quest, then the service generates recommendations us
ing the values of itemList and filters them according
to the user purchase history.
rankingMe No String The metric used to rank the items in the recommendation CONFIDENC
tric list. E
Possible values are:
● SUPPORT
● LIFT
● CONFIDENCE
● KI
● COSINE
● ADDED_VALUE
fillList No Boolean A flag that indicates whether the recommendation list must false
be filled until the maximum number of items (maxItems) is
reached.
skipAlrea No Boolean A flag that indicates whether purchased items are removed true
dyOwned from the recommendation list.
None
Response
[
{
"itemID",
"itemScore",
"itemRank",
"isFiller"
},
{...},
...
]
itemScore Number The score associated with the recommendation, using the specified
ranking metrics.
itemRank Integer The rank of the recommended item in the recommendation list
This Recommendation service generates recommendations from a recommendation model for all users or a
subset of users of a dataset.
This service:
● Generates a list of recommendations either in synchronous mode or asynchronous mode and stores it in
an SAP HANA database table
● Returns statistics on the input transaction data
● Returns statistics on the generated recommendations
● Deletes the batch job
By using the fillList parameter, this service guarantees the users always get recommendations,
independently of their purchase history. Bestsellers items can be added to the recommendation list if the
maximum number of items that can be recommended for a user is not reached. They are called fillers in that
case. Bestsellers are items with the highest frequencies in the dataset. They do not have any score associated.
This service also identifies which items are part of the user transaction history and allows you not to
recommend them. By default, items already purchased are not recommended.
Remember
The predictive service DB user must be granted CREATE ANY and INSERT permissions to the destination
schema in order to write resulting recommendations in the destination table.
Generating recom POST /api/analytics/ input [page 72] output [page 76]
mendations recommendations/
batch/sync
Creating the job POST /api/analytics/ input [page 72] output [page 86]
recommendations/batch
Getting the job sta GET /api/analytics/ None output [page 86]
tus recommendations/
batch/<jobID>/status
<jobID> Yes Integer The job identifier that you get by creating a job first.
The list of input parameters for the synchronous mode API and the asynchronous job creation API that return
recommendations for a group of users.
{
"recommenderID",
"maxItemsPerUser",
"destination" : {
"schema",
"table",
recommende Yes Integer The identifier of a recommender job. The recom N/A
rID mendation model resulting from this job will be
used to generate recommendations.
Note
The job must have a SUCCESSFUL status for
the service to generate recommendations.
destinatio Yes Object The SAP HANA database table where recommen N/A
n [page 74] dations are stored.
Note
The SAP HANA user subaccount used by the
service must have CREATE ANY and INSERT
rights granted on the destination schema or ta
ble to write the resulting data.
transactio No Object The details on the transaction data that is used to null
nData [page generate recommendations.
74]
If not specified, the dataset used to create the rec
ommendation model is used with the same defini
tion of a transaction. All other settings related to
the transaction data have default values.
users No Array of strings The list of user IDs for which the service generates By default, rec
recommendations. ommendations
are generated
for all users of
the transaction
dataset.
recommenda No Object Settings which impact the content of the recom N/A
tionSettin mendation list.
gs [page 75]
datasetID Yes Integer The identifier of a dataset that has been registered N/A
in the schema.
transactio Yes Object The list of the dataset columns that define a trans N/A
n action.
period No Object The time period of the dataset on which the crea By default, no
tion of the model is based. time period is
defined. The
recommenda
tion model is
created using
all available
data.
userColumn Yes String The name of the column containing the user IDs re N/A
lated to a transaction.
itemColumn Yes String The name of the column containing the item IDs re N/A
lated to a transaction.
dateColumn Yes String The name of the column containing the date or N/A
timestamp of the transaction.
rankingMet No String The metrics used as score to sort the recommen CONFIDENCE
ric dations.
● SUPPORT
● LIFT
● CONFIDENCE
● KI
● COSINE
● ADDED_VALUE
threshold No Number The threshold of the ranking metrics above which null
an item is kept in the recommendation list.
Related Information
The list of output parameters for the synchronous and asynchronous APIs that return recommendations for a
group of users.
{
"parameters" : {...},
"transactionDataStatistics" : {
"numberOfRows",
"numberOfUsers",
"numberOfItems",
"numberOfUserItemPairs",
"density"
},
"recommendationsStatistics" : {
"numberOfRecommendations",
"numberOfUsers",
"percentageOfUsers",
"numberOfItems",
"percentageOfItems"
}
}
transactionData Object A summary of the statistics of the dataset used to generate the recom
Statistics [page mendations.
76]
numberOfUserIte Integer The number of distinct user/item pairs in the transaction dataset.
mPairs
density Number The ratio between the number of existing user/item pairs and the total
number of possible user/item pairs.
numberOfUsers Integer The number of distinct users which have at least one recommendation.
percentageOfUse Number The percentage of users which have at least one recommendation com
rs pared to the total number of distinct users in the dataset
(transactionDataStatistics.numberOfUsers).
percentageOfIte Number The percentage of recommended items compared to the total number of
ms distinct items in the dataset
(transactionDataStatistics.numberOfItems).
Related Information
The Scoring Equation service builds a predictive model from a dataset and exports its scoring equation to
either an SAP HANA SQL query or a score card in CSV format.
This service:
The scoring equation generates predicted values for each data point of the specified dataset. In regression
cases, the predicted value corresponds to an estimation of the value of the target indicator. In classification
cases, the predicted value corresponds to a score. The more likely a data point is a target, the higher the score
is. This score does not have any semantics and can only be used to sort out data points from more likely to less
likely to be a target. However, it can be converted into a probability by setting the predictedOutputType
parameter to probability.
Remember
The target of the dataset must be either binary or continuous. Multinomial targets are not supported.
APIs
Getting the scoring POST /api/analytics/ input [page 78] output [page 80]
equation scoringequation/sync
Creating a job POST /api/analytics/ input [page 78] output [page 86]
scoringequation
Getting the job sta GET /api/analytics/ None output [page 86]
tus scoringequation/
<jobID>/status
<jobID> Yes Integer The job identifier that you get by creating a job first
None
The list of input parameters for the synchronous mode API and the asynchronous job creation API of the
Scoring Equation service.
{
"datasetID",
"targetColumn",
"predictionOutputType",
"equationFormat",
"keyColumn",
"datasetName",
"targetKey",
"skippedVariables",
"weightVariable",
"autoSelection"
}
datasetID Yes Integer The identifier of a dataset that has been regis N/A
tered in the schema
targetColu Yes String The name of the column containing the target to N/A
mn use for the analysis
equationFo Yes String Format of the scoring equation of the resulting N/A
rmat model:
keyColumn No String The name of the column considered as key. Ap $Key. The $Key
plicable only if equationFormat is HANA. variable must
be set when ex
ecuting the
SQL query.
datasetNam No String The name of the table which the scoring equation $Dataset. The
e is applied to. Applicable only if $Dataset varia
equationFormat is HANA. ble must be set
when executing
the SQL query.
variableDe Array of objects The tuples are name and value pairs for the fol Null. The de
scription Cau lowing parameters that describe the variable: scription
tion ● variable
stored with the
dataset is used.
Deprecated ● storage
● value
● key
● missing
Note
Only variable, storage, and value
must be in the input. The other parameters
can be omitted.
targetKey No String or number The value of the target of interest The least fre
quent category
skippedVar No Array of strings The list of variables that should not be included in No variable is
iables the analysis excluded.
weightVari No String The variable to be used as weight during model Null. No varia
able ing ble is used as
weight.
Related Information
The list of output parameters of the synchronous and asynchronous APIs of the Scoring Equation service.
{
"parameters": {
...
},
"scoringEquation”,
"modelPerformance" : {
"qualityRating",
"confidenceIndicator",
"predictivePower",
"predictionConfidence"
}
}
Related Information
The What If service simulates a planned action and returns the significant changes that result from it.
This service:
The simulation consists of changing the weight assigned to a group of values of the variable. The service
returns the deviations observed on the series of variables affected by this change and lists the affected
categories for each of them. It also compares the frequency of each category before and after the change is
applied.
Running the simula POST /api/analytics/ input [page 82] output [page 84]
tion whatif/sync
Creating a job POST /api/analytics/whatif input [page 82] output [page 86]
Getting the job sta GET /api/analytics/ None output [page 86]
tus whatif/<jobID>/status
<jobID> Yes Integer The job identifier that you get by creating a job first
None
The list of input parameters for the synchronous mode API and the asynchronous job creation API of the What
If service.
{
"datasetID",
"simulation":{
"variable",
"weights": [
{
"categories",
"range":{
"lowerBound",
"lowerBoundIncluded",
"higherBound",
datasetID Yes Integer The identifier of a dataset that has been registered N/A
in the schema.
variableDe Array of objects The tuples are name and value pairs for the follow Null. The de
scription Cau ing parameters that describe the variable: scription stored
tion ● variable
with the data
set is used.
Deprecated ● storage
● value
● key
● missing
Note
Only variable, storage, and value must
be in the input. The other parameters can be
omitted.
skippedVar No Array of strings The list of variables that should not be included in No variable is
iables the analysis. excluded.
weightVari No String The existing weight variable to be considered in the Null. No varia
able analysis. ble is used as
weight.
variable Yes String The variable whose distribution is modified for the N/A
sake of the simulation.
weights Yes Array of objects The list of changes to apply to the variable. Each N/A
change corresponds to a new weight assigned to a
specific group of values. A group is specified either
as a set of values or a range of values.
categories No Array of strings The set of values, which a new weight is assigned N/A
to. This list can contain strings but cannot contain
null values.
range No Object The range of values, which a new weight is assigned N/A
to.
weight Yes Number The new weight assigned to a group for the simula N/A
tion.
Remember
You must specify either categories or range. The service returns an error if both are specified.
kxmissingI No Integer 1 if the group also includes missing values, else 0. N/A
ncluded
Related Information
The list of output parameters of the synchronous and asynchronous APIs of the What If service.
{
"parameters" : {
...
},
"deviations" : [
{
"variable",
"categories",
"statistics" : [
{
"category",
deviations [page Array of objects The list of the deviations observed after simulation.
85]
categories Array of strings The list of all the deviant categories for the current deviant variable
statistics Array of objects The comparison data of the current deviant variable between the original
and simulation datasets. See table below.
originalFrequen Number The frequency of the current category in the original dataset
cy
simulationFrequ Number The frequency of the current category in the simulation dataset
ency
frequencyIncrea Number The relative increase of the frequency of the current category between
se the original and simulation datasets
Related Information
The list of output parameters of the job creation and status check APIs.
{
"id",
"status",
"type",
"input": "{
}"
}
● NEW
● PROCESSING
● SUCCESSFUL
● FAILED
● clustering
● forecasts
● key_influencer
● outliers
● recommender
● recommendations_batch
● scoring_equation
● whatif
Related Information
With this scenario, the end user will be able to segment a population into "interesting" / "not interesting"
clusters and focus on specific clusters of interest instead of the whole population.
Step Description
Register the dataset and specify the primary key Registering the Dataset [page 87]
Create the clustering model in asynchronous mode Calling the Clustering Service [page 92]
Get the clustering job results Getting the Results [page 94]
Get access to segmentation results Accessing the Segmentation Results [page 95]
The SAP HANA schema is DATA_SCHEMA and contains the CENSUS dataset, which has the id column as
primary key. OUTPUT_SCHEMA is the schema that will contain the output table of the clustering process.
Note
For a description of the CENSUS dataset, see Datasets Available for SAP API Business Hub [page 164].
Request
URI: /api/analytics/dataset/sync
Request body:
Response
{
"ID": 118,
"name": "CENSUS",
"numberOfColumns": 16,
"numberOfRows": 48842,
"variables": [
{
"name": "id",
"position": 0,
"storage": "integer",
"value": "continuous"
},
{
"name": "age",
"position": 1,
"storage": "integer",
"value": "continuous"
},
{
"name": "workclass",
"position": 2,
"storage": "string",
"value": "nominal"
},
{
"name": "fnlwgt",
"position": 3,
"storage": "integer",
"value": "continuous"
},
{
"name": "education",
"position": 4,
"storage": "string",
"value": "nominal"
},
{
"name": "education-num",
"position": 5,
"storage": "integer",
"value": "nominal"
},
{
"name": "marital-status",
"position": 6,
"storage": "string",
"value": "nominal"
},
{
"name": "occupation",
"position": 7,
"storage": "string",
"value": "nominal"
},
{
"name": "relationship",
A dataset does not require a primary key to use the clustering service. However, if it does not have one, the
performances will be hindered, as the whole dataset is copied to the segmentation results. To specify a primary
key, modify the description of the id column of the dataset and indicate it is the only component of the primary
key. In this scenario, correct the description of some variables at the same time.
URI: /api/analytics/dataset/118/variables/update
Request body:
[
{
"name" : "id",
"value" : "nominal",
"key" : 1
},
{
"name" : "education-num",
"value" : "ordinal"
},
{
"name" : "capital-gain",
"value" : "continuous"
},
{
"name" : "capital-loss",
"value" : "continuous"
}
]
Response
{
"ID": 118,
"name": "CENSUS",
"numberOfColumns": 16,
"numberOfRows": 48842,
"variables": [
{
"name": "id",
"position": 0,
"storage": "integer",
"value": "nominal",
"key" : 1
},
{
"name": "age",
"position": 1,
"storage": "integer",
"value": "continuous"
},
{
"name": "workclass",
"position": 2,
"storage": "string",
"value": "nominal"
},
{
"name": "fnlwgt",
"position": 3,
● Not too many clusters, as each cluster would require a specific analysis (4 or 5)
● The clustering process is driven by a target indicator (class column) to ideally get clusters with high or
low target rate. That way, you can focus on first ones and ignore the last ones.
● Export the segmentation results to a table called MY_CLUSTERING table of the OUTPUT_SCHEMA schema.
Send an asynchronous call that creates a clustering job. The job creates, trains, applies, and deletes a
clustering model.
Request
URI: /api/analytics/clustering
Request body:
{
"datasetID" : 118,
"numberOfClusters" : [4, 5],
"exportSettings" : {
"method" : "table",
"destination" : {
"schema" : "OUTPUT_SCHEMA",
"table" : "MY_CLUSTERING"
}
},
"target" : {
"column" : "class",
"value" : 1
}
}
Response
Request
URI: /api/analytics/clustering/7/status
Response
{
"ID": 7,
"status": "SUCCESSFUL",
"type": "clustering"
}
Request
URI: /api/analytics/clustering/7
Response
{
"parameters" : {
"datasetID" : 118,
"numberOfClusters" : [4, 5],
"exportSettings" : {
"method" : "table"
"destination" : {
"schema" : "OUTPUT_SCHEMA",
"table" : "MY_CLUSTERING"
}
},
"target" : {
"column" : "class",
"value" : 1
}
},
"clustering" : {
"numberOfClusters" : 4
},
"clusters" : [
{
"ID" : 1,
"name" : "Cluster_1",
"frequency" : 0.0635,
"targetMean" : 0.7708
},
{
"ID" : 2,
"name" : "Cluster_2",
"frequency" : 0.5332,
"targetMean" : 0.0473
},
The clustering model has segmented all the data points of the dataset into four clusters:
● Cluster 1 has a high target rate (77%), but it is a bit small (6% of the dataset).
● Cluster 3's target rate is a bit lower (64%), but it is twice as big (15%).
● Cluster 2's target rate is really low, which means targets from this cluster should be ignored
● Nothing special comes from Cluster 4.
The clustering model has a high predictive power (69%), which means the cluster ID assigned to a data point is
a useful information to deduce the value of the target variable. The clusters are stable (prediction confidence =
99%), which means the cluster ID is a reliable information to deduce the value of the target of a data point.
The cluster assignment information has been successfully exported to the MY_CLUSTERING table of the
OUTPUT_SCHEMA schema.
Cluster assignment information is now available to the MY_CLUSTERING table. Since a primary key and a target
have been specified, this table only contains:
You can get the complete view of the segmentation results where each data point has an additional column
containing their assigned cluster ID. Create the MY_CLUSTERING_FULL view to merge the CENSUS table and
the MY_CLUSTERING table using the id column as merge key.
Error messages may appear while you are using the predictive service.
If the request was unsuccessful, the call returns a message in the JSON format, as follows:
{
"errors":[
{
"errorCode": string,
"errorMessage": string
},
{...},
...
]
}
Related Information
EXX100 500 An error has occurred. This is the default error message.
EXX101 400 The mandatory <parameter> parame You do not specify one of the manda
ter is missing. Please set tory parameters.
<parameter>.
EXX102 400 The "<variable>" variable used as The dataset does not contain the speci
<parameter> was not found in the da fied variable.
taset.
EXX103 400 The "<variable>" variable is used as The specified variable does not have
<parameter> but its storage type is the required storage type.
identified as <real storage type>.
A variable used as <parameter> must
have one of the following storage
types : <storageType,
storageType2, …>.
EXX104 400 The "<variable>" variable is used as The specified variable does not have
<parameter> but its value type is iden the required value type.
tified as <real value type>. A varia
ble used as <parameter> must have
one of the following value types :
<valueType, valueType2, …>.
EXX105 400 The <parameter> parameter does not The current parameter does not sup
support the value "<value>". The sup port the specified value.
ported values are "<value1,
value2…>".
EXX106 400 The "<variable>" variable does not The dataset does not contain the speci
contain a "<value>" value. fied value.
EXX107 400 The <parameter> parameter must be The value specified for the current pa
a positive integer. rameter must be a positive integer.
EXX108 500 The service cannot access the dataset The dataset identifier used with the
<datasetID>. service is not accessible.
EXX109 400 The "<variable>" variable is used A variable is used with 2 or more roles.
with multiple roles: <role1, role
2>.
EXX110 500 An SQL error has occurred. Various issues returned by the SAP
HANA database, for example: a table
that does not exist, wrong credentials,
and so on.
EXX111 400 You must specify either Only one parameter must be defined
"<parameter1>",… or "<parameter>" among a list of parameters. Either no or
parameter. several parameters have been defined.
EXX112 403 The dataset cannot contain a variable The name of a variable of the dataset
named "<variable>". Please rename corresponds to a variable internally cre
this variable before using the service. ated by the service.
EXX113 400 <parameter> is not a valid parameter. One of the request parameter is not
Please refer to documentation for the valid.
JSON Schema of the request body of
the service.
EXX114 500 An internal error has occurred : An error occurred while training a data
<submessage>. mining model. <submessage> refers to
an error message raised by the predic
tive model engine.
EXX115 400 The <parameter> parameter requires The number of values of one of the ar
at least <n> values. ray type parameters is lower than the
minimum required number of values.
EXX116 400 The value set for the <parameter> pa The value assigned to one of the re
rameter is not valid. quest parameters is not valid.
EXX117 400 The job specified as <setting> must One of the setting refers to a job which
refer to an existing <type> job with a either is not accessible or does not
SUCCESSFUL status. have a SUCCESSFUL status.
EXX118 400 <schema>.<table> is specified both The destination table cannot be the ta
as input dataset and destination table. ble used as input dataset.
EXX119 400 The body of the request is not a valid There are syntax issues in the request.
JSON string.
EXX120 403 The service cannot write inside The service cannot write inside the
<schema>.<table>. specified destination table. Possible
causes are the following:
EXX121 400 The <parameter> cannot be set to The following combination of parame
<value> if <parameter2> is set to ters is not valid.
<value2>.
EDB002 500 Datasource not initialized: bind The binding does not exist or has not
ing=<binding_name> been initialized properly.
EDB003 500 No datasource, binding does not exist: The binding has been initialized but
binding=<binding_name> can’t be used.
EDS101 400 "<hanaURL>" is not accessible. Please The datasource URL is not accessible.
make sure that hanaURL is correct and
that access rights were granted to the
predictive services.
EDS102 404 The dataset <datasetID> was not The dataset ID does not exist or is un
found. registered.
EDS103 500 The dataset registered with ID The dataset is still registered in the ap
<datasetID> is not registered any plication but not accessible anymore.
more. Please unregister this dataset.
EDS104 404 The dataset <datasetID> does not No variable is found at the specified po
have any variable at position sition in the specified dataset.
<position>.
EDS105 403 The dataset <datasetID> cannot be The unregistered dataset must be de
deleted because it has dependencies : leted but it has dependencies.
<datasetID1, datasetID2,...>.
Please unregister the dependencies be
fore unregistering this dataset.
EDS106 400 The provided list of variables does not A list of variables has been provided,
contain <variable>. but it does not contain an existing data
set variable. Edit the list of variables.
EDS107 400 The <variable> variable was not The provided list of variables contains a
found in the dataset. name that does not match any actual
dataset variable. Edit the list of varia
bles.
EDS108 400 The <variable> variable is specified The storage type of a specified variable
with a <storage_type> storage type, does not match its actual representa
but is stored as tion in the database. Edit the list of vari
<actual_storage_type> in the data ables.
base.
EDS109 400 The <variable> variable is specified A variable should appear at most once
more than once. in the list describing variable proper
ties. Edit the list of variables.
EDS110 400 The variable <variable> has incom The specified combination of settings
patible storage (<storage_type>) either is not allowed or has no meaning.
and value (<value_type>) types. Edit the list of variables.
EDS111 400 Column names with blank characters The specified dataset contains one or
are not allowed in the datasets. The fol more columns with names containing
lowing columns have blank characters spaces. These datasets are not sup
in their names: ported.
[<column_name1>,...,<column_namen
>]
EJB101 404 The <service> job <jobID> was not The specified job ID does not exist or
found. does not refer to the specified service.
This release allows you to train an Automated Analytics model on a dataset and to apply this model on a new
dataset.
End-users can create simple Automated Analytics models by specifying only a few settings. They can then run
these models on a specific set of data to produce predictive results. The Predictive Analytics Integrator
services rely on new concepts that describe the models to be managed:
Object Description
Catalog Catalogs are containers for predictive scenarios and datasets. They behave like folders in a
file system.
Predictive scenario Applications interact with predictive models through a predictive scenario.
Dataset A dataset is a reference to a physical data source, such as a database table or view.
Task A task is the object that you run in order to train and apply predictive models.
Model version A model version represents a trained model. It contains a reference to a physical model
within the back-end system.
Remember
The predictive scenario is a static interface between the consuming application and the predictive
capabilities. It abstracts the concept of models and model versions away from the application developer.
This way, applications can interact with a predictive scenario without knowing anything about the physical
model behind it.
First, the user answers the business question by creating a predictive scenario without a signature, which is the
description of the input datasets and output results. Then, they have to create a Train task and provide certain
settings to generate a model and a model version. The user indicates if the model version is active when the
The user must browse the catalog to find the predictive scenario. If the predictive scenario already has a
signature defined, it is read-only at this point and then the model version added is used to validate the model
against the signature. No automatic extraction is done. If the validation fails, the model version is not added to
the scenario.
Remember
● The Train task works only with Automated Analytics models.
● The Apply task does not create a new model version. Apply is focused on data in and data out.
A predictive scenario is a logical entity that needs data accessed through an SAP HANA table or view in order
to train models and generate predictions. To this end, it is possible to bind a dataset to a predictive scenario for
use as the default when tasks are run. Default bindings can be overridden by input bindings set at the task level
between an input dataset and a task. The back-end system validates the binding against the signature of the
predictive scenario.
Related Information
The back-end system validates the dataset that is bound to a predictive scenario to make sure it conforms to
the signature of the predictive scenario. Only conforming datasets can be bound to the corresponding
predictive scenario. A conforming dataset has column name and storage values identical to those of the
signature input structure.
Before an end user can run a task against a predictive scenario, the model version needs to be made active for
that predictive scenario. There are three ways to do this:
● Setting the AutoActivate parameter of the Train Task object to true before running it. The model
version created is activated automatically once the task is finished.
● Setting the Active parameter of the ModelVersion object to true once the task has run.
● Setting the ActiveModelVersion parameter of the PredictiveScenario object via the call
[PUT] /api/pai/PredictiveScenarios('GUID')/$links/ActiveModelVersion with the
reference to the model version in the request.
You can choose the method you prefer as they are equivalent to each other.
Related Information
The Predictive Analytics Integrator services allow an end user to create, consume, and manage predictive
models.
Overview
These services are a set of OData REST APIs that you use to integrate predictive analysis features into a cloud
application. This release allows you to train an Automated Analytics model on a dataset and to apply this model
on a new dataset.
Test these APIs directly in the SAP API Business Hub with the sample datasets described in Datasets
Available for SAP API Business Hub [page 164].
Permissions: sap.hana.pai::ExecutePAI
Common Headers
Note
According to the OData specification, a call that returns an entity always returns the entity metadata
("d:"{ "__metadata": {...). To avoid returning metadata, add odatamedata=none to the Accept
request header: application/json;odatametadata=none.
Header Description
Accept application/json
The service uses the common OData 2.0 response codes described in this table. Some specific codes are used
for request validation errors (400) and not found objects (404). For 500 error codes, a generic message is
returned and the exception is written to the logs.
Code Reason
200 OK. Indicates that a request has been received and processed successfully by a data service
and that the response body is not empty.
202 Accepted. Indicates that a batch request has been accepted for processing, but that the
processing has not been completed.
204 No Content. Indicates that a request has been received and processed successfully by a
data service and that the response does not include a response body.
400 Bad Request. Indicates that the payload, request headers, or request URI provided in a re
quest are not correctly formatted according to the syntax rules defined in this document.
404 Not Found. Indicates that a segment in the request URI's resource path does not map to an
existing resource in the data service. A data service may respond with a representation of an
empty collection of entities if the request URI addressed a collection of entities.
405 Method Not Allowed. Indicates that a request used an HTTP method not supported by the
resource identified by the request URI.
412 Precondition Failed. Indicates that one or more of the conditions specified in the request
headers evaluated to false.
500 Internal Server Error. Indicates that a request being processed by a data service encoun
tered an unexpected error during processing.
{
"error": {
"code",
"message": {
"lang":"en",
"value"
}
}
}
where:
Note
The same applies for asynchronous tasks, except that the message and code are written to the Message
property of the Task object.
Related Information
http://www.odata.org/documentation/odata-version-2-0/
The object that the application code deals with first is the PredictiveScenario. This object contains the
Task (Apply or Train) and the Model. The model does not represent the underlying model itself (predictive
model or pipeline), it is only a container for ModelVersion objects, which represent the actual underlying
models. The PredictiveScenario also references the active model version, if there is one.
The application code stores the end user datasets and the Apply results in the SAP HANA database. The model
version references the underlying model managed by the back-end system of the services and stored in SAP
HANA.
Properties
Each entity or object has a set of properties. Some of these properties are extracted from the underlying model
or created by the back-end system automatically, while others need to be provided by the application end user.
System-provided properties are read-only, while user-provided properties can be modified. The application
code does not have to set system-provided properties in a request, since most of them are actually extracted
from the model content. User-provided properties can be set either at object creation time only or at any time.
Containment Relationships
The model makes heavy use of containment relationships between many of the objects so that their lifecycles
are connected. When you delete the parent, all children are deleted automatically. So you don’t have to manage
each object individually. A containment relationship is identified by a navigation property. See the Usage
Scenarios [page 131] for examples.
Deep Insert
OData allows you to call a deep insert, which creates multiple objects at the same time by simply embedding
them inside each other. This is similar to what you get at query time when you use $expand. For example, you
can define a task directly inside the Tasks property of a PredictiveScenario and then pass the whole thing
in so you create both the PredictiveScenario and the child Task at the same time.
Resources
Dataset [page 119] Represents a physical data source, such as an SAP HANA /Datasets
database table or view.
Model [page 127] Represents the physical model. Acts as a container for /Models
ModelVersion objects.
Model Version [page 129] Represents the physical trained model with all its informa /ModelVersions
tion (version, metadata, and metrics).
Common Properties
In addition to any standard OData properties, all the objects have the following properties in common.
Remember
In this document, the description of a property shows if it is provided by the system ('System' in 'From'
column) or can be input by the application end user ('User'). User-provided properties may or may not be
required in the POST request when creating the object ('No' in 'Required on POST' column). If a property set
by the end user cannot be modified, it is also mentioned in the description.
Type is Edm.DateTime.
6.3.2 Catalog
Catalogs are a simple way to organize predictive scenarios and datasets. They behave like folders in a file
system. A Catalog has a name and optionally a reference to a parent Catalog. Each Catalog has a
Resource Path:/Catalogs
Operations
CRUD Operations
Properties
Remember
In this document, the description of a property shows if it is provided by the system ('System' in 'From'
column) or can be input by the application end user ('User'). User-provided properties may or may not be
required in the POST request when creating the object ('No' in 'Required on POST' column). If a property set
by the end user cannot be modified, it is also mentioned in the description.
Parent Catalog object The parent Catalog of the current ob User Yes
or null ject.
Note
The target entity set, in which an entity
will be created, will determine the pa
rent. POST /Catalogs results in a
null Parent (root object) whereas
POST /Catalogs('GUID')/
Catalogs sets the entity with GUID
identifier as parent.
Example
Here's a Catalog object, as it can be returned by a GET request:
{
"d": {
"__metadata": {
"id": "http://<server>:<port>/api/pai/
Catalogs('29eb9468-58c8-431d-9f71-27951e6860bb')",
"uri": "http://<server>:<port>/api/pai/
Catalogs('29eb9468-58c8-431d-9f71-27951e6860bb')",
"type": "com.sap.aa.ii.backend.ODataCatalog"
},
"CatalogType": "DemoFolder",
"Type": "Catalog",
"CreationTime": "2016-05-26T17:01:26.233",
"Description": "",
"GUID": "29eb9468-58c8-431d-9f71-27951e6860bb",
"LastModificationTime": "2016-05-26T17:01:26.233",
"Name": "MyCatalog",
"Path": "MyCatalog",
Related Information
A PredictiveScenario is the interface through which an application can interact with predictive models. A
PredictiveScenario can be child of a Catalog or a root object by itself.
A predictive scenario has a signature, which describes the input datasets and output results. Only models and
datasets conforming to this signature can be used in the predictive scenario. The real physical model will be
dynamically selected at runtime by resolving the predictive scenario into a particular model version. The
signature will never change during the lifespan of the predictive scenario. It assures you that the predictive
scenario stays the same even if the model implementation changes behind the scenes.
Operations
CRUD Operations
Properties
Remember
In this document, the description of a property shows if it is provided by the system ('System' in 'From'
column) or can be input by the application end user ('User'). User-provided properties may or may not be
required in the POST request when creating the object ('No' in 'Required on POST' column). If a property set
by the end user cannot be modified, it is also mentioned in the description.
Parent Catalog object The parent Catalog of the current ob User No
or null ject.
Signature Object The description of the input datasets and System/User (op No
[page 113] output results that make up the interface tional)
of a model.
Models Array of Model The list of child Model objects of the cur User No
objects rent PredictiveScenario.
Inputs Array of objects The variables of the input datasets re User No
quired by the model.
Note
This release only accepts an array of
size 1 (one input).
Outputs Array of objects The variables of the output datasets gen User No
erated by the model when an Apply task is
run.
Name String For inputs, the name of the variable in the User No
model. For outputs, the name of the col
umn in the Apply output table.
Note
Name is important for validation pur
poses.
Outputs Array of objects The physical location where the data User No
should be stored when the Apply task is
run.
Reference Dataset object The URI reference to the Dataset object User No
to be used as input.
Location Object The physical location of the SAP HANA ta User No
ble or view containing the data.
TableName String The name of the SAP HANA table or view. User Yes
Example
Here's a PredictiveScenario object without default binding, as it can be returned by a GET request:
{
"__metadata":{
"id":"/api/pai/PredictiveScenarios('6c156b2d-
da98-4f83-80a2-62574e590e37')"
},
"GUID":"6c156b2d-da98-4f83-80a2-62574e590e37",
"Name":"FraudsterDetector",
"Parent":"/api/pai/Catalogs('8358096b-039e-47d9-84d0-f30a0ddb4ba2')",
"Path":"MyFunctionalArea/FraudsterDetector",
"Type":"PredictiveScenario",
"Description":"Find the persons who might lie regarding their age",
"CreationTime":"2016-05-26T17:01:26.233",
"LastModificationTime":"2016-05-26T17:01:26.233",
"ScenarioType":"Regression",
"Signature":{
"Inputs":[
{
"Name":"inputDataset",
"Description":"Structure of input dataset expected by the
predictive model",
"Structure":[
{
"Name":"id",
"Storage":"Integer",
"Type":"Key"
},
{
"Name":"age",
"Storage":"Integer",
"Type":"Target"
},
{
"Name":"workclass",
"Storage":"String(16)"
},
{
6.3.4 Dataset
A Dataset is a reference to a physical data source, such as a database table or view. A Dataset can be the
child of a Catalog or a root object by itself.
A dataset can be bound to a predictive scenario for use as input when performing tasks. The back-end system
ensures that only datasets conforming to the signature of the predictive scenario can be bound to that
predictive scenario. A Dataset can be used by multiple PredictiveScenario objects.
Operations
CRUD Operations
Properties
Remember
In this document, the description of a property shows if it is provided by the system ('System' in 'From'
column) or can be input by the application end user ('User'). User-provided properties may or may not be
required in the POST request when creating the object ('No' in 'Required on POST' column). If a property set
by the end user cannot be modified, it is also mentioned in the description.
Parent Catalog object The parent Catalog object of the cur User No
or null rent object.
Location Object The physical location of the SAP HANA ta User Yes
ble or view containing the data.
TableName String The name of the SAP HANA table or view. User Yes
Name String The name of the variable column in the ta System
ble or view.
Example
Here's a Dataset object, as it can be returned by a GET request:
{
"__metadata":{
"id":"/api/pai/Datasets('b0bcd522-3728-4deb-9f33-ad1580ab1ca5')"
},
"GUID":"b0bcd522-3728-4deb-9f33-ad1580ab1ca5",
"Name":"DATA_SCHEMA_CENSUS",
"Parent":"/api/pai/Catalogs('8358096b-039e-47d9-84d0-f30a0ddb4ba2')",
"Path":"MyFunctionalArea/DATA_SCHEMA_CENSUS",
"Type":"Dataset",
"Description":"Example Dataset",
"CreationTime":"2016-05-26T17:01:26.233",
"LastModificationTime":"2016-05-26T17:01:26.233",
"Location":{
"Schema":"DATA_SCHEMA",
Related Information
6.3.5 Task
Task objects are children of PredictiveScenario objects. A Task is deleted when the
PredictiveScenario is deleted.
Tasks are run against PredictiveScenario objects. The logical predictive scenario is resolved into a physical
model version when the Apply task is run.
Tasks normally run asynchronously in the background. If the application code knows that a task will be fast-
running and wants to wait for the operation to complete, it can mark the task as synchronous.
Remember
Possible operations in this release are Apply and Train.
Operations
CRUD Operations
Remember
In this document, the description of a property shows if it is provided by the system ('System' in 'From'
column) or can be input by the application end user ('User'). User-provided properties may or may not be
required in the POST request when creating the object ('No' in 'Required on POST' column). If a property set
by the end user cannot be modified, it is also mentioned in the description.
Synchronous Boolean Indicates whether the task will run syn User No
chronously with its creation.
Messages Array of objects The error code and message returned if System
the task fails (TaskStatus is
Failure).
Example
Train Task
{
"__metadata": {
"id": "/api/pai/Tasks('7d60b5ac-caec-4b61-95f1-1a3a4b55cfef')"
},
"GUID" : "7d60b5ac-caec-4b61-95f1-1a3a4b55cfef",
"Name" : "FraudsterDetector 2016-11-20T09:00:32.153",
"Parent" : "/api/pai/PredictiveScenarios('6c156b2d-
da98-4f83-80a2-62574e590e37')",
"Path" : "FraudsterDetector/FraudsterDetector 2016-05-26T17:01:26.233",
"Type" : "Task",
"Description" : "Train a new classification model",
"CreationTime" : "2016-11-20T09:00:32.153",
"LastModificationTime" : "2016-11-20T09:00:32.153",
"TaskType" : "Train",
"Definition": {
"Target" : "class",
"Key" : ["id"],
"AutoActivate": true,
"ApplyOutput" : {
"Reasons" : {
"Positive" : 3,
"Negative" : 1
}
}
},
"Bindings" : {
"Inputs" : [
{
"Name" : "inputDataset",
"Reference" : "/api/pai/Datasets('b0bcd522-3728-4deb-9f33-
ad1580ab1ca5')"
}
]
},
"TaskStatus" : "Processing"
}
Example
Apply Task
Here's an Apply Task object with overriding bindings, as it can be returned by a GET request:
{
"__metadata": {
"id": "/api/pai/Tasks('69b5cfb4-8b4b-4428-bb5f-766d055798cd')"
Related Information
6.3.6 Model
The Model object can be viewed as a container for child ModelVersion objects.
Models cannot be created by themselves. They must always have at least one model version, and most of the
metadata is extracted from the underlying model content.
Operations
CRUD Operations
Properties
Remember
In this document, the description of a property shows if it is provided by the system ('System' in 'From'
column) or can be input by the application end user ('User'). User-provided properties may or may not be
required in the POST request when creating the object ('No' in 'Required on POST' column). If a property set
by the end user cannot be modified, it is also mentioned in the description.
ModelType String The native model type extracted from the System
physical model.
For example,
Kxen.RobustRegression.
Example
Here's a Model object, as it can be returned by a GET request:
{
"__metadata": {
"id": "/api/pai/Models('f21511b7-d7a8-46ff-a6f1-85e94a433d8c')"
},
"GUID" : "f21511b7-d7a8-46ff-a6f1-85e94a433d8c",
"Name" : "K2R_Census_Age",
Related Information
A ModelVersion object references a real physical model stored in SAP HANA. It is attached to a
PredictiveScenario. ModelVersion objects are children of Model objects.
Additionally, the ModelVersion object contains a version number and some model metrics. It can be marked
as "active" and then applications can access this active version from the ActiveModelVersion property of a
predictive scenario.
Operations
CRUD Operations
Properties
Remember
In this document, the description of a property shows if it is provided by the system ('System' in 'From'
column) or can be input by the application end user ('User'). User-provided properties may or may not be
required in the POST request when creating the object ('No' in 'Required on POST' column). If a property set
by the end user cannot be modified, it is also mentioned in the description.
Parent Model object The parent Model of the current User Yes
ModelVersion in which the model
version will live.
Example
Here's a ModelVersion object, as it can be returned by a GET request:
{
"__metadata": {
"id": "/api/pai/ModelVersions('27f91587-dbe3-4b1d-a4a7-9926247428b3')"
},
"GUID" : "27f91587-dbe3-4b1d-a4a7-9926247428b3",
"Name" : "K2R_Census_age_Version_3",
"Parent" : "/api/pai/Models('f21511b7-d7a8-46ff-a6f1-85e94a433d8c')",
"Path" : "MyFunctionalArea/FraudsterDetector/K2R_Census_Age/
K2R_Census_age_Version_3",
"Type" : "ModelVersion",
"Description" : "Target = \"age\", NumberOfReasonCode=3",
"Version" : 1,
"CreationTime" : "2016-05-26T17:40:45.333",
"LastModificationTime" : "2016-05-26T17:40:45.333",
"Active" : true,
"Metrics" : [
{
"Name" : "predictivePower",
"Value" : 0.6275,
"Flag" : "HigherIsBetter"
}, {
"Name" : "predictionConfidence",
"Value" : 0.99,
"Flag" : "HigherIsBetter"
}
]
}
Related Information
The following scenarios illustrate a typical usage of the services to create and consume a predictive model.
Note
In these scenarios, the call responses show deserialized complex properties.
Related Information
In this scenario, the end user answers a business question by performing a predictive analysis on their
customer data stored on SAP HANA.
Step Description
Create a predictive scenario without model Creating a Predictive Scenario [page 132]
Create a Train task to initialize the predictive scenario with Training a Model [page 136]
an automated model
Apply the predictive scenario on a dataset stored in an SAP Applying the Model [page 141]
HANA database table and get the results
You create a predictive scenario with minimal information. There is no underlying physical model.
Request
URI: /api/pai/PredictiveScenarios
{
"Name" : "CustomerClassification",
"Description" : "Identify people with gain over 40K USD",
"ScenarioType" : "Classification"
}
Response
The response contains properties generated by the service plus information provided by the request. Since
there is no underlying model from which to extract model metadata, it does not contain any signature.
{ "d": {
"__metadata": {
"id": "https://<server>:<port>/api/pai/PredictiveScenarios('9ed39768-
c0fb-4085-abfb-54a6ae6f88c6')",
"uri": "https://<server>:<port>/api/pai/PredictiveScenarios('9ed39768-
c0fb-4085-abfb-54a6ae6f88c6')",
"type": "com.sap.aa.ii.backend.ODataPredictiveScenario"
},
"Bindings": "",
"ScenarioType": "Classification",
"Signature": "",
"Type": "PredictiveScenario",
"CreationTime": "2016-12-28T10:48:47.329",
"Description": "Identify people with gain over 40K USD",
"GUID": "9ed39768-c0fb-4085-abfb-54a6ae6f88c6",
"LastModificationTime": "2016-12-28T10:48:47.329",
"Name": "CustomerClassification",
"Path": "CustomerClassification",
"ActiveModelVersion": {
"__deferred": {
"uri": "https://<server>:<port>/api/pai/PredictiveScenarios('9ed39768-
c0fb-4085-abfb-54a6ae6f88c6')/ActiveModelVersion"
}
},
"Models": {
"__deferred": {
"uri": "https://<server>:<port>/api/pai/PredictiveScenarios('9ed39768-
c0fb-4085-abfb-54a6ae6f88c6')/Models"
}
},
"Parent": {
"__deferred": {
"uri":"https://<server>:<port>/api/pai/PredictiveScenarios('9ed39768-
c0fb-4085-abfb-54a6ae6f88c6')/Parent"
}
},
"Tasks": {
"__deferred": {
"uri": "https://<server>:<port>/api/pai/PredictiveScenarios('9ed39768-
c0fb-4085-abfb-54a6ae6f88c6')/Tasks"
}
}
}
}
Note
ActiveModelVersion, Models, Parent, and Tasks are navigation properties between the predictive
scenario and its children or parent. They are defined in the service metadata. According to the OData
You create a dataset with minimal information, that is, the location of the dataset in SAP HANA.
Note
The CENSUS table is a sample dataset that the database user can read (SELECT right). CENSUS will be used
to train and apply the model. SERVICE_TEST is the schema within the same SAP HANA instance as the
deployed predictive service.
Request
URI: /api/pai/Datasets
Request body:
{
"Name": "Census",
"Description": "Census demo dataset",
"Location": { "Schema": "SERVICE_TEST", "TableName": "CENSUS" }
}
Response
The response contains properties generated by the service, plus information provided by the request. The
service returns the list of input variables from the dataset of which table name and schema have been passed
in the request. The service has also generated a specific GUID for this dataset.
{
"d":{
"__metadata":{
"id":"https://<server>:<port>/api/pai/Datasets('3f4ba129-87b2-4cc6-99dd-
d07b37ec377b')",
"uri":"https://<server>:<port>/api/pai/
Datasets('3f4ba129-87b2-4cc6-99dd-d07b37ec377b')",
"type":"com.sap.aa.ii.backend.ODataDataset"
},
"Location":{
"Schema":"SERVICE_TEST",
"TableName":"CENSUS"
},
"Type":"Dataset",
"Columns":[
Create a train task with the input binding on the table used for train and a given target variable.
Caution
When providing the Input Dataset for the training Task, if the dataset contains a DECIMAL column, the
precision and the scale must be included in the column type definition. Make sure to provide both
arguments at all times to avoid errors during the creation of the ModelVersion. For more details see: SAP
HANA SQL and System Views Reference.
Request
URI: /api/pai/PredictiveScenarios('9ed39768-c0fb-4085-abfb-54a6ae6f88c6')/Tasks
Request body:
An input dataset is bound to the task by a reference in bindings. Definition specifies the target variable to be
used when training the model.
{
"Name":"TrainTask",
"TaskType":"Train",
"Definition":{
"Target":"class"
},
"Bindings":{
"Inputs":[
{
"Name":"inputDataset",
"Reference":"/api/pai/Datasets('3f4ba129-87b2-4cc6-99dd-
d07b37ec377b')"
}
]
}
}
Response
The response contains properties generated by the service plus information provided by the request. The task
is still ongoing, as shown by TaskStatus and asynchronous by default, as shown by Synchronous to null.
{
"d":{
"__metadata":{
"id":"https://<server>:<port>/api/pai/Tasks('01ee5873-30ba-4c8f-8210-
b08dc4932c9b')",
"uri":"https://<server>:<port>/api/pai/Tasks('01ee5873-30ba-4c8f-8210-
b08dc4932c9b')",
"type":"com.sap.aa.ii.backend.ODataTask"
},
"Bindings":{
"Inputs":[
{
"Name":"inputDataset",
"Mapping":null,
"Reference":"/api/pai/
Datasets('bad17c49-1053-4854-9b0e-1d48163069b9')"
}
]
},
"Definition":{
"Target":"class"
},
"Messages":"[]",
"Name":"TrainTask",
"TaskStatus":"Pending",
"TaskType":"Train",
"Type":"Task",
"Synchronous":null,
"CreationTime":"2016-05-26T17:01:26.233",
"Description":"",
"GUID":"01ee5873-30ba-4c8f-8210-b08dc4932c9b",
"LastModificationTime":"2016-05-26T17:01:26.233",
"Path":"CustomerClassification/TrainTask",
"ModelVersion":{
"__deferred":{
"uri":"https://<server>:<port>/api/pai/
Tasks('01ee5873-30ba-4c8f-8210-b08dc4932c9b')/ModelVersion"
}
},
"Parent":{
"__deferred":{
"uri":"https://<server>:<port>/api/pai/
Tasks('01ee5873-30ba-4c8f-8210-b08dc4932c9b')/Parent"
}
}
}
}
Note
ModelVersion and Parent are navigation properties between the task and its children or parent. They are
defined in the service metadata. According to the OData specification, the __deferred property contains a
link to the object when this one is not asked to be returned as a whole object.
Request
URI: /api/pai/Tasks('69b5cfb4-8b4b-4428-bb5f-766d055798cd')/TaskStatus
Response
{
"d":{
"TaskStatus":"Success"
}
}
The user activates the model version by setting a specific URI corresponding to the actual model version to the
ActiveModelVersion property.
Request
URI: /api/pai/PredictiveScenarios('9ed39768-c0fb-4085-abfb-54a6ae6f88c6')/$links/
ActiveModelVersion
Request body:
{
"uri" : "http://<server>:<port>/api/pai/ModelVersions('97ad0c02-ee9d-485c-b120-
ac41c0f8cc85')"
}
Response
The user can check the ActiveModelVersion property to see if the model has been activated successfully.
This is an optional request.
Request
URI: /api/pai/PredictiveScenarios('9ed39768-c0fb-4085-abfb-54a6ae6f88c6')?
$expand=ActiveModelVersion
Response
The object requested and returned is the PredictiveScenario. The $expand parameter set to the
ActiveModelVersion property allows you to get a response that also contains the whole active
ModelVersion object. The signature has been created from the underlying model (active model version) for
the current predictive scenario. The system has computed some metrics when the model has been trained
against the input dataset, and the service has returned them.
{
"d":{
"__metadata":{
"id":"https://<server>:<port>/api/pai/PredictiveScenarios('9ed39768-
c0fb-4085-abfb-54a6ae6f88c6')",
"uri":"https://<server>:<port>/api/pai/PredictiveScenarios('9ed39768-
c0fb-4085-abfb-54a6ae6f88c6')",
"type":"com.sap.aa.ii.backend.ODataPredictiveScenario"
},
"Bindings":"",
"ScenarioType":"Classification",
"Signature":{
"Inputs":[
{
"Name":"inputDataset",
"Description":"",
"Structure":[
{
"Name":"id",
"Storage":"Integer",
"Type":"Key"
},
{
...
},
{
"Name":"class",
"Storage":"SmallInteger",
"Type":"Target"
}
]
}
Note
Parent and Tasks are navigation properties between the model version and its children or parent. They are
defined in the service metadata. According to the OData specification, the __deferred property contains a
link to the object when this one is not asked to be returned as a whole object.
The predictive scenario has an underlying model. Now you create an Apply task to predict values for the target
variable on a new dataset. Here the same dataset is used.
Note
CENSUS_RESULT is the table that will receive the result of the Apply task. The predictive service DB user
must have rights to the table specified in the output binding for the Apply.
Request
URI: /api/pai/PredictiveScenarios('9ed39768-c0fb-4085-abfb-54a6ae6f88c6')/Tasks
Request body:
{
"Name":"ApplyTask",
"TaskType":"Apply",
"Bindings":{
"Inputs":[
{
"Name":"inputDataset",
"Reference":"/api/pai/Datasets('3f4ba129-87b2-4cc6-99dd-
d07b37ec377b')"
}
],
"Outputs":[
{
"Name":"outputDataset",
"Location":{
"Schema":"SERVICE_TEST",
"TableName":"CENSUS_RESULT"
}
}
Response
The response contains properties generated by the service plus information provided by the request.
Definition specifies the target variable to be used when applying the model. The task is still ongoing, as
shown by TaskStatus and asynchronous by default, as shown by Synchronous to null.
{
"d":{
"__metadata":{
"id":"https://<server>:<port>/api/pai/Tasks('01ee5873-30ba-4c8f-8210-
b08dc4932c9b')",
"uri":"https://<server>:<port>/api/pai/Tasks('01ee5873-30ba-4c8f-8210-
b08dc4932c9b')",
"type":"com.sap.aa.ii.backend.ODataTask"
},
"Bindings":{
"Inputs":[
{
"Name":"inputDataset",
"Mapping":null,
"Reference":"/api/pai/
Datasets('bad17c49-1053-4854-9b0e-1d48163069b9')"
}
],
"Outputs":[
{
"Name":"outputDataset",
"Location":{
"Schema":"SERVICE_TEST",
"TableName":"CENSUS_RESULT"
}
}
]
},
"Messages":"[]",
"Name":"ApplyTask",
"TaskStatus":"Pending",
"TaskType":"Apply",
"Type":"Task",
"Synchronous":null,
"CreationTime":"2016-05-26T17:01:29.233",
"Description":"",
"GUID":"01ee5873-30ba-4c8f-8210-b08dc4932c9b",
"LastModificationTime":"2016-05-26T17:01:26.233",
"Path":"CustomerClassification/ApplyTask",
"ModelVersion":{
"__deferred":{
"uri":"https://<server>:<port>/api/pai/
Tasks('01ee5873-30ba-4c8f-8210-b08dc4932c9b')/ModelVersion"
}
},
"Parent":{
"__deferred":{
"uri":"https://<server>:<port>/api/pai/
Tasks('01ee5873-30ba-4c8f-8210-b08dc4932c9b')/Parent"
}
}
}
Note
ModelVersion and Parent are navigation properties between the task and its children or parent. They are
defined in the service metadata. According to the OData specification, the __deferred property contains a
link to the object when this one is not asked to be returned as a whole object.
Request
URI: /api/pai/Tasks('01ee5873-30ba-4c8f-8210-b08dc4932c9b')/TaskStatus/$value
Response
"Success"
This scenario illustrates the concept of deep insert in OData through the creation of a predictive scenario and a
Train task together in one request.
Step Description
Create a predictive scenario with a Train task to initialize the Creating a Predictive Scenario with a Train Task [page 146]
predictive scenario with an automated model
Apply the predictive scenario on a dataset stored in an SAP Applying the Model [page 155]
HANA database table and get the results
You create a dataset with minimal information, that is, the location of the dataset in SAP HANA.
Note
The CENSUS table is a sample dataset that the database user can read (SELECT right). CENSUS will be used
to train and apply the model. SERVICE_TEST is the schema within the same SAP HANA instance as the
deployed predictive service.
Request
URI: /api/pai/Datasets
Request body:
{
"Name": "Census",
"Description": "Census demo dataset",
"Location": { "Schema": "SERVICE_TEST", "TableName": "CENSUS" }
}
Response
The response contains properties generated by the service, plus information provided by the request. The
service returns the list of input variables from the dataset of which table name and schema have been passed
in the request. The service has also generated a specific GUID for this dataset.
Add task properties to the request along with predictive scenario details.
Request
URI: /api/pai/PredictiveScenarios
Request body:
An input dataset is bound to the task by a reference in bindings. Definition specifies the target variable to be
used when training the model.
{
"Name":"CustomerClassification",
"Description":"Identify people with gain over 40K USD",
"ScenarioType":"Classification",
"Tasks":[
{
"Name":"TrainTask",
"TaskType":"Train",
"Definition":{
"Target":"class"
},
"Bindings":{
"Inputs":[
{
"Name":"inputDataset",
"Reference":"/api/pai/Datasets('3f4ba129-87b2-4cc6-99dd-
d07b37ec377b')"
}
]
}
}
]
Response
Predictive scenario and task have been created but the signature is still empty. The task is still ongoing, as
shown by TaskStatus and asynchronous by default, as shown by Synchronous to null. The user has to wait
for the task to finish to get a model version attached to a model, and a model attached to the predictive
scenario.
{
"d":{
"__metadata":{
"id":"https://<server>:<port>/api/pai/
PredictiveScenarios('437a5303-2512-4c16-852b-190659474096')",
"uri":"https://<server>:<port>/api/pai/
PredictiveScenarios('437a5303-2512-4c16-852b-190659474096')",
"type":"com.sap.aa.ii.backend.ODataPredictiveScenario"
},
"Bindings":"",
"ScenarioType":"Classification",
"Signature":"",
"Type":"PredictiveScenario",
"CreationTime":"2016-05-26T17:01:26.233",
"Description":"Identify people with gain over 40K USD",
"GUID":"437a5303-2512-4c16-852b-190659474096",
"LastModificationTime":"2016-05-26T17:01:26.233",
"Name":"CustomerClassification",
"Path":"CustomerClassification",
"ActiveModelVersion":{
"__deferred":{
"uri":"https://<server>:<port>/api/pai/
PredictiveScenarios('437a5303-2512-4c16-852b-190659474096')/ActiveModelVersion"
}
},
"Models":{
"__deferred":{
"uri":"https://<server>:<port>/api/pai/
PredictiveScenarios('437a5303-2512-4c16-852b-190659474096')/Models"
}
},
"Parent":{
"__deferred":{
"uri":"https://<server>:<port>/api/pai/
PredictiveScenarios('437a5303-2512-4c16-852b-190659474096')/Parent"
}
},
"Tasks":{
"results":[
{
"__metadata":{
"id":"https://<server>:<port>/api/pai/
Tasks('ef390129-2cc1-40ea-b2e6-002301353ee3')",
"uri":"https://<server>:<port>/api/pai/
Tasks('ef390129-2cc1-40ea-b2e6-002301353ee3')",
"type":"com.sap.aa.ii.backend.ODataTask"
},
"Bindings":{
"Inputs":[
{
"Name":"inputDataset",
"Reference":"/api/pai/Datasets('2a07adaa-403f-44e9-bda4-
a6e9af9fe297')",
This mode is asynchronous by default. Check the task status. The predictive scenario is returned with a
signature once the task is finished.
Request
URI: /api/pai/PredictiveScenarios('9ed39768-c0fb-4085-abfb-54a6ae6f88c6')/
Tasks('ef390129-2cc1-40ea-b2e6-002301353ee3')
{
"d":{
"results":[
{
"Definition":{
"Target":"Class"
},
"Bindings":{
"Inputs":[
{
"Name":"inputDataset",
"Mapping":null,
"Reference":"/api/pai/Datasets('b03f57a4-373d-4b03-b709-
eca0f577ed0d')"
}
]
},
"Messages":"",
"Name":"TrainTask",
"TaskStatus":"Pending",
"TaskType":"Apply",
"Type":"Task",
"Synchronous":null,
"CreationTime":"2016-05-26T17:01:26.233",
"Description":"",
"GUID":"1a4ef494-2115-4c7f-96f7-a79fd983f62f",
"LastModificationTime":"2016-05-26T17:01:26.233",
"Path":"CustomerClassification/ApplyTask"
}
]
}
}
The user activates the model version by setting a specific URI corresponding to the actual model version to the
ActiveModelVersion property.
Request
URI: /api/pai/PredictiveScenarios('9ed39768-c0fb-4085-abfb-54a6ae6f88c6')/$links/
ActiveModelVersion
Request body:
{
"uri" : "http://<server>:<port>/api/pai/ModelVersions('6ccbef39-a9f5-4f74-
a453-6fb0103fafa5')"
}
The user can check the ActiveModelVersion property to see if the model has been activated successfully.
This is an optional request.
Request
URI: /api/pai/PredictiveScenario('9ed39768-c0fb-4085-abfb-54a6ae6f88c6')?
$expand=ActiveModelVersion
Response
The $expand parameter set to the ActiveModelVersion property allows you to get the response that
contains the whole active ModelVersion object. The signature has been created from the underlying model
(active model version) for the current predictive scenario.
{
"d":{
"__metadata":{
"id":"https://<server>:<port>/api/pai/
PredictiveScenarios('437a5303-2512-4c16-852b-190659474096')",
"uri":"https://<server>:<port>/api/pai/
PredictiveScenarios('437a5303-2512-4c16-852b-190659474096')",
"type":"com.sap.aa.ii.backend.ODataPredictiveScenario"
},
"Bindings":"",
"ScenarioType":"Classification",
"Signature":{
"Inputs":[
{
"Description":"",
"Name":"inputDataset",
"Structure":[
{
"Name":"id",
"Storage":"Integer",
"Type":"Input"
},
{
"Name":"age",
"Storage":"TinyInteger",
"Type":"Input"
Note
Parent, Models, and Tasks are navigation properties between OData objects. They are defined in the
service metadata. For example:
<NavigationProperty Name="Parent"
Relationship="com.sap.aa.ii.backend.ODataModelVersion_Parent_ODataModel_Version
s" FromRole="Parent" ToRole="Versions"/>
<AssociationSet Name="ODataModelVersion_Parent_ODataModel_Versions"
Association="com.sap.aa.ii.backend.ODataModelVersion_Parent_ODataModel_Versions
">
<End EntitySet="Models" Role="Versions"/>
<End EntitySet="ModelVersions" Role="Parent"/>
</AssociationSet>
According to the OData specification, the __deferred property contains a link to the object when this one
is not asked to be returned as a whole object.
The predictive scenario has an underlying model. Now you create an Apply task to predict values for the target
variable on a new dataset. Here the same dataset is used.
Note
CENSUS_RESULT is the table that will receive the result of the Apply task. The predictive service DB user
must have rights to the table specified in the output binding for the Apply.
Request
URI: /api/pai/PredictiveScenarios('9ed39768-c0fb-4085-abfb-54a6ae6f88c6')/Tasks
Request body:
{
"Name":"ApplyTask",
"TaskType":"Apply",
"Bindings":{
"Inputs":[
{
"Name":"inputDataset",
"Reference":"/api/pai/Datasets('3f4ba129-87b2-4cc6-99dd-
d07b37ec377b')"
}
],
"Outputs":[
{
"Name":"outputDataset",
"Location":{
"Schema":"SERVICE_TEST",
"TableName":"CENSUS_RESULT"
}
}
]
}
}
Response
The response contains properties generated by the service plus information provided by the request.
Definition specifies the target variable to be used when applying the model. The task is still ongoing, as
shown by TaskStatus and asynchronous by default, as shown by Synchronous to null.
{
"d":{
"__metadata":{
Note
ModelVersion and Parent are navigation properties between the task and its children or parent. They are
defined in the service metadata. According to the OData specification, the __deferred property contains a
link to the object when this one is not asked to be returned as a whole object.
Request
URI: /api/pai/Tasks('01ee5873-30ba-4c8f-8210-b08dc4932c9b')/TaskStatus/$value
Response
"Success"
As specified when creating the apply task, the application has to query the SERVICE_TEST/CENSUS_RESULT
to get the apply results. The column generated is proba_rr_class.
You serialize or deserialize JSON strings of complex properties when handling model entities.
The entity data model does not use Edm:ComplexType to represent complex properties. Only simple
properties based on primitive types like Edm.String, Edm.Boolean, or Edm.DateTime are used. Complex
properties are represented as Edm.String properties containing serialized JSON, which gives more flexibility
to express data structures. You can identify these complex properties in the service metadata because they are
qualified with the attribute content="json". For example:
<EntityType Name="ODataDataset">
<Key>
<PropertyRef Name="GUID"/>
</Key>
<Property Name="Location" Type="Edm.String" content="json"/>
<Property Name="Type" Type="Edm.String" sap:default="Dataset"
xmlns:sap="http://www.sap.com/Protocols/SAPData"/>
<Property Name="Variables" Type="Edm.String" content="json"/>
<Property Name="CreationTime" Type="Edm.DateTime"/>
<Property Name="Description" Type="Edm.String" sap:default=""
xmlns:sap="http://www.sap.com/Protocols/SAPData"/>
<Property Name="GUID" Type="Edm.String"/>
<Property Name="LastModificationTime" Type="Edm.DateTime"/>
<Property Name="Name" Type="Edm.String" sap:default="" xmlns:sap="http://
www.sap.com/Protocols/SAPData"/>
<Property Name="Path" Type="Edm.String"/>
<NavigationProperty Name="Parent"
Relationship="com.sap.aa.ii.backend.ODataDataset_Parent_ODataCatalog_Datasets"
FromRole="Parent" ToRole="Datasets"/>
</EntityType>
Instances of the Dataset entity will contain a Location property whose value is a serialized JSON string:
{
"d": {
...,
"Location": "{
\"Schema\":\"APL_SAMPLES\",
\"TableName\":\"CENSUS\"
}"
...,
}
}
You can easily parse the serialized JSON content into a languagespecific data structure using standard
libraries.
JavaScript
{
...
"Location": {
"Schema": "APL_SAMPLES",
"TableName": "CENSUS"
}
...
}
Java
In Java, you can use the open-source Jackson library to handle the mapping between JSON and Java
structures. Use the ObjectMapper object to map to and from generic JsonNode objects:
You can also have Java classes representing the types to be more specific:
class Location {
private String schema;
private String tableName;
// standard getters and setters
}
ObjectMapper mapper = new ObjectMapper();
// ...from JSON to Java instance of the Location class
Location loc = mapper.readValue(json, Location.class);
// ...from Java to JSON
String newJsonString = mapper.writeValueAsString(loc);
7.1 Introduction
Governments place legal requirements on industry to protect data and privacy. We provide features and
functions to help you meet these requirements.
Note
SAP does not provide legal advice in any form. SAP software supports data protection compliance by
providing security features and data protection-relevant functions, such as blocking and deletion of personal
data. In many cases, compliance with applicable data protection and privacy laws is not covered by a
product feature. Furthermore, this information should not be taken as advice or a recommendation
regarding additional features that would be required in specific IT environments. Decisions related to data
protection must be made on a case-by-case basis, taking into consideration the given system landscape and
the applicable legal requirements. Definitions and other terms used in this documentation are not taken
from a specific legal source.
7.2 Glossary
Term Definition
Business purpose A legal, contractual, or in other form justified reason for the
processing of personal data. The assumption is that any pur
pose has an end that is usually already defined when the
purpose starts.
Consent The action of the data subject confirming that the usage of
his or her personal data shall be allowed for a given purpose.
A consent functionality allows the storage of a consent re
cord in relation to a specific purpose and shows if a data
subject has granted, withdrawn, or denied consent.
End of business Date where the business with a data subject ends, for exam
ple the order is completed, the subscription is canceled, or
the last bill is settled.
End of purpose (EoP) End of purpose and start of blocking period. The point in
time, when the primary processing purpose ends (for exam
ple contract is fulfilled).
End of purpose (EoP) check A method of identifying the point in time for a data set when
the processing of personal data is no longer required for the
primary business purpose. After the EoP has been reached,
the data is blocked and can only be accessed by users with
special authorization (for example, tax auditors).
Purpose The information that specifies the reason and the goal for
the processing of a specific set of personal data. As a rule,
the purpose references the relevant legal basis for the proc
essing of personal data.
Residence period The period of time between the end of business and the end
of purpose (EoP) for a data set during which the data re
mains in the database and can be used in case of subse
quent processes related to the original purpose. At the end
of the longest configured residence period, the data is
blocked or deleted. The residence period is part of the over
all retention period.
Retention period The period of time between the end of the last business ac
tivity involving a specific object (for example, a business
partner) and the deletion of the corresponding data, subject
to applicable laws. The retention period is a combination of
the residence period and the blocking period.
Sensitive personal data A category of personal data that usually includes the follow
ing type of information:
Where-used check (WUC) A process designed to ensure data integrity in the case of
potential blocking of business partner data. An application's
where-used check (WUC) determines if there is any depend
ent data for a certain business partner in the database. If de
pendent data exists, this means the data is still required for
business activities. Therefore, the blocking of business part
ners referenced in the data is prevented.
The SAP Predictive service does not manage read-access logging. Read-access logging should be put in place
in the data-owning system.
The SAP Predictive service does not handle information report. Any data inquiries should be directed to the
data-owning system.
7.5 Erasure
The SAP Predictive service does not delete any data. The deletion of data should be handled by the data-
owning system.
The SAP Predictive service does not modify any data. Change log should be managed in the data-owning
system.
The SAP Predictive service does not collect any data. User consent should be managed by the data-owning
system.
A description of the sample datasets available in SAP HANA to be used with SAP Predictive service.
The table specifies which services are worth testing with the datasets.
CENSUS Clustering, Key In Excerpt from the American Census Bu APL_SAMPLES CENSUS_500
fluencers, Outliers, reau database, completed in 1994 and
Scoring Equation, by Barry Becker. CENSUS_1000
and What-if
It contains 14 characteristics of an indi
vidual extracted from a census dataset
associated to an indicator equal to 1,
when the individual earned more than
fifty thousand dollars the previous year,
else 0. In the dataset, the name of this
indicator is class.
CashFlows Forecasts Contains historical cash flow data from APL_SAMPLES CASHFLOWS
2016 and 23 other indicators.
Announcement
Fix
Documentation
The configuration tasks have been updated. See Configuration Tasks [page 13].
Enhancement
Documentation
The version numbers of the predictive service have been added to the What's New. See What's New [page 6].
Enhancement
When creating a job, the input parameter of the response is now formatted as a JSON object instead of a string. See
Job Response Body Parameters [page 86].
Enhancement
Key Influencers
End-users can now get the variables that are correlated to each other, with their coefficient of correlation. See the
correlatedVariables property in Response Body Parameters [page 54].
Enhancement
EXX121 is a new message related to incompatibility between input parameters. See General Service Parameter Error
Messages (EXX) [page 97].
Documentation
An important note has been added to the Clustering APIs documentation to explain the link between the view export
method and the modelSQLExportEnabled input parameter. See Request Body Parameters [page 27].
Fix
Documentation
A correction has been made to the data source binding procedure. Only <default> can be used as data source name. See
Bind the Data Source [page 15].
New
The new collection of services Predictive Analytics Integrator Services is available to add model management tasks to
your application. See Service Description [page 101].
Enhancement
Clustering
End-users can now specify the distance used to measure the proximity of two data points. It's enabled through the
distance input parameter. See Request Body Parameters [page 27].
Enhancement
Forecasts
● Specify the maximum lag to consider when forecasts are computed. It's enabled through the maxLag input param
eter. See Request Body Parameters [page 46].
● Get the MAPE indicator for each horizon. It's output in the mapePerHorizon property under
modelPerformance. See Response Body Parameters [page 49].
Enhancement
Dataset
End-users can now specify the schema and the table of the dataset in SAP HANA separately in the request. It's enabled
through the location input parameter. hanaURL has been deprecated. See Register an SAP HANA Table as Dataset
[page 33].
Enhancement
Scoring Equation
End-users can now choose the type of output generated by the scoring equation (predicted value or probability). It's en
abled through the predictionOutputType input parameter. See Request Body Parameters [page 78].
Enhancement
All services
The variableDescription parameter has been deprecated from the request body of the APIs. From now on, end-
users must use the Dataset APIs to specify the variable descriptions (Register an SAP HANA Table as Dataset [page 33])
or to modify them (Modify the Variable Description [page 42]).
Enhancement
End-users can now set the key of the target variable through the TargetKey input parameter. See Request Body Pa
rameters [page 59] (outliers) and Request Body Parameters [page 78] (scoring equation).
New
Documentation
The documentation now specifies that referenceDate must follow the ISO 8601 format in the Forecasts API request.
See Request Body Parameters [page 46].
New
Clustering
A new Clustering service is available and provides a set of APIs that allow end-users to segment an input dataset into
clusters and to get segmentation results into an SAP HANA database table or view. See Clustering APIs [page 26].
Enhancement
Outliers
End-users can now enable autoselection of variables through the autoSelection input parameter. See Request
Body Parameters [page 59].
New
Documentation
A first usage scenario has been added to the documentation. It describes an end-to-end clustering process. See Creat
ing Clusters with Either High or Low Target Rate [page 87].
Enhancement
Forecasts
● End-users can now specify the modeling technique used to generate forecasts: the default Automated Analytics
technique, the exponential smoothing, or the linear regression. It's enabled through the forecastMethod input
parameter of the APIs [POST] /api/analytics/forecast/sync and [POST] /api/analytics/
forecast.
● End-users can now specify the cycle length in the case of the smoothing technique. It's enabled through the
smoothingCycleLength input parameter of the same APIs.
Enhancement
Dataset
End-users can now specify if a variable column is a component of the primary key. See Modify the Variable Description
[page 42].
You can now explore, test, and consume the predictive service through the SAP API Business Hub .
New
SAP HANA
Enhancement
Dataset
End-users can now modify the value types of the variables. It's enabled through the new API [POST] /api/
analytics/dataset/<datasetID>/variables/update. See Modify the Variable Description [page 42].
Enhancement
New
Enhancement
Forecasts
● End-users can now get the past data with both predicted and real values of each data point. It's enabled through the
numberOfPastValuesInOutput input parameter. See Request Body Parameters [page 46].
● End-users can now get the definition of the trend, cycles, and fluctuation features found in the data and used by the
underlying time series model to generate forecasts. It's available in the modelInformation output property.
See Response Body Parameters [page 49].
Dataset
When registering a dataset, end-users can now provide the description of the variables contained in the dataset. The
specified description is used whenever the associated dataset is used with the predictive service. See Register an SAP
HANA Table as Dataset [page 33].
New
EDS106 to EDS110 are new messages related to errors in the list of variables provided in the dataset service request. See
Dataset Service Error Messages (EDS) [page 99].
Coding Samples
Any software coding and/or code lines / strings ("Code") included in this documentation are only examples and are not intended to be used in a productive system
environment. The Code is only intended to better explain and visualize the syntax and phrasing rules of certain coding. SAP does not warrant the correctness and
completeness of the Code given herein, and SAP shall not be liable for errors or damages caused by the usage of the Code, unless damages were caused by SAP
intentionally or by SAP's gross negligence.
Gender-Neutral Language
As far as possible, SAP documentation is gender neutral. Depending on the context, the reader is addressed directly with "you", or a gender-neutral noun (such as
"sales person" or "working days") is used. If when referring to members of both sexes, however, the third-person singular cannot be avoided or a gender-neutral noun
does not exist, SAP reserves the right to use the masculine form of the noun and pronoun. This is to ensure that the documentation remains comprehensible.
Internet Hyperlinks
The SAP documentation may contain hyperlinks to the Internet. These hyperlinks are intended to serve as a hint about where to find related information. SAP does not
warrant the availability and correctness of this related information or the ability of this information to serve a particular purpose. SAP shall not be liable for any
damages caused by the use of related information unless damages have been caused by SAP's gross negligence or willful misconduct. All links are categorized for
transparency (see: https://help.sap.com/viewer/disclaimer).