Академический Документы
Профессиональный Документы
Культура Документы
Version 4.2
Windows
BusinessObjects TM
Version 4.2
No part of the computer software or this document may be reproduced or transmitted in any form or by any means, electronic or mechanical,
including photocopying, recording, or by any information storage and retrieval system, without permission in writing from Business Objects.
The information in this document is subject to change without notice. If you find any problems with this documentation, please report them to
Business Objects in writing at documentation@businessobjects.com. Business Objects does not warrant that this document is error free.
Trademarks:
The Business Objects logo, BusinessMiner, BusinessQuery, and WebIntelligence are registered trademarks of Business Objects SA.
The Business Objects tagline, Broadcast Agent, BusinessObjects, Personal Trainer, Rapid Deployment Templates, and Set Analyzer are
trademarks of Business Objects SA.
Microsoft, Windows, Windows NT, Access, Microsoft VBA and other names of Microsoft products referenced herein are trademarks or
registered trademarks of Microsoft Corporation.
Oracle is a registered trademark of Oracle Corporation. All other names of Oracle products referenced herein are trademarks or registered
trademarks of Oracle Corporation.
All other product and company names mentioned herein are the trademarks of their respective owners.
This software and documentation is commercial computer software under Federal Acquisition regulations, and is provided only under the
Restricted Rights of the Federal Acquisition Regulations applicable to commercial computer software provided at private expense. The use,
duplication, or disclosure by the U.S. Government is subject to restrictions set forth in subdivision (c)(1)(ii) of the Rights in Technical Data and
Computer Software clause at 252.227-7013.
Edition: 3
Contents
Chapter 1 Introduction 11
An Overview of Data Mining .................................................................................... 12
BusinessObjects and BusinessMiner......................................................................... 17
The BusinessMiner Interface ..................................................................................... 22
Glossary
Index
In this preface
Multimedia
BUSINESSOBJECTS documentation in multimedia includes Quick Tour and the
BUSINESSMINER tutorial, both of which cover the main concepts and features of
the products using images and animation.
Quick Tour
Quick Tour is a multimedia presentation that introduces new features in
BUSINESSOBJECTS. Aimed primarily at users updating from a previous version of
BUSINESSOBJECTS, it is also an excellent primer for first-time users of the product.
You can use Quick Tour as an accompaniment to the guide Getting Started with
BusinessObjects.
Online Guides
User’s Guides
All BUSINESSOBJECTS user’s guides are available as Acrobat Portable Document
Format (PDF) files. Designed for online reading, PDF files enable you to view,
navigate through, or print any of their contents. The full list of BUSINESSOBJECTS
guides is provided in the Deployment Guide.
From a PDF file, you can search for specific occurrences of a word using the Find
command, or navigate to the exact location of a topic by clicking on an entry in
the Index or Table of Contents.
During installation, the BUSINESSOBJECTS installer program automatically copies
these files to:
Business Objects\BusinessObjects 5.0\Online Guides\En
To view the pdf document you need to have the Adobe Acrobat Reader, version
3.0 or higher installed on your machine. This Reader is available on the
BUSINESSOBJECTS CD-ROM. You can also download it for free from Adobe
Corporation’s web site at:
http://www.adobe.com
Online Help
The extensive online help system consists of step-by-step procedures and
reference information for all the commands, toolbars, and options of the product.
For BUSINESSOBJECTS Windows desktop products, online help is available in the
form of .hlp and .cnt files that comply with the standards of Microsoft Windows
online help.
For WEBINTELLIGENCE products, the online help is available as HTML files that
that are accessible directly from the interface.
Audience
This guide is intended for the user who would like to discover and use hidden
relationships in a database. The user needs neither a technical background nor a
knowledge of database structures. BUSINESSMINER presents information in
everyday business terms.
In this chapter
A Historical Perspective
Over the last decade, faster and improved devices for collecting and storing data
led to an exponential growth in the data found in corporate databases.
While advances in database technology provided the basic tools for handling
such massive and unprecedented volumes of data, they nonetheless did not
address the issue of transforming this data into useful knowledge.
It soon became evident that automated, powerful techniques were needed to
uncover the hidden relationships or patterns buried deep in the data. From this
need a new discipline, data mining, was born.
Decision trees
A decision tree is a model that represents a knowledge structure as a sequence of
decisions. These decisions are depicted as a directed graph. The graph is built
from a data set, and is made up of nodes, paths, and leaves.
Each node in the tree represents a premise or condition. Each node is labelled
with a causal factor. The first node is called the root node. The final nodes are
called the leaves.
From the root node, the tree traces a number of paths, each of which leads to the
conclusion of a specific premise. A tree can lead to multiple conclusions. Such
conclusions need not be mutually exclusive; in fact, several can hold true at the
same time.
The diagram below illustrates a partial and simplified decision tree for classifying
four-legged animals.
Decision trees can be used for real life business applications. The next example is
a decision tree built with BUSINESSMINER. The purpose of this decision tree is to
analyze the credit worthiness or risk of specific types of bank customers. The
rectangular elements are the nodes; the lines joining them are the paths.
Among data mining models, decision trees are by far the easiest to work with.
Decision trees yield results that are precise, efficient, and clearly interpreted.
Bibliography
• Adriaans, P., Zantige, D., Data Mining. Harlowe, England:Addison Wesley
Longman Limited
• Breiman, L., Friedman J., Olshen R., Stone C., Classification and Regression
Trees. New York, New York: Chapman & Hall
• Fayyad M., Piatetsky-Shapiro G., Smyth P., and Uthurusamy R. 1996.
Advanced in Knowledge Discovery and Data Mining. Menlo Park, California:
AAAI Press
• Quinlan, R. C4.5: Programs for Machine Learning. San Mateo, California:
Morgen Kaufmann Publishers
BusinessMiner Windows
Inside BUSINESSMINER’s main window is the Project window, which is illustrated
on page 37. From the Project window, you can access BUSINESSMINER’s other
principal windows.
BusinessMiner Toolbars
BUSINESSMINER provides three toolbars:
• Windows Standard toolbar
• Model toolbar
• Mine toolbar
Model toolbar
The Model toolbar allows you quick access to some items on the Model menu,
and to display the tree browser and tree caption window.
Model Toolbar
Button Description
...........................................................
Expand selection
Collapse selection
Mine toolbar
The Mine toolbar allows you quick access to the items in the Mine menu.
Mine Toolbar
Button Description
...........................................................
Discover rules...
Visualize...
What-if...
Segment
2. In the Toolbars dialog box that appears, click the check boxes of the toolbars
you want to see. Click the checkbox if you want to display tooltips.
3. Click OK.
In this chapter
❑ Overview 28
What is a BusinessMiner Project? 28
Preparing and Obtaining Data: Your Gold Mine 29
How Do You Build a BusinessMiner Project? 31
Overview
This chapter teaches you about BUSINESSMINER projects: what they are and how
you build them. You will also learn about things you must do before building
projects, notably obtaining the data you need.
What’s in a project?
A project consists of data that you put in, and the results you obtain by analyzing
the data:
• The data you put in can come from a BUSINESSOBJECTS report, or from an
external file such as a Microsoft Excel workbook. The data in a project is
categorized as objects. For example, the values of the Customer object are
customer names.
• Once you have put data into a project, you obtain results by building a data
mining model, then performing analysis to discover relationships.
Note: This chapter deals with building a project, i.e., inputting the data. You can
find out about analyzing the data in Chapter 4.
.ald The raw rows of data imported into the project, i.e., the
records.
Personal data Microsoft Excel, Lotus 1-2-3, dBASE and ASCII files.
files
OLAP servers Online Analytical Processing (OLAP) servers, which are
multidimensional databases that store summarized data,
ready for business analysis.
You can find detailed information on all types of data providers in the
BUSINESSOBJECTS online help.
Tip: You can find detailed information on editing data providers and creating user
objects in the BUSINESSOBJECTS online help.
4. Click Finish.
The new project appears in BUSINESSMINER, as illustrated on the next page.
Tip: In the New Project Wizard, avoid using Build a data mining model automatically.
You will get better results by building models interactively.
Refer to Chapter 4 for more information on building models.
A BUSINESSMINER project
a
b
c
d
e
f
a. The name of the project is taken from the BUSINESSOBJECTS microcube. When you
build a project from an external file, you name the project yourself.
The Project window displays folders that contain the project’s objects, records,
models, pictures and results. To open or close a folder, double-click it.
b. BUSINESSMINER objects are the categories of data in the project: Profession, Gender,
Marital Status, etc.
You can view all the project’s objects and their values, and you can also change
object labels, i.e., their names.
c. Records are the rows of raw data imported from your data source.
You cannot change records in any way.
d. Models are the decision trees you build in the project. So, until you build a model,
this folder is empty.
e. Pictures are the column and line charts you build from your data.
f. Results are the rules you discover by using the Discover Rules command (Mine
menu) on a decision tree.
Note: Also at this stage, you can deselect any objects you do not wish to include
in your analysis. Objects you deselect in the external file will not be available in
the project. This is not the case when you build a project from BUSINESSOBJECTS.
New Project 3. Click the type of file (Microsoft Access, Text delimited, etc.) you want to use,
then click Next.
5. Click Next. The next action depends on the type of file you selected:
Access or If the file contains more than one sheet or table, select one.
Excel If not, go to the next step.
6. Deselect any objects you do not wish to include in the project, then click Next.
9. Click Next, type a name for the project, then click Finish.
The project appears in BUSINESSMINER. You can find an illustration of a project
on page 37.
Example Using file overview information to set parameters in a delimited text file
...........................................................
In the following illustration, BUSINESSMINER indicates that, in the text file, objects
and values are separated by commas:
Parameters to set
The following table indicates what each parameter lets you specify. It also
indicates whether the parameter is relevant to fixed-length text files, delimited
text files, or both.
This parameter... Lets you specify... Fixed Delimited
length
1. In the Data Access Wizard, illustrated below, read the information in File
overview:
3. In the Width box, type the number of characters for the object’s values. For
example, if the object is Gender, the number of characters is 6 because the
longest value of Gender is Female.
4. Indicate the type of the object in the Type box, then click Add.
5. Repeat the three previous steps until you have specified all the objects you
need, then click Next.
In this chapter
❑ Overview 46
❑ Modeling Data 47
Modifying the Building Options 48
Modifying the Display Options 53
Building the Decision Tree 62
Managing the Decision Tree 67
Reading the Decision Tree 72
❑ Mining Data 75
Discovering Rules 75
Visualizing Data Relationships 78
Performing a What-If? Analysis 81
Segmenting the Data 83
Overview
Data mining consists of two principal activities:
• Modeling the relationships among objects in the data
• Mining to explore and use those relationships
This chapter explains how to build a model and how to use it to mine the data.
The examples in this chapter are based on the sample database files bmdemo.rep
and bmdemo.txt, which are included on the installation CD.
Modeling Data
BUSINESSMINER uses a decision tree as its data mining model.
Each node of the tree represents a condition that BUSINESSMINER has tested. The
branches starting from a node represent the different conditions of the test. The
first node in the tree is called the root. All the data is present in the root. The data
is successively split into the different branches to obtain a finer analysis. The final
nodes are called the leaves.
BUSINESSMINER builds a decision tree automatically if you so specified in the New
Project wizard or if you choose Build Full Tree from the Model menu. However,
you can build the decision tree yourself interactively, step by step, to better
control the way the tree is built.
BUSINESSMINER provides default build and display settings that apply whether
BUSINESSMINER builds the decision tree automatically or you build the tree
interactively.
The default settings are the most appropriate settings for a typical situation.
However, you can modify the way the decision tree is built and displayed to suit
the needs of your analysis.
Note: If you modify the building and display options in the project window, the
modifications apply to the entire BUSINESSMINER project, that is, to other trees
you build in this project.
If you modify the building and display options in a Tree window, the
modifications apply only to that tree.
1. From the Model menu, select Modify Building Options, then click the Control
tab.
The following dialog box appears:
2. Enter the maximum depth you want the decision tree to be.
This value must be a positive integer.
3. Enter the minimum number of records you want per node.
This value must be a positive integer.
4. If you do not want to group the symbolic values within a node, click Group
symbolic values to uncheck it.
5. Click OK.
Note: You can choose not to reassign the unknown values of all objects except
numeric objects. Numeric objects are always reassigned.
1. From the Model menu, select Modify Building Options, then click the Objects
tab.
The dialog box that appears lists all the objects that were imported into the
project:
2. Select the objects you want to mine, that is, to use in the data analysis.
• For each object, specify whether or not to mine it (Yes or No).
• For each object, specify whether or not to reassign unknown values (Yes or
No).
3. Select one of the objects as the output object, that is, the object which interests
you the most.
You want to discover the relationship of the other objects to this output object.
4. Click OK.
1. From the Model menu, select Modify Building Options, then click the Priority
tab.
The following dialog box appears:
2. Click on an object you want to prioritize from the list of objects in the left
column.
Other objects Allows you to select other objects you want to see in the
decision tree nodes. Seeing other objects provides you
with more information about the population of a
particular node.
• To apply the change to the Tree window, click Apply for your change to
take place immediately, then click OK to close the dialog box.
The information for that node appears in the Tree window.
Object to analyze Displays the object used to build the current decision
tree.
1. From the Model menu, select Modify Display Options, then click the Charts
tab.
The following dialog box appears:
6. To apply the change to the entire project, click OK, then click Yes in the dialog
box that appears.
To apply the change to the selected tree, click Apply for your change to take
place immediately, then click OK to close the dialog box.
Specifying alerters
You can assign colors to decision tree nodes and designate the meaning of the
assigned colors to alert you to situations of interest or alarm. To do so:
1. From the Model menu, select Modify Display Options, then click the Alerters
tab.
The following dialog box appears:
6. To apply the change to the entire project, click OK, then click Yes in the dialog
box that appears.
To apply the change to the selected tree, click Apply for your change to take
place immediately, then click OK to close the dialog box.
2. Select the information you want to see in the Tree Browser window.
3. To apply the change to the entire project, click OK, then click Yes in the dialog
box that appears.
To apply the change to the selected tree, click Apply for your change to take
place immediately, then click OK to close the dialog box.
Labels
You can specify whether you want to see codes or labels in BUSINESSMINER
windows. To do so:
4. Click OK.
Model Menu
Command Description Enabled When
...........................................................
Expand Tree Expands the current tree by one Tree window is
One Level level. Uses the object that has the opened
highest discriminating power, or the
highest priority that you specified.
The next level appears to the right of
the existing levels.
Collapse Tree Collapses the current tree by one Tree depth is one or
One Level level. more nodes
A Tree window
In a Tree window, selected nodes are displayed with a darkened border. In the
illustration above, the node Marital Status was selected. Using this window, you
can view and manipulate the decision tree.
Copying records
In the Tree window
1. Select a node.
Node information
The nodes of the decision tree contain important information. You can select the
type of information to be displayed in each node of the decision tree. Refer to the
section titled “Modifying Display Options” in this chapter.
A first level of information is the number of data records contained in each node.
A second level is the distribution of the values of the output object at each node
of the tree. This distribution is presented as a percentage and optionally, with a
bar chart or stacked bar chart.
In the tree on the previous page, at the root node you see that in a total of 407
examples, 60.7% have a low risk status, 21.4% have a high risk status and 17.9%
have a high profit status.
The test on credit limit lets you conclude that, when the credit limit is low, very
low, or mid-level, the account status is:
• Low risk with a probability of 84.4%;
• High risk with a probability of 8.3%;
• High profit with a probability of 7.2%.
Alerters
A third level of information is provided by the use of colored nodes, which you
can use as alerters. You can assign a color that corresponds to the number of data
records in a node that have a particular value. In the decision tree shown on page
72, the chart at the bottom of each node shows the color key for the low risk, high
risk, and high profit customers. In this example, when a node contains over 80%
of the high profit customers, the node is gold. With this coding, BUSINESSMINER
alerts you immediately to the nodes that contain the majority of high profit
customers. The objects that lead to the gold nodes describe these customers.
Mining Data
After you have constructed the model of your data, you can explore the
relationships it contains in various ways. This is called mining.
You can
• View the rules BUSINESSMINER discovered
• Visualize data relationships
• Create a “what-if?” analysis to assess a new case
• Isolate a segment of the data that is similar in some way
To describe BUSINESSMINER’s data mining capabilities, this section continues to
use the demonstration credit database example provided with BUSINESSMINER.
Discovering Rules
In the demonstration database described in Chapter 1, suppose you want to
understand the customers whose account status is High Profit in order to market
extra incentives. In this case, you want to find all the customers whose account
status is High Profit with a likelihood greater than 70%.
You decide to discover the rules that determine this population in your database
and generate a report that states these rules.
3. Select the output object, that is, the object about which you want rules.
6. Click OK.
The Rules window appears, as shown below.
It lists the criteria used to classify the decision tree nodes and displays the
percentage of population in the node that meets the criteria.
Rules window
The Edit menu commands are active when the rules window is active. You can
edit this report, copy the data onto the Windows clipboard, and paste it into any
other application.
The first column of the Rules window shows the percentage associated with each
description. The second column shows the number of records in the description.
You can use this information to check the validity of the given rules.
For the Rules window shown above, the settings in the Discover Rules dialog box
are:
Output Object: Credit Account_Status
Value: High Profit
Sort: Increasing Order
Threshold: 70%
For each rule, the objects are linked by logical AND statements, and the values
are linked by logical OR statements.
The first rule shown above states that
• IF credit limit is high
• AND marital status is widowed or single
• AND Monthly disposable is greater than 2403
• THEN account status will be High Profit with a likelihood of 88.6%
The second rule shown above states that
• IF credit limit is high
• AND marital status is widowed or single
• AND monthly disposable is greater than 2403
• AND the customer rents, rather than owns, the home
• THEN account status will be High Profit with a likelihood of 96.9%
4. In the next dialog box, shown below, select the object you want to plot for the
X axis, and, if appropriate, the object to use for the Y axis and the object against
which you want to differentiate the X axis object.
5. For charts of numeric objects, specify how you want the curves smoothed.
Low smoothing provides more detail; high smoothing provides less detail.
6. Click Finish.
The Chart window appears, as shown below. It allows you to visualize the
relationships in your data.
Chart window
1. Open a decision tree that is completely built; that is, a tree in which all the
branches are expanded to the maximum depth.
2. From the Mine menu, select What-if?
The What If? dialog box appears:
Note: When you do not know a value, click the Unknown value check box. This
causes the what-if? analysis to consider all the branches from the selected nodes
until the leaves. The result is a combination of the probabilities of all the nodes
In some cases, BusinessMiner will not be able to make a prediction because the
information is not available in the original data. In such a case, the error message
“Impossible to predict” appears.
2. Click the node for which you want the segmented information.
To open the Print Preview, select Print Preview... from the File menu.
To zoom in on your results or decision tree in Preview, click the left mouse
button.
To print the decision tree or results from Print Preview, simply click the Print
button. BUSINESSMINER prints your selection, closes Print Preview, and returns
you to the main BUSINESSMINER window.
To return to the main BUSINESSMINER window without printing, click the Close
button.
Using Print
BUSINESSMINER lets you print all the decision trees, rules, and charts of your
project. It also allows you to print information about the project itself.
The procedures for printing the Project window as well as the other
BUSINESSMINER windows appear in the next sections.
1. In the BUSINESSMINER window, either click the window whose contents you
wish to print, or activate the window by double-clicking it from the Projects
window.
Tip: You may find that the nodes of a decision tree are easier to view if you print
them with the Orientation option set to Landscape. For large trees, you may wish
to set the Scaling option to less than 100%. The Orientation options (Landscape
and Portrait) and the Scaling option are located in the Print Setup dialog box.
Printing charts
To print a chart:
visualization Data mining feature which enables you to view data relation-
ships graphically. In BUSINESSMINER, you can visualize results
in line charts and column charts.
what-if? analysis Data mining feature which enables you to input data that is
then tested against the findings you have already made. For
example, in BUSINESSMINER you can enter data about a
customer to establish their credit risk status, based on the
results you have already obtained from your decision tree.
projects 28–43 S W
building from BusinessObjects
samples vi web page for documentation vi
data 32–37
Segment command 83 What if? analysis 81–83
building from external
segmenting data 83 What-if command 81
file 38–43
specifying object priority 52
contents 28 stored procedures, defined 34
creating ordered symbolic
Supervisor module 21
objects 39
symbolic objects
defined 28
defined 39
defining data source 29
defining goal 29
external files 38
T
text files
file specifications 28
parameters to set 42
how built 31
setting parameters 41
illustrated 37
microcubes 35 specifying objects in a fixed-
length text file 43
object to analyze 31
threshold 76
obtaining data 30
parameters to set for text tips vi
toolbars 23–25
files 42
displaying or hiding 25
preparing and obtaining data
Mine 25
for 29
Model 23–24
printing information about 86
tree browser
Project window 22
changing information in 70
setting parameters for text
definition of 70
files 41
navigating in 70
specifying objects in a fixed-
Tree Caption window 71
length text file 43
tree statistics 56
Q
queries
U
Unfold command 67
building for data mining 34
Ungroup command 67
conditions 34
universes
redundant objects 34
defined 32
user objects 34, 35
measure objects 33
Quick Tour vii
optimizing for data mining 33
predefined conditions 33
R summarized objects 33
reassigning unknown values to
user objects 34, 35
objects 50
unknown values, reassigning 50
records, copying 69
updates vi
Rename Node command 68
user objects 34, 35
Rename Tree command 67
results
printing 88
V
Visualize command 78
rules
visualizing data
discovering 75
relationships 78–80
examples of 77
Rules window 76