Вы находитесь на странице: 1из 10

IBM Analytics Platform:

The IBM analytics platform provides the capabilities you need to address a whole
range of critical agendas, from creating a new data foundation to deriving value
from your unstructured content.
Building from the bottom up . . .It works with data in motion as well as data at rest.
It works with both structured and unstructured data and content. And it works with
enterprise data as well as data outside the firewall. It works with data wherever it is
on sites, on cloud or in a hybrid environment.
No other vendor offers such a complete platform for addressing your current needs
and helping you adapt quickly and easily to future needs.

Integrate and manage the full variety, velocity and volume of Big Data

Apply advanced analytics

Visualize all available data for ad-hoc analysis

Support workload optimization and scheduling

Provide for security and governance

Integrate with enterprise software

Solution Components
This section presents an overview of software components that are part of the IBM
proposal for POC. Key capabilities of these software products have been highlighted
here. Note that not all of these capabilities are necessarily relevant to or in scope of
the current proposal.

IBM InfoSphere DataStage


IBM InfoSphere DataStage integrates data across multiple systems using a high
performance parallel framework, and it supports extended metadata management
and enterprise connectivity. The scalable platform provides more flexible integration
of all types of data, including big data at rest (Hadoop-based) or in motion (streambased), on distributed and mainframe platforms.
Powerful, scalable ETL platform

Manages data arriving in near real-time as well as data received on a periodic


or scheduled basis.

Provides high-performance processing of very large data volumes.

Leverages the parallel processing capabilities of multiprocessor hardware


platforms to help you manage growing data volumes and shrinking batch
windows.

Supports heterogeneous data sources and targets in a single job including


text files, XML, ERP systems, most databases (including partitioned
databases), web services, and business intelligence tools.

Transform and aggregate any volume of information

Deliver data in batch or real time through visually designed logic

Hundreds of built-in transformation functions

Metadata-driven productivity, enabling collaboration

Workload and business rules management

Helps enable policy-driven control of system resources and prioritization of


different classes of workloads.

Helps you optimize hardware utilization and prioritize tasks, control job
activities where resources exceed specified thresholds, and assess and
reassign the priority of jobs as they are submitted into the queue.

Ease of use

Includes an operations console and interactive debugger for parallel jobs to


help you enhance productivity and accelerate problem resolution.

Helps reduce the development and maintenance cycle for data integration
projects by simplifying administration and maximizing development
resources.

Offers operational intelligence capabilities, smart management of metadata


and metadata imports, and parallel debugging capabilities to help enhance
productivity when working with partitioned data.

IBM InfoSphere QualityStage


IBM InfoSphere QualityStage is a foundational component for your data quality and
information governance initiatives. It helps you create and maintain consistent
views of key entities including customers, vendors, locations and products. This lets
you investigate, cleanse and manage your data.
Use InfoSphere QualityStage to deliver quality data for your big data, business
intelligence, data warehousing, application migration and master data management
projects.
High quality data about core business entities

Offers an easy-to-use graphical user interface (GUI) for specifying automated


data quality processes: data investigation, standardization, matching and
survivorship

Delivers investigation and analysis processing capabilities for free-form data

Offers a single set of standardization, cleansing, matching and survivorship


rules executed in batch, near real time or as a web service

Achieves superior match rates with probabilistic matching technology and


fuzzy matching capabilities

Lets you validate and monitor data in motion

Delivers domain-agnostic data cleansing capabilities including product data,


phone numbers, email addresses, birth dates, events and other comment and
descriptive fields

Delivers worldwide address standardization verification and enrichment


capabilities including postal certification modules for United States, Canada
and Australia

Data quality within a unified platform

Reduces risks and lowers the total cost of ownership (TCO) by integrating
with other components of InfoSphere Information Server platform to deliver
end-to-end data integration capabilities

Leverages shared metadata across InfoSphere Information Server platform


for greater consistency of information and the ability to perform impact
analysis

Shares other platform capabilities that include connectivity to


sources/targets, centralized rule management and a parallel processing
engine that scales to handle large volumes of data

Offers unified installation, deployment and source control for fast startup as
well as unified data quality and transformation functions to reduce project
costs

Support for information governance

Offers an enterprise-level exception monitoring system with a web-based


dashboard. It features comprehensive search, filtering and drill-down
capabilities for business users to investigate quality issues

Lets users create business terms and information governance policies and
then link them to quality rules that are monitored

IBM InfoSphere Information Analyzer


IBM InfoSphere Information Analyzer provides data quality assessment, data quality
monitoring and data rule design and analysis capabilities. This software helps you
derive more meaning from your enterprise data, reduces the risk of proliferating
incorrect information, facilitates the delivery of trusted content, and helps to lower
data integration costs.
Advanced analysis and monitoring

Enables users to easily classify data, display data using semantics, validate
column/table relationships and move to exception rows for further analysis.

Provides data quality assessment functions such as column, primary key,


foreign key, cross-domain and baseline analysis, and offers 80 configurable
reports for visualizing analysis and trends.

Uses the IBM Information Server scheduling service to allow scheduled


execution of profiling, rules and metrics.

Provides auditing, tracking and monitoring of data quality conditions over


time to support data governance initiatives.

Uses project-, role- and user-based approaches to control access to sensitive


information, including the ability to restrict access to original data sources.

Integrated rules analysis

Provides common data rules to perform trending, pattern analysis and


establish baselines consistently over data sources.

Offers multiple-level rules analysis (by rule, record, pattern) for evaluating
data issues by record rather than in isolation.

Provides pre-packaged data validation rules to reduce development time.

Offers exception-based management of business rules and transformations.

Scalable, collaborative platform

Provides native parallel execution for enterprise scalability to support large


volumes of data.

Supports multiple analytical reviews and asynchronous profiling to allow more


than one user to work in a project-based context.

Uses virtual tables and columns for analyzing data without requiring changes
to a host database.

Provides annotations to enable users to add their business names,


descriptions, business terms and other attributes to tables, columns and
rules.

Supports multiple languages including English, Chinese (simplified and


traditional), Japanese, Korean, Spanish, Portuguese (Brazil), Italian, German
and French.

Support for heterogeneous data

Uses open database connectivity (ODBC) or native connectivity to profile IBM


DB2, IBM Informix, Oracle, Microsoft SQL Server, Sybase, Microsoft Access,
Teradata and other data sources such as text files.

Allows reuse and sharing of data rules in IBM InfoSphere DataStage through
IBM InfoSphere QualityStage and InfoSphere Information Analyzer to help you
align data quality metrics throughout the project lifecycle.

Uses metadata to allow analytical results to be shared across all IBM


InfoSphere Information Server modules.

Integrates with IBM InfoSphere Metadata Workbench and IBM InfoSphere


Business Glossary.

Reporting & Visualization Software


We have proposed IBM Cognos Analytics for the reporting and visualization
requirements of this POC. IBM Cognos Analytics offers guided, self-service
capabilities designed to address these demands. The platform delivers a personal
approach to analytics by empowering business users to solve individual or
workgroup challenges on their own while providing IT with a proven solution that
can be easily scaled as business needs grow.

Smarter self-service
For business users, Cognos Analytics offers a guided experience that helps them
create or customize dashboards and reports in minutes. Smart search capabilities

anticipate their intent and help them access trusted and personal data sources
without assistance from IT.

Get answers when and where you need them with consistent web and mobile
user experiences

Capture and combine all types of data effortlessly with intent-based modeling

Increase your analytic skills without training by seamlessly transitioning


between tasks

Make decisions with confidence


Business users want assurance that the data is trustworthy and protectedand
they cant afford to waste time moving data around. Cognos Analytics offers a single
environment spanning departmental and enterprise reporting thats designed to
connect people and ideas.

Be confident that your decisions are based on trusted data

Uncover new insights with the flexibility to report from multiple data sources

Gain insights from corporately sanctioned data

Delivery of content to mobile devices


IBM Cognos Mobile extends interactive Cognos Analytics to a broad range of mobile
devices, including the Apple iPhone and iPad and Android and tablets. With a rich
client, users can view and fully interact with Cognos reports, dashboards, metrics,
analysis and other information in a security-rich environment. Users receive timely,
informative and interactive business intelligence (BI) to support their decisionmaking, regardless of location. Embedded "how-to" tutorials provide tips for getting
the most out of the Cognos Mobile application.
Cognos Mobile is designed to enable you to:

Experience insight wherever you are with support for a variety of mobile
technologies with a rich, interactive BI interface.

Interact with data online or off with highly visual, interactive reports.

Enable IT to confidently deploy BI to many mobile devices and give them


tools to protect against data loss and theft while enforcing policy compliance

Experience insight wherever you are


Cognos Mobile offers what you need to gain insight wherever you are.
BI capabilities on mobile devices: View and interact with reports, dashboards and
analysis anywhere you happen to be. You can also work with broad analytic
content and all supported data sources in an uninterrupted experience from office

to notebook, tablet or mobile phone. In essence, you can work with the same
information that is available to you in the office.
The most up-to-date BI available at your fingertips: Make timely and accurate
decisions based on the most up-to-date information. You do not need to go to the
office to get the latest report because you can automatically refresh or receive
reports and dashboards on your device. Reports load page by page, so you dont
have to wait for the entire report to download before reviewing it.
Personalized mobile experience: Organize your workspace, give it a new name or
connect to multiple servers with the tap of a finger. Arrange content the way you
want to see it and select a unique background. Embedded how-to tutorials provide
tips and tricks to get the most out of the Cognos Mobile application.

Figure 1: Personalize your Cognos Mobile experience by rearranging content and


renaming your workspace
Analysis on the go: Keep a pulse on your business while on the road. So you can get
to the right level of information when you need it, Cognos Mobile provides the same
navigation you have in the office.
Interact with information offline or online
You want the same experience in the work world that you have in the consumer
world in terms of interactivity and consistency. Cognos Mobile provides you with the
same rich, interactive experience you get from consumer mobile applications. You
can access highly visual reports and dashboards offline, on your desktop computer

or on your mobile device. You can explore business information without having to
rely on network connectivity. For example, you can share your perspectives with
others on your iPad by highlighting an area for discussion and then sending an email
with comments, insights and actions to the appropriate people. Because you can
communicate information regardless of where you are, you can take timely action.

Figure 2: Access highly visual and interactive reports with Cognos Active Report
when connected or disconnected.

Analytics Data Mart


We have proposed IBM DB2 ADVANCED WORKGROUP SERVER EDITION in the role of
the analytics data mart. This single database will host multiple logical areas for
staging data, detailed data, and dimensional summaries.

DB2 Advanced Workgroup Server Edition


IBM DB2 Advanced Workgroup Server Edition is the ideal multi-workload database
solution that offers data warehousing, transactional and analytics capabilities. It
provides storage optimization, system availability, workload management and
performance to help reduce overall database costs.
Optimized database storage and performance

Incorporates the new BLU Acceleration to exploit dynamic in-memory


columnar technology and other innovations such as parallel vector

processing, actionable compression and data skipping to speed processing


(applicable if BLU deployment option is selected)

Offers database partitioning to deliver massively parallel processing using


multiple partitions and servers, ideal for warehousing workload.

Supports continuous data ingest from a variety of sources to enable more


rapid and reliable decision making in near real time.

Contains advanced workload management to optimize the handling of mixed


workloads and random inputs and outputs (I/O).

Includes InfoSphere Optim Performance Manager Extended Edition to enable


IT staff to identify, diagnose, solve and prevent performance problems in
database and warehousing associated applications.

Faster, more cost-effective queries

Provides multidimensional clustering (MDC) to continuously and automatically


cluster table data in multiple dimensions. This improves query performance
by reducing I/O requirements.

Utilizes materialized query tables (MQTs) to improve the performance of


repeatedly requested queries by caching results.

Includes multi-temperature data management to place hot data on the


fastest and most expensive solid state disks (SSDs) and place other data on
less expensive storage systems.

Uses the DB2 Workload Manager to allow administrators to prioritize queries


from different users and applications and control the number of underlying
resources dedicated to these processes.

High availability and workload management capabilities

Provides a high availability disaster recovery (HADR) solution for both partial
and complete site failures. Protects against data loss by replicating data
changes from a source database.

Leverages DB2 pureScale reliability for always available transactions. When


coupled with HADR, it further mitigates the impact of unplanned outages.

Offers SQL compatibility features to help reduce the cost and risk of migrating
legacy Oracle database applications to DB2.

Uses adaptive compression technology to significantly reduce the need for


table reorganization and overall data maintenance for further cost savings
and performance improvements.

Simplified database administration and development tools

Enables users to design, model, reverse engineer and validate physical


database schemas using Design Studio.

Supports data movement and transformation through the SQL Warehousing


Tool (SQW), a part of Design Studio, to generate SQL for warehouse
maintenance and administration.

Includes InfoSphere Data Architect, a collaborative data design tool that can
discover, model, visualize, relate and standardize diverse and distributed
data assets.

Contains InfoSphere Optim Query Workload Tuner to help developers write


more efficient SQL queries, identify query candidates and facilitate analysis of
queries. Supports the analysis of single queries or query workloads to
improve performance.

Uses InfoSphere Optim Configuration Manager, a central repository designed


to help IT departments manage client configurations and security, track
configuration changes and balance workloads.

Provides security capabilities using row and column access controls based on
defined rules for users, groups and roles.