Академический Документы
Профессиональный Документы
Культура Документы
MEET ANDREW
Founder, Blue Badge Insights Chief Technology Officer, Tallan Member, Microsoft BI Partner Advisory Council
Redmond Review columnist for Visual Studio Magazine and Redmond Developer News
brustblog.com,
@andrewbrust
AGENDA
Concepts, terms, approaches and products for DW, multi-dimensional databases Other BI product categories CPM, scorecards, dashboards Stacks, acquisitions, vendors Open Source Excel Cloud NoSQL Fusion
BIS PREMISE
Conventional RDBMSes are for operational, transactional use
Execute a trade; lookup an order; book a flight They perform data maintenance very well If we want to move from tracking data to discovering information, we need different methodologies, architectures We may need different technologies and products too Reading data, not writing it
Transactions Process
Transaction Database
DIMENSIONAL MODEL
Measure
Dimension Hierarchy Grain
STAR SCHEMAS
Physical data model
Central fact table Multiple dimension tables Used to constrain fact table queries
DATA WAREHOUSE VS. DATA MART AND BILL INMON VS. RALPH KIMBALL
A Data Mart is suited to a purpose, specific team, department, division or other subgrouping of the organizations data
May be subset of DW May be a standalone, tactical repository Kimball vs. Inmon Kimball (bottom-up) Data Warehouse is constructed from Data Marts; both use dimensional model/star schema Inmon (top-down) Data Marts are extracted from the DW. DW uses 3rd Normal Form; Marts use Star Schema
Transactional data
Age of data Unstructured data Metadata Still a challenger to Kimball
DW 2.0
DW VENDORS
All major RDBMS vendors, including:
IBM (DB2) Oracle Microsoft (SQL Server) Specialty DW technologies and vendors
Formerly DATAllegro
Teradata Netezza (acquired by IBM) Greenplum, ParAccel, others
Multi-Dimensional Model
Multi-Dimensional Data Store Relational Data Store
FROM DW TO MULTIDIMENSIONAL
Data Warehouse
Transactions Process
Transaction Database
Multidimensional Hierarchical
OLAP Database
OLAP STANDARDS
MDX: MultiDimensional eXpression language
An OLAP query language that is superficially similar to SQL XML for Analysis (XMLA) A SOAP Web Service interface for performing admin/DDL and query tasks over HTTP Both come from (Microsoft) Analysis Services Are nonetheless implemented (to varying degrees) in numerous competing products
COLUMN-ORIENTED
Store values for a given column next to each other, instead of values for a given row More efficient for aggregating single columns Facilitates high degree of compression Manipulate large amounts of data in memory Examples Sybase IQ (now owned by SAP) Vertica QlikView Microsoft PowerPivot (and SQL Server v.Next)
OTHER BI TECHNOLOGIES
Enterprise Reporting
Enterprise Data Management (EDM) Extract, transform and load (ETL) Data Quality Management (DQM) Master Data Management (MDM)
Management methodology
Perspectives, Objectives, KPIs, Strategy Maps Data Visualization Dashboards If BI is the platform, CPM is the application
BSC EXAMPLE
ACQUISITIONS
Acquisitions define this market
Biggies IBM-> Cognos, SPSS, Netezza
M&A RAMIFICATIONS
Most mega-vendors stacks are stitched together
Best of breed is difficult strategy Common licenses may bundle disjoint product sets
VENDORS - COMMERCIAL
Mega-vendors
IBM Oracle SAP
Microsoft
Others MicroStrategy
SAS
QlikTech Tableau Tibco/Spotfire
OPEN SOURCE
Pentaho
Japsersoft and
EXCEL AS BI TOOL
Formal classifications notwithstanding, Excel is the #1 BI tool out there
Most BI technologies have an Excel add-in story Excel has native database connectivity; PivotTables (and charts) are designed for dimensional analysis Excel can query SQL Server Analysis Services cubes directly
CLOUD BI
Amazon Web Services-Based Pentaho/ParAccel Jaspersoft GoodData Proprietary Oco Birst myDIALS Tableau Public Predixion Microsoft: SQL Azure Federation and SQL Azure Reporting
NOSQL
Works with unstructured data; abhors schemas
NoSQL not really a BI technology, but Uses MapReduce and sharding which have some commonality with MPP architectures Lives in the same Open Source and/or startup milieu as MPP and columnar
FUSION, CONSOLIDATION
MPP Data Warehouse appliances may be built on open source databases (even SQL Server PDW started that way)
MPP products adding columnar features
WHAT NEXT?
If youre not using BI, you should be
Start simple small data marts, cubes Exercise healthy skepticism of MPP, Columnar, Cloud, but do not dismiss any of them Have an Excel strategy Design for your users After some experimentation, consider DQM, MDM strategy
RESOURCES
Links to most companies, products and subjects mentioned in this talk can be found at: http://bit.ly/IASABrustBILinks
QUESTIONS
andrew.brust@tallan.com
@andrewbrust