Академический Документы
Профессиональный Документы
Культура Документы
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
Introducing Hadoop
A new approach to data processing and data storage
Rather than a small number of large, powerful servers, it spreads processing over
large numbers of small, cheap, redundant servers
Spreads the data youre processing over
lots of distributed nodes
Job Tracker
Has scheduling/workload process that sends
parts of a job to each of the nodes
- a bit like Oracle Parallel Execution
And does the processing where the data sits
Task Tracker
Task Tracker
Task Tracker
Task Tracker
- a bit like Exadata storage servers
Shared-nothing architecture
Low-cost and highly horizontal scalable
Data Node
Task Tracker
Data Node
Task Tracker
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
SmartScan
Oracle Big
Data SQL
DATABASE_NAME
-----------------------------default
default
default
default
default
default
default
default
TABLE_NAME
-----------------------------access_per_post
access_per_post_categories
access_per_post_full
apachelog
categories
countries
cust
hive_raw_apache_access_log
E : info@rittmanmead.com
W : www.rittmanmead.com
3%
External$Table$Services$
10110010%
Smart$Scan$
2%
RecordReader%
Data$Node$
Disks%
E : info@rittmanmead.com
W : www.rittmanmead.com
10110010%
SerDe%
Apply filters
Project columns
Parse JSON/XML
Score models
Big$Data$SQL$Server$
10110010%
1%
E : info@rittmanmead.com
W : www.rittmanmead.com
Linear Regression
Fitting of ordinary-least-squares
regression line to set of number pairs
Descriptive Statistics
Correlations
Pearsons correlation coefficients
Crosstabs
Chi squared, phi coefficinet
Hypothesis Testing
Student t-test, Bionomal test
Distribution
Anderson-Darling test - etc.
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
Combined output
in report form
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
movieapp_log_odistage.custid = CUSTOMER.CUSTID
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
Sqoop extract
movieapp_log_odistage.custid =
customer.custid
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
2
Register in ODI Model
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
1
Register in ODI Model
4
Hive table AP uses LKM Hive to Oracle (Big Data SQL)
2
IKM Oracle Insert
E : info@rittmanmead.com
W : www.rittmanmead.com
Summary
Oracle Big Data SQL extends Exadata capabilities over Hadoop and Hive
Makes Hive Metastore visible through Oracle Data Dictionary
Register Hive tables as ORACLE_HIVE external tables and include in Oracle SQL queries
Used with OBIEE, allows RPDs to be created across both Oracle + Hive data, with
query federation handled by Oracle RBDMS rather than BI Server
Enables use of Oracle Advanced Analytics functions over Hadoop data
Useful for ODI as way of using full set of join operators on Hive data, and simplifying
the addition of Hive data to Oracle mappings
For developers working with Exadata + BDA, useful addition to the data access toolkit
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com
E : info@rittmanmead.com
W : www.rittmanmead.com