Академический Документы
Профессиональный Документы
Культура Документы
to your BI Landscape
Tillmann Eitelberg
Oliver Engels
Our Sponsors
Oliver Engels
CEO, oh22data AG
Tillmann Eitelberg
CEO, oh22information services GmbH
cheap to bi-lingual –
easy to use enterprise friendly
operationalize it exclusive to Java
Azure Data Lake Analytics: Decision tree
My problem?
I want to do big data
when I have a real big
data problem!
Everything else I do with
I
my favourite SQL Server!
SQL
If I do big data, it will
never stand alone, I
need to integrate!
Azure Data Lake
as part of Cortana Analytics Suite
Information Big Data Stores Machine Learning Dashboards and
Management and Analytics Visualizations
Power BI
Business
apps
Azure Azure Azure Personal Digital Assistant
Data Factory SQL Data Warehouse Machine Learning
Cortana
Azure
Stream Analytics People
Azure
Custom Data Catalog Perceptual Intelligence
apps Azure
HDInsight (Hadoop) Face, vision
Azure
Data Lake Store Speech, text
Azure Azure
Event Hub Data Lake Analytics Business Scenarios
Sensors Recommendations,
and devices Automated
customer churn,
Systems
forecasting, etc.
Iterate
Gather data
Store indefinitely Analyze See results
from all sources
Why data lakes?
Apache
SQL Flume
Azure SQL DB
Server logs
Built-in
copy service
Built-in
SQL copy service
Apache Sqoop
Azure SQL DW
ADL Store
.NET SDK
Table Storage
JavaScript CLI
Azure Tables Azure Portal
Azure PowerShell
Note: If you are using Hadoop (Map Reduce programs or Hive or HBase) or Spark, then you will not be programming directly to the Azure Data Lake Store as they all will transparently access
Azure Data Lake Store under the covers.
Developing scripting applications
Provides native Windows and
cross-platform (Mac, Linux)
scripting experience
Scripting operations include Azure PowerShell cmdlets
JavaScript CLI
Create new directories
Scripting
Listing the contents of a directory
Upload files to directory
Delete files/directories
Rename files/directories
Azure Data Lake Store
…
Federated queries:
Query data where it lives
Easily query data in multiple Azure data stores without moving it to a single store
Benefits
Avoid moving large amounts of data across the Azure
network between stores Storage Blobs
Azure
SQL DB
Combining RowSets
U-SQL provides a number of operators to combine RowSets
OK, I have:
Transactional Data on my
SQL Azure DB (PaaS)
I
Master Data on my
SQL
Create secrets for your PaaS and IaaS databases in the database
(here: master) of your ADLA account
Check if your Network Settings allow access from outside to the
DBs
How do I integrate? Let‘s do a sample
Federated Queries:
//Get Data from Azure SQL Database //Get Data from SQL Server 2016 IaaS Database on Azure
@sales_internal = SELECT @country_mds = SELECT *
Countryiso FROM EXTERNAL COUNTRY_MASTER EXECUTE @"
,Market SELECT
,Product [Code]
,SYear ,[Name] AS Country
,SMonth ,[ISO2] AS Countryiso
,Sales ,[Capital]
,Units ,[Area]
FROM EXTERNAL SALES EXECUTE @" ,[Population]
SELECT FROM [TransformationDB].[dbo].[vw_GetCountryMaster]";
[COUNTRYISO] AS Countryiso
,[MARKET] AS Market
,[PRODUCT] AS Product
,[SMONTH] AS SMonth
,[SYEAR] AS SYear
,[SALES] AS Sales
,[UNITS] AS Units
FROM [dbo].[Sales]";
How do I integrate? Let‘s do a sample
Federated Queries:
// Union the two streams from Internal Sales and External Marketresearch
// Calculate Sales
@sales =
@sales_external=
SELECT Countryiso,
SELECT C.Countryiso,
Market,
S.Market,
Product,
S.Product,
SYear,
S.SYear,
SMonth,
S.SMonth,
Sales,
SUM((Ttype == "SALES") ? SValue : 0 ) AS Sales,
Units,
SUM((Ttype == "SALES") ? SValue : 0 ) AS Units
"External" AS Source
FROM @sales_external_unp AS S INNER JOIN @country_mds AS C ON
FROM @sales_external
S.Country == C.Country
UNION
GROUP BY
C.Countryiso,
SELECT Countryiso,
S.Market,
Market,
S.Product,
Product,
S.SYear,
Convert.ToInt32(SYear) AS SYear,
S.SMonth;
Convert.ToInt32(SMonth) AS SMonth,
Sales,
Units,
"Internal" AS Source
FROM @sales_internal;
How do I integrate?
Using ADF
https://azure.microsoft.com/en-
us/documentation/articles/data-factory-
data-movement-activities/#supported-
data-stores-and-formats
How do I integrate? Access ADLA Results
Azure Data Lake Analytics results can only be out put to:
How can you integrate this into your on prem world without
downloading data
POLYBASE!
Build an external table with POLYBASE to access the data
Works actually only with Azure Blob Storage – not ADLS
DEM0
Polybase to Access Azure
Data Lake Analytics results
How do I integrate? Trigger ADLA Jobs
Visual Studio
ADF
Azure Portal
.Net (SDKs)
PowerShell:
#Azure Data Lake Analytics Job Execution via PowerShell
$ADLA_Account = "adlaoh22"
$usql = "C:\Users\oengels\OneDrive\PASS\Summit2016\FederatedQuery.usql„
Woohoo!
I That is nice technology, I
ADLA will start today with the
preview!
Thank You!