Академический Документы
Профессиональный Документы
Культура Документы
Ekta Parashar
Solutions Architect Manager, AISPL
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
• Quick recap of Amazon Redshift
• Additional resources
• Open Q&A
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Who am I ?
• Solution architect manager
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon SWF Amazon VPC IAM Amazon EC2
OLAP
MPP
Columnar
PostgreSQL
Amazon Redshift
Amazon Amazon
Amazon S3 AWS KMS
Route 53 CloudWatch
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
November 2018
February 2013
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Redshift architecture SQL clients/BI tools
JDBC/ODBC
Massively parallel, shared-nothing columnar architecture
128GB
16
RAM
Leader node Leader
cores
node
16TB
SQL endpoint
disk
Stores metadata
Coordinates parallel SQL processing
128gb 128gb 128gb
Compute nodes 16
ram
Compute 16
ram
Compute 16
ram
Compute
Local, columnar storage cores
node cores
node cores
node
16TB 16TB 16TB
Executes queries in parallel disk disk disk
Load, unload, backup, restore Load
Unload
Backup Amazon
Amazon Redshift Spectrum nodes ... Redshift
Restore 1 2 3 4 N
Execute queries directly against Spectrum
Amazon Simple Storage Service (Amazon S3)
Amazon S3
Use the ETL, SQL, and BI tools you love
Data Integration Business Intelligence Systems Integrators
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Redshift: What’s new and what’s coming
We’re innovating across the 4 things that matter most to customers
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Result caching
Compiled code cache COPY operation when Late materialization
ingesting data from Parquet
and ORC formats
Support for lateral
column alias reference Queries operating over CHAR Single-row inserts
and VARCHAR columns Queries with intermediate subquery
results that can be distributed
Query processing
2x the number of tables
Query planning improvements in a cluster Complex EXCEPT
subqueries
Cluster
resize operations
DC2 nodes
Improvements to speed
Short query
acceleration
Hash join memory utilization
optimizations and cache line
Resource management for prefetching
Queries that refer to stable
Improvements for the COPY memory-intensive queries functions with constant expressions
operation when ingesting data Expressions on the partition
from Parquet and ORC formats
Faster string manipulation columns of external tables
10x faster
than it was two years ago
1
Automatically Consistently fast
creates more performance even
clusters on- with thousands of
demand concurrent queries
Amazon Redshift
Managed S3
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Enabling Concurrency Scaling
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Concurrency Scaling configuration
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Redshift’s throughput scales with
concurrent users
10000
97% of users will never see a
8000
charge for auto-scale
6000 resources
4000
For every 24 hours your main
2000 cluster is in use, we’ll provide a
0 one-hour credit for concurrent
5 40 80 120 150 180 cluster usage
Number of concurrently active users
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Renaming external
table columns
DATE data type Push the LENGTH()
Support for Parquet, ORC, Avro, CSV, string function to
Retrieving metadata for late-binding Spectrum
and other open file formats viewsSupport for Enhanced VPC Routing
Improvements to scale
Specify the root of an Integrate seamlessly with your data lake Arrays of arrays and
arrays of maps
S3 bucket as the source
for an existing table
JDBC/ODBC
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Sizing an Amazon Redshift cluster for production
Estimate the uncompressed size of the incoming data
Assume 3x compression (actual can be > 4x)
Target 30-40% free space (resize to add/remove storage as needed)
• Disk utilization should be at least 15% and less than 80%
Based on performance requirements, pick SSD or HDD
• If required, nodes can be added for increased performance
Example:
20 TB of uncompressed data ~= 6.67 TB compressed
Depending on performance requirements, recommendation:
• 4xDC2.8xlarge or 5xDS2.xlarge = ~10 TB of capacity
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Resizing Amazon Redshift
Classic resize
• Data is transferred from old cluster to new cluster (within hours)
• Change node types
• Enable/disable full disk encryption
Elastic resize
• Nodes are added/removed to/from existing cluster (within minutes)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Classic resize
SQL clients/BI tools
JDBC/ODBC
Leader
Leader
Binary data transfer node
node
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Elastic resize
SQL Clients/BI Tools
Elastic resize is
requested
Leader
Node
Backup Backup Backup • Cluster is fully available for read and writes
Amazon S3
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Elastic resize
SQL Clients/BI Tools
Elastic Elastic
Elastic
resize is resize
Leader resize
requeste finishes
Node starts
d
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Elastic resize
SQL Clients/BI Tools
Node 1 Node 2 Node 3 Node 4 • Cluster is fully available; data transfer continues
in the background
Restore
• Hot blocks are moved first
Backup Backup Backup
Amazon S3
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
When to use elastic vs. classic resize
Elastic resize Classic resize
Scale up and down for workload
spikes ✔
Incrementally add/remove storage
✔
Change cluster instance type (SSD
←→ HDD) ✔
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Efficiency of backup performance CloudWatch metrics for
Enhancements to workload execution
breakdown
Automatic vacuum delete VACUUM DELETE
Improvements to
Current and trailing tracks
for release updates
Lateral column
alias reference
simplicity
Stream real-time data in CloudWatch metrics
Parquet or ORC formats for query duration by Cluster resize operations
using Amazon Kinesis Data WLM queues Short query
Firehose acceleration is
Query Monitoring Rules (QMR)
now support 3x more rules self-optimizing
Free upgrade from DC1
RIs to DC2
DISTSTYLE AUTO
CloudWatch
query runtime breakdown metric distribution style
CloudWatch metrics for query
throughput, query duration
SUMMIT © 2018,
2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Redshift query editor
Launched in October!
Query data
directly from
the AWS console
Results are instantly
visible within the console
No need to install
and set up an external
JDBC/ODBC client
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Run stored procedures in Amazon Redshift
Bring your existing stored
procedure and run it in
Amazon Redshift.
© 2018,
2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Redshift performs administration automatically
ALL
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Federated
authentication with
Encrypt your previously unencrypted single sign-on Cross-region backups for
cluster with 1 click KMS-encrypted clusters
IAM KMS
AI Services
OLTP ERP
Amazon Athena
CRM LOB
Amazon ES
Sensors Devices
Amazon
Redshift
Kinesis
Social Web
Amazon
QuickSight
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Redshift
New features
Improving 10x average
Elastic Concurrency
Speed short query performance
resize Scaling
acceleration improvement
Spectrum
Unload
Scale Request
to Parquet
Accelerator
Auto-
WLM Support for
Deferred Vacuum & Snapshot Auto Data
Simplicity Maintenance Auto- Scheduler Distribution
Concurrency stored
Setting procedures
Analyze
Amazon Lake
Security Formation
integration
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
More than 10K
customers use
AWS
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Nasdaq uses AWS to build a data lake
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data lake analytics with Amazon Redshift Spectrum
different
dashboards different fact &
to support our dimension tables in
stakeholders Amazon Redshift
of Data users
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
More places to learn about Amazon Redshift
Try it out for yourself: https://aws.amazon.com/redshift/
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Learn from AWS experts. Advance your skills and
knowledge. Build your future in the AWS Cloud.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why work with an APN Partner?
APN Partners are uniquely positioned APN Partners with deep expertise in
to help your organization at any AWS services:
stage of your cloud adoption journey, AWS Managed Service Provider (MSP)
and they:
Partners
• Share your goals—focused on your APN Partners with cloud infrastructure and
success application migration expertise
• Help you take full advantage of all the AWS Competency Partners
business benefits that AWS has to offer APN Partners with verified, vetted, and validated
specialized offerings
• Provide services and solutions to
support any AWS use case across your AWS Service Delivery Partners
full customer life cycle APN Partners with a track record of delivering
specific AWS services to customers
aws-apac-marketing@amazon.com
twitter.com/AWSCloud
facebook.com/AmazonWebServices
youtube.com/user/AmazonWebServices
slideshare.net/AmazonWebServices
twitch.tv/aws
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.