Вы находитесь на странице: 1из 67

Scaling

ThisM
Im not as cool as Zach
Theres no picture for this, it would
change too much
The Living and Evolving AWS Cloud

Infrastructure
building blocks
Platform building
blocks
Tools to access
services
Cross Service
features
Each day, AWS adds the
equivalent server capacity to power Amazon
when it was a global, $2.76B enterprise
(circa 2000)
The Cloud Scales: Amazon S3 Growth
Q4 2006 Q4 2007 Q4 2008 Q4 2009 Q4 2010 Q3 2011
Peak Requests:
370,000+
per second
Total Number of Objects Stored in Amazon S3
2.9 Billion
14 Billion
40 Billion
102 Billion
566 Billion
262 Billion
Global Infrastructure for Global Enterprises
US West
(Northern
California)
US East
(Northern
Virginia)
Europe
West
(Dublin)
Asia
Pacific
Region
(Singapore)
Asia
Pacific
Region
(Tokyo)
AWS Regions
AWS Edge Locations
GovCloud
(US ITAR Region)
Powerful Highly scalable, Highly available,
Highly responsive Fault-tolerant, Cost-effective globally deployed
Web application
















Availability Zone #1
Buckets











Amazon EC2
Instance
MySQL
Apache
PHP Mod
Backups

Amazon S3
Seriouslyradwebsite.com
Elastic IP
Pattern #1: Design for failure and nothing will fail
















Availability Zone #1
Buckets











Amazon EC2
Instance
MySQL
Apache
PHP Mod
Backups

Amazon S3
Seriouslyradwebsite.com

Elastic IP






















Availability Zone #1
Buckets











Amazon EC2
Instance
MySQL
Apache
PHP Mod
Root
Volume
Data
Volume
Snapshots

Logs
Static Data

Backups

Amazon S3
Seriouslyradwebsite.com
Amazon EBS
Elastic IP
Pattern #2: Edge cache static content






















Availability Zone #1
Buckets











Amazon EC2
Instance
MySQL
Apache
PHP Mod
Root
Volume
Data
Volume
Snapshots

Logs
Static Data

Backups

Amazon S3
Seriouslyradwebsite.com

Amazon EBS
Elastic IP






















Availability Zone #1
Distribution
Buckets
Amazon
CloudFront











Amazon EC2
Instance
MySQL
Apache
PHP Mod
Root
Volume
Data
Volume
Snapshots

Logs
Static Data

Backups

Amazon S3
Seriouslyradwebsite.com
(dynamic data)
Media.Seriouslyradwebsite.com
(static data)
Amazon EBS
Elastic IP






















Availability Zone #1
Distribution
Buckets
Amazon
CloudFront











Amazon EC2
Instance
MySQL
Apache
PHP Mod
Root
Volume
Data
Volume
Snapshots

Logs
Static Data

Backups

Amazon S3
Seriouslyradwebsite.com
(dynamic data)
Media.seriouslyradwebsite.com
(static data)
Amazon EBS
Elastic IP


















Availability Zone #1
Distribution
Buckets
Amazon
CloudFront






Amazon EC2
Instance
MySQL
Apache
PHP Mod
Logs
Static Data

Backups

Amazon S3
Seriouslyradwebsite.com
(dynamic data)
Seriouslyradwebsite.com
(static data)
Elastic IP
Amazon RDS






















Availability Zone #1








Production EC2
Instance
MySQL
Seriouslyradwebsite.com
Elastic IP
183.2.3.1
Amazon RDS
App v1.1
Apache
PHP Mod
staging.Seriouslyradwebsite.com
Dynamic IP
172.3.1.4








Staging EC2
Instance
App v1.2
Apache
PHP Mod
Production EC2
Instance
Cloud Tip:
Smart use of
Elastic IPs
(when upgrading
new versions of
your app)


















Availability Zone #1
Distribution
Buckets
Amazon
CloudFront






Amazon EC2
Instance
MySQL
Apache
PHP Mod
Logs
Static Data

Backups

Amazon S3
Seriouslyradwebsite.com
(dynamic data)
Media.Seriouslyradwebsite.com

(static data)
Elastic IP
Amazon RDS
Resilient to reboot and re-launch:
Design the system such that in the event of a failure, it is resilient
enough to automatically re-launch and restart. Forcefully fail and test.

Stateless:
Extract stateful components out and make them stateless

Packable into an AMI:
Package and deploy your application into an AMI so it can run on an
Amazon EC2 instance. Try to run multiple instances of the application
on one EC2 instance, if needed. Run multiple instances on multiple
Amazon EC2 instances.

Decouple:
Isolate the components using Amazon SQS. Decouple code with
deployment and configuration.
Principles of elastic cloud architectures
Pattern #3: Implement Elasticity


















Availability Zone #1



Amazon Machine
Image
Distribution
Buckets
Amazon
CloudFront






Amazon EC2
Instance
MySQL
Apache
PHP Mod
Logs
Static Data

Backups

Amazon S3
www.myphpwebsite.com
(dynamic data)
media.myphpwebsite.com
(static data)
Elastic IP
Amazon RDS




















Availability Zone #1










Auto Scaling Group
Distribution
Buckets
Amazon
CloudFront






Amazon EC2
Instance
MySQL
Apache
PHP Mod
Logs
Static Data

Backups

Amazon S3
media.myphpwebsite.com
(static data)
Amazon RDS
LB
www.myphpwebsite.com
(dynamic data)
Elastic Load
Balancer
Amazon Route 53
(DNS)



Amazon Machine
Image




















Availability Zone #1










Auto Scaling Group
Distribution
Buckets
Amazon
CloudFront






Amazon EC2
Instance
MySQL
Apache
PHP Mod
Logs
Static Data

Backups

Amazon S3
media.myphpwebsite.com
(static data)
Amazon RDS
LB
www.myphpwebsite.com
(dynamic data)
Elastic Load
Balancer
Amazon Route 53
(DNS)
Amazon CloudWatch
(Monitoring)
Amazon SimpleDB
(Catalog and Config data)
Amazon SNS
(notifications)















Availability Zone #1







Auto Scaling group : Web App Tier
Apache
PHP Mod
LB
Apache
PHP Mod
Amazon RDS
Distribution
Buckets
Amazon
CloudFront
Amazon S3
www.myphpwebsite.com
(dynamic data)
media.myphpwebsite.com
(static data)
Elastic Load
Balancer
Amazon Route 53
(DNS)
Amazon EC2
Amazon CloudWatch
(Monitoring)
Amazon SimpleDB
(Catalog and Config data)
Amazon SNS
(notifications)
MySQL
Controller A Controller B Controller C
Controller A Controller B Controller C
Q Q Q
Tight Coupling
Loose Coupling
using Queues
Cloud Tip:
Decouple
components.
The looser they're
coupled, the
bigger they scale















Availability Zone #1







Auto Scaling group : Web App Tier
Apache
PHP Mod
LB
Apache
PHP Mod
Amazon RDS
Distribution
Buckets
Amazon
CloudFront
Amazon S3
www.myphpwebsite.com
(dynamic data)
media.myphpwebsite.com
(static data)
Elastic Load
Balancer
Amazon Route 53
(DNS)
Amazon EC2
Amazon CloudWatch
(Monitoring)
Amazon SimpleDB
(Catalog and Config data)
Amazon SNS
(notifications)
MySQL
Pattern #4: Leverage Multiple Availability Zones
















Availability Zone #2
Standby
Slave















Availability Zone #1







Auto Scaling group : Web App Tier
Primary
Multi-AZ
Apache
PHP Mod
LB
Apache
PHP Mod
Amazon RDS
Amazon RDS
Distribution
Buckets
Amazon
CloudFront
Amazon S3
www.myphpwebsite.com
(dynamic data)
media.myphpwebsite.com
(static data)
Elastic Load
Balancer
Amazon Route 53
(DNS)
Amazon EC2
Pattern #5:
Isolate read and write traffic;
Isolate static and dynamic traffic
















Availability Zone #2















Availability Zone #1
Standby
Multi-AZ







Auto Scaling group : Web App Tier
Primary
Master
Apache
PHP Mod
LB
Apache
PHP Mod
Amazon RDS
Amazon RDS

Read
Replica
Read
Replica
Async
Replication
Distribution
Buckets
Amazon
CloudFront
Amazon S3
www.myphpwebsite.com
(dynamic data)
media.myphpwebsite.com
(static data)
Elastic Load
Balancer
Amazon Route 53
(DNS)
Amazon EC2
Pattern #6: Automate your in-cloud Software Development and
Deployment Lifecycle
YAGNI


(You aint gonna need it)
YAGNI-UYRNI


(You aint gonna need it, until you really need it)
Build and
Deployment
Automate
Using
Cloud APIs

1. Keep absolutely everything in version control
2. Commit early and commit often
3. Always check in to trunk and avoid branching
4. Take responsibility if your check in breaks the build
5. Automate the build, test, deploy process
6. Be prepared to stop the mainline when/if build breaks
7. Create a comprehensive automated test suite
8. Only one way deploy and everybody uses that same way
9. Be prepared to revert to the previous revision
10. Continuously improve collaboration and increase speed of feedback

Application Containers - JBoss, Tomcat, IIS, Mongrel. NOTE: there are so many app containers, I'm not going to try to list
all of them.
Build Tools - Ant, AntContrib, NAnt, MSBuild, Buildr, Gant, Gradle, make, Maven, Rake
Code Review - Crucible
Code Insight - Fisheye
Continuous Integration - Bamboo, Jenkins, AntHill Pro, Go, TeamCity, TFS 2010
Database - Hibernate, MySQL, Liquibase, Oracle, PostgreSQL, SQL Server, SimpleDB, SQL Azure, Ant, MongoDB
Database Change Management - dbdeploy, Liquibase
Data Center Configuration Automation - Capistrano, Cobbler, BMC Bladelogic, CFEngine, IBM Tivoli Provisioning
Manager, Puppet, Chef, Bcfg2, AWS Cloud Formation, Windows Azure AppFabric NOTE: There are many names and
overlap for this tool "category".
Dependency Management - Ivy, Archiva, Nexus, Artifactory, Bundler
Deployment Automation - Java Secure Channel, ControlTier, Altiris, Capistrano, Fabric, Func
Information Sharing - Confluence, Google Apps
Installer - InstallShield, IzPack
Integrated Development Environment (IDE) - Eclipse, IDEA, Visual Studio
Issue Tracking - Greenhopper, JIRA
Multi-Type - rPath
Passwords - PassPack, PasswordSafe
Protected Configuration - ESCAPE, ConfigGen
Project Management - JIRA, Pivotal Tracker, SmartSheet
Provisioning - JEOS, BoxGrinder, CLIP, Eucalyptus, AppLogic
Reporting/Documentation - Doxygen, Grand, GraphViz, JavaDoc, NDoc, SchemaSpy, UmlGraph
Static Analysis - CheckStyle, Clover, Cobertura, FindBugs, FxCop, JavaNCSS, JDepend, PMD, Sonar, Simian
Systems Monitoring - CloudKick, Nagios, Zabbix, Zenoss
Testing AntUnit, Cucumber, DbUnit, webrat, easyb, Fitnesse, JMeter, JUnit, NBehave, SoapUI, Selenium, RSpec,SauceLabs
Version-Control System - SVN/Subversion, git, Perforce
Paul Duvalls Blog
http://blog.stelligent.com/integrate-button/2011/03/list-of-software-tools-for-continuous-delivery-in-the-cloud.html
Application Containers - JBoss, Tomcat, IIS, Mongrel. NOTE: there are so many app containers, I'm not going to try to list
all of them.
Build Tools - Ant, AntContrib, NAnt, MSBuild, Buildr, Gant, Gradle, make, Maven, Rake
Code Review - Crucible
Code Insight - Fisheye
Continuous Integration - Bamboo, Jenkins, AntHill Pro, Go, TeamCity, TFS 2010
Database - Hibernate, MySQL, Liquibase, Oracle, PostgreSQL, SQL Server, SimpleDB, SQL Azure, Ant, MongoDB
Database Change Management - dbdeploy, Liquibase
Data Center Configuration Automation - Capistrano, Cobbler, BMC Bladelogic, CFEngine, IBM Tivoli Provisioning
Manager, Puppet, Chef, Bcfg2, AWS Cloud Formation, Windows Azure AppFabric NOTE: There are many names and
overlap for this tool "category".
Dependency Management - Ivy, Archiva, Nexus, Artifactory, Bundler
Deployment Automation - Java Secure Channel, ControlTier, Altiris, Capistrano, Fabric, Func
Information Sharing - Confluence, Google Apps
Installer - InstallShield, IzPack
Integrated Development Environment (IDE) - Eclipse, IDEA, Visual Studio
Issue Tracking - Greenhopper, JIRA
Multi-Type - rPath
Passwords - PassPack, PasswordSafe
Protected Configuration - ESCAPE, ConfigGen
Project Management - JIRA, Pivotal Tracker, SmartSheet
Provisioning - JEOS, BoxGrinder, CLIP, Eucalyptus, AppLogic
Reporting/Documentation - Doxygen, Grand, GraphViz, JavaDoc, NDoc, SchemaSpy, UmlGraph
Static Analysis - CheckStyle, Clover, Cobertura, FindBugs, FxCop, JavaNCSS, JDepend, PMD, Sonar, Simian
Systems Monitoring - CloudKick, Nagios, Zabbix, Zenoss
Testing AntUnit, Cucumber, DbUnit, webrat, easyb, Fitnesse, JMeter, JUnit, NBehave, SoapUI, Selenium, RSpec,SauceLabs
Version-Control System - SVN/Subversion, git, Perforce
Paul Duvalls Blog
http://blog.stelligent.com/integrate-button/2011/03/list-of-software-tools-for-continuous-delivery-in-the-cloud.html
Version
Control
CI Server
Package
Builder
Deploy
Server Commit to
Git/master
Dev
Pull
Code
AMIs
Send Build Report to Dev
Stop everything if build failed
Distributed Builds
Run Tests in parallel
Staging Env
Test Env
Code
Config
Tests
Prod Env
Push
Config
Install
Create
Repo
CloudFormation
Templates for Env
Generate
Cloud Continuous Integration
















Availability Zone #2















Availability Zone #1
Standby
Multi-AZ







Auto Scaling group : Web App Tier
Primary
Master
Apache
PHP Mod
LB
Apache
PHP Mod
Amazon RDS
Amazon RDS

Read
Replica
Read
Replica
Async
Replication
Distribution
Buckets
Amazon
CloudFront
Amazon S3
www.myphpwebsite.com
(dynamic data)
media.myphpwebsite.com
(static data)
Elastic Load
Balancer
Amazon Route 53
(DNS)
Amazon EC2
Pattern #7: Cache as much as possible
















Availability Zone #2















Availability Zone #1
Standby
Multi-AZ







Auto Scaling group : Web App Tier
Primary
Master
Apache
PHP Mod
LB
Apache
PHP Mod
Amazon RDS
Amazon RDS

Read
Replica
Read
Replica
Async
Replication
Distribution
Buckets
Amazon
CloudFront
Amazon S3
www.myphpwebsite.com
(dynamic data)
media.myphpwebsite.com
(static data)
Elastic Load
Balancer
Amazon Route 53
(DNS)
Amazon EC2





















Availability Zone #2
RDS
MultiAZ





















Availability Zone #1




cache Tier




Auto Scaling group : Web Tier
RDS
Master
Elastic Load
Balancer
Memcache
LB
Tomcat
Memcache
Distribution
Buckets
Amazon
CloudFront
Amazon S3
www.myphpwebsite.com
(dynamic data)
media.myphpwebsite.com
(static data)
Amazon Route 53
(DNS)
Amazon EC2
Apache
PHP Mod
Apache
PHP Mod
Pattern #8: Hardening security at every stage
SAS 70 Type II Audit
ISO 27001/2 Certification
PCI DSS 2.0 Level 1-5
HIPAA/SOX Compliance
FISMA A&A Low



Enforce IAM policies
Use MFA, VPC, Leverage S3
bucket policies, EC2 Security
groups, EFS in EC2 Etc..


Encrypt data in transit
Encrypt data at rest
Protect your AWS Credentials
Rotate your keys
Secure your application, OS,
Stack and AMIs
In the cloud, Security is a Shared Responsibility
Application
Security
Services Security
Infrastructure
Security
How we secure our
infrastructure
What security options
and features are available
to you?
How can you secure your
application and what is
your responsibility?





















Availability Zone #2
RDS
Slave





















Availability Zone #1




Cache Tier




Auto Scaling group : Web Tier
RDS
Master
Elastic Load
Balancer
Memcache
LB
Tomcat
Memcache
Distribution
Buckets
Amazon
CloudFront
Amazon S3
www.myphpwebsite.com
(dynamic data)
media.myphpwebsite.com
(static data)
Amazon Route 53
(DNS)
Amazon EC2
# Permit HTTP(S) access to Web
Layer from the Entire Internet
ec2auth Web -p 80,443 -s
0.0.0.0/0





# Permit Web Layer access to
App Layer
ec2auth App -p 8000 -s
1.2.3.4/32





# Permit App Layer access to DB
ec2auth App -p 3209 -s
1.2.3.4/32


# Permit administrative access
SSH to all three layers
ec2auth Web -p 22 -o App
ec2auth DB -p 22 -o App
Apache
PHP Mod
Apache
PHP Mod





















Availability Zone #2
RDS
MultiAZ





















Availability Zone #1




cache Tier




Auto Scaling group : Web Tier
RDS
Master
Elastic Load
Balancer
Memcache
LB
Tomcat
Memcache
Distribution
Buckets
Amazon
CloudFront
Amazon S3
www.myphpwebsite.com
(dynamic data)
media.myphpwebsite.com
(static data)
Amazon Route 53
(DNS)
Amazon EC2
Apache
PHP Mod
Apache
PHP Mod





















Availability Zone #n





















Availability Zone #1




Cache Tier





















Availability Zone #2
Auto Scaling group : Web Tier Auto Scaling group : Web Tier
DB
Master
Elastic Load Balancer
Memcache
LB
Tomcat
Memcache
Read
Replica




Cache Tier
Memcache
Tomcat
Memcache
Multi-AZ
Slave
Amazon Route 53
(DNS)
Distribution
Buckets
Amazon
CloudFront
Amazon S3
Amazon RDS
www.myphpwebsite.com
(dynamic data)
media.myphpwebsite.com
(static data)
Amazon EC2
Apache
PHP Mod
Apache
PHP Mod
Apache
PHP Mod
Apache
PHP Mod
















Availability Zone #1
Buckets











Amazon EC2
Instance
MySQL
Apache
PHP Mod
Backups

Amazon S3
Seriouslyradwebsite.com
Elastic IP
Pattern: #9: Go Global Quickly (with single API)
Asia Pacific
Europe
US-West
US-East
Centralized Architecture
Web Application is hosted in a
centralized location in US-East
region
Web Application is accessed
from US-West , US-East ,
Europe and Asia Pacific
regions
Asia Pacific
Europe
US-West
US-East
Geo Distributed Architecture
Data Replicator
Web Application is hosted in
globally (US-West, US-East,
Europe and Asia Pacific)
Web Application
requests are directed
to the servers residing
in nearest regions
Data is synchronized between
databases across regions using
the Custom data replicator
program














US-West-1b
RDS
Multi-AZ

















US-West






Auto Scaling group : Web
App Tier
RDS
Master
US East Traffic US West Traffic
ELB
Geo IP/Directional DNS
Server
Asia Traffic
Software-based Data Replicator
Europe Traffic
Web
App
Web
App
Web
App
Web
App
Web
App
Web
App














US-East-1b
RDS
Multi-AZ

















US-East






Auto Scaling group : Web
App Tier
RDS
Master
ELB
Web
App
Web
App
Web
App
Web
App
Web
App
Web
App














EU-West-1b
RDS
Multi-AZ

















EU-West






Auto Scaling group : Web
App Tier
RDS
Master
ELB
Web
App
Web
App
Web
App
Web
App
Web
App
Web
App














AP-SOUTHEAST-1b
RDS
Multi-AZ

















AP-SOUTHEAST






Auto Scaling group : Web
App Tier
RDS
Master
ELB
Web
App
Web
App
Web
App
Web
App
Web
App
Web
App
Pattern #10: Keep optimizing and see the savings in the next
months bill
0
50,000
100,000
150,000
200,000
250,000
300,000
350,000
400,000
450,000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
On Demand 1-year RI 3-year RI
2
3
1
1-year RI versus On Demand:
cost savings realized after first 6 months of usage
3-year RI versus On Demand:
cost savings realized after first 9 months of usage.
3-year RI versus 1-year RI:
Net savings of 3-year RI versus 1-year RI begin by month 13 and
continue throughout the RI term (additional 23 months of savings)
1
2
3
Pattern #1: Design for failure and nothing will fail
Pattern #2: Edge cache static content
Pattern #3: Implement Elasticity
Pattern #4: Leverage Multiple Availability Zones
Pattern #5: Isolate read and write traffic; Isolate static and dynamic traffic
Pattern #6: Automate your in-cloud Development and Deployment Lifecycle
Pattern #7: Cache as much as possible
Pattern #8: Hardening security at every stage
Pattern #9: Go global quickly (with single API)
Pattern #10: Keep optimizing and see the savings in the next months bill
THIS IS JUST ONE EXAMPLE!
Your stuff is uhm.. Different.


Architecture is the practice of figuring out how
your stuff ought to survive change, and
making it so.
VPC is part of the Autodesk internal network
Source: Autodesk
NYTimes TimesMachine (June 2008)
1851-1922 Articles
TIFF -> PDF
Input: 11 Million Articles
(4TB of data)


What did he do ?
Spun 100 EC2 Instances for 24
hours
Input: All data on S3
Output: 1.5 TB of Data
Used: Hadoop, iText, JetS3t
SAP as a customer
AWS Cloud Usage @ SAP
Customer Demo and POC
Business ByDesign
BusinessOne
Training
Workshops
Development and Testing
SAP Carbon Impact
OnDemand runs on AWS!



SAP AWS footprint (as of Nov 2010)



Source: SAP
Case Study: Optimizing Video
Transcoding Workloads (On-demand + Spot + Reserved)
Free Offering
Optimize for reducing cost
Acceptable Delay Limits

Implementation
Set Persistent Requests
Use on-demand Instances,
if delay

Maximum Bid Price
< On-demand Rate
Get your set reduced price
for your workload
Premium Offering
Optimized for response times
No Delays

Implementation
Invest in RIs
Use on-demand for Elasticity

Maximum Bid Price
>= On-demand Rate
Get Instant Capacity for higher price

Other stuff that needs to scale
Storage
Coffee supply
Number of hours in the day
Queues
Memory
Bandwidth

Campus Acreage
More Scalable tools from AWS
DynamoDB
Cloudformation (and CloudFormer!)
SQS, SNS, SES, SDB, S3
Elastic Beanstalk

You need to scale to scale to scale
Staff
Product
Infra Customers
Growth
++
So do we!


Thanks!

Miles Ward

miward@amazon.com
@milesward

Вам также может понравиться