Академический Документы
Профессиональный Документы
Культура Документы
UNIV/POLTEK
DIGITAL TALENT
SCHOLARSHIP
2019
digitalent.kominfo.go.id
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Designing Resilient
Architectures
Daya Adianto <dayaadianto@cs.ui.ac.id>
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Outlines
• Reliability
• Cloud Design Principles
• Service Availability
• “Nines”
• Resiliency
• Redundancy
• Autoscaling
• Health Check
• Caching
• Case Study: Building a Resilient Web Architecture on
AWS
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Reliability
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Reliability
• It covers “the ability of a system to recover from
infrastructure or service disruptions, dynamically
acquire computing resources to meet demand, and
mitigate disruptions such as misconfigurations or
transient network issues.” – Amazon
• It is how to ensure that a system can keep deliver values
to its users even during peak loads and disruptions
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Service Availability
• Commonly defined as the percentage of time that an
application is operating normally
• Availability is reduced any time the application is not
operating normally, including both scheduled and
unscheduled interruptions
• Measurement: availability = normal operation time /
total time
• Common short-hand to availability: number of nines
• E.g. five nines 99.999%
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Nines Table
Availability Max Disruption (per year) Application Categories
99% 3 days 15 hours Batch processing, data
extraction, transfer and load
jobs
99.9% 8 hours 45 minutes Internal tools like
knowledge management,
project tracking
99.95% 4 hours 22 minutes Online commerce, point of
sale
99.99% 52 minutes Video delivery, broadcast
systems
99.999% 5 minutes ATM transactions,
telecommunications
systems
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Resiliency
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Resilient
• “(of a substance or object) able to recoil or spring back
into shape after bending, stretching, or being
compressed” – Lexico
• Murphy’s Law: “Anything that can go wrong, will go
wrong”
• It is important to be able to recover from a failure
• Where the failure might happen system-wide, or isolated in
one or more resources
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Achieving Resiliency
• Redundancy
• Autoscaling
• Health Check
• Caching
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Redundancy
• “The duplication of components of a system in order to
increase the overall availability of that system.”
(Hornsby, 2018)
• In case one component fails, the remaining online
components can take over without disrupting the main
system
• Implementing redundancy:
• Deploy across multiple Availability Zones
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Autoscaling
• Performs horizontal/vertical scaling automatically under
certain rules
• Example: Increase the number of app servers if the load
balancer detects that workload on every running app servers
exceeded 75%
• Example: Shut down unused app servers when traffic goes
down below certain threshold
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Health Check
Caching
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Source: https://aws.amazon.com/about-aws/global-infrastructure/
• 21 Geographical Regions, 66 Availability Zones (last
checked: June 18th 2019)
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Website Components
Website
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Server-side Layer
• Running custom code
on AWS:
• On EC2 instances (VM)
• On an ECS cluster
(Container)
• Combining API Gateway
+ Lambda (Serverless)
• Augment with
autoscaling across AZs
• Setup ELB (load (Majerowicz, 2017)
balancers) to distribute
traffic
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Persistence Layer
• Provision EC2 instances
that contain database
engine of your choice,
or use Amazon RDS
• Amazon RDS supports
multi-AZ deployments
and synchronous
replication between
replicas
(Majerowicz, 2017)
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Static Assets
• Put assets into S3
storage and use
CloudFront to deliver
the assets
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Improving Availability
• Route 53 performs
health check and traffic
routing
• If main website goes
down, the traffic will be
routed to another
region or S3
• Provides a failover,
static site on S3 in case
the main website
becomes unavailable
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
Summary
• Features that improve overall resiliency and availability
• Multiple EC2 instances on multiple AZs, plus autoscaling
• Multiple database instances on multiple AZs (RDS)
• Caching via CloudFront
• Failover DNS routing via Route 53
• Caching DB queries
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
References
• “AWS Reliability Pillar”. Amazon Web Service. 2019. Available at:
https://d1.awsstatic.com/whitepapers/architecture/AWS-Reliability-Pillar.pdf
(Accessed: June 19th 2019)
• Hornsby, Adrian. “Patterns for Resilient Architecture – Part 1”. 2018. Available at:
https://medium.com/@adhorn/patterns-for-resilient-architecture-part-1-
d3b60cd8d2b6 (Accessed: June 18th 2019)
• Hornsby, Adrian. “Patterns for Resilient Architecture – Part 3”. 2018. Available at:
https://medium.com/@adhorn/patterns-for-resilient-architecture-part-3-
16e8601c488e (Accessed: June 19th 2019)
• Hornsby, Adrian. “Patterns for Resilient Architecture – Part 4”. 2018. Available at:
https://medium.com/@adhorn/patterns-for-resilient-architecture-part-4-
85afa66d6341 (Accessed: June 19th 2019)
• Majerowicz, Lucas. “Architecting for the cloud: building a resilient web architecture on
AWS”. 2017. Available at: http://hecodes.com/2017/04/architecting-cloud-building-
resilient-web-architecture-aws/ (Accessed: June 18th 2019)
digitalent.kominfo.go.id
LOGO
UNIV/POLTEK
digitalent.kominfo
digitalent.kominfo
DTS_kominfo
Digital Talent Scholarship 2019
digitalent.kominfo.go.id
digitalent.kominfo.go.id