Академический Документы
Профессиональный Документы
Культура Документы
Introduction to Cloud :
Cloud and Cloud Storage
Lecture 2
Dr. Dalit Naor
IBM Haifa Research
Storage Systems
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
Content
What is the Cloud paradigm
Cloud principles and virtualization
Cloud Storage and cost models
How is it done?
Cloud-based file systems
Cloud object stores
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
Benefits of cloud:
Speed and Agility
Cost Savings
Economies of scale, utilization
improvement and standardization
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
Private Cloud
Owned and operated by a single company for its internal use
Internal datacenters
Taking advantage of clouds efficiencies, such as elasticity, virtualization, cost,..
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
Virtualization
Abstraction of the physical layers and resources
Widely exists in computer systems
Memory, Storage, Compute, Networking
Virtual machines
Back to IBMs mainframe, IBM AIX/Power systems
Revolution: X86 virtualization
- VMWare
- Linux KVM, Xen
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
Objects
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
WAN (Cloud)
Web based HTTP protocol
Put/Get operations, for fixed content
Enables new extensions: integrity, dedup.
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
Doctor
Patient
http://www.eng.tau.ac.il/semcom
Source: http://aws.amazon.com/s3/
10
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
Costs
Cost is typically a combination of
Used Capacity
Network data transfer
Number of requests
11
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
12
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
13
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
if we have a 2 Terabyte model it would cost $155.65 per month in US-West and USEast standard Reduced Redundancy Storage on Amazon S3 storage. At that point, you
may as well treat yourself to the standard storage option which would run $194.56 per
month for the same 2 Terabytes. Over three years, that is over $7,000 to keep 2
Terabytes in the public storage cloud. Most on-premise storage systems would cost
less, but in the disaster recovery use case the abstraction that cloud storage brings is
priceless. But how much power, cooling, and operational expense would be avoided
? How to determine if cloud storage is a cost savings, TechRepublic, March 4, 2013
,
14
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
http://lifehacker.com/five-best-cloud-storage-providers-614393607
15
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
But Limitations
Data lock-in
Security
Multi tenancy
Secure delete
Data confidentiality and auditability
How vulnerable is the cloud infrastructure
16
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
17
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
New architecture
New , relaxed , protocols and systems operations (I/O and management)
New solutions for resiliency and high availability based on replication, e.g. not RAID
Support for computation
Designed for new workloads: large streaming, sequential Writes or Analytics.
Assumptions
Based on commodity hardware
Components always fail
- Need self monitoring to detect, tolerate, and recover from failures
Optimized for large files
Results
No POSIX API
Each chunk is replicated d times (a typical value for d==3)
Smart placement of chunks
Scribed from: Clouddbms2011.pdf
18
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
Examples
Hadoop File System
(HDFS, Yahoo) - 2009
Source: The Hadoop Distributed File System: Architecture and Design by Dhruba Borthakur
19
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
HDFS Architecture
20
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
Examples
Google File System
(GFS) 2002
Source: http://en.wikipedia.org/wiki/File:GoogleFileSystemGFS.svg
21
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
Source: http://docs.openstack.org/grizzly/openstack-compute/install/apt/content/example-object-storage-installationarchitecture.html
23
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
Building Blocks
Proxy Servers: Handles all incoming API
requests.
Rings: Map logical names of data to locations
on particular disks.
Zones: failure domains
Storage Nodes
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
Replication
Everything is stored three times (by default)
Upon a disk failure, the data is replication to
other zones, ensuring three copies
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom
Summary
Cloud is a paradigm shift
Cloud Storage is prevailing
Cloud Storage requires new storage architectures, e.g.
Cloud file systems
Cloud object stores
26
Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University
http://www.eng.tau.ac.il/semcom