Вы находитесь на странице: 1из 28

DISTRIBUTED DATABASES IN THE CLOUD USING NoSQL

Sidney SHEK sshek2@csc.com September 8, 2011

CSC Leading Edge Forum Technology Grant FY11


9/12/2011 9:12 AM PPT 2007_MASTER_FMT 1

Agenda
Introduction What is NoSQL and why is it relevant to us? Three key principles of NoSQL Case studies Applying NoSQL to Enterprises

Future trends
Conclusion

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 2

Challenges faced by Internet giants


1. Support massively scalable and high performance web services
Massive data volumes...and growing! Customers distributed around the world Highly available Low-latency

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 3

Challenges faced by Internet giants


2. Dealing with flexible and complex data structures
Document storage Social networks Multimedia

Source: IDC White Paper - sponsored by EMC. As the Economy Contracts, the Digital Universe Expands. May 2009

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 4

What is NoSQL?
Movement away from traditional relational databases Address challenges posed by Cloud and Big Data One size no longer fits all

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 5

Whos using NoSQL?

References: http://www.mongodb.org/display/DOCS/Production+Deployments, http://wiki.apache.org/cassandra/ArticlesAndPresentations, http://en.wikipedia.org/wiki/NoSQL, http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html Disclaimer: All logos, trade marks and brand names used in this presentation belong to the respective owners
9/12/2011 9:12 AM PPT 2007_MASTER_FMT 6

Whos paying?
NoSQL: 10gen stores in $6.5m for MongoDB database
Source: WIREDvc http://www.wiredvc.com/nosql-10gen528stores-in-6-5m-for-mongodb-database/

VMware hires key developer for Redis


Source: VMware http://blogs.vmware.com/console/2010/ 03/vmware-hires-key-developer-forredis.html

Cassandra NoSQL Database Gets Commerical Support


Source: Database Journal http://www.databasejournal.com/sqletc/article. php/3878651/Cassandra-NoSQL-DatabaseGets-Commercial-Support.htm
9/12/2011 9:12 AM PPT 2007_MASTER_FMT 7

Principle #1 Build Infinitely Scalable Systems


Add more servers to boost capacity and throughput Parallel processing for linear scalability (e.g. MapReduce) Process where the data is to reduce network hops
Run program on your data

Run program

Users

Here are your results

Processor

Data nodes
9/12/2011 9:12 AM PPT 2007_MASTER_FMT 8

Principle #2 Avoid Distributed Transactions


Distributed transactions dont scale
Design for co-located transactional data Accept eventual consistency

What happens when nodes are far apart?


9/12/2011 9:12 AM PPT 2007_MASTER_FMT 9

Principle #2 Avoid Distributed Transactions


Distributed transactions dont scale
Design for co-located transactional data Accept eventual consistency

No more overhead in acquiring locks across nodes!


9/12/2011 9:12 AM PPT 2007_MASTER_FMT 10

Principle #3 Choose the right tools for the job


Indicative comparison of NoSQL databases

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 11

Case Study 1 Cloud-based Application Config Management

App 1

App 1

Config

Config

Sydney
App 2

New York
App 2

Config

Config

Central administrators
App 1

App 1

Melbourne
App 2

London
App 2

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 12

Case Study 1 Cloud-based Application Config Management

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 13

Case Study 1 Cloud-based Application Config Management

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 14

Case Study 2 Data Capture and Dashboard in the Cloud

Australia-wide power

Perth region power

Sydney region power

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 15

Case Study 2 Data Capture and Dashboard in the Cloud

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 16

Case Study 2 Data Capture and Dashboard in the Cloud

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 17

Case Study 2 Data Capture and Dashboard in the Cloud

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 18

Case Study 3 Migrating Location-based Service to NoSQL

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 19

Possible Applications To Many Enterprise Scenarios


Complex bioinformatics data analysis Forensic analysis Master data management

Real-time web application analytics

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 20

Future Trends
Consolidation, standardisation and new features for NoSQL databases
Spring Data UnQL and coSQL

Database-as-a-Service RAM Cloud High-performance data grids The new enterprise stack

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 21

The Old Stack Relational Database for Everything

Queries (SQL)

Relational database

Monolithic hardware
(few CPUs and network computers)

Shared disk/memory architecture


(centralised processing)

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 22

The New Stack No More One Size Fits All

Direct record access or queries

MapReduce programs

High-performance traditional relational database


(e.g. Oracle Exadata)

NoSQL database and data grids


(e.g. CouchDB, GemFire)

Parallel relational database


(e.g. Greenplum)

MapReduce engines
(Hadoop)

Monolithic hardware
(few CPUs and network computers)

Distributed hardware
(multi-core CPUs, multiple computers connected via highperformance network)

Shared disk/memory architecture


(centralised processing)

Shared nothing architecture


(distributed parallel processing)

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 23

Summary What is NoSQL?


Move away from traditional relational databases Address challenges posed by Cloud and Big Data by:
Building infinitely scalable systems Avoiding distributed transactions Choosing the right database(s) for the job

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 24

Summary Join the crowd who are using NoSQL

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 25

For more information


Contact me at sshek2@csc.com
Distributed Databases in the Cloud Using NoSQL LEF grant report Life beyond Distributed Transactions: an Apostate's Opinion. by Pat Helland

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 26

DISTRIBUTED DATABASES IN THE CLOUD USING NoSQL


Thank you

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 27

9/12/2011 9:12 AM

PPT 2007_MASTER_FMT 28

Вам также может понравиться