Вы находитесь на странице: 1из 18

Distributed Systems

Tutorial 6 Apache Zookeeper



Written by Alex Libov
winter semester, 2012-2013
ZooKeeper
2
Because Coordinating Distributed Systems is a Zoo
http://zookeeper.apache.org

Open source high-performance coordination service for
distributed applications
Naming
Configuration management
Synchronization
Group services
Architecture
3
The servers that make up the ZooKeeper service must all know about each other.
As long as a majority of the servers are available, the ZooKeeper service will be
available.
Clients connect to a single ZooKeeper server.
The client maintains a TCP connection through which it sends requests, gets
responses, gets watch events, and sends heart beats. If the TCP connection to the
server breaks, the client will connect to a different server.
ZooKeeper stamps each update with a number that reflects the order of all ZooKeeper
transactions.
Data model and the hierarchical namespace
4
The name space provided by ZooKeeper is much like that of a standard file
system.
A name is a sequence of path elements separated by a slash (/).
Every node in ZooKeeper's name space is identified by a path.
Znodes
5
Each node in a ZooKeeper namespace (called znode) can have
data associated with it as well as children. It is like having a file-
system that allows a file to also be a directory.
Znodes maintain a stat structure that includes version
numbers for data changes, ACL changes, and timestamps.
Each time a znode's data changes, the version number increases.
Each node has an Access Control List (ACL) that restricts who can do
what.
The data stored at each znode in a namespace is read and
written atomically.
Unlike a typical file system, which is designed for storage,
ZooKeeper data is kept in-memory.

Ephemeral nodes
6
Ephemeral nodes are znodes that exists as long as the
session that created the znode is active.
When the session ends the znode is deleted.
Ephemeral nodes are useful when you want to
implement
group membership
Each client creates an ephemeral node which is deleted when the client
exits/crashes


Conditional updates and watches
7
Clients can set a watch on a znode.
A watch will be triggered and removed when the znode
changes.
When a watch is triggered the client receives a packet saying
that the znode has changed.
The client will then typically try to read that znode.
If the connection between the client and one of the
ZooKeeper servers is broken, the client will receive a
local notification.
Guarantees
8
Sequential Consistency - Updates from a client will be
applied in the order that they were sent.
Atomicity - Updates either succeed or fail. No partial
results.
Single System Image - A client will see the same view
of the service regardless of the server that it connects to.
Reliability - Once an update has been applied, it will
persist from that time forward until a client overwrites
the update.
Timeliness - The clients view of the system is
guaranteed to be up-to-date within a certain time bound.

Simple API
9
Create - creates a node at a location in the tree
Delete - deletes a node
Exists - tests if a node exists at a location
Get data - reads the data from a node
Set data - writes data to a node
Get children - retrieves a list of children of a node
Implementation - reads
10
With the exception of the request processor, each of the
servers that make up the ZooKeeper service replicates its
own copy of each of components.
The replicated database is an in-memory database containing
the entire data tree.
Updates are logged to disk for recoverability, and writes are
serialized to disk before they are applied to the in-memory database.
Every ZooKeeper server services clients. Clients connect to
exactly one server to submit requests. Read requests are
serviced from the local replica of each server database.

Implementation - writes
11
Requests that change the state of the service, write
requests, are processed by an agreement protocol.
All write requests from clients are forwarded to a single server,
called the leader.
The rest of the ZooKeeper servers, called followers, receive
message proposals from the leader and agree upon message
delivery.
ZooKeeper uses a custom atomic messaging protocol.
12
ZooKeeper in practice with
C#
Implementing a barrier
13
barrier
b13 b25 b47 b32
A root barrier node
Upon entering the barrier, a process creates a child node
under the root barrier node
And waits until all children are created
Upon exit, a process deletes his child node
And waits until all children are deleted
Implementing a barrier
14
public class Barrier : IWatcher
{
static ZooKeeper zk = null;
static readonly Object mutex = new Object();
static String root = "/barrier";
String name;
int size;
public Barrier(String address, String name, int size)
{
this.name = name;
this.size = size;
Implementing a barrier
15
if (zk == null) {
zk = new ZooKeeper( address,
new TimeSpan(0, 0, 0,10000), this);
}
Stat s = zk.Exists(root, false);
if (s == null) {
zk.Create(root, new byte[0], Ids.OPEN_ACL_UNSAFE,
CreateMode.Persistent);
}
Initialize a zookeeper client if not initialized
Create the root node if not exists
Implementing a barrier
16
public bool Enter() {
zk.Create(root + "/" + name, new byte[0],
Ids.OPEN_ACL_UNSAFE, CreateMode.Ephemeral);
while (true) {
lock (mutex) {
IEnumerable<String> list = zk.GetChildren(root, true);
if (list.Count() < size) {
System.Threading.Monitor.Wait(mutex);
} else {
return true;
}
}
}
}
Implementing a barrier
17
public bool Leave() {
zk.Delete(root + "/" + name, 0);
while (true) {
lock (mutex) {
IEnumerable<String> list = zk.GetChildren(root, true);
if (list.Count() > 0) {
System.Threading.Monitor.Wait(mutex);
} else {
return true;
}
}
}
}
Implementing a barrier
18
public void Process(ZooKeeperNet.WatchedEvent @event)
{
lock (mutex)
{
System.Threading.Monitor.Pulse(mutex);
}
}

Вам также может понравиться