Вы находитесь на странице: 1из 43

The Book of Ozone

object store for hadoop

Anu Engineer

Table of Contents

Ozone Object Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.1

Introduction

1.2

Object Store Overview

1.3

Ozones Storage Elements

1.3.1
1.3.2
1.3.3
1.3.4
1.3.5
1.3.6
1.3.7

Common REST Headers . . . . . . . .


Common Reply Headers . . . . . . .
Storage Volumes . . . . . . . . . . . . .
Buckets . . . . . . . . . . . . . . . . . . . . .
Keys . . . . . . . . . . . . . . . . . . . . . . .
Authentication & Access Control
Accounting . . . . . . . . . . . . . . . . .

Using Ozone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1

Setting up Ozone

2.2

Command Line Interface

2.2.1
2.2.2
2.2.3

Volume commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Bucket Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Key commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3

Client Libraries

2.4

Managing Ozone

2.4.1
2.4.2
2.4.3
2.4.4
2.4.5
2.4.6
2.4.7
2.4.8

Cluster Level . .
Server Level . .
User Level . . . .
Volume Level .
Bucket Level . .
Key Level . . . .
Request Level .
Logging . . . . .

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

5
5
6
6
6
6
7

11
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

12
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

12
13
14
14
14
15
15
15

REST Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.1

Storage Volume APIs

3.1.1
3.1.2
3.1.3
3.1.4
3.1.5

Create storage volume .


Update storage volume
Delete storage volume .
Info storage volume . . . .
List Storage Volumes . . .

3.2

Buckets

3.2.1
3.2.2
3.2.3
3.2.4
3.2.5

Create bucket .
Update bucket
Delete Bucket .
Info Bucket . . . .
List Bucket . . . .

3.3

Keys

3.3.1
3.3.2
3.3.3
3.3.4
3.3.5

Put Key . . .
Get Key . .
Delete Key
Info Key . .
List Keys . .

Customization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

16
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

16
18
20
21
22

24
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

25
27
28
29
31

33
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

33
35
36
38
39

Introduction
Object Store Overview
Ozones Storage Elements
Common REST Headers
Common Reply Headers
Storage Volumes
Buckets
Keys
Authentication & Access Control
Accounting

1. Ozone Object Storage

1.1

Introduction
Ozone is an Amazon S3 1 like object store. It is a redundant, distributed object store build by
leveraging primitives present in HDFS. Ozone provides a REST interface similar to Amazon
S3. The primary design point of ozone is scalability and its aim is to scale to trillions of objects.
However the guarantees and user visible behavior of an object store is different from a distributed
file system like HDFS.

1.2

This is the user documentation needed to work with ozone. If you are looking for the design
of ozone, please look at the ozone architecture document in Apache JIRA - HDFS-7240.
Ozone is a work in progress, so please dont assume that this is the final draft of ozone
spec.

Object Store Overview


Object stores have been around for a while, but it is with the ascent of cloud computing that
it became main stream. With the rise of cloud computation platforms object stores became
common. HDFS itself is based on Google File system which enshrines lot of ideas from the
original vision of object store. The core idea of an object store 2 is to separate infrequent
meta-data operations from frequent data operations like read and write.
A cloud based object store takes this idea further by creating a REST interface and moving
away from hierarchical storage management. Instead of directories and files organized in a
hierarchical way, most of the cloud based object stores use a key-value mechanism for file storage.
In cloud storage lingo, file names are keys and data is the value of these keys. These keys can
be put in a container called buckets. For example, in case of Amazon S3 - users can create
buckets and put files into that bucket. Since cloud based object stores are remotely accessed they
generally do REST protocol over HTTP/HTTPS.
1 http://aws.amazon.com/s3/
2 http://www.pdl.cmu.edu/ftp/NASD/Sigmetrics97.pdf

1.3 Ozones Storage Elements

1.3

Ozones Storage Elements


All files in ozone are organized as keys and values. Each of these keys are part of a bucket, and
buckets live inside a storage volume. Quota is attached to a volume and ACLs are attached to a
bucket. The following diagram illustrates ozones storage hierarchy.

1.3.1

Common REST Headers


Ozone protocol uses a bunch of common headers. These are part of most requests. They are
1. Authorization - This is the authorization field, this field will change based on what
authentication scheme is being used. Here is an example of how this header would look
like when using simple 3 authentication scheme. Authorization : OZONE bilbo
2. x-ozone-version - This is a required header and if the version number is not specified the
request will be rejected by ozone. Eg. x-ozone-version : v1. if you want to use a specific
version of the API you can specify that. Eg. x-ozone-version : v2 4
3. Date - Standard HTTP header that represents dates. The format is - day of the week,
month, day, year and time (military time format) in GMT. Any other time zone will be
rejected by ozone server. Eg. Date : Mon, Apr 4, 2016 06:22:00 GMT

1.3.2

Common Reply Headers


Ozone protocol uses a bunch of reply headers. These are part of all replies. They are
1. Date - This is the HTTP date header and it is set to servers local time expressed in GMT.
2. x-ozone-request-id - This is a UUID string that represents an unique request ID. This ID is
used to track the request through the ozone system and is useful for debugging purposes.
3. x-ozone-server-name - Fully qualified domain name of the sever which handled the request.
3 Simple
4 Right

Auth is discussed later in this document


now we have only one version - v1

Chapter 1. Ozone Object Storage

6
1.3.3

Storage Volumes
A storage volume represents an unit of ownership in ozone world. Once ozone is installed, an
administrator5 will have to create a storage volume for a user. The notion of user and group
are defined outside ozone. A user or a group might exist as a Kerberos principal or as some
other sort of a name. Please see the customization section to understand how to extend ozone
authorization to use custom schemes.

1.3.4

Buckets
Buckets are similar to directories in a real file system (however buckets cannot be nested), they
hold keys and values, or file names and data.

1.3.5

Keys
Keys are unique entities in a bucket. The value part of a key is the data stream.

1.3.6

Authentication & Access Control


Just like HDFS, ozone supports different modes of authentication. The default is the insecure
access6 mode. Ozone relies on user names but has no notion of authentication. It simply makes
calls to UserAuth interface to establish user identity. Ozone will support other authentication
schemes - especially Kerberos. It is trivial to extend the user authentication to use your own
schemes, for example an HMAC based scheme like Amazon AWS or oAuth. Please see the
customization section for a deeper discussion on this.
The access control in ozone is a stripped down version of controls offered in traditional Unix
systems. It is based on a notion of user and groups which are defined outside ozone.
The ACLs offered by ozone are :
user:name:rw
group:name:rw
world::rw
where name is a valid user name or group name. if this is backed by Kerberos those names
must exist as valid Kerberos entities. The maximum number of ACLs supported on a resource is
327 .
ACLs can be added or removed using x-ozone-acl header.
x-ozone-acl : ADD user:bilbo:rw - Adds an ACL to the resource
x-ozone-acl : REMOVE user:bilbo:rw - Removes an ACL from the resource.
We always evaluate REMOVE ACLs first. That is, if you send x-ozone-acl : ADD
user:bilbo:rw and x-ozone-acl : REMOVE user:bilbo:rw in the same http request we will
first remove the ACL on the resource if it exists and then add the new ACL to the resource.
Ozone ships with a simple authentication scheme. This is called simple since it does not
really do any authentication. It just trusts the user name that is specified in the http header.
Simple uses "root" and "hdfs" as admin user names. This means that all sections which are
marked as admin API can only be invoked if you specify the user names as hdfs or as root.
Simple also converts all user names into small letters, that is user HDFS, hdfs and HdFs will
all map to hdfs. In other words, user name are case-insensitive 8 and trims leading and trailing
white-space characters. Please note that this is the default behavior of simple authentication filter
that is shipped with ozone. Each authentication filter is free to implement its own policies.
5 Please

see simple authentication section to understand what an administrator means in that context
is the default insecure protocol - called simple. Please do not use this if you want a secure ozone cluster.
7 Yes, this is arbitrary. There is nothing critical in the system forcing us to use 32. Only limit is that we would like
to keep HTTP headers less than 8 KB
8 many unix systems follow this pattern, we are eager to hear if this causes any issues for you
6 This

1.3 Ozones Storage Elements

Right now ACLs are supported only at bucket level. We do plan to extend ACL support to
volumes and keys in the later versions of ozone.
1.3.7

Accounting
Ozone intends to provide built-in capabilities to do accounting and it follows AWS model of
accounting 9 . It calculates how many byte-hours of storage has been used by a storage volume.
In other words, it computes how many bytes have been stored for how long in an ozone cluster.
This can be extended via java interfaces or read from ozone using REST API10 .

9 Please
10 TBD

see billing section at http://aws.amazon.com/s3/faqs/ to understand AWS accounting Model.


- This API is not finalized yet

Setting up Ozone
Command Line Interface
Volume commands
Bucket Commands
Key commands
Client Libraries
Managing Ozone
Cluster Level
Server Level
User Level
Volume Level
Bucket Level
Key Level
Request Level
Logging

2. Using Ozone

2.1

Setting up Ozone
Setting up ozone is simple. If you have a hadoop cluster1 , ozone can be enabled by setting up
these two properties in hdfs-site.xml.
<property>
<name>ozone.enabled</name>
<value>true</value>
</property>
<property>
<name>ozone.handler.type</name>
<value>local</value>
</property>
The ozone.enabled key enables ozone functionality. The second key ozone.handler.type tells
ozone which storage handler to use. By default we ship with two handlers, the distributed handler
and a local handler. Local handler is strictly for testing. Since ozone is still a work in progress2 ,
this document uses local for all illustration purposes. Ozone local handler mimics an object store
under /tmp or a user defined directory.
Once ozone is enabled, ozones REST server listens at the datanode http port. Following
examples assume that we have an ozone cluster called http://ozone.self

2.2

Command Line Interface


Ozone ships with a command line tool called oz. Here is a typical sequence of operations
performed for creating a bucket. To use ozone, we need to create a storage volume3
1 Based

on HDFS-7240 branch
means that distributed handler is not fully functional yet.
3 We deliberately chose not to tie this to an account since historically HDFS is deployed both in secure and insecure
formats.
2 This

2.2 Command Line Interface

Creating a bucket in ozone


1.
2.
3.
4.
2.2.1

Admin creates a volume.This step can be achieved using createVolume command.


Ozone returns success, at this point a storage volume has been created.
User creates a bucket. This can be achieved by createBucket command.
Ozone returns success - indicating that a bucket has been created.

Volume commands
Volume commands are used to manage storage volumes. Other than list volume and info volume
all volume commands need administrator privilege.
Create Volume

In order to create a volume you need to be an admin and specify who the owner of the volume is.
# Creates a volume called shire that is owned by user bilbo
hdfs oz -createVolume http://ozone.self:50075/shire -user bilbo -quota
100TB -root
Please note the usage of -root. if you specify this parameter then the calling user is set to
hdfs, otherwise the caller name defaults to the current logged in user4 .
Update Volume

Updates information like ownership and quota on an existing volume.


# This call sets quota to 500 TB for shire, admin only
hdfs oz -updateVolume http://ozone.self:50075/shire -quota 500TB -root
Delete Volume

This call will delete a volume if it is empty.


# Deletes a volume if it is empty
hdfs oz -deleteVolume http://ozone.self:50075/shire -root
Info Volume

Info volume command allows the owner or the administrator of the cluster to read meta-data
about a specific volume.
# Get information about a volume that has been created
hdfs oz -infoVolume http://ozone.self:50075/shire -root
List Volumes

List volume command can be used by administrator to list volumes of any user. It can also be
used by a user to list volumes owned by him.
# List all volumes owned by user bilbo
hdfs oz -listVolume http://ozone.self:50075/ -user bilbo -root
4 This

just makes the life of a developer easy, we may not ship this feature.

10
2.2.2

Chapter 2. Using Ozone

Bucket Commands
Bucket commands follow a similar pattern as volume commands. However bucket commands
are designed to be run by the owner of the volume. Following examples assume that these
commands are run by the owner of the volume or bucket.
Create Bucket

Creates a bucket on a given volume.


# This command creates a bucket called rings on a volume called shire.
# if the volume does not exist,then this call will fail.
hdfs oz -createBucket http://ozone.self:50075/shire/rings
Update Bucket

Updates bucket meta-data, like ACLs.


# This call adds frodo as a user on "rings" bucket.
# This implies that frodo will be able to read and write
# both data and meta-data of the bucket.
hdfs oz -updateBucket http://ozone.self:50075/shire/rings
-addAcl user:frodo:rw
Delete Bucket

Deletes a bucket if it is empty.


# Deletes the specified bucket if it is empty.
hdfs oz -deleteBucket http://ozone.self:50075/shire/rings
Info Bucket

Returns information about a given bucket.


# Info bucket returns the information about the bucket.
hdfs oz -infoBucket http://ozone.self:50075/shire/rings
List Buckets

List buckets on a given volume.


# List bucket returns list of buckets on a volume.
hdfs oz -listtBucket http://ozone.self:50075/shire
2.2.3

Key commands
Set of commands to manage keys (objects) in an ozone bucket.
Put Key

Creates or overwrites a key in ozone store, -file points to the file you want to upload.
# Uploads a file to ozone bucket
hdfs oz -putKey http://ozone.self:50075/shire/rings/narya

-file

Get Key

Downloads a key from the ozone store.


# Download a file from ozone bucket
hdfs oz -getKey http://ozone.self:50075/shire/rings/narya
naryaCopy

-file

narya

2.3 Client Libraries

11

Delete Key

Deletes a key from the ozone store.


# Deletes the specified key
hdfs oz -deleteKey http://ozone.self:50075/shire/rings/narya
Info Key

Reads key metadata from the ozone store.


# Metadata of a specified key
hdfs oz -infoKey http://ozone.self:50075/shire/rings/narya
List Keys

List all keys in an ozone bucket.


# Lists all keys in a bucket
hdfs oz -listKey http://ozone.self:50075/shire/rings

2.3

Client Libraries
Ozone ships with a default java client library5 . Please look at the REST protocol section if you
would like to write your own client library.
The ozone java client library consists of four main classes. The OzoneClient, OzoneVolume,
OzoneBucket and OzoneKey. You start communicating with the server using OzoneClient and
from the client you can get to a volume and so on.
Here is a simple example of how to use Ozone client library.
import org.apache.hadoop.ozone.web.client.OzoneClient;
import org.apache.hadoop.ozone.web.client.OzoneVolume;
import org.apache.hadoop.ozone.web.client.OzoneBucket;

public static void example() {


try {
// Create an Ozone Client
OzoneClient client = new OzoneClient();
// Set the machine name which we are communicating with.
client.setEndPointURI("http://localhost");
// set the user name of the caller
client.setUserAuth("hdfs");
// This call creates a volume called shire whose owner is
// bilbo which has a quota of 100TB.
// The newly created volume is returned as OzoneVolume.
OzoneVolume vol = client.createVolume("shire", "bilbo", "100TB");
5 The current library is written to support the ozShell and tests, hence should be treated as experimental and
incomplete.

Chapter 2. Using Ozone

12

// Given a volume we can perform a bunch of operations like create,


// delete, update etc on a bucket.
// This is an example of create bucket call.
OzoneBucket bucket = vol.createBucket("bagend");
// This allows you to put your keys into the bucket.
File f = getFileObject();
String keyName = "ring";
bucket.putKey(keyName, f);

// if you want to get a copy of your ring file


// this is how you would download it from ozone.
Path targetPath = Paths.get("/tmp/secret-ring");
bucket.getKey(keyName, targetPath);
} catch (IOException | OzoneException ozException) {
System.out.printf("Error : %s%n", ozException.getMessage());
}
}

2.4

Managing Ozone
Hardest part of running any service is managing it. Ozone comes with a set of management
interfaces out of the box. The management interface of ozone is built on a notion called "levels
& pillars". Each metric or user facing configuration is tagged with6 its level and pillar. Ozone
currently defines these levels. They are Cluster, Server, User, Volume, Bucket, Key and Request.
The pillars of ozones management are - Faults, Configuration, Accounting, Performance and
Security.
We are hoping that this kind of tagging will make it easy for a user to locate a configuration or
a metric easily7 . Let us say that we have an "access denied" error, then this kind of classification
will allow end user to quickly find all configuration and metrics that deals with security at the
required level. You can think of this as a matrix of configuration and metrics for ozone. Please
look at the cheat sheet provided at the end of this section to see an example.
Following sections describe pillars for each level of ozone and what metrics and configuration
is provided.

2.4.1

Cluster Level
Faults

At cluster level we capture a set of counters that represent faults. They are
1. Total error count - Total number of errors in the cluster for the last hour.
2. Top servers with errors - Top x (say 20) servers with max error counts.
3. Straggler nodes - Nodes which are slower by three standard deviations from mean
response time for gets and puts.
These error counts would give a view or indicate a trend to the administrator on how healthy
ozone currently is.
6 Ideally

we should do it in code, right now it is only documented.


learned from the HDFS world.

7 Lessons

2.4 Managing Ozone

13

Configuration

We have a bunch of configuration parameters for ozone, but two of them are important, since
correct configuration values are required to enable ozone.
1. ozone.enabled Must be set to true for ozone to work
2. ozone.storage.handler.type Must be set to local or distributed to work
Accounting

Accounting in the context of ozone is something that reflects a chargeable quantity. For example,
Amazon charges per bytes stored over time, for number of requests or for bytes that are read from
the bucket. Since there are no chargeable entities at the cluster level, we will have no accounting
at the cluster level.
Performance

Ozone will expose a bunch of performance counters at the cluster level.


1. Bytes read - Similar to HDFS
2. Bytes written - Similar to HDFS
3. Number of file puts and deletes
4. Total used capacity
5. Request latency percentile - Counter that allows users to see what is the latency for
requests.
Security

If security is enabled for ozone, configuration setting will be enabled using the hdfs-site.xml.
Ozone will be able leverage things like Kerberos from the current HDFS configuration.
2.4.2

Server Level
Faults

At server level we capture a set of counters that represent faults. They are
1. Total errors - Total reported by this server for last hour.
2. Errors by error code - Errors broken down by error code. This is useful since "internal
server error" requires a different admin response from "access denied". Please see REST
protocol section to see details on error codes.
Configuration

We hope we will not need any server level configuration or whatever we need is already present
in the hdfs-site.xml. An example of server specific configuration would be _HOST used by
Kerberos enabled clusters.
Accounting

Server is a shared resource, so accounting at this level does not make sense.
Performance

1. Bytes read - will expose load on this server.


2. Bytes written - will expose load on this server.
Security
HTTPS certificate - we share ozone port with WebHDFS. So no ozone specific configuration is
needed.

Chapter 2. Using Ozone

14
2.4.3

User Level
Other than security nothing in ozone deals with user. We rely on UserAuth interface to tell us
the users identity. The default for UserAuth Handler is Simple, which means user is deploying
an insecure cluster8 .
ozone.user.provider - value needs to be set to use a different provider, if this is not set
ozone will use org.apache.hadoop.ozone.web.userauth.Simple as the default provider9 .

2.4.4

Volume Level
Faults

Since volumes can span multiple physical machines, it is hard to keep faults aggregated by
volume. Hence ozone currently exposes no fault counters at volume level.
Configuration

1. Quota Allows admins to specify how much disk can be consumed by this volume.
2. Owner - Specifies which user owns this volume. Please see REST protocol section for
more details.
Accounting

1. Bytes per period - All accounting is done at the Volume Level. We will store how
many bytes are stored per day at the volume level. Ozone will emit an accounting entry
whenever we put or delete a blob. If the put is overwriting an existing blob, we will emit
two entries, as if the older blob was deleted and new one was added. These numbers are
aggregated and exposed as accounting entries. Please do note that ozone accounts per
volume and not per user.
2. Requests This is something that we are planning to do in future, this will capture how
many requests are made to a volume.
Performance

No performance data is maintained at Volume level.


Security

Volume Ops Create, Delete and Update is only possible by cluster admins.
2.4.5

Bucket Level
Faults

It might be interesting to maintain fault counters at bucket level, However ozone currently have
no counters at this level.
Configuration

1. Bucket Version Support - Enabled, Disabled When implemented this will store
versions of files.
2. Bucket Storage Type - DISK, SSD or ARCHIVE Standard types supported by HDFS.
Accounting

No accounting at bucket level.


Performance

No performance data is maintained at bucket level.


8 Caveat
9

Emptor - Let the buyer beware.


This ability to change providers is not enabled yet.

2.4 Managing Ozone

15

Security

Bucket ACLs Access control in ozone is specified at bucket level.


2.4.6

Key Level
Nothing, currently we support no features at key level. At some point we would like to support
policies and ACLs on keys.

2.4.7

Request Level
We will log at INFO level for each request failure. So users can correlate that with ozone-requestid and find out what went wrong.
So given pillars and levels we can think of ozones management interface as a simple matrix
which looks like this.
Faults

Configuration

Accounting

Cluster

1. Error count
2. Server errors

1. ozone.enabled
2. ozone.storage.handler.type

- N/A-

Server

1. Total errors
2. Errors by error code

- N/A-

- N/A-

User

- N/A-

Volume

- N/A-

Bucket

- N/A-

Key
Request

- N/ALogging

UserAuth Handler,
default Simple.
1. Quota
2. Owner
1. Version Support
2. Storage Type
- N/A- N/A-

Performance
1. Bytes read
2. Bytes written
3. Put and delete count
4. Used capacity
5. Straggler nodes
1. Bytes read
2. Bytes written
3. Request count

Security

- N/A-

- N/A-

UserAuth Handler

1. Bytes per period


2. Requests

- N/A-

Admin only

- N/A-

- N/A-

ACLs

- N/A- N/A-

- N/A- N/A-

- N/A- N/A-

Kerberos

HTTPS certificate

Table 2.1: Ozone management cheat sheet


2.4.8

Logging
By default ozone writes its logs to ozone.log in the default hadoop logging directory. This is the
standard location where you would look for datanode/namenode logs. Here is an example of
how ozone.log looks like:
2016-03-23 15:01:39,325 INFO getVolumeInfo bilbo/vol1 bilbo
7407bd2d d0fd-425c-824c-9dceed032170 - Success
2016-03-23 17:56:00,697 INFO createVolume bilbo/shire hdfs
920cef58-b7cf-4314-9ec3-249a1c7e95cb - Returning exception. ex: {"httpCode":409,
"shortMessage":"volumeAlreadyExists","resource":"bilbo/shire","message":null,
"requestID":"920cef58-b7cf-4314-9ec3-249a1c7e95cb",
"hostName":"aglon.ozone.self"}

The fields of the log are time stamp, level, ozone function, resource name, user, request ID
and message.

Storage Volume APIs


Create storage volume
Update storage volume
Delete storage volume
Info storage volume
List Storage Volumes
Buckets
Create bucket
Update bucket
Delete Bucket
Info Bucket
List Bucket
Keys
Put Key
Get Key
Delete Key
Info Key
List Keys

3. REST Protocol

This chapter covers the ozones REST protocol in detail, unless you are developing an ozone
client or is really interested in understanding the protocol, you can safely skip this chapter. The
tools like oz shell or client library takes care of speaking this protocol.

3.1

Storage Volume APIs


Storage volume supports creates, deletes, updates and list operations. The minimum length of a
storage volume name shall be 3 and maximum length shall be 64. Valid characters for storage
volume names are lowercase letters, numbers, dashes (-), underscores (_), and dots (.)

3.1.1

Action

Description

Create Storage Volume


Update Storage Volume
Get Storage Volume Info
Delete Storage Volume
List Storage Volumes

Creates a storage volume


Updates an existing storage volume
Retrieves storage volume info
Deletes a storage volume if it is empty
List all the storage volumes or by user

Create storage volume


This API requires admin privilege. This call creates a new storage volume.
API Details

/{volume}
Here is a working example of this request. The following call creates a storage volume called
shire, whose owner is bilbo. To specify the owner of the volume we use an ozone specific header
called x-ozone-user header.
POST

3.1 Storage Volume APIs


POST

17

/shire

Authorization : OZONE gandalf


Date : Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-user : bilbo
x-ozone-version : v1

201 CREATED
Date: Mon, Apr 04, 2016 06:22:00 GMT
Location : /shire
x-ozone-request-id : 920cef58-b7cf-4314-9ec3-249a1c7e95cb
x-ozone-server-name : aglon.ozone.self
An optional query parameter is used to specify a quota for this volume.
POST

/shire?quota=10TB

Authorization : OZONE gandalf


Date : Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-user : bilbo
x-ozone-version : v1

201 CREATED
Date: Mon, Apr 04, 2016 06:22:00 GMT
Location : /shire
x-ozone-quota: 10 TB
x-ozone-request-id : 759d5028-af9c-4bd6-9ddf-2b1f88569e86
x-ozone-server-name : aglon.ozone.self

In the above example admin was able to set a quota of 10 TB to a storage volume called
shire. Under ozone we associate quotas to storage volumes and not to user accounts. If no quota
is specified then none is presumed to exist.
Query Parameters

Param

Value

Comments

Quota

long <BYTES | MB | GB | TB>

Optional, Quota size in BYTES, MBs, GBs or TBs

Two headers are mandatory for all ozone requests. They are Date and x-ozone-version.
For individual requests only special headers are highlighted as required. We are aware
of RFC 6648 1 which deprecates the use of x- as headers. However most users of HTTP
protocol are familiar with reading x- as a custom parameter, hence we have chosen to use
it in ozone.

1 http://tools.ietf.org/html/rfc6648

Chapter 3. REST Protocol

18
Request Headers

Header

Value

Comments

Authentication
Date
x-ozone-version
x-ozone-user

OZONE name
Date & Time in GMT
v1
User name

Simple protocol
HTTP Date
Ozone protocol version
Required - Volumes need owners

Reply Headers

Header

Value

Comments

Date
Location
x-ozone-quota
x-ozone-request-id
x-ozone-server-name

Date & Time in GMT


string
size
Hexadecimal string
server name

HTTP date
volume location
if valid quota parameter was specified
Request ID for ozone call
Ozone server that replied to this request

Error Codes

3.1.2

Error Code

Value

Comments

400
400
400
400
400
401
403
404
409
500

Malformed quota
Invalid volume name
Missing version
Missing date
Bad date
Unauthorized
Access denied
User not found
Volume already exists
Internal server error

Invalid quota specified


Invalid characters in the volume
Missing x-ozone-version
Missing date header
Unable to parse date
Access token is missing or invalid token
Access denied
Invalid user Name
Volume name has to be unique
Server error

Update storage volume


This API requires admin privilege. Allows an admin to update volume properties.
API Details
PUT /{volume}
Here is an example call where we move the ownership of shire from bilbo to frodo. This same
operation can be used to update quota information. If you need to remove an existing quota, you
can specify ?quota=remove.
PUT

/shire

Authorization : OZONE gandalf


Date : Mon, Apr 04, 2016 06:22:00 GMT

3.1 Storage Volume APIs

19

x-ozone-user : frodo
x-ozone-version : v1

200 OK
Date: Mon, Apr 04, 2016 06:22:00 GMT
Location : /shire
x-ozone-request-id : ddb48938-3a19-4287-aafc-d0751ed9c2c6
x-ozone-server-name : aglon.ozone.self

Changing the ownership of a volume does not change the bucket ACLs, it just changes
the ownership of the storage volume. However all buckets have the storage volume owner
with full access right in the ACL set, thus when you change storage volume ownership the
new owner will be able to read and write ACLs if needed.

Query Parameters

Header

Value

Comments

Quota

long <BYTES| MB | GB | TB> | remove

Optional

Request Headers

Header

Value

Comments

Authentication
Date
x-ozone-version
x-ozone-user

OZONE name
Date & Time in GMT
v1
User name

Simple protocol
HTTP date
Ozone protocol version
Optional - used to change volume ownership

Reply Headers

Header

Value

Comments

Date
Location
x-ozone-quota
x-ozone-request-id
x-ozone-server-name

Date & Time in GMT


string
size
Hexadecimal string
server name

HTTP date
volume location
if quota change was requested
Request ID for ozone call
Ozone server that replied to this request

Chapter 3. REST Protocol

20
Error Codes

3.1.3

Error Code

Value

Comments

400
400
400
400
400
401
403
404
404
500

Malformed quota
Invalid volume name
Missing version
Missing date
Bad date
Unauthorized
Access denied
User not found
No such volume
Internal server error

Invalid quota Specified


Invalid characters in the volume
Missing x-ozone-version
Missing date header
Unable to parse Date
Access token is missing or invalid token
Access denied
Invalid user Name
Volume not found
Server error

Delete storage volume


This API requires admin privilege. Deletes a storage volume if it is empty.
API Details
DELETE

/{volume}

DELETE

/shire

Authorization : OZONE gandalf


Date : Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-version : v1

200 OK
Date: Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-request-id : e194316a-6c3c-43f9-89d6-dea876e33e12
x-ozone-server-name : aglon.ozone.self

Request Headers

Header

Value

Comments

Authentication
Date
x-ozone-version

OZONE name
Date & Time in GMT
v1

Simple protocol
HTTP date
Ozone protocol version

Reply Headers

Header

Value

Comments

Date
x-ozone-request-id
x-ozone-server-name

Date & Time in GMT


Hexadecimal string
server name

HTTP date
Request ID for ozone call
Ozone server that replied to this request

3.1 Storage Volume APIs

21

Error Codes

3.1.4

Error Code

Value

Comments

400
400
400
400
401
404
404
409
500

Invalid volume name


Missing version
Missing date
Bad date
Unauthorized
User not found
No such volume
Volume not empty
Internal server error

Invalid characters in the volume


Missing x-ozone-version
Missing date header
Unable to parse date
Access token is missing or invalid token
Invalid user Name
Volume not found
Volume must not have any buckets
Server error

Info storage volume


This API allows admins to read any storage volume information. Owners of the volume can use
it to read info for the volumes they own.
API Details

GET /{volume}?info=volume
This is an example of an admin reading a volume info of another user.
GET

/shire?info=volume

Authorization : OZONE Gandalf


Date : Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-version : v1

200 OK
Content-Length : 400
Date: Mon, Apr 04, 2016 06:22:00 GMT
Location : /shire
x-ozone-request-id : 3230d711-f377-4b2d-ba6b-be3a9a0313ba
x-ozone-server-name : aglon.ozone.self
{
"owner" : {
"name" : "bilbo"
},
"quota" : {
"unit" : "TB",
"size" : 100
},
"volumeName" : "shire",
"createdOn" : "Mon, Apr 04 2016 06:22:00 GMT",
"createdBy" : "hdfs"
}

Chapter 3. REST Protocol

22
Query Parameters

Query parameter info=volume is required for this call to succeed.


Request Headers

Header

Value

Comments

Authentication
Date
x-ozone-version

OZONE name
Date & Time in GMT
v1

Simple protocol
HTTP Date
Ozone protocol version

Reply Headers

Header

Value

Comments

Date
Content-Length
Location
x-ozone-request-id
x-ozone-server-name

Date & Time in GMT


Integer
String
Hexadecimal string
server name

HTTP Date
Length of the payload
Volume location
Request ID for ozone call
Ozone server that replied to this request

Error Codes

3.1.5

Error Code

Value

Comments

400
400
400
400
401
404
404
500

Invalid volume name


Missing version
Missing date
Bad date
Unauthorized
User not found
No such volume
Internal server error

Invalid characters in the volume


Missing x-ozone-version
Missing date header
Unable to parse date
Access token is missing or invalid token
Invalid user Name
Volume not found
Server error

List Storage Volumes


List storage volume allows users and admins to list volumes.
API Details

Ozone allows three distinct ways to query a volume. The following table summarizes the different
methods. The action, scope and invoked by in the following table describes the semantics of each
type of query.
Action

Scope

Invoked by

Query

Header

List all volumes


List all volumes
List all volumes

of a user
of a user
in the cluster

the same user


admin
admin

Get /?prefix-=&max_keys=&prev_key=
Get /?prefix-=&max_keys=&prev_key=
Get /?prefix-=&max_keys=&prev_key=&root_scan=1

x-ozone-user: name

/?prefix=<string>&max_keys=<int>&prev_key=<string>&root_scan=1
Since query parameters are optional here is an example of a user listing volumes owned by him.
GET

3.1 Storage Volume APIs


GET

23

Authorization : OZONE bilbo


Date : Sat, 16 Apr 2016 20:06:29 GMT
x-ozone-version : v1

In this case the user is recognized and the user has two storage volumes called rings and
swords, this is how the reply would look like.
200 OK
content-length: 300
Content-Type: application/octet-stream
Date: Sat, 16 Apr 2016 20:06:29 GMT
x-ozone-request-id: 34e85523-80fa-459c-a150-95366fa84a92
x-ozone-server-name : aglon.ozone.self
{
"volumes":[
{
"owner":{
"name":"bilbo"
},
"quota":{
"unit":"TB",
"size":100
},
"volumeName":"rings",
"createdOn":"Sat, 16 Apr 2016 19:19:21 GMT",
"createdBy":"hdfs"
},
{
"owner":{
"name":"bilbo"
},
"quota":{
"unit":"TB",
"size":100
},
"volumeName":"swords",
"createdOn":"Sat, 16 Apr 2016 19:19:21 GMT",
"createdBy":"hdfs"
}
]
}

Same API can be invoked by an admin by adding x-ozone-user header and specifying the
user that admin wants data about. The usage is similar to Info storage volumes API.

Chapter 3. REST Protocol

24
Query Parameters

Param

Value

Comments

prefix
max_keys
prev_key
root_scan

Prefix string
Maximum number of keys to return
Last seen key
indicates cluster level listing

return all entries that match this prefix


default value is 1000
should be volume
should be set to 1

Please note that all query parameters for this call are optional.
Request Headers

Header

Value

Comments

Authentication
Date
x-ozone-version

OZONE name
Date & Time in GMT
v1

Simple protocol
HTTP date
Ozone protocol version

Reply Headers

Header

Value

Comments

Date
Content-Length
x-ozone-request-id
x-ozone-server-name

Date & Time in GMT


integer
Hexadecimal string
server name

HTTP date
Length of the payload
Request ID for ozone call
Ozone server that replied to this request

Error Codes

3.2

Error Code

Value

Comments

400
400
400
401
403
404
500

Missing version
Missing date
Bad date
Unauthorized
Access denied
User not found
Internal server error

Missing x-ozone-version
Missing date header
Unable to parse date
Access token is missing or invalid token
Access denied
Invalid user Name
Server error

Buckets
All buckets are created under a storage volume. Buckets can be created only by the owner of the
volume. Buckets support access control lists and storage types.

3.2 Buckets

3.2.1

25

Action

Description

Create bucket
Update bucket
Delete bucket
Info Bucket
List bucket

Creates a bucket
Updates a bucket
Deletes a bucket if it is empty
Gets bucket info
List all buckets in a volume

Create bucket
Create bucket API allows users to create a bucket. It supports a set of storage types. They are
ARCHIVE
DEFAULT
DISK
RAM _ DISK
SSD
You can specify the storage type when you create buckets. Storage types allow buckets to be
placed on volumes with specific characteristics.
API Details

/{volume}/{bucket}
The minimum length of a bucket name shall be 3 and maximum name length shall be 64.
Bucket names must contain only lowercase letters, numbers, dashes (-), underscores (_), and
dots (.)
POST

POST

/shire/bag-end

Authorization : OZONE frodo


Date : Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-storage-class : ssd
x-ozone-version : v1

201 CREATED
Date: Mon, Apr 04, 2016 06:22:00 GMT
Location :/shire/bag-end
x-ozone-request-id : e0688666-fa07-45a1-a33e-7536560e96db
x-ozone-server-name : aglon.ozone.self

ACLs can be added by x-ozone-acl in the http headers.


POST

/shire/bag-end

Authorization : OZONE frodo


Date : Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-acl : ADD user:samwise:rw
x-ozone-version : v1

Chapter 3. REST Protocol

26
201 CREATED
Date: Mon, Apr 04, 2016 06:22:00 GMT
Location :/shire/bag-end
x-ozone-request-id : c7ca87f4-1d4e-4d49-bc34-74fa865951ce
x-ozone-server-name : aglon.ozone.self

Request Headers

Header

Value

Comments

Authentication
Date
x-ozone-version
x-ozone-acl
x-ozone-bucket-versioning
x-ozone-storage-class

OZONE name
Date & Time in GMT
v1
<ozone acl>
enabled/disabled
storage type

Simple protocol
HTTP Date
Ozone protocol version
Ozone acls
optional, versioning2
optional, allows placement control

Reply Headers

Header

Value

Comments

Date
Location
x-ozone-request-id
x-ozone-server-name

Date & Time in GMT


String
Hexadecimal string
Server name

HTTP Date
Bucket location
Request ID for ozone call
Ozone server that replied to this request

Error Codes

Error Code

Value

Comments

307
400
400
400
400
400
400
400
400
401
403
403
404
404
409
500

Redirect
Invalid bucket name
Invalid volume name
Missing version
Missing date
Bad date
Invalid ACL
Unknown storage class
Unknown versioning tag
Unauthorized
Insufficient quota
Access denied
No such volume
User not found
Bucket already exists
Internal server error

Redirect URL
Invalid characters or length in bucket name
Invalid characters in the volume
Missing x-ozone-version
Missing date header
Unable to parse date
Invalid ACL
Invalid storage class
Unknown versioning
Access token is missing or invalid token
Quota exceeded
Access denied
Volume not found
Invalid user name in acl
Bucket name should be unique under a volume
Server error

3.2 Buckets
3.2.2

27

Update bucket
The update bucket API allows users to make changes to an existing bucket. This API allows
changing of ACLs and versioning.
API Details
PUT

/{volume}/{bucket}

Acls are specified using the x-ozone-acl header. you specify if you are adding or removing
an ACL with appropriate prefix.You can also add or remove versioning from a bucket using
x-ozone-bucket-versioning : enable and x-ozone-bucket-versioning : disable. if you disable
versioning on a bucket then all versions other than the current version is deleted. This is a
destructive operation and old versions are irrecoverably lost.
PUT

/shire/bag-end

Authorization : OZONE frodo


Date : Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-acl : ADD user:peregrin:rw
x-ozone-bucket-versioning : enabled
x-ozone-version : v1

200 OK
Date: Mon, Apr 04, 2016 06:22:00 GMT
Location :/shire/bag-end
x-ozone-request-id : 8f6db344-0654-11e6-a738-60f81dc39ce0
x-ozone-server-name : aglon.ozone.self

With this request we just added peregrin to the people who can read and write to bag-end
bucket. At some point of time, we will support versioning of objects stored in a bucket. To enable
that feature (which does not exist now) all you need to do is add x-ozone-bucket-versioning :
enabled to the http headers.
Request Headers

Header

Value

Comments

Authentication
Date
x-ozone-version
x-ozone-acl
x-ozone-bucket-versioning

OZONE name
Date & Time in GMT
v1
ozone acl
enabled or disabled

Simple protocol
HTTP Date
Ozone protocol version
Ozone acls
bucket versioning

Chapter 3. REST Protocol

28
Reply Headers

Header

Value

Comments

Date
Location
x-ozone-request-id
x-ozone-server-name

Date & Time in GMT


String
Hexadecimal string
server name

HTTP Date
bucket location
Request ID for ozone call
Ozone server that replied to this request

Error Codes

3.2.3

Error Code

Value

Comments

307
400
400
400
400
400
400
400
400
401
403
404
404
404
409
500

Redirect
Invalid bucket name
Invalid volume name
Missing version
Missing date
Bad date
Invalid acl
unknown storage class
Unknown versioning tag
Unauthorized
Access denied
No such volume
User not found
No such bucket
bucket already exists
Internal server error

Redirect URL
Invalid characters or length in bucket name
Invalid characters in the volume
Missing x-ozone-version
Missing date header
Unable to parse date
Invalid acl
Invalid storage class
Unknown versioning
Access token is missing or invalid token
Access denied
Volume not found
Invalid user name in acl
Bucket not found
Bucket name should be unique under a volume
Server error

Delete Bucket
Deletes a bucket if it is empty.
API Details
DELETE

/{volume}/{bucket}

DELETE

/shire/bag-end

Authorization : OZONE frodo


Date : Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-version : v1

200 OK
Date: Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-request-id : fd7b5c7c-f21e-4dad-8f61-3cb2271fddff
x-ozone-server-name : aglon.ozone.self

3.2 Buckets

29

Request Headers

Header

Value

Comments

Authentication
Date
x-ozone-version

OZONE name
Date & Time in GMT
v1

Simple protocol
HTTP date
Ozone protocol version

Reply Headers

Header

Value

Comments

Date
x-ozone-request-id
x-ozone-server-name

Date & Time in GMT


Hexadecimal string
server name

HTTP date
Request ID for ozone call
Ozone server that replied to this request

Error Codes

3.2.4

Error Code

Value

Comments

307
400
400
400
400
400
401
403
404
404
404
409
500

Redirect
Invalid volume name
Invalid bucket name
Missing version
Missing date
Bad date
Unauthorized
Access denied
User not found
No such volume
No such bucket
Bucket not empty
Internal server error

Redirect URL
Invalid characters in volume
Invalid characters in bucket
Missing x-ozone-version
Missing date header
Unable to parse date
Access token is missing or invalid token
Access denied
Invalid user Name
Volume not found
Bucket not found
Bucket is not empty
Server error

Info Bucket
This API returns information about a given bucket.

API Details
GET

/{volume}/{bucket}?info=bucket

Chapter 3. REST Protocol

30
GET

/shire/bag-end?info=bucket

Authorization : OZONE samwise


Date : Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-version : v1

200 OK
Content-Length : 200
Date: Mon, Apr 04, 2016 06:22:00 GMT
Location : /shire/bag-end
x-ozone-request-id : 74ec29cd-136c-4b88-9291-f3c11a9a87f3
x-ozone-server-name : aglon.ozone.self

{
"volumeName":"shire",
"bucketName":"bag-end",
"acls":[
{
"type":"USER",
"name":"frodo",
"rights":"READ_WRITE"
},
{
"type":"USER",
"name":"samwise",
"rights":"READ_WRITE"
}
],
"versioning":"DISABLED",
"storageType":"DISK"
}

Query Parameters

Query parameter info=bucket is required for this call to succeed.


Request Headers

Header

Value

Comments

Authentication
Date
x-ozone-version

OZONE name
Date & Time in GMT
v1

Simple protocol
HTTP Date
Ozone protocol version

3.2 Buckets

31

Reply Headers

Header

Value

Comments

Date
Content-Length
Location
x-ozone-request-id
x-ozone-server-name

Date & Time in GMT


Integer
String
Hexadecimal string
server name

HTTP Date
Length of the payload
Volume location
Request ID for ozone call
Ozone server that replied to this request

Error Codes

3.2.5

Error Code

Value

Comments

307
400
400
400
400
400
401
404
404
404
500

Redirect
Invalid volume name
Invalid bucket name
Missing version
Missing date
Bad date
Unauthorized
User not found
No such volume
No such bucket
Internal server error

Redirect URL
Invalid characters in volume
Invalid characters in bucket
Missing x-ozone-version
Missing date header
Unable to parse date
Access token is missing or invalid token
Invalid user Name
Volume not found
Bucket not found
Server error

List Bucket
List Bucket API allows the user to list all buckets on a volume. It has a very similar syntax to list
volumes.
API Details

/{volume}?prefix=<string>&max_keys=<int>&prev_key=<string>
Since all the query parameters are optional, here is an example of a list bucket without any
parameters.
GET

GET

/shire

Authorization : OZONE frodo


Date : Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-version : v1

200 OK
Content-Length: 722
Date: Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-request-id : fc8c8f3b-e52e-4095-8e3b-634210932c24
x-ozone-server-name : aglon.ozone.self

Chapter 3. REST Protocol

32

{
"buckets":[
{
"volumeName":"shire",
"bucketName":"bag-end",
"acls":[
{
"type":"USER",
"name":"frodo",
"rights":"READ_WRITE"
},
{
"type":"USER",
"name":"samwise",
"rights":"READ_WRITE"
}
],
"versioning":"DISABLED",
"storageType":"DISK"
},
{
"volumeName":"shire",
"bucketName":"hobbiton",
"acls":[
{
"type":"USER",
"name":"frodo",
"rights":"READ_WRITE"
},
{
"type":"USER",
"name":"samwise",
"rights":"READ_WRITE"
}
],
"versioning":"DISABLED",
"storageType":"SSD"
}
]
}

3.3 Keys

33

Query Parameters

Param

Value

Comments

prefix
max_keys
prev_key

Prefix string
Maximum number of results to return
Last seen key

return all entries that match this prefix


default value is 1000
should be bucket

Please note that all query parameters for this call are optional.
Request Headers

Header

Value

Comments

Authentication
Date
x-ozone-version

OZONE name
Date & Time in GMT
v1

Simple protocol
HTTP Date
Ozone protocol version

Reply Headers

Header

Value

Comments

Date
Location
x-ozone-request-id
x-ozone-server-name

Date & Time in GMT


String
Hexadecimal string
server name

HTTP Date
bucket location
Request ID for ozone call
Ozone server that replied to this request

Error Codes

3.3

Error Code

Value

Comments

307
400
400
400
400
400
401
404
500

Redirect
Invalid bucket name
Invalid volume name
Missing version
Missing date
Bad date
Unauthorized
No such volume
Internal server error

Redirect URL
Invalid characters or length in bucket name
Invalid characters in the volume
Missing x-ozone-version
Missing date header
Unable to parse date
Access token is missing or invalid token
Volume not found
Server error

Keys
Keys in ozone represent the objects that we store in ozone.

3.3.1

Put Key
Allows user to create or overwrite keys inside a bucket.

Chapter 3. REST Protocol

34
Action

Description

Put key
Get Key
Delete key
Info key
List keys

Creates or overwrites a key


Gets an existing key
Deletes a key
Gets key metadata
List all keys in a bucket

API Details
PUT

/{volume}/{bucket}/{key}

When putting a key it is the responsibility of the user to ensure that the key does not exist,
Otherwise put will overwrite an existing key. Key names can be from 3 bytes to maximum length
of 1024 bytes. Maximum size of data that can be stored in a single key is 5 GB. All valid URI3
characters can be used for keys.
PUT

/shire/bag-end/palantir

Authorization : OZONE frodo


Content-Length: 1024
Content-MD5: 464dcf43fde437538ad2de1956db340c
Date : Mon, Apr 04, 2016 06:22:00 GMT
Expect :100-continue
x-ozone-version : v1

data

201 CREATED
Date: Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-request-id : da547452-0595-11e6-afdd-60f81dc39ce0
x-ozone-server-name : aglon.ozone.self

Request Headers

Header

Value

Comments

Authentication
Date
x-ozone-version
Content-Length
Content-MD5

OZONE name
Date & Time in GMT
v1
length
file hash

Simple protocol
HTTP Date
Ozone protocol version
Standard HTTP header
Standard HTTP header

3 Charset

as defined here. https://tools.ietf.org/html/rfc2396#appendix-A

3.3 Keys

35

Reply Headers

Header

Value

Comments

Date
x-ozone-request-id
x-ozone-server-name

Date & Time in GMT


Hexadecimal string
server name

HTTP Date
Request ID for ozone call
Ozone server that replied to this request

Error Codes

3.3.2

Error Code

Value

Comments

307
400
400
400
400
400
400
400
401
404
404
500

Redirect
Invalid volume name
Invalid bucket name
Missing version
Missing date
Bad date
Bad Digest
Invalid key
Unauthorized
No such volume
No such bucket
Internal server error

Redirect URL
Invalid characters in the volume
Invalid characters in the bucket
Missing x-ozone-version
Missing date header
Unable to parse date
MD5 does not match
Key is invalid
Access token is missing or invalid token
Volume not found
Bucket not found
Server error

Get Key
Get keys allows user to read an existing key.
API Details
GET

/{volume}/{bucket}/{key}

GET

/shire/bag-end/palantir

Authorization : OZONE frodo


Date : Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-version : v1
Range: bytes=0-512
201 OK
Date: Mon, Apr 04, 2016 06:22:00 GMT
Content-Length : 512
Content-MD5: e6c810657a5a80446a58484e4b708b16
x-ozone-request-id : 65a0dea0-0597-11e6-ba86-60f81dc39ce0
x-ozone-server-name : aglon.ozone.self
data
Ranges are standard HTTP ranges4 .
4 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35

- This feature is not supported yet

Chapter 3. REST Protocol

36
Request Headers

Header

Value

Comments

Authentication
Date
x-ozone-version

OZONE name
Date & Time in GMT
v1

Simple protocol
HTTP Date
Ozone protocol version

Reply Headers

Header

Value

Comments

Date
x-ozone-request-id
x-ozone-server-name

Date & Time in GMT


Hexadecimal string
server name

HTTP Date
Request ID for ozone call
Ozone server that replied to this request

Error Codes

3.3.3

Error Code

Value

Comments

307
400
400
400
400
400
400
400
401
404
404
404
500

Redirect
Invalid volume name
Invalid bucket name
Missing version
Missing date
Bad date
Invalid key
Invalid range
Unauthorized
No such volume
No such bucket
No such key
Internal server error

Redirect URL
Invalid characters in the volume
Invalid characters in the bucket
Missing x-ozone-version
Missing date header
Unable to parse date
Key is invalid
Range is invalid
Access token is missing or invalid token
Volume not found
Bucket not found
Key not found
Server error

Delete Key
Deletes a key from ozone.
API Details
DELETE

/{volume}/{bucket}/{key}

DELETE

/shire/bag-end/palantir

Authorization : OZONE frodo


Date : Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-version : v1

3.3 Keys

37

200 OK
Date: Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-request-id : f7c2992c-0597-11e6-9b1f-60f81dc39ce0
x-ozone-server-name : aglon.ozone.self

Request Headers

Header

Value

Comments

Authentication
Date
x-ozone-version

OZONE name
Date & Time in GMT
v1

Simple protocol
HTTP Date
Ozone protocol version

Reply Headers

Header

Value

Comments

Date
x-ozone-request-id
x-ozone-server-name

Date & Time in GMT


Hexadecimal string
server name

HTTP Date
Request ID for ozone call
Ozone server that replied to this request

Error Codes

Error Code

Value

Comments

307
400
400
400
400
400
400
401
404
404
404
500

Redirect
Invalid volume name
Invalid bucket name
Missing version
Missing date
Bad date
Invalid key
Unauthorized
No such volume
No such bucket
No such key
Internal server error

Redirect URL
Invalid characters in the volume
Invalid characters in the bucket
Missing x-ozone-version
Missing date header
Unable to parse date
Key is invalid
Access token is missing or invalid token
Volume not found
Bucket not found
Key not found
Server error

Chapter 3. REST Protocol

38
3.3.4

Info Key
Info key provides detailed key information.
API Details
GET

/{volume}/{bucket}/{key}?info=key

GET

/shire/bag-end/palantir?info=key

Authorization : OZONE frodo


Date : Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-version : v1

200 OK
Date: Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-request-id : 1a3ce592-0657-11e6-b55f-60f81dc39ce0
x-ozone-server-name : aglon.ozone.self
{
"keyName":"palantir",
"version":0,
"md5hash":"e6edf9e1cb57057502cdaafa998e1426",
"createdOn":"Mon, Apr 04, 2016 06:22:00 GMT ",
"size":1024
}

Query Parameters

Query parameter info=key is required for this call to succeed.


Request Headers

Header

Value

Comments

Authentication
Date
x-ozone-version

OZONE name
Date & Time in GMT
v1

Simple protocol
HTTP Date
Ozone protocol version

Reply Headers

Header

Value

Comments

Date
x-ozone-request-id
x-ozone-server-name

Date & Time in GMT


Hexadecimal string
server name

HTTP Date
Request ID for ozone call
Ozone server that replied to this request

3.3 Keys

39

Error Codes

3.3.5

Error Code

Value

Comments

307
400
400
400
400
400
400
401
404
404
404
500

Redirect
Invalid volume name
Invalid bucket name
Missing version
Missing date
Bad date
Invalid key
Unauthorized
No such volume
No such bucket
No such key
Internal server error

Redirect URL
Invalid characters in the volume
Invalid characters in the bucket
Missing x-ozone-version
Missing date header
Unable to parse date
Key is invalid
Access token is missing or invalid token
Volume not found
Bucket not found
Key not found
Server error

List Keys
List key allows users to list contents of a bucket.
API Details
GET

/{volume}/{bucket}?prefix=<string>&maxresult=<int>&prev_key=<string>

GET

/shire/bag-end??prefix=ring&maxresult=100

Authorization : OZONE frodo


Date : Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-version : v1

200 OK
Date: Mon, Apr 04, 2016 06:22:00 GMT
Content-Length: 722
x-ozone-request-id : fadd901e-05a9-11e6-86b0-60f81dc39ce0
x-ozone-server-name : aglon.ozone.self
{
"Name": "bag-end",
"Prefix": "ring",
"prev_key": "",
"keyList":[
{
"version":0,
"md5hash":"e6edf9e1cb57057502cdaafa998e1426",
"createdOn":"Mon, Apr 04, 2016 06:22:00 GMT",
"size":1024,
"keyName":"ring.0"
},

Chapter 3. REST Protocol

40
{

"version":0,
"md5hash":"01c884d16f23e3da058100510967c6c8",
"createdOn":"Mon, Apr 04, 2016 06:22:00 GMT",
"size":1024,
"keyName":"ring.1"
},
]
}

Query Parameters

Param

Value

Comments

prefix
max_keys
prev_key

Prefix string
Maximum number of results to return
Last seen key

return all entries that match this prefix


default value is 1000
should be key

Please note that all query parameters for this call are optional.
Request Headers

Header

Value

Comments

Authentication
Date
x-ozone-version

OZONE name
Date & Time in GMT
v1

Simple protocol
HTTP Date
Ozone protocol version

Reply Headers

Header

Value

Comments

Date
Location
x-ozone-request-id
x-ozone-server-name

Date & Time in GMT


String
Hexadecimal string
server name

HTTP Date
bucket location
Request ID for ozone call
Ozone server that replied to this request

3.3 Keys

41

Error Codes

Error Code

Value

Comments

307
400
400
400
400
400
401
404
404
500

Redirect
Invalid bucket name
Invalid volume name
Missing version
Missing date
Bad date
Unauthorized
No such volume
No such bucket
Internal server error

Redirect URL
Invalid characters or length in bucket name
Invalid characters in the volume
Missing x-ozone-version
Missing date header
Unable to parse date
Access token is missing or invalid token
Volume not found
Bucket not found
Server error

There are various payloads shown in the examples. They are all declared under org.apache.hadoop.
ozone.web.response. Please lookup that package if you are working with REST protocol directly.
Ozone errors are JSON formatted and here is an example of ozone error.
{
"httpCode":400,
"shortMessage":"invalidKey",
"resource":"/volume/bucket/key",
"message":"Invalid key",
"requestID":"bd3244e8-de4c-42ee-9b16-9812734a7cbd",
"hostName":"aglon.ozone.self"
}
The error class and the table of error codes can be found under org.apache.hadoop.ozone.web.exceptions.

4. Customization

Ozone is designed with customization in mind. Other than configuration keys there are three major points of customization. They are all defined under org.apache.hadoop.ozone.web.interfaces.
First interface allows end-users to define custom authentication scheme. UserAuth is used
by the system to discover the identity of the user. Implementing this interface and tweaking
UserHandlerBuilder allows you to plugin your own user authentication schemes easily into
ozone. Ozone currently ships with an authenticator called Simple.
The second interface allows end users to plug into ozone accounting. This can be achieved
by implementing the interface specified in Accounting. For every successful put and delete this
interface will be invoked by ozone.
The third interface is the file system interface, which in most cases the end users should not
touch. This is defined under StorageHandler there are two instances of this interface, one which
speaks to local file system in LocalStorageHandler and other that talks to the container layer
that powers ozone in DistributedStorageHandler. If you need to intercept ozone requests, this
interface is a good place to do so.

Acknowledgments

This document has been prepared by using a LATEXtemplate called "The Legrand Orange Book"
written by Mathias Legrand and licensed under creative commons.1 The cover photo is taken
by an unknown photographer and licensed under creative commons zero2 . The cover photo is a
modified image and downloaded from this URL.3 Early drafts of this book was written using
Overleaf Latex Online Editing service4 .
Ozone is an on-going project, It was conceived by Jitendra Pandey, Sanjay Radia and Suresh
Srinivas. It is mainly designed & developed by Jitendra Pandey, Arpit Agarwal, Chris Nauroth,
Jing Zhao, Tsz Wo Nicholas Sze and Anu Engineer.
The Apache community has been very helpful and we were supported by comments and
contributions from Kanaka Kumar Avvaru, Edward Bortnikov, Thomas Demoor, Nick Dimiduk,
Chris Douglas, Jian Fang, Lars Francke, Gautam Hegde, Lars Hofhansl, Jakob Homan, Virajith
Jalaparti, Charles Lamb, Steve Loughran, Haohui Mai, Colin Patrick McCabe, Aaron Myers,
Owen OMalley, Liam Slusser, Jeff Sogolov, Enis Soztutar, Andrew Wang, Fengdong Yu,
Zhe Zhang, khanderao and others. Conversations with these community members have been
instrumental for many critical features of ozone. If I have inadvertently missed anyone I apologize
for the oversight.
I also want to thank Chris Nauroth & Arpit Agarwal for answering my numerous questions
and introducing me to the beautiful world of hadoop. If you find any errors or other issues in this
document, please email me at aengineer AT hortonworks DOT com.

http://creativecommons.org/licenses/by-nc-sa/3.0/

2 https://creativecommons.org/publicdomain/zero/1.0/
3 https://static.pexels.com/photos/3853/sunny-sand-desert-hiking.jpeg
4 https://www.overleaf.com

Вам также может понравиться