Академический Документы
Профессиональный Документы
Культура Документы
Anu Engineer
Table of Contents
1.1
Introduction
1.2
1.3
1.3.1
1.3.2
1.3.3
1.3.4
1.3.5
1.3.6
1.3.7
Using Ozone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1
Setting up Ozone
2.2
2.2.1
2.2.2
2.2.3
Volume commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Bucket Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Key commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3
Client Libraries
2.4
Managing Ozone
2.4.1
2.4.2
2.4.3
2.4.4
2.4.5
2.4.6
2.4.7
2.4.8
Cluster Level . .
Server Level . .
User Level . . . .
Volume Level .
Bucket Level . .
Key Level . . . .
Request Level .
Logging . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
5
6
6
6
6
7
11
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
12
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
12
13
14
14
14
15
15
15
REST Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1
3.1.1
3.1.2
3.1.3
3.1.4
3.1.5
3.2
Buckets
3.2.1
3.2.2
3.2.3
3.2.4
3.2.5
Create bucket .
Update bucket
Delete Bucket .
Info Bucket . . . .
List Bucket . . . .
3.3
Keys
3.3.1
3.3.2
3.3.3
3.3.4
3.3.5
Put Key . . .
Get Key . .
Delete Key
Info Key . .
List Keys . .
Customization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
16
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
16
18
20
21
22
24
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
25
27
28
29
31
33
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
33
35
36
38
39
Introduction
Object Store Overview
Ozones Storage Elements
Common REST Headers
Common Reply Headers
Storage Volumes
Buckets
Keys
Authentication & Access Control
Accounting
1.1
Introduction
Ozone is an Amazon S3 1 like object store. It is a redundant, distributed object store build by
leveraging primitives present in HDFS. Ozone provides a REST interface similar to Amazon
S3. The primary design point of ozone is scalability and its aim is to scale to trillions of objects.
However the guarantees and user visible behavior of an object store is different from a distributed
file system like HDFS.
1.2
This is the user documentation needed to work with ozone. If you are looking for the design
of ozone, please look at the ozone architecture document in Apache JIRA - HDFS-7240.
Ozone is a work in progress, so please dont assume that this is the final draft of ozone
spec.
1.3
1.3.1
1.3.2
6
1.3.3
Storage Volumes
A storage volume represents an unit of ownership in ozone world. Once ozone is installed, an
administrator5 will have to create a storage volume for a user. The notion of user and group
are defined outside ozone. A user or a group might exist as a Kerberos principal or as some
other sort of a name. Please see the customization section to understand how to extend ozone
authorization to use custom schemes.
1.3.4
Buckets
Buckets are similar to directories in a real file system (however buckets cannot be nested), they
hold keys and values, or file names and data.
1.3.5
Keys
Keys are unique entities in a bucket. The value part of a key is the data stream.
1.3.6
see simple authentication section to understand what an administrator means in that context
is the default insecure protocol - called simple. Please do not use this if you want a secure ozone cluster.
7 Yes, this is arbitrary. There is nothing critical in the system forcing us to use 32. Only limit is that we would like
to keep HTTP headers less than 8 KB
8 many unix systems follow this pattern, we are eager to hear if this causes any issues for you
6 This
Right now ACLs are supported only at bucket level. We do plan to extend ACL support to
volumes and keys in the later versions of ozone.
1.3.7
Accounting
Ozone intends to provide built-in capabilities to do accounting and it follows AWS model of
accounting 9 . It calculates how many byte-hours of storage has been used by a storage volume.
In other words, it computes how many bytes have been stored for how long in an ozone cluster.
This can be extended via java interfaces or read from ozone using REST API10 .
9 Please
10 TBD
Setting up Ozone
Command Line Interface
Volume commands
Bucket Commands
Key commands
Client Libraries
Managing Ozone
Cluster Level
Server Level
User Level
Volume Level
Bucket Level
Key Level
Request Level
Logging
2. Using Ozone
2.1
Setting up Ozone
Setting up ozone is simple. If you have a hadoop cluster1 , ozone can be enabled by setting up
these two properties in hdfs-site.xml.
<property>
<name>ozone.enabled</name>
<value>true</value>
</property>
<property>
<name>ozone.handler.type</name>
<value>local</value>
</property>
The ozone.enabled key enables ozone functionality. The second key ozone.handler.type tells
ozone which storage handler to use. By default we ship with two handlers, the distributed handler
and a local handler. Local handler is strictly for testing. Since ozone is still a work in progress2 ,
this document uses local for all illustration purposes. Ozone local handler mimics an object store
under /tmp or a user defined directory.
Once ozone is enabled, ozones REST server listens at the datanode http port. Following
examples assume that we have an ozone cluster called http://ozone.self
2.2
on HDFS-7240 branch
means that distributed handler is not fully functional yet.
3 We deliberately chose not to tie this to an account since historically HDFS is deployed both in secure and insecure
formats.
2 This
Volume commands
Volume commands are used to manage storage volumes. Other than list volume and info volume
all volume commands need administrator privilege.
Create Volume
In order to create a volume you need to be an admin and specify who the owner of the volume is.
# Creates a volume called shire that is owned by user bilbo
hdfs oz -createVolume http://ozone.self:50075/shire -user bilbo -quota
100TB -root
Please note the usage of -root. if you specify this parameter then the calling user is set to
hdfs, otherwise the caller name defaults to the current logged in user4 .
Update Volume
Info volume command allows the owner or the administrator of the cluster to read meta-data
about a specific volume.
# Get information about a volume that has been created
hdfs oz -infoVolume http://ozone.self:50075/shire -root
List Volumes
List volume command can be used by administrator to list volumes of any user. It can also be
used by a user to list volumes owned by him.
# List all volumes owned by user bilbo
hdfs oz -listVolume http://ozone.self:50075/ -user bilbo -root
4 This
just makes the life of a developer easy, we may not ship this feature.
10
2.2.2
Bucket Commands
Bucket commands follow a similar pattern as volume commands. However bucket commands
are designed to be run by the owner of the volume. Following examples assume that these
commands are run by the owner of the volume or bucket.
Create Bucket
Key commands
Set of commands to manage keys (objects) in an ozone bucket.
Put Key
Creates or overwrites a key in ozone store, -file points to the file you want to upload.
# Uploads a file to ozone bucket
hdfs oz -putKey http://ozone.self:50075/shire/rings/narya
-file
Get Key
-file
narya
11
Delete Key
2.3
Client Libraries
Ozone ships with a default java client library5 . Please look at the REST protocol section if you
would like to write your own client library.
The ozone java client library consists of four main classes. The OzoneClient, OzoneVolume,
OzoneBucket and OzoneKey. You start communicating with the server using OzoneClient and
from the client you can get to a volume and so on.
Here is a simple example of how to use Ozone client library.
import org.apache.hadoop.ozone.web.client.OzoneClient;
import org.apache.hadoop.ozone.web.client.OzoneVolume;
import org.apache.hadoop.ozone.web.client.OzoneBucket;
12
2.4
Managing Ozone
Hardest part of running any service is managing it. Ozone comes with a set of management
interfaces out of the box. The management interface of ozone is built on a notion called "levels
& pillars". Each metric or user facing configuration is tagged with6 its level and pillar. Ozone
currently defines these levels. They are Cluster, Server, User, Volume, Bucket, Key and Request.
The pillars of ozones management are - Faults, Configuration, Accounting, Performance and
Security.
We are hoping that this kind of tagging will make it easy for a user to locate a configuration or
a metric easily7 . Let us say that we have an "access denied" error, then this kind of classification
will allow end user to quickly find all configuration and metrics that deals with security at the
required level. You can think of this as a matrix of configuration and metrics for ozone. Please
look at the cheat sheet provided at the end of this section to see an example.
Following sections describe pillars for each level of ozone and what metrics and configuration
is provided.
2.4.1
Cluster Level
Faults
At cluster level we capture a set of counters that represent faults. They are
1. Total error count - Total number of errors in the cluster for the last hour.
2. Top servers with errors - Top x (say 20) servers with max error counts.
3. Straggler nodes - Nodes which are slower by three standard deviations from mean
response time for gets and puts.
These error counts would give a view or indicate a trend to the administrator on how healthy
ozone currently is.
6 Ideally
7 Lessons
13
Configuration
We have a bunch of configuration parameters for ozone, but two of them are important, since
correct configuration values are required to enable ozone.
1. ozone.enabled Must be set to true for ozone to work
2. ozone.storage.handler.type Must be set to local or distributed to work
Accounting
Accounting in the context of ozone is something that reflects a chargeable quantity. For example,
Amazon charges per bytes stored over time, for number of requests or for bytes that are read from
the bucket. Since there are no chargeable entities at the cluster level, we will have no accounting
at the cluster level.
Performance
If security is enabled for ozone, configuration setting will be enabled using the hdfs-site.xml.
Ozone will be able leverage things like Kerberos from the current HDFS configuration.
2.4.2
Server Level
Faults
At server level we capture a set of counters that represent faults. They are
1. Total errors - Total reported by this server for last hour.
2. Errors by error code - Errors broken down by error code. This is useful since "internal
server error" requires a different admin response from "access denied". Please see REST
protocol section to see details on error codes.
Configuration
We hope we will not need any server level configuration or whatever we need is already present
in the hdfs-site.xml. An example of server specific configuration would be _HOST used by
Kerberos enabled clusters.
Accounting
Server is a shared resource, so accounting at this level does not make sense.
Performance
14
2.4.3
User Level
Other than security nothing in ozone deals with user. We rely on UserAuth interface to tell us
the users identity. The default for UserAuth Handler is Simple, which means user is deploying
an insecure cluster8 .
ozone.user.provider - value needs to be set to use a different provider, if this is not set
ozone will use org.apache.hadoop.ozone.web.userauth.Simple as the default provider9 .
2.4.4
Volume Level
Faults
Since volumes can span multiple physical machines, it is hard to keep faults aggregated by
volume. Hence ozone currently exposes no fault counters at volume level.
Configuration
1. Quota Allows admins to specify how much disk can be consumed by this volume.
2. Owner - Specifies which user owns this volume. Please see REST protocol section for
more details.
Accounting
1. Bytes per period - All accounting is done at the Volume Level. We will store how
many bytes are stored per day at the volume level. Ozone will emit an accounting entry
whenever we put or delete a blob. If the put is overwriting an existing blob, we will emit
two entries, as if the older blob was deleted and new one was added. These numbers are
aggregated and exposed as accounting entries. Please do note that ozone accounts per
volume and not per user.
2. Requests This is something that we are planning to do in future, this will capture how
many requests are made to a volume.
Performance
Volume Ops Create, Delete and Update is only possible by cluster admins.
2.4.5
Bucket Level
Faults
It might be interesting to maintain fault counters at bucket level, However ozone currently have
no counters at this level.
Configuration
1. Bucket Version Support - Enabled, Disabled When implemented this will store
versions of files.
2. Bucket Storage Type - DISK, SSD or ARCHIVE Standard types supported by HDFS.
Accounting
15
Security
Key Level
Nothing, currently we support no features at key level. At some point we would like to support
policies and ACLs on keys.
2.4.7
Request Level
We will log at INFO level for each request failure. So users can correlate that with ozone-requestid and find out what went wrong.
So given pillars and levels we can think of ozones management interface as a simple matrix
which looks like this.
Faults
Configuration
Accounting
Cluster
1. Error count
2. Server errors
1. ozone.enabled
2. ozone.storage.handler.type
- N/A-
Server
1. Total errors
2. Errors by error code
- N/A-
- N/A-
User
- N/A-
Volume
- N/A-
Bucket
- N/A-
Key
Request
- N/ALogging
UserAuth Handler,
default Simple.
1. Quota
2. Owner
1. Version Support
2. Storage Type
- N/A- N/A-
Performance
1. Bytes read
2. Bytes written
3. Put and delete count
4. Used capacity
5. Straggler nodes
1. Bytes read
2. Bytes written
3. Request count
Security
- N/A-
- N/A-
UserAuth Handler
- N/A-
Admin only
- N/A-
- N/A-
ACLs
- N/A- N/A-
- N/A- N/A-
- N/A- N/A-
Kerberos
HTTPS certificate
Logging
By default ozone writes its logs to ozone.log in the default hadoop logging directory. This is the
standard location where you would look for datanode/namenode logs. Here is an example of
how ozone.log looks like:
2016-03-23 15:01:39,325 INFO getVolumeInfo bilbo/vol1 bilbo
7407bd2d d0fd-425c-824c-9dceed032170 - Success
2016-03-23 17:56:00,697 INFO createVolume bilbo/shire hdfs
920cef58-b7cf-4314-9ec3-249a1c7e95cb - Returning exception. ex: {"httpCode":409,
"shortMessage":"volumeAlreadyExists","resource":"bilbo/shire","message":null,
"requestID":"920cef58-b7cf-4314-9ec3-249a1c7e95cb",
"hostName":"aglon.ozone.self"}
The fields of the log are time stamp, level, ozone function, resource name, user, request ID
and message.
3. REST Protocol
This chapter covers the ozones REST protocol in detail, unless you are developing an ozone
client or is really interested in understanding the protocol, you can safely skip this chapter. The
tools like oz shell or client library takes care of speaking this protocol.
3.1
3.1.1
Action
Description
/{volume}
Here is a working example of this request. The following call creates a storage volume called
shire, whose owner is bilbo. To specify the owner of the volume we use an ozone specific header
called x-ozone-user header.
POST
17
/shire
201 CREATED
Date: Mon, Apr 04, 2016 06:22:00 GMT
Location : /shire
x-ozone-request-id : 920cef58-b7cf-4314-9ec3-249a1c7e95cb
x-ozone-server-name : aglon.ozone.self
An optional query parameter is used to specify a quota for this volume.
POST
/shire?quota=10TB
201 CREATED
Date: Mon, Apr 04, 2016 06:22:00 GMT
Location : /shire
x-ozone-quota: 10 TB
x-ozone-request-id : 759d5028-af9c-4bd6-9ddf-2b1f88569e86
x-ozone-server-name : aglon.ozone.self
In the above example admin was able to set a quota of 10 TB to a storage volume called
shire. Under ozone we associate quotas to storage volumes and not to user accounts. If no quota
is specified then none is presumed to exist.
Query Parameters
Param
Value
Comments
Quota
Two headers are mandatory for all ozone requests. They are Date and x-ozone-version.
For individual requests only special headers are highlighted as required. We are aware
of RFC 6648 1 which deprecates the use of x- as headers. However most users of HTTP
protocol are familiar with reading x- as a custom parameter, hence we have chosen to use
it in ozone.
1 http://tools.ietf.org/html/rfc6648
18
Request Headers
Header
Value
Comments
Authentication
Date
x-ozone-version
x-ozone-user
OZONE name
Date & Time in GMT
v1
User name
Simple protocol
HTTP Date
Ozone protocol version
Required - Volumes need owners
Reply Headers
Header
Value
Comments
Date
Location
x-ozone-quota
x-ozone-request-id
x-ozone-server-name
HTTP date
volume location
if valid quota parameter was specified
Request ID for ozone call
Ozone server that replied to this request
Error Codes
3.1.2
Error Code
Value
Comments
400
400
400
400
400
401
403
404
409
500
Malformed quota
Invalid volume name
Missing version
Missing date
Bad date
Unauthorized
Access denied
User not found
Volume already exists
Internal server error
/shire
19
x-ozone-user : frodo
x-ozone-version : v1
200 OK
Date: Mon, Apr 04, 2016 06:22:00 GMT
Location : /shire
x-ozone-request-id : ddb48938-3a19-4287-aafc-d0751ed9c2c6
x-ozone-server-name : aglon.ozone.self
Changing the ownership of a volume does not change the bucket ACLs, it just changes
the ownership of the storage volume. However all buckets have the storage volume owner
with full access right in the ACL set, thus when you change storage volume ownership the
new owner will be able to read and write ACLs if needed.
Query Parameters
Header
Value
Comments
Quota
Optional
Request Headers
Header
Value
Comments
Authentication
Date
x-ozone-version
x-ozone-user
OZONE name
Date & Time in GMT
v1
User name
Simple protocol
HTTP date
Ozone protocol version
Optional - used to change volume ownership
Reply Headers
Header
Value
Comments
Date
Location
x-ozone-quota
x-ozone-request-id
x-ozone-server-name
HTTP date
volume location
if quota change was requested
Request ID for ozone call
Ozone server that replied to this request
20
Error Codes
3.1.3
Error Code
Value
Comments
400
400
400
400
400
401
403
404
404
500
Malformed quota
Invalid volume name
Missing version
Missing date
Bad date
Unauthorized
Access denied
User not found
No such volume
Internal server error
/{volume}
DELETE
/shire
200 OK
Date: Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-request-id : e194316a-6c3c-43f9-89d6-dea876e33e12
x-ozone-server-name : aglon.ozone.self
Request Headers
Header
Value
Comments
Authentication
Date
x-ozone-version
OZONE name
Date & Time in GMT
v1
Simple protocol
HTTP date
Ozone protocol version
Reply Headers
Header
Value
Comments
Date
x-ozone-request-id
x-ozone-server-name
HTTP date
Request ID for ozone call
Ozone server that replied to this request
21
Error Codes
3.1.4
Error Code
Value
Comments
400
400
400
400
401
404
404
409
500
GET /{volume}?info=volume
This is an example of an admin reading a volume info of another user.
GET
/shire?info=volume
200 OK
Content-Length : 400
Date: Mon, Apr 04, 2016 06:22:00 GMT
Location : /shire
x-ozone-request-id : 3230d711-f377-4b2d-ba6b-be3a9a0313ba
x-ozone-server-name : aglon.ozone.self
{
"owner" : {
"name" : "bilbo"
},
"quota" : {
"unit" : "TB",
"size" : 100
},
"volumeName" : "shire",
"createdOn" : "Mon, Apr 04 2016 06:22:00 GMT",
"createdBy" : "hdfs"
}
22
Query Parameters
Header
Value
Comments
Authentication
Date
x-ozone-version
OZONE name
Date & Time in GMT
v1
Simple protocol
HTTP Date
Ozone protocol version
Reply Headers
Header
Value
Comments
Date
Content-Length
Location
x-ozone-request-id
x-ozone-server-name
HTTP Date
Length of the payload
Volume location
Request ID for ozone call
Ozone server that replied to this request
Error Codes
3.1.5
Error Code
Value
Comments
400
400
400
400
401
404
404
500
Ozone allows three distinct ways to query a volume. The following table summarizes the different
methods. The action, scope and invoked by in the following table describes the semantics of each
type of query.
Action
Scope
Invoked by
Query
Header
of a user
of a user
in the cluster
Get /?prefix-=&max_keys=&prev_key=
Get /?prefix-=&max_keys=&prev_key=
Get /?prefix-=&max_keys=&prev_key=&root_scan=1
x-ozone-user: name
/?prefix=<string>&max_keys=<int>&prev_key=<string>&root_scan=1
Since query parameters are optional here is an example of a user listing volumes owned by him.
GET
23
In this case the user is recognized and the user has two storage volumes called rings and
swords, this is how the reply would look like.
200 OK
content-length: 300
Content-Type: application/octet-stream
Date: Sat, 16 Apr 2016 20:06:29 GMT
x-ozone-request-id: 34e85523-80fa-459c-a150-95366fa84a92
x-ozone-server-name : aglon.ozone.self
{
"volumes":[
{
"owner":{
"name":"bilbo"
},
"quota":{
"unit":"TB",
"size":100
},
"volumeName":"rings",
"createdOn":"Sat, 16 Apr 2016 19:19:21 GMT",
"createdBy":"hdfs"
},
{
"owner":{
"name":"bilbo"
},
"quota":{
"unit":"TB",
"size":100
},
"volumeName":"swords",
"createdOn":"Sat, 16 Apr 2016 19:19:21 GMT",
"createdBy":"hdfs"
}
]
}
Same API can be invoked by an admin by adding x-ozone-user header and specifying the
user that admin wants data about. The usage is similar to Info storage volumes API.
24
Query Parameters
Param
Value
Comments
prefix
max_keys
prev_key
root_scan
Prefix string
Maximum number of keys to return
Last seen key
indicates cluster level listing
Please note that all query parameters for this call are optional.
Request Headers
Header
Value
Comments
Authentication
Date
x-ozone-version
OZONE name
Date & Time in GMT
v1
Simple protocol
HTTP date
Ozone protocol version
Reply Headers
Header
Value
Comments
Date
Content-Length
x-ozone-request-id
x-ozone-server-name
HTTP date
Length of the payload
Request ID for ozone call
Ozone server that replied to this request
Error Codes
3.2
Error Code
Value
Comments
400
400
400
401
403
404
500
Missing version
Missing date
Bad date
Unauthorized
Access denied
User not found
Internal server error
Missing x-ozone-version
Missing date header
Unable to parse date
Access token is missing or invalid token
Access denied
Invalid user Name
Server error
Buckets
All buckets are created under a storage volume. Buckets can be created only by the owner of the
volume. Buckets support access control lists and storage types.
3.2 Buckets
3.2.1
25
Action
Description
Create bucket
Update bucket
Delete bucket
Info Bucket
List bucket
Creates a bucket
Updates a bucket
Deletes a bucket if it is empty
Gets bucket info
List all buckets in a volume
Create bucket
Create bucket API allows users to create a bucket. It supports a set of storage types. They are
ARCHIVE
DEFAULT
DISK
RAM _ DISK
SSD
You can specify the storage type when you create buckets. Storage types allow buckets to be
placed on volumes with specific characteristics.
API Details
/{volume}/{bucket}
The minimum length of a bucket name shall be 3 and maximum name length shall be 64.
Bucket names must contain only lowercase letters, numbers, dashes (-), underscores (_), and
dots (.)
POST
POST
/shire/bag-end
201 CREATED
Date: Mon, Apr 04, 2016 06:22:00 GMT
Location :/shire/bag-end
x-ozone-request-id : e0688666-fa07-45a1-a33e-7536560e96db
x-ozone-server-name : aglon.ozone.self
/shire/bag-end
26
201 CREATED
Date: Mon, Apr 04, 2016 06:22:00 GMT
Location :/shire/bag-end
x-ozone-request-id : c7ca87f4-1d4e-4d49-bc34-74fa865951ce
x-ozone-server-name : aglon.ozone.self
Request Headers
Header
Value
Comments
Authentication
Date
x-ozone-version
x-ozone-acl
x-ozone-bucket-versioning
x-ozone-storage-class
OZONE name
Date & Time in GMT
v1
<ozone acl>
enabled/disabled
storage type
Simple protocol
HTTP Date
Ozone protocol version
Ozone acls
optional, versioning2
optional, allows placement control
Reply Headers
Header
Value
Comments
Date
Location
x-ozone-request-id
x-ozone-server-name
HTTP Date
Bucket location
Request ID for ozone call
Ozone server that replied to this request
Error Codes
Error Code
Value
Comments
307
400
400
400
400
400
400
400
400
401
403
403
404
404
409
500
Redirect
Invalid bucket name
Invalid volume name
Missing version
Missing date
Bad date
Invalid ACL
Unknown storage class
Unknown versioning tag
Unauthorized
Insufficient quota
Access denied
No such volume
User not found
Bucket already exists
Internal server error
Redirect URL
Invalid characters or length in bucket name
Invalid characters in the volume
Missing x-ozone-version
Missing date header
Unable to parse date
Invalid ACL
Invalid storage class
Unknown versioning
Access token is missing or invalid token
Quota exceeded
Access denied
Volume not found
Invalid user name in acl
Bucket name should be unique under a volume
Server error
3.2 Buckets
3.2.2
27
Update bucket
The update bucket API allows users to make changes to an existing bucket. This API allows
changing of ACLs and versioning.
API Details
PUT
/{volume}/{bucket}
Acls are specified using the x-ozone-acl header. you specify if you are adding or removing
an ACL with appropriate prefix.You can also add or remove versioning from a bucket using
x-ozone-bucket-versioning : enable and x-ozone-bucket-versioning : disable. if you disable
versioning on a bucket then all versions other than the current version is deleted. This is a
destructive operation and old versions are irrecoverably lost.
PUT
/shire/bag-end
200 OK
Date: Mon, Apr 04, 2016 06:22:00 GMT
Location :/shire/bag-end
x-ozone-request-id : 8f6db344-0654-11e6-a738-60f81dc39ce0
x-ozone-server-name : aglon.ozone.self
With this request we just added peregrin to the people who can read and write to bag-end
bucket. At some point of time, we will support versioning of objects stored in a bucket. To enable
that feature (which does not exist now) all you need to do is add x-ozone-bucket-versioning :
enabled to the http headers.
Request Headers
Header
Value
Comments
Authentication
Date
x-ozone-version
x-ozone-acl
x-ozone-bucket-versioning
OZONE name
Date & Time in GMT
v1
ozone acl
enabled or disabled
Simple protocol
HTTP Date
Ozone protocol version
Ozone acls
bucket versioning
28
Reply Headers
Header
Value
Comments
Date
Location
x-ozone-request-id
x-ozone-server-name
HTTP Date
bucket location
Request ID for ozone call
Ozone server that replied to this request
Error Codes
3.2.3
Error Code
Value
Comments
307
400
400
400
400
400
400
400
400
401
403
404
404
404
409
500
Redirect
Invalid bucket name
Invalid volume name
Missing version
Missing date
Bad date
Invalid acl
unknown storage class
Unknown versioning tag
Unauthorized
Access denied
No such volume
User not found
No such bucket
bucket already exists
Internal server error
Redirect URL
Invalid characters or length in bucket name
Invalid characters in the volume
Missing x-ozone-version
Missing date header
Unable to parse date
Invalid acl
Invalid storage class
Unknown versioning
Access token is missing or invalid token
Access denied
Volume not found
Invalid user name in acl
Bucket not found
Bucket name should be unique under a volume
Server error
Delete Bucket
Deletes a bucket if it is empty.
API Details
DELETE
/{volume}/{bucket}
DELETE
/shire/bag-end
200 OK
Date: Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-request-id : fd7b5c7c-f21e-4dad-8f61-3cb2271fddff
x-ozone-server-name : aglon.ozone.self
3.2 Buckets
29
Request Headers
Header
Value
Comments
Authentication
Date
x-ozone-version
OZONE name
Date & Time in GMT
v1
Simple protocol
HTTP date
Ozone protocol version
Reply Headers
Header
Value
Comments
Date
x-ozone-request-id
x-ozone-server-name
HTTP date
Request ID for ozone call
Ozone server that replied to this request
Error Codes
3.2.4
Error Code
Value
Comments
307
400
400
400
400
400
401
403
404
404
404
409
500
Redirect
Invalid volume name
Invalid bucket name
Missing version
Missing date
Bad date
Unauthorized
Access denied
User not found
No such volume
No such bucket
Bucket not empty
Internal server error
Redirect URL
Invalid characters in volume
Invalid characters in bucket
Missing x-ozone-version
Missing date header
Unable to parse date
Access token is missing or invalid token
Access denied
Invalid user Name
Volume not found
Bucket not found
Bucket is not empty
Server error
Info Bucket
This API returns information about a given bucket.
API Details
GET
/{volume}/{bucket}?info=bucket
30
GET
/shire/bag-end?info=bucket
200 OK
Content-Length : 200
Date: Mon, Apr 04, 2016 06:22:00 GMT
Location : /shire/bag-end
x-ozone-request-id : 74ec29cd-136c-4b88-9291-f3c11a9a87f3
x-ozone-server-name : aglon.ozone.self
{
"volumeName":"shire",
"bucketName":"bag-end",
"acls":[
{
"type":"USER",
"name":"frodo",
"rights":"READ_WRITE"
},
{
"type":"USER",
"name":"samwise",
"rights":"READ_WRITE"
}
],
"versioning":"DISABLED",
"storageType":"DISK"
}
Query Parameters
Header
Value
Comments
Authentication
Date
x-ozone-version
OZONE name
Date & Time in GMT
v1
Simple protocol
HTTP Date
Ozone protocol version
3.2 Buckets
31
Reply Headers
Header
Value
Comments
Date
Content-Length
Location
x-ozone-request-id
x-ozone-server-name
HTTP Date
Length of the payload
Volume location
Request ID for ozone call
Ozone server that replied to this request
Error Codes
3.2.5
Error Code
Value
Comments
307
400
400
400
400
400
401
404
404
404
500
Redirect
Invalid volume name
Invalid bucket name
Missing version
Missing date
Bad date
Unauthorized
User not found
No such volume
No such bucket
Internal server error
Redirect URL
Invalid characters in volume
Invalid characters in bucket
Missing x-ozone-version
Missing date header
Unable to parse date
Access token is missing or invalid token
Invalid user Name
Volume not found
Bucket not found
Server error
List Bucket
List Bucket API allows the user to list all buckets on a volume. It has a very similar syntax to list
volumes.
API Details
/{volume}?prefix=<string>&max_keys=<int>&prev_key=<string>
Since all the query parameters are optional, here is an example of a list bucket without any
parameters.
GET
GET
/shire
200 OK
Content-Length: 722
Date: Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-request-id : fc8c8f3b-e52e-4095-8e3b-634210932c24
x-ozone-server-name : aglon.ozone.self
32
{
"buckets":[
{
"volumeName":"shire",
"bucketName":"bag-end",
"acls":[
{
"type":"USER",
"name":"frodo",
"rights":"READ_WRITE"
},
{
"type":"USER",
"name":"samwise",
"rights":"READ_WRITE"
}
],
"versioning":"DISABLED",
"storageType":"DISK"
},
{
"volumeName":"shire",
"bucketName":"hobbiton",
"acls":[
{
"type":"USER",
"name":"frodo",
"rights":"READ_WRITE"
},
{
"type":"USER",
"name":"samwise",
"rights":"READ_WRITE"
}
],
"versioning":"DISABLED",
"storageType":"SSD"
}
]
}
3.3 Keys
33
Query Parameters
Param
Value
Comments
prefix
max_keys
prev_key
Prefix string
Maximum number of results to return
Last seen key
Please note that all query parameters for this call are optional.
Request Headers
Header
Value
Comments
Authentication
Date
x-ozone-version
OZONE name
Date & Time in GMT
v1
Simple protocol
HTTP Date
Ozone protocol version
Reply Headers
Header
Value
Comments
Date
Location
x-ozone-request-id
x-ozone-server-name
HTTP Date
bucket location
Request ID for ozone call
Ozone server that replied to this request
Error Codes
3.3
Error Code
Value
Comments
307
400
400
400
400
400
401
404
500
Redirect
Invalid bucket name
Invalid volume name
Missing version
Missing date
Bad date
Unauthorized
No such volume
Internal server error
Redirect URL
Invalid characters or length in bucket name
Invalid characters in the volume
Missing x-ozone-version
Missing date header
Unable to parse date
Access token is missing or invalid token
Volume not found
Server error
Keys
Keys in ozone represent the objects that we store in ozone.
3.3.1
Put Key
Allows user to create or overwrite keys inside a bucket.
34
Action
Description
Put key
Get Key
Delete key
Info key
List keys
API Details
PUT
/{volume}/{bucket}/{key}
When putting a key it is the responsibility of the user to ensure that the key does not exist,
Otherwise put will overwrite an existing key. Key names can be from 3 bytes to maximum length
of 1024 bytes. Maximum size of data that can be stored in a single key is 5 GB. All valid URI3
characters can be used for keys.
PUT
/shire/bag-end/palantir
data
201 CREATED
Date: Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-request-id : da547452-0595-11e6-afdd-60f81dc39ce0
x-ozone-server-name : aglon.ozone.self
Request Headers
Header
Value
Comments
Authentication
Date
x-ozone-version
Content-Length
Content-MD5
OZONE name
Date & Time in GMT
v1
length
file hash
Simple protocol
HTTP Date
Ozone protocol version
Standard HTTP header
Standard HTTP header
3 Charset
3.3 Keys
35
Reply Headers
Header
Value
Comments
Date
x-ozone-request-id
x-ozone-server-name
HTTP Date
Request ID for ozone call
Ozone server that replied to this request
Error Codes
3.3.2
Error Code
Value
Comments
307
400
400
400
400
400
400
400
401
404
404
500
Redirect
Invalid volume name
Invalid bucket name
Missing version
Missing date
Bad date
Bad Digest
Invalid key
Unauthorized
No such volume
No such bucket
Internal server error
Redirect URL
Invalid characters in the volume
Invalid characters in the bucket
Missing x-ozone-version
Missing date header
Unable to parse date
MD5 does not match
Key is invalid
Access token is missing or invalid token
Volume not found
Bucket not found
Server error
Get Key
Get keys allows user to read an existing key.
API Details
GET
/{volume}/{bucket}/{key}
GET
/shire/bag-end/palantir
36
Request Headers
Header
Value
Comments
Authentication
Date
x-ozone-version
OZONE name
Date & Time in GMT
v1
Simple protocol
HTTP Date
Ozone protocol version
Reply Headers
Header
Value
Comments
Date
x-ozone-request-id
x-ozone-server-name
HTTP Date
Request ID for ozone call
Ozone server that replied to this request
Error Codes
3.3.3
Error Code
Value
Comments
307
400
400
400
400
400
400
400
401
404
404
404
500
Redirect
Invalid volume name
Invalid bucket name
Missing version
Missing date
Bad date
Invalid key
Invalid range
Unauthorized
No such volume
No such bucket
No such key
Internal server error
Redirect URL
Invalid characters in the volume
Invalid characters in the bucket
Missing x-ozone-version
Missing date header
Unable to parse date
Key is invalid
Range is invalid
Access token is missing or invalid token
Volume not found
Bucket not found
Key not found
Server error
Delete Key
Deletes a key from ozone.
API Details
DELETE
/{volume}/{bucket}/{key}
DELETE
/shire/bag-end/palantir
3.3 Keys
37
200 OK
Date: Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-request-id : f7c2992c-0597-11e6-9b1f-60f81dc39ce0
x-ozone-server-name : aglon.ozone.self
Request Headers
Header
Value
Comments
Authentication
Date
x-ozone-version
OZONE name
Date & Time in GMT
v1
Simple protocol
HTTP Date
Ozone protocol version
Reply Headers
Header
Value
Comments
Date
x-ozone-request-id
x-ozone-server-name
HTTP Date
Request ID for ozone call
Ozone server that replied to this request
Error Codes
Error Code
Value
Comments
307
400
400
400
400
400
400
401
404
404
404
500
Redirect
Invalid volume name
Invalid bucket name
Missing version
Missing date
Bad date
Invalid key
Unauthorized
No such volume
No such bucket
No such key
Internal server error
Redirect URL
Invalid characters in the volume
Invalid characters in the bucket
Missing x-ozone-version
Missing date header
Unable to parse date
Key is invalid
Access token is missing or invalid token
Volume not found
Bucket not found
Key not found
Server error
38
3.3.4
Info Key
Info key provides detailed key information.
API Details
GET
/{volume}/{bucket}/{key}?info=key
GET
/shire/bag-end/palantir?info=key
200 OK
Date: Mon, Apr 04, 2016 06:22:00 GMT
x-ozone-request-id : 1a3ce592-0657-11e6-b55f-60f81dc39ce0
x-ozone-server-name : aglon.ozone.self
{
"keyName":"palantir",
"version":0,
"md5hash":"e6edf9e1cb57057502cdaafa998e1426",
"createdOn":"Mon, Apr 04, 2016 06:22:00 GMT ",
"size":1024
}
Query Parameters
Header
Value
Comments
Authentication
Date
x-ozone-version
OZONE name
Date & Time in GMT
v1
Simple protocol
HTTP Date
Ozone protocol version
Reply Headers
Header
Value
Comments
Date
x-ozone-request-id
x-ozone-server-name
HTTP Date
Request ID for ozone call
Ozone server that replied to this request
3.3 Keys
39
Error Codes
3.3.5
Error Code
Value
Comments
307
400
400
400
400
400
400
401
404
404
404
500
Redirect
Invalid volume name
Invalid bucket name
Missing version
Missing date
Bad date
Invalid key
Unauthorized
No such volume
No such bucket
No such key
Internal server error
Redirect URL
Invalid characters in the volume
Invalid characters in the bucket
Missing x-ozone-version
Missing date header
Unable to parse date
Key is invalid
Access token is missing or invalid token
Volume not found
Bucket not found
Key not found
Server error
List Keys
List key allows users to list contents of a bucket.
API Details
GET
/{volume}/{bucket}?prefix=<string>&maxresult=<int>&prev_key=<string>
GET
/shire/bag-end??prefix=ring&maxresult=100
200 OK
Date: Mon, Apr 04, 2016 06:22:00 GMT
Content-Length: 722
x-ozone-request-id : fadd901e-05a9-11e6-86b0-60f81dc39ce0
x-ozone-server-name : aglon.ozone.self
{
"Name": "bag-end",
"Prefix": "ring",
"prev_key": "",
"keyList":[
{
"version":0,
"md5hash":"e6edf9e1cb57057502cdaafa998e1426",
"createdOn":"Mon, Apr 04, 2016 06:22:00 GMT",
"size":1024,
"keyName":"ring.0"
},
40
{
"version":0,
"md5hash":"01c884d16f23e3da058100510967c6c8",
"createdOn":"Mon, Apr 04, 2016 06:22:00 GMT",
"size":1024,
"keyName":"ring.1"
},
]
}
Query Parameters
Param
Value
Comments
prefix
max_keys
prev_key
Prefix string
Maximum number of results to return
Last seen key
Please note that all query parameters for this call are optional.
Request Headers
Header
Value
Comments
Authentication
Date
x-ozone-version
OZONE name
Date & Time in GMT
v1
Simple protocol
HTTP Date
Ozone protocol version
Reply Headers
Header
Value
Comments
Date
Location
x-ozone-request-id
x-ozone-server-name
HTTP Date
bucket location
Request ID for ozone call
Ozone server that replied to this request
3.3 Keys
41
Error Codes
Error Code
Value
Comments
307
400
400
400
400
400
401
404
404
500
Redirect
Invalid bucket name
Invalid volume name
Missing version
Missing date
Bad date
Unauthorized
No such volume
No such bucket
Internal server error
Redirect URL
Invalid characters or length in bucket name
Invalid characters in the volume
Missing x-ozone-version
Missing date header
Unable to parse date
Access token is missing or invalid token
Volume not found
Bucket not found
Server error
There are various payloads shown in the examples. They are all declared under org.apache.hadoop.
ozone.web.response. Please lookup that package if you are working with REST protocol directly.
Ozone errors are JSON formatted and here is an example of ozone error.
{
"httpCode":400,
"shortMessage":"invalidKey",
"resource":"/volume/bucket/key",
"message":"Invalid key",
"requestID":"bd3244e8-de4c-42ee-9b16-9812734a7cbd",
"hostName":"aglon.ozone.self"
}
The error class and the table of error codes can be found under org.apache.hadoop.ozone.web.exceptions.
4. Customization
Ozone is designed with customization in mind. Other than configuration keys there are three major points of customization. They are all defined under org.apache.hadoop.ozone.web.interfaces.
First interface allows end-users to define custom authentication scheme. UserAuth is used
by the system to discover the identity of the user. Implementing this interface and tweaking
UserHandlerBuilder allows you to plugin your own user authentication schemes easily into
ozone. Ozone currently ships with an authenticator called Simple.
The second interface allows end users to plug into ozone accounting. This can be achieved
by implementing the interface specified in Accounting. For every successful put and delete this
interface will be invoked by ozone.
The third interface is the file system interface, which in most cases the end users should not
touch. This is defined under StorageHandler there are two instances of this interface, one which
speaks to local file system in LocalStorageHandler and other that talks to the container layer
that powers ozone in DistributedStorageHandler. If you need to intercept ozone requests, this
interface is a good place to do so.
Acknowledgments
This document has been prepared by using a LATEXtemplate called "The Legrand Orange Book"
written by Mathias Legrand and licensed under creative commons.1 The cover photo is taken
by an unknown photographer and licensed under creative commons zero2 . The cover photo is a
modified image and downloaded from this URL.3 Early drafts of this book was written using
Overleaf Latex Online Editing service4 .
Ozone is an on-going project, It was conceived by Jitendra Pandey, Sanjay Radia and Suresh
Srinivas. It is mainly designed & developed by Jitendra Pandey, Arpit Agarwal, Chris Nauroth,
Jing Zhao, Tsz Wo Nicholas Sze and Anu Engineer.
The Apache community has been very helpful and we were supported by comments and
contributions from Kanaka Kumar Avvaru, Edward Bortnikov, Thomas Demoor, Nick Dimiduk,
Chris Douglas, Jian Fang, Lars Francke, Gautam Hegde, Lars Hofhansl, Jakob Homan, Virajith
Jalaparti, Charles Lamb, Steve Loughran, Haohui Mai, Colin Patrick McCabe, Aaron Myers,
Owen OMalley, Liam Slusser, Jeff Sogolov, Enis Soztutar, Andrew Wang, Fengdong Yu,
Zhe Zhang, khanderao and others. Conversations with these community members have been
instrumental for many critical features of ozone. If I have inadvertently missed anyone I apologize
for the oversight.
I also want to thank Chris Nauroth & Arpit Agarwal for answering my numerous questions
and introducing me to the beautiful world of hadoop. If you find any errors or other issues in this
document, please email me at aengineer AT hortonworks DOT com.
http://creativecommons.org/licenses/by-nc-sa/3.0/
2 https://creativecommons.org/publicdomain/zero/1.0/
3 https://static.pexels.com/photos/3853/sunny-sand-desert-hiking.jpeg
4 https://www.overleaf.com