Вы находитесь на странице: 1из 5

1

WebDAV protocol analysis


LINGI2141 Project 2
Orban Pierre-Yves
I. INTRODUCTION Web Distributed Authoring and Versioning (webDAV) is an HTTP extension that facilitates collaboration between users managing files stored in a web server. Developed by the IETF, this set of methods, headers and content-types auxiliary to HTTP/1.1 allow clients to perform remote Web content authoring operations like editing, changing and moving documents. It simply turns the web into a writeable medium. WebDAV includes some important abstract features. Firstly, it provides methods to create remove and query metadata about a web files, such as their name, authors, publication date, etc. Secondly, WebDAV includes the collection concept. A collection is a set of documents. This HTTP extension allows users to create collections and retrieve a hierarchical membership listing, like a directory listing in a file system. Thirdly, WebDAV provides a locking feature. Some operations are planned by the protocol to keep more than one person from working on a document at the same time. Thereby it acts like an overwrite protection and avoids the lost update problem, in which modifications are lost as first one author, then another, writes changes without merging the others authors changes. Well see later WebDav provides two types of locking: an exclusive and a shared locking. Fourthly and finally, it also provides a namespace management: the ability to copy or move web resources within a servers namespace. These four abstractions are manipulated by the WebDAVspecific HTTP methods and the extra HTTP headers used with WebDAV methods. Well see these methods later. WebDAV also uses XML for property names, some requests and some responses. XML has been chosen because its a flexible and self-describing structured data format. Moreover, XML provides an internationalization support because it can encode information in ISO 10646 character sets. The HTTP status codes are used by WebDAV but some extra status codes are also used to manage some specific WebDAV errors and responses. WebDAV protocol has many advantages: it uses the HTTP protocol (via the port number 80) which passes through firewalls and NAT routers (instead of the FTP port which is often blocked in public network). Furthermore, WebDAV can use the HTTPS protocol and, in this way, provide a secured file transfer. Last important thing, WebDAV is included in Apache servers. So, in few seconds, you can enable this protocol on your server (instead of the FTP protocol which requires a ftp server installation and configuration). One last word, in this document I will not describe the versioning operations which are ruled by the [RFC3253]. Ill focus on the WebDAV protocol, explained in the [RFC4918]. In a first time, Ill give some explanations on the WebDAV general structure and mechanism. In a second time, Ill explain some more technical parts of the protocol. To finish, I show some traces and conclude with a personal opinion on this file management protocol. II. WEBDAV ABSTRACTIONS The fourth abstraction, the namespace management, is resumed with the third abstraction: the locking operations. A. Properties Properties, in WebDAV, are pieces of data that describe the state of a resource. Its also called metadata, i.e. data about data. Properties are name/value pairs. There are two kinds of properties: live and dead properties. A live property is a property whose semantics and syntax are enforced by the server. For example, the live property DAV:getcontentlength has its value, the length of the entity returned by a GET request, automatically calculated by the server. Dead property is a property whose semantics and syntax are not enforced by the server. The server only records the value of a dead property; the client is responsible for maintaining the consistency of the syntax and semantics of a dead property. For example, a dead property can be the author property. A property name is a universally unique identifier and the value of a property is always a well-formed XML fragment. For example, a well-formed property is:
<D:creationdate> 2011-11-23T16:37:23Z</D:creationdate>

A particular property name used in the writing operations is the entity tag, named etag. This tag is a unique identifier given by the server to a specific version of a resource. It will be used to detect file modifications when two (or more) users try to change a same file. B. Collections of Web Resources As previously said, collections are sets of Web resources. The goal of a collection resource is to model collection-like objects within a servers namespace (like file system directories). Like most of namespace models, the HTTP URL namespace is a hierarchical namespace where the hierarchy is delimited with the / character. Moreover, uppercase and lowercase dont matter. This namespace is used by collection resources to act as containers. At least, a collections state consists of a set of mappings between path segments and resources, and a set of properties on the collection itself. With collection concept comes the depth notion: when a method is applied to a collection, it can be applied only on the collection (depth 0), on all collections children (depth 1), on all collections children and sub-children (depth infinity), etc. Except the fact that a collection is a container, it acts like a non-collection resource: a collection has same properties and

2 may have some additional states. C. Locking Locks are an important abstraction provided by WebDAV. Locking a resource provides a mechanism for serializing access to that resource and so, an authoring client can provide a reasonable guarantee that another user will not modify this resource while it is being edited. Locks follow some obvious rules like: a lock either directly or indirectly locks a resource, each lock is identified by a single globally unique lock token, a server mustnt create conflicting locks on a resource, a resource become directly locked when a LOCK request on this resource is received, a lock with an infinity depth on a collection locks all collections children and sub-children, etc. There are two kind of lock: exclusive locks and shared locks. Exclusive locks are the most basic form of lock. When a resource is locked by an exclusive lock, resource will be unavailable during all the lock time. An exclusive lock can be released by the lock user himself, by a system administrator or by a timeout. But, this solution isnt always the best: there are some cases when the lock provides rather a mechanism for users to indicate that they intend to exercise their access rights. For this, WebDAV proposes the shared locks. Shared locks allow multiple users to receive a lock on a same resource. It can be used to limit access to a resource to a certain set of users. With this kind of locks, we avoid problem situations when a user forgot to release his exclusive lock or when a lock isnt properly released (when a program crashes for example). When a lock, exclusive or shared, is asked by a user, the server gives a unique lock associated with the requesting user. Thus, if three users ask a shared lock on the same resource, server will give three different locks and generate three lock tokens, given in the response header. A lock token is a type of state token that identifies a particular lock. Each lock has exactly one unique lock token. The WebDAV protocol asks clients to not attempt to interpret lock tokens in any way. When a user takes a lock on a resource, he receives a lock token (if the resource is lockable). After that, for each operation on the locked resource, he has to submit the lock token in the request header. Well see later how to do that technically. As previously said, a lock can be released by a timeout. So, the lock may have a limited lifetime. This lifetime can be suggested by the client when he creates or refreshes the lock. But, finally, its always the server which chooses the timeout value (taking into account the user proposal, or not). However, the timeout counter must be restarted if a refresh lock is successful requested by the lock user. When the timeout expires, the lock should be released. So, a user, at the end of his work, has always to release his lock, even if the timeout has expired. One last thing about locks in WebDAV is that a resource doesnt have to be lockable, its optional. So, before trying to lock a resource, a user can ask to the server if this resource is lockable. To do that, WebDAV provides a useful property which has to be supported by all DAV-compliant lockable resources:
DAV:supportedlock

Another last interesting property is a property which lists all outstanding locks, describes their type, and may even provide the lock tokens. In can be useful for a user to know who is locking a resource. This property is named:
DAV:lockdiscovery

III. WEBDAV TECHNICALLY Before enter in WebDAV protocol details, I would like to precise how I did my tests and how I recorded packets. To avoid an Apache installation on my personal computer, I signed up an account in Box.net which provides free data storage space supporting the WebDAV protocol. I used BitKinex, a WebDAV client, and Nautilus, the Ubuntu files manager supporting WebDAV, to interact with the server. I used obviously Wireshark to capture exchanged packets. A. General overview As previously said, WebDAV is an HTTP extension. So, it uses the HTTP request and response format with some extra features. The HTTP protocol uses a TCP connection on the port number 80. WebDAV does obviously the same. A HTTP request has the form: Command Header [Empty line] Request body For example, a GET request on Google.be gives:
GET / HTTP/1.1\r\n Host: www.google.be\r\n Connection: keep-alive\r\n User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.2 (KHTML, like Gecko) GET / HTTP/1.1\r\n Chrome/15.0.874.121 Safari/535.2\r\n Host: www.google.be\r\n Accept: text/html,application/xhtml+xml,application/xml;q=0. 9,*/*;q=0.8\r\n Accept-Encoding: gzip,deflate,sdch\r\n Accept-Language: fr-FR,fr;q=0.8,enUS;q=0.6,en;q=0.4\r\n Accept-Charset: ISO-8859-1,utf8;q=0.7,*;q=0.3\r\n [truncated] Cookie: HSID=Apz0jh; SID=DQ \r\n [Full request URI: http://www.google.be/]

Unlike HTTP/1.1, WebDAV encodes method parameters either in an XML request entity body or in an HTTP header (HTTP/1.1 uses only HTTP headers). Moreover, all WebDAV responses are also encoded in XML. I have already explained why the IETF chose the XML protocol (see I. Introduction). Regarding the resource paths in parameters or responses, WebDAV accepts absolute and relative paths. Its the sender who decides the approach to be used. To illustrate a XML response, here are two packets: the first ask resource properties to the server with PROPFIND (well see later this method) and the server responds with a HTTP response with XML content.

3 the resource identified by the Request-URI. All DAVcompliant resources must support this method and must process indicated instructions. This method requires the propertyupdate XML element in its message body. The server, once the modifications done, sends in its response the new property values with a status code to indicate if it fails or not. Below is a PROPPATCH example which updates the files author.
Hypertext Transfer Protocol PROPPATCH /dav/file.txt HTTP/1.1\r\n Request Method: PROPPATCH Request URI: /dav/file.txt Request Version: HTTP/1.1 Host: www.box.net\r\n \r\n eXtensible Markup Language <?xml <D:propertyupdate xmlns:D="DAV:"> <D:set> <D:prop> <Z:Author>P-Y Orban</Z:Author> </D:prop> </D:set> </D:propertyupdate>

HTTP PROPFIND request:


PROPFIND /dav HTTP/1.1\r\n Connection: Keep-Alive\r\n Host: www.box.com\r\n User-Agent: Microsoft-WebDAVMiniRedir/6.1.7601\r\n Depth: 0\r\n

HTTP response: (some lines were removed willingly)


Hypertext Transfer Protocol HTTP/1.1 207 Multi-Status\r\n Server: nginx\r\n Content-Type: text/xml; charset="utf-8"\r\n Connection: keep-alive\r\n Content-Length: 750\r\n \r\n eXtensible Markup Language <?xml <D:multistatus xmlns:D="DAV:"> <D:response xmlns:D="DAV:"> <D:href>/dav/</D:href> <D:propstat> <D:prop> <D:creationdate>2011-1126-T17:39:51Z</D:creationdate> <D:displayname>dav</D:displayname> <D:getcontentlength>373595</D:getcontentlength> <D:getcontenttype>httpd/unixdirectory</D:getcontenttype> <D:getetag>"62f88dfbd1be9c47ed87bcbaa47e64c1"</D:gete tag> <D:resourcetype><D:collection/></D:resourcetype> </D:prop> <D:status>HTTP/1.1 200 OK</D:status> </D:propstat> </D:response> </D:multistatus>

3.

MKCOL Method

The MKCOL method creates a new collection resource at the location specified by the Request-URI. If this URI is already mapped to a resource, the MKCOL method must fail. This method can have a request body to indicate potential collections members, bodies of members, properties on collections members, etc. If there is no request body, the server should create only a collection, without any members. The servers response is an HTTP status code message. 4. GET, HEAD, POST, PUT methods for Collections

B. Extra HTTP methods The WebDAV protocol defines some extra HTTP methods to provide its writing and locking operations. After each request, the server will respond with a status code and, sometimes, some data information. 1. PROPFIND Method

The GET, HEAD, POST and PUT methods are standard HTTP/1.1 methods. Their behaviors dont change when they are used in the WebDAV protocol. However, when the PUT method is requested on a collection resource, it doesnt create a collection; its considered as an error. Users have to use the MKCOL method to create collection resources. 5. DELETE Method

The PROPFIND method retrieves properties defined on the resource identified by the Request-URI. All DAV-compliant resources must support this method. The PROPFIND request sender must also submit a Depth header with a value 0, 1 or infinity. If no depth header is included, the server should treat the request as if a Depth: infinity header was included. With this method, the client can submit a propfind XML element in the request body to say what information is being requested. He can request particular property values by naming desired properties, request all dead properties and live properties defined in this specification by using the allprop element or request a list of all the resource properties by using the propname element. 2. PROPPATCH Method

The DELETE method is already defined by the HTTP/1.1 protocol. WebDAV changes some handling requirements. When a server receives a DELETE request, it must destroy locks rooted on the deleted resource and it must remove the mapping from the Request-URI to any resource. When a DELETE operation is applied on a collection resource, the server must act as if a Depth: infinity header was used on it. The client mustnt submit a depth header with any value but infinity. 6. COPY Method

The PROPPATCH method processes instructions specified in the request body to set and/or remove properties defined on

The COPY method duplicates the source resource identified by the Request-URI to the destination resource identified by the URI in the Destination header. All DAV-compliant

4 resources must support this method. The COPY method can be applied both on collection resources and non-collection resources. When the source isnt a collection, the COPY method creates a new resource the most similar as possible to the source resource. Indeed, the destination environment may be different then the source environment due to factors outside the scope of control of the server. The COPY method applied on a collection resource must have a depth 0 or infinity. If there is no Depth header, the methods acts with an infinity depth. In the two cases, after a successful copy, all dead properties on the source resource should be duplicated on the destination source. Live properties are also duplicated but not necessarily with the same values: servers shouldnt convert live properties into dead properties. Last thing about the COPY method, an Overwrite header can be included. If its value is F and if a resource exists at the destination URI, the server must fail. Otherwise, the method overwrites the resource already present. 7. MOVE Method protocol must appear on all OPTIONS responses. But, in my personal tests using the Box.net server, I never saw this header in the OPTIONS server responses. Secondly, the Depth header, already mentioned, is used with methods applied on resources that could have internal members to indicate the method scope. Another header already mentioned is the Destination header, used to identify the destination resource (for methods like COPY or MOVE). To lock resource, the header used is the Lock-Token header. Meanwhile, the Overwrite header indicates to the server if a resource can overwrite another or not (for the COPY or MOVE methods). The sixth header is the Timeout Request header which may be included by clients in their LOCK requests. The last but not least header introduced by the WebDAV protocol is the If header. The If header has two goals. The first is, like in every language, to make a conditional request. In this case, the If header tests conditions on the specified resource. If only one condition is evaluated to true, then the request succeed. On the other case, the request must fail. The second goal is to indicate the client knows the state token of a resource by placing this state token in the request header. The If header is particularly useful for the lock management. IV. SOME TRACES ANALYSIS In this section, based on traces, Ill show and explain some WebDAV resource operations such as locking a resource, updating it, etc. In all shown traces, some irrelevant lines are willingly hidden to save some space. A. First touch to the server To contact the server, the client sends a simple PROPFIND request. If the specified directory is protected, the server responds with an HTTP 401 (Unauthorized) error. I tested two different clients and I saw a difference in the way to provide the user and password. BitKinex sends the password in clear in the request (with a basic http authentication) while Nautilus sends it hashed in md5 (with a digest http authentication). BitKinex PROPFIND request
PROPFIND /dav/ HTTP/1.1\r\n Request Method: PROPFIND Request URI: /dav/ Host: www.box.net\r\n User-Agent: BitKinex/3.2.3\r\n Depth: 1\r\n Authorization: Basic b3JiYW4ucGllcnJleXZlc0BnbWFpbC5jb206d3B0eWpvMjU=\r\n Credentials: orban.pierreyves@gmail.com:password \r\n eXtensible Markup Language <?xml <d:propfind xmlns:d="DAV:"> <d:prop> <d:getlastmodified/> <d:getcontentlength/> <d:getcontenttype/> <d:resourcetype/> <d:getetag/> <d:lockdiscovery/> </d:prop> </d:propfind>

The MOVE method is the logical equivalent of a COPY method, followed by consistency maintenance processing, followed by a DELETE operation on the source resource. These three operations are enclosed in a unique atomic operation. The consistency maintenance process allows the server to perform updates caused by the move operation. 8. LOCK Method

The LOCK method, as already said, locks a collection or non-collection resource. A lock operation may include the Depth header to indicate to the server if its just a lock on a unique resource (value 0) or if its a recursive lock on all children and sub-children of the identified resource (value infinity). If no depth is included, the server acts as it received an infinity depth. After a successful lock operation, the servers response must contain two important elements: a body with the value of the DAV:lockdiscovery property in a prop XML element and the Lock-Token response header which contains the token associated with the lock just granted. Its also with the LOCK method that the lock can be refreshed, just by including the lock token in the If header. Some lock examples will be shown later. 9. UNLOCK Method

The UNLOCK method removes the lock identified by the token included in the Lock-Token request header. The header also contains a Request-URI which identifies the resource to unlock within the scope of the lock. When the unlock operation is applied on a collection resource, if the unlock of one of all locked resources failed, then the request must fail. C. Extra HTTP headers Such as the extra methods, the WebDAV protocol adds some extra HTTP headers. Firstly, the DAV header which indicates that the resource supports the DAV schema and

5 The PROPFIND request with Nautilus is the same except the authentication method which differs:
Authorization: Digest username="orban.pierreyves@gmail.com",realm="box.net/ dav",nonce="ce713a9f86ede27c0c6c6d7a2bf2d028",uri="/d av",cnonce="6fc2932a9cbe1c47ad41d6ffb55c4abc",nc=0000 0001,algorithm=MD5,response="dc0ec6278e0ccc5a8

After this first authenticated request, the server responds with a multi-status response. A multi-status response is a server response which contains a XML part. In the XML body, there is several <d:response element, one for each resource contained in the collection. Its a little bit like the ls command in a UNIX system. A multi-status response example is shown below (only the first element is expanded).
Hypertext Transfer Protocol HTTP/1.1 207 Multi-Status\r\n Request Version: HTTP/1.1 Status Code: 207 Response Phrase: Multi-Status \r\n eXtensible Markup Language <?xml <d:multistatus xmlns:d="DAV:"> <d:response xmlns:d="DAV:"> <d:href>/dav/</d:href> <d:propstat> <d:prop> <d:getlastmodified xmlns:d="DAV:">2011-1201T14:25:08Z</d:getlastmodified> <d:getcontentlength xmlns:d="DAV:">373604</d:getcontentlength> <d:getcontenttype xmlns:d="DAV:">httpd/unixdirectory</d:getcontenttype> <d:resourcetype xmlns:d="DAV:"><d:collection/></d:resourcetype> <d:getetag xmlns:d="DAV:">"5e575335558a3156228685c92afc14e6"</d: getetag> <d:lockdiscovery xmlns:d="DAV:"></d:lockdiscovery> </d:prop> <d:status>HTTP/1.1 200 OK</d:status> </d:propstat> </d:response> <d:response <d:response <d:response <d:response <d:response </d:multistatus>

We can see in the XML part that the request has to contain some information on the lock such as the scope of the lock (exclusive or shared lock), the lock type (at present, only the write lock is defined in the protocol) and the lock owner. After that, the server sends its responds with a XML body containing a <D:locktoken element. The client has to record this token: it will be used to do some operations on the locked resource. For example, a PUT request on a locked file is done with:
PUT /dav/file.txt HTTP/1.1\r\n Request Method: PUT Request URI: /dav/file.txt Request Version: HTTP/1.1 Connection: Keep-Alive\r\n If: (<urn:uuid:the-resource-lock-token>)\r\n Host: www.box.net\r\n \r\n [Full request URI: http://www.box.net/dav/file.txt] Data ....

We can see the If header which is used to check if the client has the right lock token to do the operation on this resource. Ill not show more requests or response: its always the same schema. V. CONCLUSION To conclude, the WebDAV protocol is an interesting protocol which provides a good file management. A big advantage of WebDAV is that it uses the HTTP protocol, so the port number 80. Moreover, with the HTTPS, file transfer and management can be done securely. With this assignment, I discovered a good HTTP extension and I learned to analyze traffic with Wireshark and TCPdump. Moreover, I learned a lot of extra information on the HTTP protocol. REFERENCES I read many different websites to do this work but I mention only the RFC because it was my principal source.
[1] L. Dusseault, HTTP Extensions for Web Distributed Authoring and Versioning (WebDAV), [RFC4918], June 2007.

B. Locking a resource As already said, a big advantage of the WebDAV protocol is the ability to lock a resource before working on it. Requesting a lock is relatively simple with the LOCK method:
Hypertext Transfer Protocol LOCK /dav/file.txt HTTP/1.1\r\n Request Method: LOCK Request URI: /dav/file.txt Request Version: HTTP/1.1 Timeout: Second-3600\r\n Host: www.box.net\r\n \r\n [Full request URI: http://www.box.net/dav/file.txt] eXtensible Markup Language <?xml <D:lockinfo xmlns:D="DAV:"> <D:lockscope><D:exclusive/></D:lockscope> <D:locktype><D:write/></D:locktype> <D:owner><D:href>orban.pierreyves@gmail.com</D:href>< /D:owner> </D:lockinfo>

Вам также может понравиться