This is the second in a series of tutorial blog posts that provide an overview of the CDMI standard from the perspective of cloud clients. The goal of this tutorial series is to demonstrate the value of CDMI for solving real-world problems, with a focus on how easy it is to add CDMI functionality to existing applications.
As a cloud client, I want to access data stored in a cloud using the CDMI protocol in order to take advantage of enhanced cloud storage functionality.
The CDMI standard is built on top of the HTTP standard. In addition to being a standard HTTP server, every CDMI server also provides native CDMI functionality. In order to take advantage of the CDMI functionality, a client has to make a "CDMI request", as opposed to a "Non-CDMI request" which we reviewed in the first tutorial.
As CDMI requests provide more functionality than a non-CDMI request, there's a fair bit to review. Fortunately, CDMI uses standard HTTP and JSON, so it's easy to use existing libraries to request CDMI data, and to parse it, especially from within Web 2.0 (AJAX) applications running in the browser.
Let's start by reviewing how to make a CDMI request:
CDMI vs. Non-CDMI Access
One key concept of CDMI is the difference between CDMI and Non-CDMI access. So, what differentiates a CDMI request from a Non-CDMI request?
It's all in the headers.
CDMI is a RESTful protocol, which means it is fundamentally based around the concept of representations. In HTTP, a given resource (what is found at a specific URL), can have more than one representation. For example, an image could have a JPEG and a TIFF representation, and depending on which representation the client requests, the server would deliver the representation the client requested (if supported).
The way that a client indicates what representations being requested on a GET, is via the HTTP "Accept" header, and for a PUT or POST, it is via the HTTP "Content-Type" header. So, when a CDMI server receives an HTTP request, the logic that determines if it is a CDMI operation or a standard HTTP operation is based on the value of these header.
There are five CDMI content-types that are defined in RFC 6208. These are:
So, for example, when requesting a data object via CDMI, the client would include the below header:
Each of these representations are defined by the CDMI specification, and are returned in JSON format. Later in this tutorial, we'll look through the data object representation in more detail.
Once the CDMI server has determined that a CDMI representation is being used, then, the CDMI server will look at a CDMI-define header to determine compatibility between the client and the server. If this header is missing or does not include a version string that matches a version that is supported by the server, the operation fails.
This is defined as part of the standard to allow enhancements to be made to the CDMI standard while still allowing it to respond to clients that understand an older version of the standard.
So, to put these together, our CDMI request would include the below headers:
On the Wire
When a CDMI client requests a data object, the following request is sent over the wire:
GET /Hello.txt HTTP/1.1
This results in the CDMI server sending back the following response:
HTTP/1.1 200 OK
"aceflags": "OBJECT_INHERIT, CONTAINER_INHERIT, INHERITED",
"aceflags": "OBJECT_INHERIT, CONTAINER_INHERIT, INHERITED",
"value": "Hello CDMI World"
That's a lot of information returned, so let's walk through each of the JSON fields:
Every CDMI object has a type. This is one of the five content-types defined in RFC 6208. This makes the JSON self-describing.
Every CDMI object has a globally unique object identifier. This is another key CDMI concept, and one that will be explored in more detail in a future tutorial entry.
CDMI objects may have a user-assigned name.
When a CDMI object has a name, this field indicates the parent container. When a CDMI object does not have a name, this indicates the URI used for accessing objects by ID (typically /cdmi_objectid/).
When a CDMI object has a name, this field indicates the object ID of the parent container. When a CDMI object does not have a name, this indicates the object ID of the object access URI.
CDMI systems can support domains, which allow for multiple administrative control groups and administrative delegation. Every object belongs to one domain (which may be a sub-domain of another domain). This field indicates the URI to the domain associated with the object (and can be a URI to the name of the domain, or to the domain's object ID.
Domains will be covered in more detail in a future tutorial.
Every CDMI object has a corresponding capabilities object, that allows a client to discover what operations are supported for that object. Capabilities will be covered in more detail in a future tutorial.
Indicates if the object is complete. This field will be covered in more detail in a future tutorial.
For data objects, this field indicates the mime-type of the stored content.
Every CDMI object can have associated system and user-defined metadata. This is one of the most powerful aspects of object storage, and is the core of CDMI's value proposition. While we will cover metadata in much greater detail in future tutorials, in this example, we have only storage-system metadata, such as the creation time, modification time, modification count, owner, group, object ACL and size. (CDMI uses standard NFSv4 ACLs).
For data objects, this field indicates the byte range within the object that is being returned.
For data objects, this field indicates how the value is encoded within the JSON representation. For UTF-8 clean text, the value can be directly included in the JSON, but for binary data, the value must be base-64 encoded.
There is also a CDMI extension in public review to allow the object body to be transported in a second MIME part using multi-part MIME.
For data objects, this field contains the value of the object.
CDMI access of data objects is mandatory for all CDMI servers that implement support for reading data objects by name or ID.
The following capabilities must be present:
- System-wide Capabilities: cdmi_dataobjects, cdmi_acls, cdmi_domains
- Data Object Capabilities (on the data object being accessed): cdmi_read_value, cdmi_read_metadata