Designing RESTful representations

Designing RESTful representations

software_development
English text only

A resource can be represented in several ways. We’ll examine different strategies to return correct representations, and to avoid common mistakes.

Some HTTP headers are commonly used to describe a resource. They must be added to the response, in order to represent that resource in a complete way.

Content-Type

It indicates the type of the resource, with its charset. Examples are text/plain, image/jpeg, application/pdf. Media types specify which format (PDF, JSON, XML) has been used to encode the body, and are used by clients to select proper parsing engines.

A server receiving a request without Content-Type, should return a 400 (Bad Request).

A client receiving a response without Content-Type should treat it as a bad response.

Content-Lenght

It specifies the size of the body (in bytes). HTTP 1.1 includes a more efficient mechanism known as chunked transfer encoding.

This header should be checked only if there is no Trasfer-Encoding: chunked.

Content-Language

This header specifies the language of localized representations. The RFC 5646 defines language and country code tags.

Content-MD5

This header includes an MD5 digest of the representation, to provide its consistency and integrity. The value is computed after applying content (i.e. gzip) but before transfer (i.e. chunked) encoding.

Content-Encoding

If the body is encoded, this header will contain how it has been encoded (i.e. compress, deflate or gzip). If the media type allows a charset parameter, you can include this header specifying the encoding used to convert characters to bytes.

If representations are JSON, XML or HTML data, let their parsers interpret the character set (see RFC 4627).

Also, avoid including a charset in Content-Type and a different one in the body, that would lead to encoding mismatch.

Last-Modified

It indicates the last time the resource was modified.

Media types and formats should be chosen in a flexible way, depending on use cases and client needs. The best place to find a standard format is the IANA (Internet Assigned Numbers Authority) Media Type Registry.

Use extensible formats if no standard media type and format is defined, such as XML (application/xml), Atom Syndication Format (application/atom+xml), or JSON (application/json).

Images or rich formats data should be used to provide alternative representations of data: when using such media types, add a Content-Disposition header, as in

 

Content-Disposition: attachment; filename=example.pdf

 

to give hints on the file to save.

If a new format and/or media type must be used, register both with IANA, by following what’s described in RFC 4288.

Standard media types are:

Media type Format Reference
application/xml Generic XML format RFC 3023
application/*+xml Special-purpose media type described as XML RFC 3023
application/atom+xml XML format for Atom documents RFC 4287 and RFC 5023
application/json Generic JSON format RFC 4627
application/javascript Javascript RFC 4329
application/x-www-form-urlencoded Query strings HTML 4.01
application/pdf PDF RFC 3778
text/html Various HTML versions RFC 2854
text/csv Generic comma-separated values format RFFC 4180

Generic formats do not contain application-specific semantics, so they are customised for each representation they describe.

New formats and media types

Even if it’s best to use standard formats and media types whenever possible, sometimes new ones are required (i.e., when introducing a new video encoder, or a new document type). If such formats are widely available, new media types are strongly encouraged.

These are rules to follow:

  • if media types are based on XML, the subtype should end with +xml.
  • if media types are for private use only, the subtype should start with vnd..
  • if media types are for public use, register them with IANA, as indicated in the RFC 4288.

When designing representations, there are some common points to keep, in order to make them exchangable. As general rules:

  • add a self link (i.e. a link with a relation type self) to the resource, that’s equivalent to the request URIs or Content-Location headers;
  • include identifiers for each domain entities that can be mapped to resources;
  • specify the language if natural text is included in the representation.

XML example

Let’s suppose we are retrieving a task within a checklist. The XML request is:

 

GET /checklist/1/task/1.xml HTTP/1.1
Host: www.example.org

 

and the relative response:

 

HTTP/1.1 200 OK
Content-Type: application/xml;charset=UTF-8
Content-Language: en
<task>
  <id>urn:example:checklist:1:task:1</id>
  <atom:link rel="self" href="http://www.example.org/checklist/1/task/1.xml" />
  <name xml:lang="en">First task</name>
  <priority>high</priority>
</task>

 

If a new task has to be created, the request will be

 

POST /checklist/1.xml HTTP/1.1
Host: www.example.org
Content-Type: application/xml;charset=UTF-8
Content-Language: en
<task>
  <name xml:lang="en">Second task</name>
  <priority>low</priority>
</task>

 

and the response

HTTP/1.1 201 Created
Location: http://www.example.org/checklist/1/task/2.xml
Content-Location: http://www.example.org/checklist/1/task/2.xml
Content-Type: application/xml;charset=UTF-8
Content-Language: en
<task>
  <id>urn:example:checklist:1:task:2</id>
  <atom:link rel="self" href="http://www.example.org/checklist/1/task/2.xml" />
  <name xml:lang="en">Second task</name>
  <priority>low</priority>
</task>

 

JSON example

Mapping the body of the last response in JSON, the result would be:

 

{
  "id": "urn:example:checklist:1:task:2",
  "link": {
    "rel": "self",
    "href": "http://www.example.org/checklist/1/task/2.xml"
  },
  "name": {
    "lang": "en",
    "value", "Second task"
  },
  "priority": "low"
}

 

Collections

Collections are used to group a number of resources together, and to iterate through them. In order to simplify how to access to different sets of resources, some additional details can be added:

  • a self link to the collection
  • a link to the next page, if available
  • a link to the previous page, if available
  • the total size of the collection (this should be a hint, not the real number, which calculation could be very expensive)

Collections of similar objects (i.e. cars and boats) should be used when their members are structurally and semantically similar.

Except for texts to be presented to end users, to make representations easily available for different applications, formats depending on language, region or country should be avoided. Instead, some standards will apply:

Type Reference Example
Numbers W3C XML Schema -123.456
Countries and territories ISO 3166-1, ISO 3166-2 US, US-WA
Currencies ISO 4217 USD, CAD
Dates and times RFC 3339 2012-01-15Z
Language tags BCP 47 en, en-CA
Time zone identifiers Java  

Use entity identifiers when links within the REST protocol can change, or when the same resources can be used outside the REST web service (i.e. for SOAP web services, or RPCs). They should be used also when resources are not a direct mapping of domain entities, and their uniqueness should be provided.

There is no established standard that defines these ids, but a common convention is to use a syntax like

 

<id>urn:example:user:1234</id>

 

Sometimes we need to encode data that have no textual representation, such as a video or an image.

Avoid using base64 encodings, but instead use multipart media types such as multipart/form-data, multipart/mixed, multipart/related or multipart/alternative (see here).

Another way to handle different media types is to provide links to separate resources.

Errors are another way to express the result of an HTTP exchange between a client and a server. They are usually composed of a status code, headers and a content body, just like any other response.

Errors due to clients are mapped with 4xx status codes, while 5xx represent server errors.

Response bodies should contain a description of the error in plain text or HTML, properly formatted and localized.

A link (with the Link header or some anchor within the body) could direct to an informational resource, even with debug notices. An error identifier can be used when reporting problems.

One common mistake is to return a successful status code (2xx or 3xx) with an error described in the body. Avoid this to help cache servers, proxies and clients parse correct responses.

4xx: client errors

This is a mapping of errors due to client behavior:

Error Usage
400 (Bad Request) Syntactic errors due to requests not understood by the server.
401 (Unauthorized) The client has no access to the resource, but can obtain it after a proper authentication. A WWW-Authenticate header will help the client to use the correct authentication method (such as Basic or Digest).
403 (Forbidden) The client has no access to the resource, even if authenticated.
404 (Not Found) No resource is found.
405 (Not Allowed) The method is not allowed for this resource (i.e., the request is a POST when only a GET is available). An Allow header will show which methods are available.
406 (Not Acceptable) (see a following post)
409 (Conflict) The request conflicts with the current status of the resources.
410 (Gone) The resource existed, but has been deleted. If there is no tracking of deleted resources, return a 404 (Not Found) instead.
412 (Precondition Failed) (see a following post)
413 (Request Entity Too Large) The body of POST or PUT requests is too large. The body should contain available sizes, or alternatives.
415 (Unsupported Media Type) The server cannot recognize the media type.

5xx: server errors

The only available server errors to map are:

Error Usage
500 (Internal Server Error) Due to implementation bugs.
503 (Service Unavailable) The server cannot fulfill the request at this time. This could be due to database connections or request rate limit exceeded by clients. A Retry-After header should be included to suggest when the client can resend the same request.

Error bodies

The error message should contain a body for all methods except HEAD. In all the other cases, there should be

  • a short description of the error condition
  • a longer description with steps on how to fix the error, if possible
  • an error identifier
  • a link to help resources

Clients has to handle two different kind of errors: network failures and server responses. The first type is usually handled with custom exceptions, the latter with explicit coding. The following table shows how each error should be treated:

Error Usage
400 (Bad Request) Examine the body for hints.
401 (Unauthorized) Ask the user or obtain credentials, and retry adding an Authorization header.
403 (Forbidden) The client can’t access the resource, so do not repeat the request.
404 (Not Found) The resource is no more available, so clean up data (even marking it as deleted).
405 (Not Allowed) Examine the Allow header, apply what indicated and retry the request.
406 (Not Acceptable) (see a following post)
409 (Conflict) Look and solve conflicts.
410 (Gone) See 404 (Not Found).
412 (Precondition Failed) (see a following post)
413 (Request Entity Too Large) Look the body for available sizes.
415 (Unsupported Media Type) Examine the body for available media types.
500 (Internal Server Error) Log and report to developers.
503 (Service Unavailable) If there is a Retry-After header, wait for that time interval and retry.

Errors should be handled as resources, not exceptions, since they contain useful informations to be managed.

Handling REST resources

REST guidelines

Related News