A resource can be represented in several ways. We’ll examine different strategies to return correct representations, and to avoid common mistakes.
Representing resources with HTTP headers
Some HTTP headers are commonly used to describe a resource. They must be added to the response, in order to represent that resource in a complete way.
Content-Type
It indicates the type of the resource, with its charset. Examples are text/plain, image/jpeg, application/pdf. Media types specify which format (PDF, JSON, XML) has been used to encode the body, and are used by clients to select proper parsing engines.
A server receiving a request without Content-Type, should return a 400 (Bad Request).
A client receiving a response without Content-Type should treat it as a bad response.
Content-Lenght
It specifies the size of the body (in bytes). HTTP 1.1 includes a more efficient mechanism known as chunked transfer encoding.
This header should be checked only if there is no Trasfer-Encoding: chunked.
Content-Language
This header specifies the language of localized representations. The RFC 5646 defines language and country code tags.
Content-MD5
This header includes an MD5 digest of the representation, to provide its consistency and integrity. The value is computed after applying content (i.e. gzip) but before transfer (i.e. chunked) encoding.
Content-Encoding
If the body is encoded, this header will contain how it has been encoded (i.e. compress, deflate or gzip). If the media type allows a charset parameter, you can include this header specifying the encoding used to convert characters to bytes.
If representations are JSON, XML or HTML data, let their parsers interpret the character set (see RFC 4627).
Also, avoid including a charset in Content-Type and a different one in the body, that would lead to encoding mismatch.
Last-Modified
It indicates the last time the resource was modified.
Choosing formats and media types
Media types and formats should be chosen in a flexible way, depending on use cases and client needs. The best place to find a standard format is the IANA (Internet Assigned Numbers Authority) Media Type Registry.
Use extensible formats if no standard media type and format is defined, such as XML (application/xml), Atom Syndication Format (application/atom+xml), or JSON (application/json).
Images or rich formats data should be used to provide alternative representations of data: when using such media types, add a Content-Disposition header, as in
to give hints on the file to save.
If a new format and/or media type must be used, register both with IANA, by following what’s described in RFC 4288.
Standard media types are:
| Media type | Format | Reference |
|---|---|---|
| application/xml | Generic XML format | RFC 3023 |
| application/*+xml | Special-purpose media type described as XML | RFC 3023 |
| application/atom+xml | XML format for Atom documents | RFC 4287 and RFC 5023 |
| application/json | Generic JSON format | RFC 4627 |
| application/javascript | Javascript | RFC 4329 |
| application/x-www-form-urlencoded | Query strings | HTML 4.01 |
| application/pdf | RFC 3778 | |
| text/html | Various HTML versions | RFC 2854 |
| text/csv | Generic comma-separated values format | RFFC 4180 |
Generic formats do not contain application-specific semantics, so they are customised for each representation they describe.
New formats and media types
Even if it’s best to use standard formats and media types whenever possible, sometimes new ones are required (i.e., when introducing a new video encoder, or a new document type). If such formats are widely available, new media types are strongly encouraged.
These are rules to follow:
- if media types are based on XML, the subtype should end with +xml.
- if media types are for private use only, the subtype should start with vnd..
- if media types are for public use, register them with IANA, as indicated in the RFC 4288.
Designing representations
When designing representations, there are some common points to keep, in order to make them exchangable. As general rules:
- add a self link (i.e. a link with a relation type self) to the resource, that’s equivalent to the request URIs or Content-Location headers;
- include identifiers for each domain entities that can be mapped to resources;
- specify the language if natural text is included in the representation.
XML example
Let’s suppose we are retrieving a task within a checklist. The XML request is:
Host: www.example.org
and the relative response:
Content-Type: application/xml;charset=UTF-8
Content-Language: en
<task>
<id>urn:example:checklist:1:task:1</id>
<atom:link rel="self" href="http://www.example.org/checklist/1/task/1.xml" />
<name xml:lang="en">First task</name>
<priority>high</priority>
</task>
If a new task has to be created, the request will be
Host: www.example.org
Content-Type: application/xml;charset=UTF-8
Content-Language: en
<task>
<name xml:lang="en">Second task</name>
<priority>low</priority>
</task>
and the response
Location: http://www.example.org/checklist/1/task/2.xml
Content-Location: http://www.example.org/checklist/1/task/2.xml
Content-Type: application/xml;charset=UTF-8
Content-Language: en
<task>
<id>urn:example:checklist:1:task:2</id>
<atom:link rel="self" href="http://www.example.org/checklist/1/task/2.xml" />
<name xml:lang="en">Second task</name>
<priority>low</priority>
</task>
JSON example
Mapping the body of the last response in JSON, the result would be:
"id": "urn:example:checklist:1:task:2",
"link": {
"rel": "self",
"href": "http://www.example.org/checklist/1/task/2.xml"
},
"name": {
"lang": "en",
"value", "Second task"
},
"priority": "low"
}
Collections
Collections are used to group a number of resources together, and to iterate through them. In order to simplify how to access to different sets of resources, some additional details can be added:
- a self link to the collection
- a link to the next page, if available
- a link to the previous page, if available
- the total size of the collection (this should be a hint, not the real number, which calculation could be very expensive)
Collections of similar objects (i.e. cars and boats) should be used when their members are structurally and semantically similar.
Portable data formats
Except for texts to be presented to end users, to make representations easily available for different applications, formats depending on language, region or country should be avoided. Instead, some standards will apply:
| Type | Reference | Example |
|---|---|---|
| Numbers | W3C XML Schema | -123.456 |
| Countries and territories | ISO 3166-1, ISO 3166-2 | US, US-WA |
| Currencies | ISO 4217 | USD, CAD |
| Dates and times | RFC 3339 | 2012-01-15Z |
| Language tags | BCP 47 | en, en-CA |
| Time zone identifiers | Java |
Entity identifiers
Use entity identifiers when links within the REST protocol can change, or when the same resources can be used outside the REST web service (i.e. for SOAP web services, or RPCs). They should be used also when resources are not a direct mapping of domain entities, and their uniqueness should be provided.
There is no established standard that defines these ids, but a common convention is to use a syntax like
Binary data
Sometimes we need to encode data that have no textual representation, such as a video or an image.
Avoid using base64 encodings, but instead use multipart media types such as multipart/form-data, multipart/mixed, multipart/related or multipart/alternative (see here).
Another way to handle different media types is to provide links to separate resources.
Returning errors from server
Errors are another way to express the result of an HTTP exchange between a client and a server. They are usually composed of a status code, headers and a content body, just like any other response.
Errors due to clients are mapped with 4xx status codes, while 5xx represent server errors.
Response bodies should contain a description of the error in plain text or HTML, properly formatted and localized.
A link (with the Link header or some anchor within the body) could direct to an informational resource, even with debug notices. An error identifier can be used when reporting problems.
One common mistake is to return a successful status code (2xx or 3xx) with an error described in the body. Avoid this to help cache servers, proxies and clients parse correct responses.
4xx: client errors
This is a mapping of errors due to client behavior:
| Error | Usage |
|---|---|
| 400 (Bad Request) | Syntactic errors due to requests not understood by the server. |
| 401 (Unauthorized) | The client has no access to the resource, but can obtain it after a proper authentication. A WWW-Authenticate header will help the client to use the correct authentication method (such as Basic or Digest. |
| 403 (Forbidden) | The client has no access to the resource, even if authenticated. |
| 404 (Not Found) | No resource is found. |
| 405 (Not Allowed) | The method is not allowed for this resource (i.e., the request is a POST when only a GET is available). An Allow header will show which methods are available. |
| 406 (Not Acceptable) | (see a following post) |
| 409 (Conflict) | The request conflicts with the current status of the resources. |
| 410 (Gone) | The resource existed, but has been deleted. If there is no tracking of deleted resources, return a 404 (Not Found) instead. |
| 412 (Precondition Failed) | (see a following post) |
| 413 (Request Entity Too Large) | The body of POST or PUT requests is too large. The body should contain available sizes, or alternatives. |
| 415 (Unsupported Media Type) | The server cannot recognize the media type. |
5xx: server errors
The only available server errors to map are:
| Error | Usage |
|---|---|
| 500 (Internal Server Error) | Due to implementation bugs. |
| 503 (Service Unavailable) | The server cannot fulfill the request at this time. This could be due to database connections or request rate limit exceeded by clients. A Retry-After header should be included to suggest when the client can resend the same request. |
Error bodies
The error message should contain a body for all methods except HEAD. In all the other cases, there should be
- a short description of the error condition
- a longer description with steps on how to fix the error, if possible
- an error identifier
- a link to help resources
Handling errors in clients
Clients has to handle two different kind of errors: network failures and server responses. The first type is usually handled with custom exceptions, the latter with explicit coding. The following table shows how each error should be treated:
| Error | Usage |
|---|---|
| 400 (Bad Request) | Examine the body for hints. |
| 401 (Unauthorized) | Ask the user or obtain credentials, and retry adding an Authorization header. |
| 403 (Forbidden) | The client can’t access the resource, so do not repeat the request. |
| 404 (Not Found) | The resource is no more available, so clean up data (even marking it as deleted). |
| 405 (Not Allowed) | Examine the Allow header, apply what indicated and retry the request. |
| 406 (Not Acceptable) | (see a following post) |
| 409 (Conflict) | Look and solve conflicts. |
| 410 (Gone) | See 404 (Not Found). |
| 412 (Precondition Failed) | (see a following post) |
| 413 (Request Entity Too Large) | Look the body for available sizes. |
| 415 (Unsupported Media Type) | Examine the body for available media types. |
| 500 (Internal Server Error) | Log and report to developers. |
| 503 (Service Unavailable) | If there is a Retry-After header, wait for that time interval and retry. |
Errors should be handled as resources, not exceptions, since they contain useful informations to be managed.





