REST is an architecture style definition applied to networked applications. It exists as a series of constraints applied to the implementation of network components, enabling uniform interface semantics, rather than application-specific implementations and syntax.
REST as a network architecture was first documented by Roy Fielding in his doctoral thesis titled Architectural Styles and the Design of Network-based Software Architectures, published in 2000. REST allows clients to interact with data stored on a server, without having any prior knowledge of the server or what exists on it.
What Does REST Stand For?
REST stands for Representational State Transfer. Acronym purists often write it as ReST, because the E in REST does not actually stand for anything. It is merely the second letter of the word representational. This prevents controversy on how to pronounce it, like the one that has plagued GIF in recent years. With the E included, there is no confusion when pronouncing the more accurate acronym of RST as “wrist” or even as R-S-T.
REST General Constraints
There is no specific definition or standard of what REST is. As an architecture, REST defines how various components are linked via connectors, and how data is exchanged over interfaces. REST is a series of constraints or requirements that, when followed, create an implementation of the REST architectural style.
REST architecture does not encourage the creation of additional, situation-specific methods. For example, a REST architecture would use the GET method to retrieve data in every circumstance. The only change for different types of data would be different parameters. Contrast this to creating a new method to get information about a user (getUser), and yet another method to get information about pricing (getPricing) and so on.
The user interface on the client is separate and independent of the data storage on the server. This allows for the client to be implemented and modified regardless of what is, or is not, happening on the server. Likewise, the data on a server can be used and modified regardless of how it is being accessed by the client. Such a design allows for both client and server systems to evolve at their own rate, independent of each other.
No session data should be stored on the server between requests from the client. In other words, every transaction must carry with it the ability to understand the request completely without the need to access any additional context or data stored on the server. All session data should reside exclusively on the client.
This constraint allows for simplified load balancing or fault tolerance. A different server could respond to each and every request from a client and return the same data the original server would have, so long as both servers have the same original data. There is no need to pass any sort of session-specific data between such load-balancing servers. Sessions that are not stateless might result in different responses if the request is processed by a different server, because only the previous server would have data specific to that session with the client.
In every exchange, data must be marked either as cacheable or non-cacheable data. Data that is cacheable may be stored and reused by the client. No data is cached on the server because this would violate the stateless constraint. The ability to cache data reduces the increased bandwidth otherwise necessary to maintain a stateless, client-server session.
In order to achieve full interoperability, the interface is decoupled from the type of data provided that all interactions operate in the same manner. Four sub-constraints ensure uniform interfaces: identification of resources, manipulation of resources via representations, self-descriptive messages, and Hypermedia as The Engine Of the Application State (HATEOAS).
Identification of Resources
Any information that can be named is a resource. A resource may be any form of data. A resource identifier is a way to refer to a specific resource at a particular point in time. Such resources may be updated on the server without the client needing to have advance knowledge because each request is fully described and answered.
Manipulation via Representations
A representation is the current state of a resource, along with its accompanying metadata that allows the representation to be understood. A representation’s data format defines its media type. Thus, a client may request a resource, such as an image, from a server using the resource identifier of that resource (likely a URI), and a representation of that image composed of the bytes that make up the image, along with the metadata that defines the data format as a media type of JPG will be returned to the client, which may then display the image to the user.
Every message from the client to the server must contain all the information necessary to process the message. In the case of security and authentication, the security token must be exchanged within every message.
Cookies are very popular on the internet. The act of checking the cookie versus all of the data in every message would be an example of a non-self-descriptive message, and thus not technically RESTful.
Hypermedia As the Engine of the Application State (HATEOAS)
Hypermedia is similar to the concept of hypertext, or hyperlinks, except that it encompasses all forms of media and not just text or links. It is a non-linear way of providing information or data typically via following a link or other marker in a published resource such as a web page. Hypermedia may be text, graphic, video or other data forms.
Under HATEOAS, a Client Interacts Via a Network Through Hypermedia
The client interacts with a server using only hypermedia. That same hypermedia is delivered dynamically by the server in response to a RESTful request. As a result, no prior knowledge of the data on a server, its structure, nor how that data is stored is required. Instead, using only well-defined hypermedia structure and methods is required.
In a system of hierarchical layers, no component may interact with or see any data or interface except on its own immediate layer. As a result, the client need not know how to, nor even whether it is necessary to, connect to any additional server, proxy, firewall, router, or endpoint. Rather, any intermediary will continue to connect to subsequent servers according to REST constraints, and the resulting response to any request returned via the intermediaries to the client will also be REST compliant. Changes or disruptions to the intermediary systems are therefore invisible to the client allowing for such intermediary systems to provide load balancing, security, or other functions.
Code On Demand
Although technically labeled as optional, clients in a REST architecture should be able to download and execute scripts. This allows for the extension of more complicated functionality and systems, while still providing the same REST-style communication between client and server. As with basic requests and response, the entire instruction for running the scripted code must be independent and not require pre-implementation on the client.
Components of a REST system communicate via a representation of a resource in one of several agreed standardized formats such as graphic formats, document formats, and various web formats. Again, to be a true REST system, each transaction must contain the ability to retrieve and interpret the desired resource.
A resource is anything that can be named. The resource stored on the server to be requested by the client. A resource can be a static file, a document, a database, a picture, or any other format that may be requested.
Although a common concept now, the original idea of a resource as a generic, changeable, point-in-time element was a key feature of REST and the web in general. A resource is any nameable data on a server. That data can change not only to a newer version of the same data, but even to an entirely different type of data. This allows the data on a server to be updated at any time without the need for any client to know about it ahead of time, a key feature of interoperability and availability.
The resource identifier is the specific location of the resource requested. In the case of an HTTP-based system, the resource identifier is the URL or URI. The resource identifier specifies a single resource. A single resource identifier may refer to different data, or resources, at different times. In a REST compliant system, the client need not know in advance what kind of resource the resource identifier is requesting. The response will include metadata that describes how to interpret the data received.
For example, even if a URL request states /server/index.html, if the resource at that address returns a representation as an image file, along with the corresponding metadata, the image file will still be properly displayed regardless of the what the resource identifier said.
This refers to the data sent to the client. As mentioned above, the representation must be one of the standardized data formats. The data is not processed on the server, but rather interpreted by the client. For example, a client requesting an HTML document does not receive a graphic to be displayed on the monitor, but rather a set of HTML code that is then interpreted and displayed on the client. Other representation examples include graphic files like JPEG and GIF images.
Representation metadata provides information about the representation to the system. As with most metadata, representation metadata is typically not part of what is displayed to the end user. Representation metadata may include the media type, a creation and modified date, and version number.
Resource metadata is additional information provided to the system about the resource that exists on the server rather than the representation displayed on the client. Examples of resource metadata include source links, alternates, and information about the resource. For example, resource metadata may include alternate text to be displayed in the event an image representation cannot be displayed for some reason (for example, ALT image text in HTML).
Control data is primarily concerned with the validity of the resource and its representation on the client. Control data includes whether or not data is cacheable, as well as the absolute expiry time, or provide a limit on how long data is used. Control data may also include a checksum, or other means of ensuring integrity.
When taken together, the REST architecture creates a highly-scalable, completely transparent and reusable framework where the clients are decoupled from the implantations of services on the servers. It is platform-independent, for both the client and the server. It is language and structure independent. It does not matter if the data exists in a database, retrieved by a Java server process, and sent to a local program – such as a browser – coded in one of several versions of C.
On the modern web, REST is implemented using a common standard vocabulary between client and server known as Hypertext Transfer Protocol, or HTTP. However, any implementation that conforms to all of the tenants of the REST architecture are considered RESTful implementations. HTTP is not the only possibility.
REST and HTTP
Although HTTP and REST are not the same thing, HTTP in its original form is one implementation of REST. This is not surprising considering Roy Fielding was working on the HTTP 1.1 protocol, while developing the REST architecture.
Identification of Resources in HTTP
Each resource is identified by a uniform resource identifier (URI). Typically, this is implemented as a URL within a web browser. A URI identifies a resource. No prior knowledge of the system where the resource resides on the server is required. Also, the URI (or URL) is able to specify the representation of a resource, without knowledge of that resource’s file structure.
Representation of Resources
In HTTP, the representation of resources is implemented via support for various file types within an HTTP browser. Thus, the metadata accompanying a resource defines one of several widely supported file formats such as HTML, CSS, JPG, GIF, and so on.
A pure REST implementation of HTTP requires using four core methods GET, POST, PUT, and DELETE. Each method should be used explicitly and mapped to one of each of the core actions RETRIEVE DATA, CREATE RESOURCE, UPDATE RESOURCE, and DELETE RESOURCE.
Certain HTTP methods should be idempotent. Idempotent means that executing the same method with the same parameters should always return the same result. In order for this to work, the method in question cannot cause any modifications on the server that would cause a different result to be returned for the same request. In practical aspects, an idempotent request should not change server-based data.
GET, POST, PUT, DELETE
Used to retrieve a resource or information about the resource. Although most implementations of HTTP will process parameters with a GET request to modify or create resources, such action would not be compliant with a RESTful implementation. GET requests should be idempotent.
A POST request creates new data on a server. By definition, a POST request is NOT idempotent. With each execution, a POST request would create more data.
Similar to a POST request, a PUT request modifies existing data. For example, changing the last name of an existing user. PUT requests are non-intuitively idempotent. While a PUT request does change data, it does so in the same way every time. Thus, running a PUT request that changes a user’s last name would always produce the same result so long as all parameters are consistent.
A DELETE request removes or deletes data, as the name suggests. Delete requests are also idempotent in that running one repeatedly will always result in the same final state with the data in question no longer existing on the server. However, to the client, there may appear to be a difference in that once a resource is deleted, the request may be answered with an error message such as file not found. However, the method is still considered idempotent, because no matter what error is sent during the transaction, the end state of such a request is the same as the first time it is requested.
Requirements of a RESTful API
In a blog post on his now abandoned blog, Roy Fielding discussed what criteria an API must meet in order to be considered truly RESTful. Building off his previously published thesis, this post described the same REST architecture concepts as they should apply to APIs.
A RESTful API is thus one that uses ONLY REST architecture without the need for additional documentation or methods beyond those that fit the model. Fielding provided the following points as clarification of what makes an API REST compliant.
Independence of Protocol
A truly RESTful API should not depend upon any one protocol and should be able to support any protocol that uses URI for identification. Otherwise, identification is not separate from interaction.
Support of Protocols as Standardized
A REST API should not require changes to standardized protocols, particularly the addition of extra features. Whenever possible, any required workarounds should be separately defined with the goal of removing them altogether once the workarounds are no longer necessary.
Does Not Define Fixed Resources
A server namespace should be independent of the API definition and requirements. In other words, the API should work with any REST capable server, not just servers that conform to a particular API specification.
- Architectural Styles and the Design of Network-based Software Architectures
- Rest APIs Must Be Hypertext Driven
- Representational State Transfer
- Restful Rails Development – O’Reilly – Chapter 1
- Learn REST: A Tutorial
- What Is REST
- RESTful Web Services – The Basics
- REST – The Short Version
- Describing RESTful Applications