The guide is work in progress. Some advice are even contradictory and not stabilized yet.
This documentation was forked from https://github.com/interagent/http-api-design and transformed to use docpress instead of GitBook format. The original documentation is extracted from work on the Heroku Platform API.
Support developers to create consistent and good APIs.
There are tens or hundreds of individual APIs and this guide tries to ensure that API users can trust them to follow the same conventions. This speeds up app development, communication, API design process, and more.
Our goals here are consistency and focusing on business logic while avoiding design bikeshedding. We’re looking for a good, consistent, well-documented way to design APIs, not necessarily the only/ideal way.
When a new service is created, these artifacts/actions are required:
API documentation as Swagger 2.0 YAML
The latest Swagger YAML/JSON definition should be always uploaded to IBM API Connect.
For services based on the common Hapi/Hapi-swagger template, the Swagger definition will be
available to download from the /swagger.json
path on the running service. This definition can be used
to add the API under an existing product in IBM API Connect.
To push a new version of your API to IBM API Connect you can use a command-line based tool available here.
Currently, the command-line tool cannot add new APIs in a catalog (only in Drafts). The API can be first published to a catalog using the IBM API Connect web interface, and then later updated using the command-line tool.
Public API is exposed via IBM API Connect
“How to get the service running” -documentation
Your service repository should provide or link to this type of documentation in README.md file.
Preferably the service would use Docker or Vagrant to run local environment. This makes it easier to kickstart the development for new developers or maintainers joining the project.
See also Service building tips for additional information how to build services.
OpenID Connect is used for user authentication. It provides a way for applications(Mobile, web or any clients) to perform operations on behalf of the user without ever needing to know the details of that identity (e.g. username and password).
Client executes OAuth2 Authorization code flow to receive access_token
.
The access_token
is an encoded JWT token.
Client requests /api/x
via API Gateway with correct Authorization: Bearer <access_token> header.
API Gateway includes e.g. x-token
header in the request and passes it to Service X.
The x-token
is a shared secret between Service X and API Gateway.
Note: x-token
is just one of the ways to authenticate requests between
API Gateway and backing services. It is important that there is some
mechanism to authenticate the requests, but it doesn’t need to be
exactly x-token
header mechanism.
Service X verifies x-token
and Authorization header.
Authorization header is a JWT token which has been encoded with a private key (only OpenID knows this). Service can verify the signature by using a public key which OpenID service publicly provides.
Service X can now trust that the request came via API Gateway and JWT payload was created by OpenID service.
JWT Payload contains information about the end user who made the request. Service can use this e.g. information to implement fine grained authorization.
Versioning and the transition between versions can be one of the more challenging aspects of designing and operating an API. As such, it is best to start with some mechanisms in place to mitigate this from the start.
To prevent surprise, breaking changes to users, it is best to require a version be specified with all requests. Default versions should be avoided as they are very difficult, at best, to change in the future.
It is best to provide version specification in the headers, with other
metadata, using the Accept
header with a custom content type, e.g.:
Accept: application/json; version=3
See Zalando versioning guidelines for more.
Follow Zalando versioning guidelines.
Require secure connections with TLS to access the API, without exception. It’s not worth trying to figure out or explain when it is OK to use TLS and when it’s not. Just require TLS for everything.
Ideally, simply reject any non-TLS requests by not responding to requests for
http or port 80 to avoid any insecure data exchange. In environments where this
is not possible, respond with 403 Forbidden
.
Redirects are discouraged since they allow sloppy/bad client behaviour without providing any clear gain. Clients that rely on redirects double up on server traffic and render TLS useless since sensitive data will already have been exposed during the first call.
Your APIs should be described as Swagger 2.0 YAML format. This shouldn’t be a manually maintained. It should be generated from the service’s HTTP endpoint code.
Don’t maintain API endpoints, parameters and their descriptions in multiple places. They will go out of sync.
In addition to endpoint details, provide an API overview with information about:
API stability and versioning, including how to select the desired API version.
Common request and response headers.
Error serialization format.
Examples of using the API with clients in different languages.
API Connect does this when correct Swagger examples are specified.
Accept serialized JSON on PUT
/PATCH
/POST
request bodies, either
instead of or in addition to form-encoded data. This creates symmetry
with JSON-serialized response bodies, e.g.:
$ curl -X POST https://service.com/apps \
-H "Content-Type: application/json" \
-d '{"name": "demoapp"}'
{
"id": "01234567-89ab-cdef-0123-456789abcdef",
"name": "demoapp",
"owner": {
"email": "username@example.com",
"id": "01234567-89ab-cdef-0123-456789abcdef"
},
...
}
Method | Description |
---|---|
HEAD | Can be issued against any resource to get just the HTTP header info. |
GET | Get one or multiple resources. |
POST | Create a new resource. Id of the resource is unknown before request. |
PUT | Fully replace an existing resource. Id of the resource is known before request. Note: You must send the full object on each PUT request. |
PATCH | Add or modify attributes for an existing resource. |
DELETE | Delete an existing resources. |
Good examples:
GET /api/products
Get paginated array of products.GET /api/products/:id
Get product by id.DELETE /api/products/:id
Delete product by id.POST /api/products
Create a new product.PUT /api/products/:id
Replace a products.PATCH /api/products/:id
Add or modify attributes for a products.PUT /api/servers/:id/actions/hibernate
Special “hibernate” action for a virtual machine.Prefer endpoint layouts that don’t need any special actions for
individual resources. In cases where special actions are needed, place
them under a standard actions
prefix, to clearly delineate them:
/resources/:resource/actions/:action
e.g.
/products/actions/search
/machines/1/actions/shutdown
Use POST
or PUT
method for actions.
When each API follows the same rules, using the Kesko API ecosystem becomes much easier as you can trust to certain conventions.
If some conventions are not documented, always follow existing conventions. When introducing a new convention, there should be a plan how it will be taken into use in all services.
Changing API conventions across multiple services takes time, so choose wisely.
Use camelcased attribute names, plural array keys and correct JSON types for data. You may use strings for money to make sure the API user acknowledges that using float values is dangerous.
Example of good JSON naming conventions:
{
"id": "123e4567-e89b-12d3-a456-426655440000",
"name": "Test Name",
"plussaCards": [
{
"number": "0123123191999",
"owner": {
"_link": "https://keskoapi.com/api/users/123e4567-e89b-12d3-a456-426655440003",
"id": "123e4567-e89b-12d3-a456-426655440003"
}
},
{
"number": "0123123191998",
"owner": {
"_link": "https://keskoapi.com/api/users/123e4567-e89b-12d3-a456-426655440003",
"id": "123e4567-e89b-12d3-a456-426655440003"
}
}
],
"birthYear": 1991
}
Use downcased and dash-separated path names, for alignment with hostnames, e.g:
service-api.com/users
service-api.com/app-setups
Use camelcase in query parameter names:
?isAdmin=true
?hasComment=false&minRating=1.2
Type | Good examples | Bad examples | Note |
---|---|---|---|
Booleans | ?a=true , ?a=false |
?a=1 , ?a=False |
|
Arrays | ?id=1&id=2 |
?ids=1,2 , ?ids=1&ids=2 |
Use singular in parameter name |
If your API requests are complex and need a lot of query parameters, consider moving all parameters to a configurable body request JSON object similar to Elasticsearch queries.
In data models with nested parent/child resource relationships, paths may become deeply nested, e.g.:
/stores/:storeId/assortments/:assortmentId/products/:productId
Limit nesting depth by preferring to locate resources at the root path. Use nesting to indicate scoped collections. For example, for the case above where a product belongs to an assortment belongs to a store:
/stores/:storeId
/stores/:storeId/assortments
/assortments/:assortmentId
/assortments/:assortmentId/products
/products/:productId
In other words, have only one level of parent/child relationship depth in one path.
Use the plural version of a resource name unless the resource in question is a
singleton within the system (for example, the overall status of the system might
be /status
). This keeps it consistent in the way you refer to particular resources.
Generate consistent, structured response bodies on errors. Include a
human-readable error message
.
HTTP/1.1 400 Bad Request
{
"statusCode": 400,
"error": "Bad Request",
"message": "child \"weight\" fails because [\"weight\" is required]",
"validation": {
"source": "payload",
"keys": [
"weight"
]
}
}
Use HapiJS Boom error payload format.
Return appropriate HTTP status codes with each response. Successful responses should be coded according to this guide:
200
: Request succeeded for a GET
, POST
, DELETE
, or PATCH
call that
completed synchronously, or a PUT
call that synchronously updated an
existing resource201
: Request succeeded for a POST
, or PUT
call that synchronously
created a new resource. It is also best practice to provide a 'Location’
header pointing to the newly created resource. This is particularly useful
in the POST
context as the new resource will have a different URL than the
original request.202
: Request accepted for a POST
, PUT
, DELETE
, or PATCH
call that
will be processed asynchronously. E.g. if your task is processed in the background with a worker.206
: Request succeeded on GET
, but only a partial response
returned: see above on rangesPay attention to the use of authentication and authorization error codes:
401 Unauthorized
: Request failed because user is not authenticated. E.g. token is not valid.403 Forbidden
: Request failed because user does not have authorization to access a specific resource. E.g. the given token is not allowed to access the resource.Return suitable codes to provide additional information when there are errors:
400 Bad Request
: Request was incorrectly formed. E.g. invalid json or missing required attributes.422 Unprocessable Entity
: Your request was correctly formed, but contained invalid parameters. E.g. endDate is before startDate.429 Too Many Requests
: You have been rate-limited, retry later.500 Internal Server Error
: Something went wrong on the server, check status site and/or report the issue.Refer to the HTTP response code spec for guidance on status codes for user error and server error cases.
Accept and return times in UTC only. Render times in ISO8601 format, e.g.:
"finishedAt": "2012-01-01T12:00:00Z"
You may have milliseconds in the timestamp too.
Use existing globally unique IDs when possible. E.g. use EAN codes for products when possible to avoid creating yet another ID. You may use different IDs internally but public API should only expose one unique ID for a resource. This one ID should be used in resource objects, url paths etc.
Think carefully before exposing multiple IDs for resources such as EAN and UUID.
Give each resource an id
attribute by default. Use UUIDs unless you
have a very good reason not to. Don’t use IDs that won’t be globally
unique across instances of the service or other resources in the
service, especially auto-incrementing IDs.
Render UUIDs in downcased 8-4-4-4-12
format, e.g.:
"id": "01234567-89ab-cdef-0123-456789abcdef"
Keep things simple while designing by separating the concerns between the different parts of the request and response cycle. Keeping simple rules here allows for greater focus on larger and harder problems.
Requests and responses will be made to address a particular resource or collection. Use the path to indicate identity, the body to transfer the contents and headers to communicate metadata. Query params may be used as a means to pass header information also in edge cases, but headers are preferred as they are more flexible and can convey more diverse information.
Separate long running requests to worker processes which live independently from request-response lifecycle. For example a video resolution scaling would be a perfect use case for worker architecture. Good rule of thumb is that if your processing takes >500ms, consider using a background worker.
Benefits:
Always fast HTTP responses (even though the actual processing might take time). Use a ticketing or similar system to be able to poll progress information.
Decouple job processing from web framework. Allows you to write the worker processing with a different language since the jobs are defined in a generic job queue.
Job queues are robust. They allow retrying, alerts from increasing job queue depth etc. Worker can be disposed at any time and the job is still persisted in the queue.
Scale worker processes independently from HTTP responses. Background processing might need different specs from the server, e.g. more CPU. HTTP serving is usually more IO-bound.
Read more: https://devcenter.heroku.com/articles/background-jobs-queueing
Large responses should be broken across multiple requests using Range
headers
to specify when more data is available and how to retrieve it. See the
Heroku Platform API discussion of Ranges
for the details of request and response headers, status codes, limits,
ordering, and iteration.
Include a Request-Id
header in each API response, populated with a
UUID value. By logging these values on the client, server and any backing
services, it provides a mechanism to trace, diagnose and debug requests.
Recommended way is using an ETag
header in all responses, identifying the specific
version of the returned resource. This allows users to cache resources
and use requests with this value in the If-None-Match
header to determine
if the cache should be updated.
You can also use different HTTP cache headers for specific use cases but consider
ETag
as the default pick.
Use JSON-LD format with API response data. Link other entities by referencing to an IRI which is expected to return another schema.org entity.
Schema.org aims to create and maintain open schemas for structured data in the web. Structured data means data that is accompanied by semantics through an ontology. In other words structured data contains information about the data types and hierarchical links and relationships to other entities. JSON-LD is a format specification that extends JSON with linking capabilities. Linked data enables for example rich results in Google search.
Example of JSON-LD with schema.org vocabulary:
{
"@context": "http://schema.org",
"@type": "HardwareStore",
"branchCode": "PK035-K-rauta-Lielahti",
"name": "K-Rauta Lielahti",
"telephone": "010 538 0300",
"email": "lielahti@k-rauta.fi",
"address": {
"@type": "PostalAddress",
"streetAddress": "Turvesuonkatu 10",
"addressLocality": "Tampere",
"postalCode": "33400",
"addressCountry": "FI"
},
"hasOfferCatalog": {
"@type": "OfferCatalog",
"name": "Products",
"url": "https://keskoapi.com/api/products/PK035-K-rauta-Lielahti"
}
}
Read more:
Extra whitespace adds needless response size to requests, and many clients for human consumption will automatically “prettify” JSON output. It is best to keep JSON responses minified e.g.:
{"beta":false,"email":"alice@heroku.com","id":"01234567-89ab-cdef-0123-456789abcdef","lastLogin":"2012-01-01T12:00:00Z","createdAt":"2012-01-01T12:00:00Z","updatedAt":"2012-01-01T12:00:00Z"}
Instead of e.g.:
{
"beta": false,
"email": "alice@heroku.com",
"id": "01234567-89ab-cdef-0123-456789abcdef",
"lastLogin": "2012-01-01T12:00:00Z",
"createdAt": "2012-01-01T12:00:00Z",
"updatedAt": "2012-01-01T12:00:00Z"
}
You should also compress the API responses with GZip if client supports it.
Provide the full resource representation (i.e. the object with all
attributes) whenever possible in the response. Always provide the full
resource on 200 and 201 responses, including PUT
/PATCH
and DELETE
requests, e.g.:
$ curl -X DELETE \
https://service.com/apps/1f9b/domains/0fd4
HTTP/1.1 200 OK
Content-Type: application/json;charset=utf-8
...
{
"createdAt": "2012-01-01T12:00:00Z",
"hostname": "subdomain.example.com",
"id": "01234567-89ab-cdef-0123-456789abcdef",
"updatedAt": "2012-01-01T12:00:00Z"
}
202 responses will not include the full resource representation, e.g.:
$ curl -X DELETE \
https://service.com/apps/1f9b/dynos/05bd
HTTP/1.1 202 Accepted
Content-Type: application/json;charset=utf-8
...
{}
Provide createdAt
and updatedAt
timestamps for resources by default,
e.g:
{
// ...
"createdAt": "2012-01-01T12:00:00Z",
"updatedAt": "2012-01-01T13:00:00Z",
// ...
}
These timestamps may not make sense for some resources, in which case they can be omitted.
Plan and ideally describe the stability of your API or its various endpoints according to its maturity and stability, e.g. with prototype/development/production flags.
See the Heroku API compatibility policy for a possible stability and change management approach.
Once your API is declared production-ready and stable, do not make backwards incompatible changes within that API version. If you need to make backwards-incompatible changes, create a new API with an incremented version number.
_link
for foreign key relationsSerialize foreign key references with a nested object, e.g.:
{
"name": "service-production",
"owner": {
"_link": "https://keskoapi.com/api/users/01234567-89ab-cdef-0123-456789abcdef",
"id": "01234567-89ab-cdef-0123-456789abcdef"
},
// ...
}
Instead of e.g.:
{
"name": "service-production",
"ownerId": "01234567-89ab-cdef-0123-456789abcdef",
// ...
}
If needed, this approach makes it possible to inline more information about the related resource without having to change the structure of the response or introduce more top-level response fields, e.g.:
{
"name": "service-production",
"owner": {
"id": "5d8201b0...",
"name": "Alice",
"email": "alice@heroku.com"
},
// ...
}
However prefer linking to the original data instead of inlining data.
The purpose of this guide is to focus on the API design details instead of the whole architecture, but going through the underlying architecture principles helps to understand the whole picture.
Most of the backend services are built with microservice architecture. The architecture is used to gain certain benefits. It’s not a silver bullet but has been a good fit for the domain.
Good resources
The following rules should apply for the services to benefit of the architecture:
Align them with the business capabilities. By focusing to a single responsibility, technology choices can be picked to suit the domain best. For example you could model graph heavy data with a graph-database etc.
If you are unsure about the splitting, it might be a better to create a broader service first, and split later. This way you don’t end up with unnecesssary operational overhead.
For example product service can be deployed independently from product order status service. When product order status service is down or has broken, product service should not be affected by the incident.
Services should also have independent attached resources such as Redis, Postgres etc.
Example smell of bad design: deploying a service requires updating another service at the same time. Correct versioning fixes this.
When accessing data of other microservices, you should use their normal HTTP APIs. Treat them as you would e.g. when integrating to GitHub API. And as said before, assume it to be down at any time.
Each microservice should be horizontally scalable to avoid single point of failures.
For example:
Environments help in rapid development. This is a good, in practice tested, set of
environments you should have. At minimum, have qa
and prod
environments.
Environment | Purpose |
---|---|
dev |
Experimental. Sharing new features / fixes to other developers or customer. May break at any time, but should be kept as a working environment. |
qa |
Should be as close as prod as possible. All deployments must be tested in this environment before deployment to prod . |
prod |
The real deal. Used to serve end users. Response times should ideally be <100ms |
All these environments should be as similar to each other as possible, e.g. have the same external dependencies(Postgres 9.4).