"Don't Repeat Yourself" When Designing APIs

Techniques for keeping your OpenAPI definitions DRY

Apr 09, 2023

Your journey of implementing API ideas with OpenAPI designs has reached a critical mass. In previous articles, you’ve seen how to model responses for GET operations as well as describing problem and error responses. In both cases, we include the JSON Schema to describe the results directly in the operation’s responses object. However, this is not scalable, as it would require copious copying and pasting, which in turn leads to fragile code and increased likelihood of bugs. This article will show you the path of API Design enlightenment via OpenAPI’s support for the Don't Repeat Yourself Principle, otherwise known as the DRY principle1.

Welcome to the next article in the Language of API Design. Rather than jumping into the middle of this series, I encourage new subscribers/visitors read Your Guide to The Language of API Design and review previous posts in the series.

The Horrible Experience

First, let’s look at the wrong approach. Consider defining multiple problem responses in just one operation, to cover 400 (Bad Request) and 422 (Unprocessable Entity) HTTP status codes. If each response repeated the JSON schema for the application/problem+json response, the API definition becomes quite verbose quite quickly, almost 150 lines of YAML source for just those three responses for one operation.

See my multiple-400-level-application-problem+json-response.yaml gist—I hope you see the “bad code smell” in that solution. Repeat that for every operation in an API and the result is a horrible developer experience. (This pattern is also known as Write Everything Twice or WET.)

Here are just a few immediately obvious problems of this WET method:

Once a construct is copied, changing that construct requires updating duplicate code in numerous places.
A natural consequence of updating multiple sections of code is that developers may accidentally skip some copies, resulting in defects.
Similarly, if the original code had defects, copy and paste also duplicates those defects.

Fortunately, there is a better way!

The (Not Horrible) Better Experience

OpenAPI provides two general mechanisms to avoid the need for such verbosity and duplication of code.

The components object provides a place to define reusable API design elements, such as schemas, parameters, and response objects. Instead of inlining a schema or response object or other element, you can reference a component that is defined within the components object.
You can split the API definition into multiple source documents. Components that are used across multiple APIs can be moved into the components of a common or shared OpenAPI document.

With regard to problem responses, OpenAPI also provides a way to define one response object for all 4XX level (or all 5XX level) responses for an operation that are not already explicitly defined.

Refactoring an OpenAPI Document to Use `components`

Let’s refactor our API document to use OpenAPI components correctly. We will do this in stages instead of one huge rewrite, since incremental refactorings are easier to verify. (Unfortunately, I don’t know of IDE tools which support such OpenAPI refactorings. This is a great opportunity for a clever toolsmith…)

First, let’s extract the JSON schemas definition for our RFC 7807 application/problem+json responses. To do this, we need a name for the schema so we can reference the schema by name where needed. RFC 7807 does not specify a standard name, so we’ll pick something that is descriptive: apiProblem.

OpenAPI’s components object defines a place to put all reusable schemas: inside the nested schemas object. (The components object sits at the top level of the OpenAPI document, as a sibling of the info and paths and other top-level objects.) The name of the items in the /components/schemas object is the name of the schema:

components:
  schemas:
    apiProblem:
      title: API Problem
      description: >-
        API problem response, as per
        [RFC 7807](https://tools.ietf.org/html/rfc7807).
      type: object
      properties:
        ... properties as listed above

Next, we’ll reference that schema by name instead of making in-line copies. To reference a reusable component, OpenAPI uses a reference object with $ref as the key and a value that is a URI of the referenced element’s location in the document. Prior to OpenAPI 3.1, this was a (restrictive) JSON Reference object, but for 3.1, the specification loosened this to allow these reference objects to also include a description and summary values. We’ll see how that benefits us below.

For example, the reference to our apiProblem schema is

$ref: ‘#/components/schemas/apiProblem’

(The path must be quoted in the YAML format because the # character is interpreted as a comment character when not in a string.)

This path is a fragment portion of a relative URI. The # marks the beginning of the fragment, and the rest are path elements: the components object, its nested schemas object, and the schema named apiProblem. In general, the fragment is interpreted as an RFC 6901 JSON-Pointer.

If the $ref location starts with “#“, the referenced component is resolved from the same source document. The $ref may also start with a document URI or a (relative) file references, such as

$ref: ‘../common/openapi.yaml#/components/schemas/apiProblem’

to reference API elements from a library of reusable components in another file.

This leaves us with the following somewhat shorter definition of the 400 and 422 responses for the getChainLinks operation:

paths:
  /chainLinks:
    get:
      responses:
        '200': ... as seen above ...
        '400':
          description: Bad Request. 
          content:
            application/problem+json:
              schema:
                $ref: ‘#/components/schemas/apiProblem’
        '422':
          description: Unprocessable Entity. 
          content:
            application/problem+json:
              schema:
                $ref: ‘#/components/schemas/apiProblem’

Note that this more concise code is much easier to understand because all the repetitive schema definitions are referenced rather than copied. But we can do even better!

To allow multiple operations ( listChainLinks as shown above, but also listAuthors, listChains, listUniverses, listCharacters, etc.) to share the same responses, you can put reusable response objects in the #/components/responses object and reference them. For example:

components:
  responses:
    '400':
      description: Bad Request. 
      content:
        application/problem+json:
          schema:
            $ref: ‘#/components/schemas/apiProblem’
    '422':
      description: Unprocessable Entity. 
      content:
        application/problem+json:
          schema:
            $ref: ‘#/components/schemas/apiProblem’

Now our operation can reference those common response objects and be even more concise (and more consistent):

paths:
  /chainLinks:
    get:
      responses:
        '200': ... as seen above ...
        '400':
          $ref: '#/components/responses/400'
        '422': 
          $ref: '#/components/responses/422'

  /characters:
    get:
      responses:
        '200': ... as seen above ...
        '400':
          $ref: '#/components/responses/400'
        '422': 
          $ref: '#/components/responses/422'
  # ... and so on for other operations

Default/Wildcard Response Objects

You can explicitly list different responses, as shown above. OpenAPI provides an additional shortcut for defining responses. A “wildcard” 4XX response defines a default response format for all 4xx-level response codes not explicitly listed in an operation’s responses object:

paths:
  /chainLinks:
    get:
      responses:
        '200': ... as seen above ...
        '400':
          description: Bad Request. Invalid query parameters for the
            listChainLinks operation.
          $ref: '#/components/responses/400'
        '422':
          description: Unprocessable Entity. The listChainLinks request
            was syntactically correct but could not be processed due to
            unsupported combinations or values.
          $ref: '#/components/responses/422'
        '4XX':
          $ref: '#/components/responses/4XX'

Rather than use the 4XX wildcard character to define all client request problems, I recommend using response objects, with a description , for the most common problems a developer may face. This allows better API documentation and defines some expectations for how the client should handle the most common API problems.

OpenAPI also lets you define a default response object for 5xx-level HTTP response codes that are not explicitly defined:

        ...
        '5XX':
          description: Server error. 
          content:
            application/json:
              schema:
                $ref: ‘#/components/schemas/apiProblem’

Reusable response body schemas

As I discussed in What Am I Getting Out of This?, the listChainLinks operation returns a JSON object containing a page of chain link items. The initial implementation put that schema directly in the operation’s responses object. However, it is helpful to lift such response schemas into the schemas component and reference them:

You are more likely to reuse that schema elsewhere in the API.
Providing a named schema helps when generating Software Development Kits (SDK) for your API consumers to use with your API. SDK generators2 can use the schema names as the names for programming language constructs such as types, classes or interfaces. If you use “anonymous” (unnamed) schemas, the generated symbols are fabricated, gangly, and harder to use. Named schemas provide a direct association between the OpenAPI document and the SDK, resulting in a better Developer Experience.

Observations

Much literature exists on the benefits of the DRY principle. Here are my primary reasons to apply DRY (modularization and componetization) to API design with OpenAPI:

DRY API designs lead to better API design consistency.
DRY API definitions are more concise and therefore easier to construct, easier to read and understand, easier to review (as part of an API governance program), and easier to maintain.
Named reusable schemas lead to well-named SDK constructs like types, classes, and interfaces.
Less copy and paste results in fewer defects.
By making reuse of API elements convenient for the API designer, the structure of the OpenAPI Specification actively encourages good API design. That is, your are more likely to reuse components simply because it is easy to do so. It’s one of those virtuous cycle things. That is, via the use of component references, you can establish rich and powerful patterns of API design.
Additionally, by removing API element duplication, you eliminate the need to read between the lines or wonder “Is this instance of the response schema different from that one, and if so, how do they differ and why do they differ?” (Can you tell if or how the 400 response differs from the 422 response in this example?) Remember, you are not the only one who will be reading the OpenAPI documents. Future maintainers will as well.
To me, the most important benefit of moving key API elements (schemas, responses, parameters, etc.) into components is that the intent and design structure of the API becomes much more evident—it is not buried in all the syntax chrome that is otherwise required to define all those elements without components.

The structure of the OpenAPI Specification actively encourages good API design

The remainder of The Language of API Design will employ the components objects for these reasons. So stick around, we'll be right back.

Otherwise known as the DRY principle

SwaggerHub, openapi-generator, APIMatic, Speakeasy and others