Normalizing JSON Data from REST APIs

TL; DR:

{
  items: [1, 2, ...otherUsersIds],
  users: {
    '1' : {
      content: {
        id: 1,
        name: 'Popol',
        organization_id: 1
      },
      avatarURL: '//img.ur/trololol.png'
    },
    ...otherUsers
  },
  organisations: {
    '1': {
      content: {
        id: 1,
        name: 'Popol inc.',
        owner_id: 1
      }
    }
  }
}
  

I always wanted to write down this post on how I design JSON data on the REST APIs I build. At least for a reference to give to people that ask me why i'm doing it that way.

If you didn't ragequit right after reading the TL;DR, then here is everything you have to know to understand the reasons behind those choices.

Normalization

It is not about structuration or standardization, but really normalization like you would apply it to relational databases. Except that here, we are normalizing a single JSON.

As you may know, in a lot of APIs, some additional resources are embedded with a given resource representation. Mainly, those have relations with it. For instance, a GET /users/:userId endpoint could embed the user's organization representation.

One could say that it is a sign you need to use GraphQL. I wont be that categoric. I think it is convenient to add some related data in your JSON representations. In fact, RESTful principles allows several representations of the same resource.

But a common mistake when doing so is to add the linked resource to the originating one as a property of it. It leads to content duplication. Indeed, if 2 users have the same organization, it will be embedded twice.

You should now better understand this post's TL;DR. The JSON structure I use is avoiding duplication for linked resources.

You'll also notice the the collection items aren't directly put in the corresponding array. Only identifiers are there. The reason behind that is to allow having repeating collections. For instance, a GET /usersQueue endpoint could have the same user two times in the queue. The JSON format I use allows that without duplicating it. Finally, if a user owns the organization of the above example, then, you can easily find him from the organization owner identifier.

You may wonder why I did not use JSON reference. The are 3 reasons for it:

< Blog