Dynamic Depth Serialization in django REST Framework

I think it's fair to claim that django is the most used tool in my professional and private development. Whether its for small prototypes or massive projects, if there is a web interface and a database involved, chances are that I'll choose django to develop that project.

Practically a requirement when building an app these days is the ability to also provide a REST API for interacting with the data programmatically. This is where django's ecosystem of packages and modules shines. One such package is the django REST framework module. Since Django REST framework is quite a mouthful I'll be referring to it by its abbreviation DRF from now on.

DRF allows you to expose a REST framework based on your django data models with very little code needed. To get a basic REST API that supports Create-Read-Update-Delete operations on your data you have to do two things:

  1. You specify a serializer that tells drf how to go from your model to a JSON representation.
  2. You specify a viewset that tells drf what actions (creating a new object, listing objects, deleting objects) can be used. In the words of the Ruby on Rails documentation linked to by the DRF docs "your [...] [viewset] is responsible for making sense of the request and producing the appropriate output".

And that is about it. DRF then takes care of routing your API request, has build-in capabilities for various authentication schemas (like token-based) and supports more advanced patterns such as pagination out of the box.

One problem with this approach however is that you are doing pretty much exactly what was said above. You expose your data model via a REST API. This isn't necessarily the best API design and REST has its own design problems.

One of these problems is underfetching. The idea behind underfetching is that your front end or API client needs different (or more) data that is available in your API endpoint.A common way to solve this problem is by designing your API endpoints so that they provide the right information that your front end needs. And here you probably see the problem with DRF. DRF uses exactly your django models. And your django models rarely match exactly what you need in your front end.

To illustrate lets have a look at a very simple group model. We have a User model that stores some information on a user and a Group model that stores the name of a group as well as a list of members of this group. I assume that you are familiar with django and the django REST framework here. If you are not, this is a great starting guide on django and this is the quickstart for DRF. The code for this example application is available on my github here.

The entities/models.py file for this might look like this:

import uuid
from django.db import models

class User(models.Model):
    id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
    first_name = models.CharField(max_length=200)
    last_name = models.CharField(max_length=200)
    mail = models.EmailField()

    def __str__(self):
        return f"{self.first_name} {self.last_name}({self.mail})"

class Group(models.Model):
    id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
    name = models.CharField(max_length=200)
    members = models.ManyToManyField(User)

    def __str__(self):
        return f"{self.name} ({len(self.members.all())} member)"

Implementing a normal serializer (in entities/serializers.py) like this ...

from entities.models import User, Group
from rest_framework import serializers

class UserSerializer(serializers.ModelSerializer):
    class Meta:
        model = User
        fields = ['id', 'first_name', 'last_name', 'mail']

class GroupSerializer(serializers.ModelSerializer):
    class Meta:
        model = Group
        fields = ['id', 'name', 'members']

... a normal view set (in entities/api_views.py) like this ...

from entities.models import User, Group
from entities.serializers import UserSerializer, GroupSerializer
from rest_framework import viewsets

class DynamicDepthViewSet(viewsets.ModelViewSet):
    def get_serializer_context(self):
        context = super().get_serializer_context()
        depth = 0
        try:
            depth = int(self.request.query_params.get('depth', 0))
        except ValueError:
            pass # Ignore non-numeric parameters and keep default 0 depth

        context['depth'] = depth

        return context

class UserViewSet(viewsets.ModelViewSet):
    queryset = User.objects.all()
    serializer_class = UserSerializer

class GroupViewSet(viewsets.ModelViewSet):
    queryset = Group.objects.all()
    serializer_class = GroupSerializer

... and wiring them all up to be routed properly by the router (in urls.py) like this ...

from django.contrib import admin
from django.urls import path, include

from rest_framework import routers

from entities import api_views as ApiViews

router = routers.DefaultRouter()
router.register(r'groups/', ApiViews.GroupViewSet)
router.register(r'users/', ApiViews.UserViewSet)

urlpatterns = [
    path('admin', admin.site.urls),
    path('api/v0/', include(router.urls)),
]

... a GET request to list all groups (GET /api/v0/groups) would result in the following JSON:

[ 
   {
        "id": "2068130f-35de-4698-8b96-1505a79fa7ee",
        "name": "Friends of the Python Language",
        "members": [
            "bfbad58d-0123-4991-8fd9-0c9dd0b18cc4",
            "f89dd82c-9a97-47fa-9b45-e22c14f00c3e"
        ]
    }
]

So if, in our front end, we now wanted to get the names of all the people that are a member of this group, we'd have to now iterate over all the ids in the members field and make a GET request to /api/v0/users/<id>. This is also referred to as the (n+1) problem. As illustrated by this example, in order to resolve the name of all n members in our group we'd have to do n requests to the /users endpoint. While this is doable for small requests it's still a hassle both from a load perspective as well as a front-end perspective. Also remember that every round-trip you take to the server increases the likelihood of encountering a network error.

What we could do is to specify a depth parameter in our serializer. Altering the Serializer for our Group to look like this ...

class GroupSerializer(serializers.ModelSerializer)
    class Meta:
        model = Group
        depth = 1
        fields = ['id', 'name', 'members']:

The returned JSON now looks like this:

[    
    {
        "id": "2068130f-35de-4698-8b96-1505a79fa7ee",
        "name": "Friends of the Python Language",
        "members": [
            {
                "id": "bfbad58d-0123-4991-8fd9-0c9dd0b18cc4",
                "first_name": "Marcel",
                "last_name": "Neidinger",
                "mail": "marcel@mail.com"
            },
            {
                "id": "f89dd82c-9a97-47fa-9b45-e22c14f00c3e",
                "first_name": "Max",
                "last_name": "Mustermann",
                "mail": "max@mail.com"
            }
        ]
    }
]

As you can see, the members have been extended to include the full JSON representation for the associated object.

However, this way, we'd always get all those information. If we wanted to just fetch a list of all the groups and their names our payload would include all the unnecessary information for all the subscribers. This is an example of overfetching. Simply speaking, overfetching is the opposite of underfetching. With overfetching the API is returning more properties then what we actually need. As mentioned before, one solution would be to build different API endpoints for the different scenarios. For example you could build a /api/v0/groups endpoint that just displays the names and then have a /api/v0/group-details endpoint that lists all the groups including their member details. However this is taking away the simplicity of DRF so an alternative is a dynamic depth parameter. With this query parameter we can dynamically toggle the depth that we want to receive. This way, our front end can send a request to /api/v0/groups whenever it just wants the names of the groups and use /api/v0/groups?depth=1 whenever it also needs all user information. The later call will then set the serialization depth to 1 and thus, similar to what we have seen when changing that setting for the serializer itself, will give us all the user details included within the request.

This feature has even been requested to be added to the core code base of DRF but the request was declined. So lets see how you can build your own dynamic depth serializer in DRF. The solution proposed here is partially based upon this stack overflow discussion and a few other places that mentioned how to implement such a serializer.

We first need to specify our own custom Serializer class. This custom DynamicDepthSerializer class overwrites the default init method and retrieves a depth parameter from its context and pass it back into the meta information of our parent ModelSerializer class. If no depth parameter is present we set a default depth of 0. We then have the serializers for our User and Group models inherit from DynamicDepthSerializer instead of the drf-provided ModelSerializer.

from entities.models import User, Group
from rest_framework import serializers

class DynamicDepthSerializer(serializers.ModelSerializer):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.Meta.depth = self.context.get('depth', 0)

class UserSerializer(DynamicDepthSerializer):
    class Meta:
        model = User
        fields = ['id', 'first_name', 'last_name', 'mail']

class GroupSerializer(DynamicDepthSerializer):
    class Meta:
        model = Group
        fields = ['id', 'name', 'members']

With our serializer done, we need a way of changing the context of our serializer based on a query parameter since the context is where our DynamicDepthSerializer class takes the depth parameter from. We can do this in our view set. To do so we'll define a custom DynamicDepthViewSet that inherits from ModelViewSet and overwrites the get_serializer_context() method to add the depth parameter from the requests query parameters to the serializer context. We then make the view sets for our User and Group inherit from DynamicDepthViewSet instead of the drf-provided ModelViewSet.

from entities.models import User, Group
from entities.serializers import UserSerializer, GroupSerializer
from rest_framework import viewsets

class DynamicDepthViewSet(viewsets.ModelViewSet):
    def get_serializer_context(self):
        context = super().get_serializer_context()
        depth = 0
        try:
            depth = int(self.request.query_params.get('depth', 0))
        except ValueError:
            pass # Ignore non-numeric parameters and keep default 0 depth

        context['depth'] = depth

        return context

class UserViewSet(DynamicDepthViewSet):
    queryset = User.objects.all()
    serializer_class = UserSerializer

class GroupViewSet(DynamicDepthViewSet):
    queryset = Group.objects.all()
    serializer_class = GroupSerializer

There is no need to change the router and thus we can now make a GET request to /api/v0/groups?depth=1 and see the result of our custom serialization and view set.

Result of request in Postman

Omitting the depth parameter and sending a GET request to /api/v0/groups will then result in a response that doesn't include all the extended information on the user objects.

Result of request in Postman

Pretty neat. I hope you enjoyed this little excursion into django, DRF and building REST APIs.