HotstarX -Part 3: The BFF baseplate

Unblock Hotstar in UAE

In the previous instalment, we wrote about our tenets for building widgets and how the server vended widget response allows the client apps to paint a delightful UX.

In this post, we’ll unravel our server side architecture that provides us:

  1. 🏃 Mad Agility – An agile platform that allows 200+ engineers to concurrently build and manage hundreds of widgets within the set guard-rails of performance and quality.
  2. ⚙️ Mad Flexibility – Dynamically change page layouts, control the widgets in a page, their order and other widget properties like orientation, styling, etc.
  3. 🚀 Performance — A performant approach to fetch data and hydrate widgets per form-factor.
Photo by Chris Hardy on Unsplash

BFF (Backends For Frontends) is a proven pattern which works well, however, there were some distinct challenges from prior work on this pattern, which we wanted to address.

  1. Logic duplication: Given discretion on how to compose the data from underlying domain apis, causes divergence. E.g. images can be pulled from source of truth, or personalization layer. Governance is needed!
  2. Sub-optimal data aggregation : Multiple customer widgets + all asking for similar / shared data+ multiple teams building = Chatty, Duplicate calls. Aggregating these calls based on a data dependency graph is needed to keep this lean and clean.
  3. Operationally expensive: A common orchestration framework, to harmonize, will add operational complexity. Every BFF team would need to spend effort in maintaining it and keeping it performant.
  4. Build For Evolution : We’d like to re-use our BFF’s without much fuss for future iterations, this requires thoughtful segregation of business logic. Again, given multiple teams building simultaneously, how does one tackle it? 🤔

A customized BFF was the need of the hour. Step 1 was to make our data API’s authoritative (by domain), with clear lines of ownership — and being unaware of the UX that consumes them.

The widget orchestration piece was moved centrally under a single team. This would provide a consistent page architecture and manage cross-cutting concerns of performance and operational excellence uniformly.

Here’s how our high level architecture looks like.

BFF architecture at Hotstar

Lets dive into the details of each logical component:

📺 Display Data Services (DDS)

When we peeled the layers of the logic in our legacy client apps, we recognized that it was often producing new “presentation entities”.

Two flavors emerged:

  • Presentation-Aggregate-Entity : Joining existing domain entities (e.g. join user, subscription and playback data to decide what playback urls to vend out)
  • Enriched-Entity : Override the domain entities with richer business logic like personalization (e.g. artwork personalization) or marketplace/feature specific rules (e.g. content age-rating filtering when kids-mode turned on).

We aligned on the fact that these are first-class entities that must be owned and mastered on the server side rather than being scattered in the BFF layer under a shared ownership model. This is not a new idea, and simply piggy-backs on the notion of aggregates and entities in the Domain-Driven-Design. We branded these set of services as Display-Data-Services (DDS).

✨ DDS — Abstract the magic

Simply put, if someone were to look for contentTitle, they should be able to fetch it from the Content-DDS (CDDS). CDDS internally would own the logic to fetch and build the most relevant title object on the intersection of contentType, personaRecommendations, cmsTitle, etc.

This also meant that a lot of complex orchestration would move into the DDS layers and could be managed by a handful teams much like other micro-services in the ecosystem.

It was re-assuring that about the time we made this decision, SoundCloud (the original pioneers of BFF pattern), had similar observations after years of operating the BFF stack and had landed on a solve similar to ours. You can read up more about it here (their VAS layer is analogous to our DDS layer).

All the presentation needed binding! The missing layer now was a set of modules that’d fetch display data from these DDS services and map them to the widget data object. For eg. the TrayWidget would ask for Listand for each item it would then recursively ask for contentTitle, contentImage, trayTitle, etc.

Once the data objects were fetched, this mapper layer would take those response objects, parse them and set them in the widget proto response object. We refer to this layer as theBinders.

Binders also become the layer where UX concerns of language localization, feature-flags (whether to show a certain feature in a given request context or not), A/B experimentation get handled.

Binders+DDS also resemble what a conventional View-Model layer is in the MVVM architecture. This layer is responsible for extracting the domain data (Model) , applying business logic and UX centric transforms to it and then returning an object model that the client (View) can consume.

🎵 Binders Runtime and Orchestrator

Time to make music! We had to decide on the strategy to host and run these binders. Given that binders are lean data-mappers, it didn’t make sense for them to be managed as independent micro-services.

Plus the efficient data scatter-gather could be done only if the binders ran in a shared runtime where some kind of central execution framework would own their orchestration.

👊 Enter PageCompositor

We introduced a new component PageCompositor that’d be responsible for parsing the incoming request, deciding what widgets to render in the given request context by consulting a layout manager (discussed later) and then firing off each widget construction in parallel. The widget binders would be hosted as plugins inside the compositor.

  1. Each widget would declaratively describe what data it needed (example below),
  2. PageCompositor would then fetch those data-sets in the most optimal fashion.
  3. Once the data-sets were fetched, we’d pass on the data to the respective widget binders who’d perform the data mappings and UX transforms, returning the widget data objects.
  4. PageCompositor would then iterate over these widget responses and compose the final page response for the client.
Sequence Diagram for GetPage flow

The last missing piece of the puzzle was some form of a centralized api lifecycle manager — this includes concerns like authentication, authorization, enrichment of request context, rate-limiting, etc.

Instead of these concerns being replicated across various layers in the stack, we decided to pull them forward into our API gateway. We use Ambassador as our API Gateway to our K8 clusters and by writing custom envoy plugins, we were able to handle these cross-cutting concerns in one single layer.

Our server side architecture was now ready to return a page response with a set of configured widgets while each widget was being independently built and managed.

However,

  • what should inform the PageCompositor about which widgets to return in a given request?
  • How do we get the ability to change the order of widgets, or the look and feel of a widget or even drop some widgets from the page — all from the server?
  • We’d also like to be able to A/B test with new widget templates and roll out newer versions to the updated clients that can support those templates?

Enter – Layout Service

Layout Service is our control plane for managing page and widget configurations. It exposes an api endpoint via which all the widgets are registered (as mastered in the widget_registry).

An operator can then manage the page configurations in the Layout Service Dashboard. This provides a single pane of glass to view, edit and modify the choice of widgets on each page.

It also allows for different page configurations for varying request properties. For e.g. in some regions, the homepage for Kids cohort looks very different than the default homepage for adults — both in terms of the page layout and the choice of widgets on the page.

Layout Service in conjunction with the widget_registry manages the deprecation and promotion of newer widget templates. Given the incoming client version in the request, LayoutService is able to influence PageCompositor on whether to return the version v.X or v.Y.

This becomes the layer where the final decisions of which widgets and widget versions to return are made. Eventually, we’ll even move parts of this decision making to machine-learnt models that will find the best performing ranked order widgets on a page.