In the last instalment, we discussed the high level components of our BFF architecture. There’s lot more that went into bringing this stack together. Tonnes of interesting technical choices and frameworks — a lot of which we are already in the process of rapidly evolving in the coming months. Dig in and see for yourself how the HotstarX stack is under the covers!
Here’s the high level view of the HotstarX architecture:
- Display Data Services (DDS): Contain the business logic of aggreating, decorating and processing the underlying domain datas and exposing displayable data properties
- Binders: Mappers that take the DDS data and massage them into the widget data output. Also handle concerns of localization, translations, etc.
- Page Compositor: The orchestrator that breaks-down an incoming page request into its constituent widgets (by consulting the Layout Service), gets the data for each widget, invokes the respective binders and stitches together the entire page response.
Far from being cut and dried, we had several challenges which influenced the implementation choices we made.
We wanted our binders to declaratively describe their data dependencies. This is crucial because:
- Developer Experience: From a dev-exp standpoint we want widget owners to care about what data they need, rather than worrying about where and how to get the data from
- Performance and cost : We want widgets to request only for the optimum amount of data that is required by the widget and not over-fetch
- Centralised orchestration control: In order to plan how the data is fetched, duplicated, cached, etc. across various widgets, we need the data dependencies to be separated from the code that consumes them
This obviated the need for an orchestration framework that would infer the widget’s declarative data dependencies, flat-map them at a page level, build the most efficient execution plan (service-call graph) and then execute it with the right SLAs.
GraphQL, for data orchestration
While building something like this in-house would have been an inspiring engineering problem to solve, we felt that GraphQL had already solved parts of this problem. While there’s lot more desired, we decided to latch on to the GraphQL offering as a starting step.
This provided us with:
- A solid declarative data-fetch spec that is tailored for use-cases like ours to avoid over-fetching.
- GraphQL federation allows us to easily compose data from multiple services in a unified spec without worrying about how to join schemas and manage them. A lot of the tooling comes OOTB and reduced our bootstrap time.
- GraphQL also has mature server side frameworks like Netflix DGS (Java), gqlgen (Golang) that allowed us to quickly build our DDS services as well as to adapt our existing domain services become GQL compatible. This was a major boost for us, since we didn’t have to invest at all in building any DDS frameworks.
We settled on using the Apollo Federated GraphQL Gateway as the entry point for our DDS interactions. This took care of all the heavy lifting associated with syntactic validations, query planning, scatter-gather, and many other OOTB primitives for circuit breaking, caching, etc.
Golang, for scale
For the PageCompositor
, we decided to build it natively in Golang. At Hotstar, we’ve acquired experience building and operating high scale, concurrent services in Golang and given that PageCompositor
was going to be the entry point for all our page requests, we needed it to be fine tuned and highly scaled.
DSL[Golang plugins], for Binders
Given that Binders
were to run in the compositor runtime, we wanted to minimize the performance and operational risks of managing arbitrary code. Our initial idea for the Binders
was to manage them using some form of a DSL. Also since Binders
are supposed to be very thin by design, they shouldn’t require powerful programming constructs.
We couldn’t find any performant DSL framework that’d run at near native speeds and eventually settled on using Golang plugins. This allowed us to package and manage each binder as its own plugin and load within the compositor runtime. Long-term we aspire to support hot-reloading of plugins and independent lifecycle management of each plugin such that the core framework (compositor) and the widget code (binders) could evolve independent of each other; bound only by the orchestration primitives.
The theory is great, but does it work? Let us now walkthrough an end to end example of a simple TrayWidget
which was introduced in our last article. To quickly recap, here’s what the widget looks like:
Step 1 — Proto spec
The client team will first define the widget contract (in protobuf)
in a model package repo. We chose protobuf
as our IDL and our data format on-wire given its strong type-safety and lean size. This proto is then used to auto-generate client side bindings for our Android, iOS and Web clients. We also generate Golang bindings to emit out these objects in the server response.
Following is a very simple widget proto definition.
message TrayWidget {
.base.WidgetCommons widget_commons = 1;
reserved 3 to 10;
Data data = 11;
message Data {
Header header = 1;
repeated Item items = 2;
} message Header {
string title = 1;
} message Item {
string title = 1;
string duration = 2;
.feature.image.Image image = 4;
}}
- Field
widget_commons
is used to capture standard properties like widgetName, version, analytics info, etc. - The
data
object contains a header and a list of items each of which has a title, duration, progressMarkers and an image associated with it.
Step 2 — Widget Registration
The team that owns this widget will then register
it by making an entry into the widget_registry
. This is a .yaml
file that describes what data the widget wants to query and what binder will be used to map the response to a valid proto object.
It also contains information about the widget owner that can be used to emit observability data and contact the owners in case of malformed or misbehaving widgets.
name: ContentTray #Unique name for the widget instance
tags: # Can be used to power discovery of widgets
- AutoGenerated
template_query_binder_mapping:
- path_to_binder: ContentTray/ContentTrayTrayWidget #Folder path where this widget's binder code can be found
path_to_query: ContentTray/ContentTrayTrayWidget #Folder path where this widget's query can be found
template_name: TrayWidget #The actual proto template that this instance is going to use
ownership_info:
team: #Team Name
team_slack: #Team slack channel
team_mailing_list: #Team mailing list
pager_duty: #Team PD details
team_service_directory_id:
team_escalation_policy_id:
The widget team then commits a query and binder code to the respective folders in the widget repo.
Step 3 — Query
The query asks for content collections (which has been advertised as available by an underlying DDS). For each item in the collection, it then asks for some additional data.
The query layer doesn’t expose details as to which DDS provides what data and therefore makes the binders layers truly agnostic of the data provider and orchestration details.
query ContentCollection( $country: String, $platform: String ) {
fetchContentCollection(collectionRequest: {
hsRequest: {
countryCode: $country,
platformCode: $platform},
}) {
collectionItem {
watchProgress {
contentId
duration
}
coreAttributes {
title
horizontalImage {
url
height
width
}
}
}
collectionTitle
}
}
Step 4 — Binder
For the sake of brevity, we’ve stripped off some non-essential code-bits. Very simply, the binder takes the incoming DDS response and maps it to the various fields of the TrayWidget proto object. Concerns like language localization (refer to the title
field mappings) are handled in this layer so that the DDS’es can be truly devoid of presentation concerns.
package mainfunc (b *binder) Execute(ctx context.Context, input *v1.BinderInput) (interface{}, error) {
// first we need to take the raw DDS response and convert it into known structs
contentTrayResponse := new(model.ContentTrayResponse)
err := json.Unmarshal(input.DdsJsonResponse, &contentTrayResponse)
if err != nil {
return nil, err
}
contentTrayResponseItems := make([]model.ContentTrayResponseItem, 0)
if contentTrayResponse != nil {
contentTrayResponseItems = contentTrayResponse.ContentTrayResponseItems
}
// after unmarshalling, we can start transformation/mapping
// we need to return the WidgetTemplate which was registered, in this case its TrayWidget
ret := &widget.TrayWidget{
Template: &base.Template{
Name: constants.TrayWidget,
Version: constants.Version1,
},
// this is a common object that encapsulates standard widget properties
WidgetCommons: &base.WidgetCommons{
Id: constants.TrayWidget,
Version: constants.Version1,
},
Data: &widget.TrayWidget_Data{
Header: &widget.TrayWidget_Header{
// display concerns like localization are handled at binders using standard libs
Title: localizationUtil.GetLocalisedString(contentTrayResponse.CollectionTitle),
},
Items: b.getContentTrayItems(ctx, contentTrayResponseItems, widgetContext),
},
}
return &v1.WidgetBinderOutput{
TypeInstanceName: constants.TrayWidget,
Data: ret,
}, nil
}
// Recursively iterate over collectionItems and build the inner widget data object
func (b *binder) getContentTrayItems(ctx context.Context, contentTrayResponseItems []model.ContentTrayResponseItem, widgetContext *component.Widget) []*widget.TrayWidget_Item {
var ret []*widget.TrayWidget_Item
for _, item := range contentTrayResponseItems {
collectionItem := item.CollectionItem
id := collectionItem.WatchProgress.ContentId
itemTitle := collectionItem.CoreAttributes.Title
widgetItem := &widget.TrayWidget_Item{
Title: localizationUtil.GetLocalisedString(itemTitle),
Image: &image.Image{
Src: collectionItem.CoreAttributes.Images.HorizontalImage.Url,
Alt: collectionItem.CoreAttributes.Title,
Height: int32(collectionItem.CoreAttributes.Images.HorizontalImage.Height),
Width: int32(collectionItem.CoreAttributes.Images.HorizontalImage.Width),
},
},
Duration: int64(cwItem.WatchProgress.Duration) * 1000,
},
ret = append(ret, widgetItem)
}
return ret
}
Step 5 — Magic 🎩
Voila! And just like that, we have a living and breathing widget ready to create magic on your screens.
If a developer wants to re-purpose this widget template to display another set of collections (say top-trending content), they could simply write a new query and a binder, update the widget_registry
and get their widget to production — all from the server side.