Watch Hotstar, Watch your friends

Echo Cancellation
In a normal voip application, we need to cancel the other speaker’s audio so that they will not hear their own voice again. However, in “Watch With Friends” (WwF) the complexity is different. We have the extra audio of the content thaat is playing, which needs to be cancelled, hence 2 audio tracks need cancellation, and only the speakers mic input must be sent over the network.

To cancel the audio of the content, we need to get the raw audio buffers and send the same to our echo cancellation unit to avoid same to be send over the network.

Prioritisation of 2 different audio tracks

Optimising CPU / Battery Consumption
Video Conferencing consumes a lot of battery as the system has to perform cpu intensive operations like encoding , decoding of multiple video streams.
Our use case is one step more cpu intensive than that of a regular video conferencing app.

A key optimization for us was to select the best resolution for our video conferencing basis the UX, and a lower frame rate, 120p at 10fps. This worked very well for our viewport size as per the UX as well.

We also decoded only those participant’s video who were in the view port. So if there were 10 participants in watch party, we would decode only 5–6 participants video as the rest of them were not in the viewport (because they were not actively speaking at that moment).

Prioritising the video playback over watch party
Video playback is the heart of Hotstar, we could not compromise the playback of the primary content due to the video conference and we also wanted to let our mobile customers participate in watch parties. To achieve this, we tweaked our resolution bit rates so as to not add an overly appreciable load over the primary content being streamed.

We limited the number of participants, so that bandwidth consumption could be controlled and playback doesn’t get affected. Also, we start the watch party post the start of playback to ensure that the main video always starts to play. It was important to not impact the video start lag of the primary content.

Seamless join experience
Frictionless join and invite experience is what is very important for any watch party. For this, we came up with a new deeplink composition design which allowed us to pass the party meta info like channel and encryption information.

Part1 is our base url which defines the content to be played, part 2 for defining the feature to be used on that content, part 3 for meta info for the content. This way we get all info in single url to take the customer to the right content, and to ensure that they join the right party.

Backward Compatibility
There is a possibility that not all participants are on a watch party compatible build / app. In such a scenario, it was important to ensure a smoother customer experience.

Using the meta-data encoded into the new deep link, we added version checks and were able to indicate to the customer that they needed to upgrade to use Watch Party.

Video Synchronisation among participants
Imagine a scenario when your friends yells out, “OUT” and you are still watching the bowling action. Spoiler alert! Hence, while watching live sports with friends, it’s really important that everybody is almost on same live position so that they can have an exciting party together.

We made this stronger by allowing participants to catch up to live, if they fell behind more aggressively.

Security
When you are in a watch party with friends and families, it’s really important for us to make sure that no stranger can enter/peek into your party.To make this happen, we have security gates at various levels.

Our tokenization architecture ensures that only Hotstar users can join the party. We also encrypt our deeplinks , so that no one outside the Hotstar knows your channel name and password to join the party.