"That diagram is upside down!” a broadcast engineer exclaimed as I was explaining how a personalized advertising-insertion solution worked for multi-screen simulcast.
I looked at the diagram that I had drawn, and I had put the client devices at the top. I realized the broadcast engineer would have drawn the video source and encoding components at the top.
This is not just pedantry. Web engineers tend to think user-/device-centric, whereas broadcast engineers tend to think more headend-centric. It’s called broadcast for a reason! However, these worlds are colliding, and — thanks to advances in online video delivery protocols, multi-screen encoding systems and cloud-based distribution services — it is now possible to perform advertising insertion or content replacement, and to tailor it to individual users. This article explains how this is possible.
Until recently, simulcast streaming to connected devices was performed using protocols like RTMP, RTSP and MMS. In 2009, when the iPhone 3GS was launched, iOS 3.0 included a new streaming protocol called HTTP Live Streaming (HLS), part of a new class of video delivery protocol.
HLS differed from its predecessors by relying only upon HTTP to carry video and flow control data to the device. It made the protocol far more firewall-friendly and easier to scale, as it required no specialist streaming server technology distributed throughout the Internet to deliver streams to end users. The regular HTTP caching proxies that serve as the backbone of all content delivery networks (CDNs) would suffice.
Apple was not alone in making this paradigm switch. Microsoft and Adobe also introduced their own protocols — SmoothStreaming and HDS, respectively. Today, work is ongoing to standardize these approaches into a single unified protocol, under a framework known as MPEG-DASH.
What is significant about all these is that they separate the control aspects of the protocol from the video data. They share the general concept that video data is encoded into chunks and placed onto an origin server or a CDN. To start a streaming session, client devices load a manifest file from that server that tells them what chunks to load and in what order. The infrastructure that serves the manifest can be completely separate from the infrastructure that serves the chunks.
The separation of these concerns provides a basis for dynamic content replacement, as it is possible to dynamically manipulate the manifest file to point the client device at an alternative sequence of video chunks that have been pre-encoded and placed on the CDN. The ability to swap chunks out in this way relies on the encoding workflow generating video chunks whose boundaries match possible replacement events.