This article is for web developers who wants to get more familiar with the video domain and it is recommended to have read the article Demystifing HTML5 Video Player and our ABR tutorial first (part 1, part 2 and part 3).
Parsing the Manifest
As explained in the ABR tutorial every streaming format has a Manifest that contains all info about the media stream. Where the media chunks can be downloaded, how the media is encoded and what tracks are available. The first thing our streaming video player must do is to fetch and parse this manifest. Using the streaming format MPEG-DASH as an example a snippet from such manifest can look like this:
This snippet shows one of the representations of the content. A representation with the width of 1280 pixels and the height of 720 pixels, encoded with AVC1 (H.264) and average bandwidth of 1660 kbps. Representations belonging to the same AdaptationSet are aligned in a way that it is possible to jump between the different representations and by having representations of different resolution and sizes we can adjust and select the representation that guarantees playback without any interruptions. The syntax differs with other streaming formats but the principle is the same.
Instead of specifying the filename for each media chunk we are presented with a pattern, in this case,
vinn-$RepresentationID$-$Time$.dash. The SegmentTimeline specifies the sequence of media chunks where t is the time offset, d is the duration of the chunk and r is number of repetitions. For example to get the filename of the first media chunk we replace $RepresentationID$ with video=1660000 and $Time$ with t=0 resulting in vinn-video=1660000–0.dash. The next media chunk then get the filename vinn-video=1660000–25600.dash, and the following vinn-video=1660000–51200.dash, etc. The timescale is 12800 which specifies that t=12800 is 1 second and we can conclude that this example has media chunks where every chunk is 2 seconds. The initialization chunk is called vinn-video=1660000.dash and contains the actual mp4 header as the media chunks only contains the encoded video data.
Setup a MediaSource
Before creating the MediaSource object we need to check whether the browser supports this video encoding. The MediaSource class has a static function isTypeSupported() that we can use to check this. We construct a mime type string that we pass to this function and in our case it is
When creating a MediaSource object it is in the state closed at first which can be verified by the object attribute
Attach MediaSource to Media Element
The next thing to do is to attach this MediaSource object to a HTML Media Element, for example a video element. Attaching the MediaSource object is achieved by using the static method
URL.createObjectURL(). This method creates a
DOMString containing an URL representing the MediaSource object.
Update: This way is according to newer versions of the specification deprecated and you should simply set the srcObject to the MediaStream directly.
Once we have attached the MediaSource object to the Media Element we need to wait for the
sourceopen event before we can continue.
Adding a SourceBuffer to the MediaSource
The MediaSource is now open and we can add a SourceBuffer to it. We use the method
mediaSource.addSourceBuffer(mimeCodec) to do that. What we later will do is to use the method
sourceBuffer.appendBuffer(arrayBuffer) to append video data to the MediaSource.
We also set the duration of the MediaSource and in this example we will append three chunks where each chunk is 2 seconds.
Fetching Chunks and Appending
To simplify and focus on the basic principles we will in this example assume that the manifest is parsed and we have three chunks that we want to append and play.
The first chunk (segment) we need to fetch is the initialization segment containing the mp4 header. It is important that the mime type of the SourceBuffer matches the container and video format in the initialization segment.
Once that chunk is fetched and appended to the SourceBuffer we can download and append the other segments. As the above code snippet shows we can start video playback before all segments are appended. The
fetchSegmentAndAppend() function is presented in full here below.
The Example Code
What we have described here is basically the main task for the Player Engine. Fetching segments and appending to a MediaSource. Determine which representation to use is the task of the ABR Manager and I will leave it as a good exercise for the reader to extend this example to show how we can append chunks of various resolutions. This example code in its full is available below.
If you have any further questions and comments on this blog drop a comment below or tweet me on Twitter (@JonasBirme).