In a previous blog post my colleague Jonas Rydholm Birmé explains how you can use SRT together with ffmpeg to create and upload a live stream to the cloud.
In order to further test the potential for using SRT as part of a commercial video distribution flow we decided to try it out together with Edgeware and use their newly launched managed service for content ingest and repackaging in the cloud.
To get something similar to a live stream, I created a udp-stream using ffmpeg with 2 looping PNG images and a timecode overlay. After experimenting a little bit with generating multi-bitrate streams in the way described here, I ended up creating just a constant bitrate stream.
The reason for using single bitrate streams was that the low-spec laptop used for running ffmpeg couldn´t keep up with the workload when generating multiple bitrates — causing discontinuities in the generated ts-stream.
Setting up the SRT-connection
Using the open source SRT package from Haivision, I could set up an SRT-transmitter at the Eyevinn office and get a connection up with the SRT-receiver which was setup by Edgeware in their managed service running in AWS. In this setup, the transmitter was in caller mode and the receiver was setup in listener mode. The SRT-transmitter takes the udp-stream generated by ffmpeg as input and sends it in SRT-format to the ingest point provided by Edgeware in AWS.
./srt-live-transmit “udp://127.0.0.1:12345?pkt_size=1316” “srt://[ingest-ip]:[port]?pkt_size=1316&latency=1000&oheadbw=100”
In order to improve the robustness of the stream, we used the following parameters to avoid packet losses due to fluctuating bandwidth.
- latency: Sets the maximum accepted transmission latency and should be >= 2.5 times the RTT (default: 120ms; when both parties set different values, the maximum of the two is used for both)
- oheadbw: Maximum bandwidth that may be used, based on % over input stream rate. 100 % is the maximum value that can be set.
Adding Encryption to the SRT-connection
SRT has built in encryption to protect the stream. It can quite easily be added by just setting the passphraseand pbkeylenparameters in the transmitter and adding the same shared passphrase in the receiver. You can find details on how the srt-encryption works here.
./srt-live-transmit “udp://127.0.0.1:12345?pkt_size=1316” “srt:// [ingest-ip]:[port]?pkt_size=1316&passphrase=1234567890&pbkeylen=16&latency=1000&oheadbw=100”
On the receiver, the input stream was configured with just the passphrase.
Leaving out the host-ip in the input-path indicates that the receiver runs in listener mode.
Live Ingest / TV Content Capture
The SRT-receiver feeds the stream to the content capture service which provides live ingest capability, synchronizing the audio and video streams and also does the segmentation of the streams needed for further processing in the repackager service. The segments are stored as CMAF in a circular buffer allowing for catch-up viewing and cloud DVR.
Just in time repackager / TV Repackager
The TV Repackager converts and repackages the stream retrieved from the TV content capture service and generates manifests and segments for HLS and DASH for distribution to the client device when it is requested.
Playing the Streams
In order to get some idea of the performance of the setup in an end-to-end scenario, I thought I try to use it with one of the commercial video-players available, in this case the Bitmovin player with integrated analytics SDK and the Bitmovin Analytics Dashboard to monitor some of the performance indicators.
Now with everything in place, I was able to playback the streams in any device supporting either HLS or Dash format in a browser and monitor the behavior of each session using the analytics dashboard.
Latency Measurement and Break-down
Today one of the hot topics within the streaming community is latency. Specifically, this is critical for live content. With a sort of end-to-end A/V setup at my fingertips, I thought it would be interesting to measure and try to do a break-down of the latency from stream generation to the screen of the end-user. With the help of Edgeware engineers, we could conclude the following.
Steps in the latency chain described below, using 2 second segments:
A) 2 looping images + timestamp overlay T1 put together and fed into the ffmpeg mpegts encoder.
B) The ts-stream is taken as input in the SRT transmitter on same machine as ffmpeg encoder.
C) On the “Live Ingest” instance in the cloud, the stream is received in the SRT receiver.
D) The stream is ingested by the TV Content Capture service. Here a timestamp T2 is stamped into the stream.
E) Upon request, the TV Repackager instance in the cloud will fetch segments from the Live Ingest instance and serve it to the client (via a load balancer in the cloud). A timestamp T3 is logged in the access log.
F) The client/player receives the stream and stores it in the player buffer.
G) The stream is shown at the screen of the user.
A common clock-source was used to synchronize the machines in the setup.
To determine the latency from A to D, a segment from the TV Repackager was fetched and the timestamp T2 was extracted from the stream, then we could check the time T1 in the pictures within the video. T1 has only second resolution, but by looking at how progress bar was positioned it was possible to say T2 — T1 was between 0.8 and 1.2 seconds. So we can say 1.0 seconds for our estimate.
To determine the latency from D to E, a while-true busy loop tried to fetch a segment about to be produced until it returned 200 (instead of 404). Using millisecond precision in the access log in the TV Repackager, it was determined that the segment was available 2.35 seconds after the first frame of the segment arrived to the Live Ingest instance. By this we can conclude that 0.35 seconds was added in the repackager on top of the segment length.
The latency between E and F can be estimated by just downloading a segment; it is roughly 100 ms, in the case we measured.
The Bitmovin player offers the capability to configure a target latency in the player. I total we end up with an end-to-end latency of approximately 3.5 s + Player latency, with 2 second segments we ended up with around 6 s when the player was configured to “chase” the live point.
A - > D | D -> E | E -> F | F -> G | Total
1 – 1.2s | 2.35s | 0.1s | 2.5s | 6s
Table 1 showing the different parts of the latency break down
Setting up an SRT-connection to upload live streams to a cloud service such as the one provided by Edgeware is an easy way to get your first-mile delivery in place. Since SRT is open source and quite easy to get started with it is probable that it will get more wide spread support going forward. Especially considering the increasing number of members in the SRT-alliance.
Using a managed service origin in with stream segmentation and packaging in the cloud gives you the benefit to be able to get started quickly to deliver high-quality video over the internet without having to make large upfront investments and build up operations competence if you don’t have it already in place.
For the playback, there are several open source video-players available as well as commercial solutions such as the Bitmovin one I used in this setup.
This proof of concept setup proves that today it is quite easy to get a video stream distribution up and running end-to-end without large investments. For a global commercial video-service you would also need to add ABR transcoding and CDN if you want to reach large audiences with high quality and fast response time on different devices and locations, but I would conclude that SRT is an interesting option for stream delivery to commercial cloud services. In addition, we can see that fairly low latency can be reached end-to-end with technology available today.
As I am writing this down the stream delivered over the setup described above has been running for about 72h without interruptions, so the reliable part of the SRT seems to be fairly accurate.
Author: Björn Westman, Eyevinn Technology