Smart content encoding at VEVO

The Vevo Engineering team is always looking at the quality of our customer video experience since it is such a critical component of our platform. Combine that quality with rich metadata, beautiful imagery, and polished client applications, and the result is a stunning user experience. After reading a wonderful article from Netflix on this same subject, we found some time to share a similar effort we carried out over the latter half of 2015.

Vevo circa mid-2015

Most content today is encoded as an Adaptive Bit Rate (ABR) encode – this simply means we encode the same file at several bitrates and resolutions. Those parameters and more combine into profiles used by the encoder to create content we can view on our mobile devices or PCs. We, and the industry generally, also break up each of these encoded files into small chunks (say 4 seconds). By creating these 4 second mini video files we are able to quickly switch between one profile or another to detect variance in bandwidth coverage. For every 4 second chunk we have several bitrate/resolution options that are controllable and selectable by our client applications and web experiences to ensure we deliver a great, uninterrupted viewing experience.

How do we create these profiles? This can be subjective. We take a set of complex content (dark, fast-moving action and so on), look at this on a large screen, and then increase the bitrate until you see minimal difference before and after. If you start at 500kbps and go up in increments of say 500kbps, you will notice a gradual increase in viewing quality until around 2.5Mbps. Then the curve will tend to flatten out, as shown in the graph below. This means we get to a point of diminishing returns on the curve when increasing the bitrate gives minimal quality improvements for a specific resolution. Conversely, the quality drops off considerably once we tend below the 1.5Mbps values.

Fig. 1.0 : Example of Rate Distortion curve for a specific content encoded at 720p

Curves vary by other means too. They are not the same for 720p and 1080p, or for a 10ft viewing experience versus watching video on your mobile device.

At Vevo, and with many other online or over the top video providers, we simply plot these curves to varying complexities (dark, action…) and resolutions (1080, 720…) to look for the inflection point on these curves where we are able to meet a minimum, subjective quality. There is a trade off between the higher data rates – slower to buffer and requiring more bandwidth (equals cost on mobile) – and a fantastic user experience, which is always our number one goal.

Fig. 2.0 : Example of Rate Distortion curve for different classes of complexity at 720p

The graph clearly points out that at 2Mpbs highly complex encodes are hitting our required quality bar. However lower complexity encodes are effectively wasting bandwidth. This is the general solution that enables us to simply set one value for 720p of 2Mbps and all files for this resolution encode at this rate. This enables us to provide the best video experience, but we are wasting some bandwidth in the majority of cases as above.

In the following figure 3.0, we have plotted these curves for the same complexity video but at varying resolutions. You will see all of those curves intersect one another. This means that there is a bitrate threshold at which, below that value, perceptual quality will be better by choosing a lower resolution. For instance, as the red line drops below the green on the 2Mbps vertical, 720 will provide better visual quality.

Fig. 3.0 : Relationship between RD curves at different resolutions for the same content

This is how we chose to encode content up until the summer of 2015 across a library of almost 200,000 music videos with a bitrate table and resolution match shown below, and calibrated against some of the most complex music videos, but used for all.

Resolution Video Bitrate (H.264)

It is a hard number to quantify, but our best intelligent guesses indicate that using this very general approach we are wasting 10-20% bitrate margin and therefore perceptual quality for our audience. An alternative way of looking at this is that we could deliver higher resolution of video for the same bitrate if we could figure out a way to reduce that waste. Below is Vevo’s solution.

A new strategy

Vevo started on a project to review quality at lower levels of bandwidth with the objective of retaining the minimum quality while lowering the bandwidth in a more content specific and dynamic way - or - finding an improved resolution for the same number of bits.

If we review the graph presented in Figure 2.0, you can see the target quality for the lower complexity encode (green curve) can be achieved at a significantly lower bitrate. In this sample case, around a half at 1Mbps. Alternatively, we could increase the resolution of the same content.

Reviewing Figure 2.0 again, we see that the gray line cuts the low complexity (green) curve at around 1Mbps – about half the rate of the higher complexity curve (red). This clearly indicates that we could reduce the bandwidth quite a bit for that video. Alternatively, we could also increase its resolution, in this case to 1080, thus considerably increasing the quality of user experience while leaving the bitrates as they are.

Moving forward from this analysis of a general video towards a more specific approach was the next step in understanding the way in which complexity and other factors would be able to “hint” at a more specific set of profiles for each and every music video.

In place of a worst-case set of rules for defining encoding profiles, we classify the complexity of each music video and apply these to the rate distortion curves. This enables us to gather multiple hints on how to optimize our encoding for each music video.

Fig. 4.0 : Relationship between resolution-bitrate-quality-complexity

This graph identifies the approach taken at Vevo. By analyzing the complexity of the content each time we classify this content in terms of its complexity and apply an optimized scale to the same content.

We modify the encoding resolution/bitrate mixture based on the complexity and can then impact the choice of encoding parameters, scaling algorithm, and our visual filters to provide the best video quality customer experience.

We deployed a new version of Vevo’s cloud encoding platform in September 2015 after some period of data gathering to determine the heuristics to optimally apply parameters used to encode the video.

For example, in many more cases we are able to stream 540 content at 720 and most 720 at 1080 without impacting the users experience in any way, while actually improving perceptual quality in many cases or reducing startup time for playback on devices, which of course really matters.

The Engineering team is always eager to take on the “let's do a little better” challenge, and it’s an exciting time to be working at Vevo as we research new, innovative technologies like HDR, H265, UHD, and beyond, both in terms of growing our expertise but most importantly ensuring our customers have a best in class video experience.

This article was written by Nick Vicars-Harris, Fabio Sonnati and David Levinson.