VP8 vs VP9 - In the context of online video delivery

This blog post is intended for software engineers working on web-based video applications who want to understand the basics, history, and practical differences between VP8 and VP9, especially in the context of online video streaming.

The key takeaways are:

  • The basics of VP8 and VP9 codec and a bit of the history.
  • Practical differences between VP8 and VP9 in the context of encoding and delivering video content over the internet. We will compare compression efficiency, playback quality, compute requirement, resolution handling, HDR content handling, and hardware support.
  • Key implementation-level differences are explained in easy-to-understand language instead of too much theory. It is a brief explanation to keep this post short.

Basics of VP8 and VP9 codec

VP8 codec

On2 Technologies initially created the VP8 codec. Then Google acquired On2 and released VP8 as an open, royalty-free video format. WebM project was launched with the aim to use VP8 as the video format for HTML5.

From the very beginning, VP8 developers focused on Internet/web-based video applications. The design targets low-bandwidth scenarios and a wide range of client hardware (from mobile devices to powerful desktops) and is optimized for the common video format on the web (420 color sampling, 8-bit color depth, and resolution up to 16383x16383 pixels).

Encode a video file in VP8 codec using the FFmpeg command line tool.

ffmpeg -i input.mp4 -c:v libvpx output.webm

VP9 codec

VP9 represents a natural evolution over VP8, incorporating several enhancements and new tools to improve compression efficiency, especially for high-definition content, with a modest increase in decoding complexity. VP9 is competitive with state-of-the-art codecs like HEVC, showing significant improvements over VP8. Like VP8, VP9 is also open and royalty-free.

For most practice use cases, you should use VP9 or its successor AV1 instead of VP8.

FFmpeg command to encode video in VP9 codec.

ffmpeg -i input.mp4 -c:v libvpx-vp9 output.webm

VP8 vs VP9 - Practical differences for web developers

Let’s compare VP8 and VP9 in terms of compression efficiency, video quality, hardware support, and adoption.

Compression efficiency

VP9 is a significant improvement over VP8 in terms of compression efficiency. VP9 provides better video quality at a lower bitrate compared to VP8. If you are streaming HD content on the web, you will always benefit from VP9. libvpx, the reference implementation of VP8 and VP9 codec, provides many knobs you can control to tweak bitrate, CPU utilization while encoding, etc.

Playback video quality

Despite the high compression ratio, VP9 maintains impressive video quality compared to VP8. At lower bitrates, VP9 maintains video quality more effectively than VP8, reducing the blockiness and blurring that can occur with compression. This makes VP9 a preferable choice for streaming high-quality video, especially in environments where bandwidth is a constraint.

Video resolution support

VP9 can handle ultra-high resolutions up to 65536x65536 pixels, while VP8's capability is capped at 16383x16383 pixels. This makes VP9 more suitable for high-definition or Ultra HD content, as VP8 is less equipped to manage higher resolutions effectively.

Color and Frame Rate

Both VP8 and VP9 support a wide range of colors and frame rates, but VP9's improved compression techniques allow it to handle dynamic scenes and fast motion more effectively, resulting in smoother playback and clearer images.

Encoding speed and compute requirements

If you executed the FFmpeg command above, you will notice a stark difference between the execution time. This is because VP9 encoding (using libvpx) is terribly slow. As a codec, VP9 is impressive, but encoding is more CPU intensive and slower compared to VP8. Decoding wise, VP9 is only slightly expensive compared to VP8 while offering impressive compression.

Hardware Support

Decoding - VP9 has better hardware decoding support than VP8. Due to Google’s effort and collaboration with hardware vendors, most GPUs and SoCs support VP9 decoding natively. Offloading decoding to hardware results in efficient playback of VP9-encoded videos on a wide variety of devices. VP8 has limited hardware decoding support.

Encoding - Hardware-accelerated encoding never really took off for VP9. However for VP8, on Monday, March 14, 2011, WebM project team in Finland released the world’s first VP8 hardware encoder namely H1 for free.

High Dynamic Range (HDR) support

VP9 natively supports HDR video. VP8 does not support HDR content.

High dynamic range (HDR) video technology represents a significant advancement in mirroring the colors and contrast that the human eye perceives, ranging from the brightest whites to the deepest blacks (dynamic range). The goal of HDR video is to replicate the realism of images from the moment they are captured by the camera through post-production, distribution, and, ultimately, display.

Notice the difference:

Source

Browser and device support

Though both VP8 and VP9 are supported in browsers, VP9 support is more widespread if you consider hardware-accelerated decoding. To be precise, VP9 decoding is compatible with more than 2 billion devices, encompassing browsers like Chrome, Opera, Edge, Firefox, and platforms such as Android, in addition to millions of smart TVs. Android has supported VP9 since version 4.4 KitKat and iOS/iPadOS added VP9 support in iOS/iPadOS 14.

WebM browser support - source

Support in WebRTC

All WebRTC-compatible browsers have to support the VP8 codec as part of the specifications. VP9 support in WebRTC is available, starting with Chrome (48+) and Firefox.

The WebRTC API enables the creation of websites and applications that facilitate real-time communication among users, allowing for the exchange of audio and/or video, along with optional data and additional information.

Lossless mode for archival

Lossless mode is typically used for archival and storage rather than for streaming on the web. VP9 supports lossless encoding, while VP8 doesn’t. libvpx-vp9 has a lossless encoding mode that can be activated using -lossless 1.

ffmpeg -i input.mp4 -c:v libvpx-vp9 -lossless 1 output.webm

Do not re-encode an H.264 encoded file in VP9 using lossless mode. The file size will increase without actually adding any new details. So it is important to be careful with this mode, even for archival purposes.

Key implementation differences

  • Prediction Block Sizes
    Videos are made up of frames/images. And encoder doesn’t try to look at frame as a whole. Instead it needs to break it down into smaller grids, kind of like cutting a cake into smaller pieces to manage it better. This is called block decomposition.

    VP8 uses fixed block sizes for its encoding process, while VP9 introduces variable block sizes, allowing for more flexible and efficient compression. VP9,  starts with a super-block (SB) of size 64x64 and allows a recursive decomposition of these super-blocks down to 4x4 blocks, offering a total of 13 different endpoint block sizes for defining various prediction signals. This adaptability in VP9 helps it better manage the trade-off between compression and image quality.
  • Prediction Modes
    Prediction modes are techniques used in video compression to predict the contents of a block of pixels based on other blocks. These modes help reduce redundancy, leading to more efficient compression by encoding only the differences between the predicted and actual blocks.

    Both VP9 and VP8 include various intra-prediction modes (used within the same frame) and inter-prediction modes (using data from other frames). But VP9 extends the prediction modes used in VP8, supporting a set of 10 intra-prediction modes for block sizes up to 32x32 and four inter-prediction modes for block sizes up to 64x64 pixels, offering more adaptability in coding different video content.
  • Tile-based Parallel Processing
    Tiling in video codecs is a feature that divides the video frame into smaller, independently decodable regions called tiles. This allows for parallel processing during encoding and decoding, improving efficiency and enabling better error resilience and adaptability to network conditions.

    VP8 does not natively support tiling as part of its standard feature set. It processes the frame as a whole, which means that any error or processing requirement affects the entire frame, potentially impacting decoding efficiency and error recovery. VP9 introduces tiling, allowing a frame to be divided into several rectangular tiles that can be decoded independently. This independence is crucial for multi-threaded decoding, enabling different processor cores to decode different tiles simultaneously, significantly speeding up the decoding process.

  • Bit-stream Features for online streaming
    VP9 includes error resilience, frame-parallelism, tiling, and alternate resolution reference frames – features specific to VP9 that are designed to support internet video delivery and consumption, enhancing the streaming experience.

    The differences between VP8 and VP9 bitstream features reflect VP9's advancements in providing more efficient compression, better error handling, and enhanced adaptability to various playback scenarios, making it more suitable for a broader range of applications, particularly in environments with fluctuating network conditions.

How to choose between VP8, VP9, and other next-gen codecs

When it comes to choosing between VP8 or VP9, the choice is simple, always go with VP9.

However, generally, the decision to choose codec or multiple codecs depends on the answer to the following questions:

  • Are you inclined to utilize an open-source format, or are you open to considering proprietary formats as well?
  • Are you willing to sacrifice support in some browsers? This will depend on your application.
  • Can you encode each of your videos in the latest codecs (VP9 or its successor, AV1)? Having a universally supported fallback option like H.264 (AVC) can greatly streamline your decision-making.
  • More importantly, can you detect device capability and serve different format to different devices? This can quickly become complex when combined with a content delivery network (CDN), because you will need to configure CDN correctly to cache multiple copies of same content and serve correct one.
This is where tool like ImageKit can help, it can automatically convert video format and deliver the right format based on device support.

VP9 encoding with ImageKit

ImageKit is a Media management & delivery platform for high-growth teams. It provides real-time video processing APIs to process, transform, and stream videos across devices directly from the video's URL without worrying about intricate architecture details around encoding & browser support. You can use the forever free plans that is enough for many small projects.

Conclusion

  • VP9 is superior in terms of compression efficiency and HD and UHD content handling. When it comes to decoding, it's only slightly slower compared to VP8.
  • If you are building a video-on-demand application, choose VP9 or its successor, AV1.
  • Greatly simplify encoding and delivery-related architecture by using a tool like ImageKit.