If video forms a critical component of your digital asset strategy, then you need to know how to avoid lag, stutters, and freezes for your video content.
If your videos/streams regularly take longer to load – or worse, freeze – viewers are just going to move on.
There are two main reasons for poor-quality streams:
- Your streams are either using way too much bandwidth for your viewers (causing frequent buffering) or not enough (causing poor image quality).
- Your streams use an encoding standard that isn't ideal for the content you're pushing -- either for latency or compatibility.
The trick to optimizing your content, then, isn’t faster internet, but knowing which modern video encoding standard – like H.264 or H.265 – to use, where, and why.
Such video encoding technologies solve the issue with better, more efficient compression, which means you use less bandwidth for broadcasting a stable stream at the same (or better!) quality, and your viewers need less bandwidth to watch your content on any device they want, directly translating to better engagement numbers for you.
What are H.264 and H.265?
H.264 (AVC) and H.265 (HEVC) are video compression standards used in digital video recording and distribution.
What is this “compression”? Why do we need H.264 or H.265? Well, that would be because raw input from cameras and microphones is massive.
Don’t believe me? Here’s some quick math:
Bits per frame = 1920 * 1080 * 24 = 49766400
Bits per second (at 60 frames per second) = 49766400 * 60 = 2985984000, around ~375 MB per second.
File size for 5 seconds of this footage at 60 fps = 2985984000 * 5 = ~1866 MB = ~1.87 GB
That’s nearly 2 GB for only 5 seconds of footage, not counting audio!
This is not suitable for online streaming, no matter how fast your network speed is.
To overcome this problem, we compress our footage with software or hardware (“codecs”, which do the actual mathematical operations specified by these compression standards) to achieve a satisfactory quality-to-size ratio before sending them to viewers, tiny chunks at a time, ideally via an optimizing CDN.
This compression is the critical step, for both on-demand and real-time content. Let’s talk about the two most well-known of these methods in the era of HTML5 - H.264 and H.265.
Also known as AVC or Advanced Video Coding, part of the MPEG4-Part 10 spec, H.264 was developed by the ISO/IEC MPEG (Moving Picture Experts Group) and ITU-T VCEG (Video Coding Experts Group), and first made available in 2014. It is ubiquitous; standardized near-universally after the release of the Apple iPad, and found everywhere from Blu-Rays to Twitch and Netflix, to the medical and surveillance industries.
Informally known as HEVC or High Efficiency Video Coding, part of the MPEG-H spec) is the successor to H.264. Made by the same creators, the HEVC standard was first made available in 2013. Far more efficient, enabling even higher-resolution streaming and boasting even better compression that allows for the same visual fidelity at just half of H.264’s bitrate.
What do our numbers look like if we used these, then?
So, for 5 seconds of the same footage, 2.5 * 5 = ~12.5 MB
And with the default H.265 settings, you only need ~1.5 MB per second for the same quality.
So, 1.5 * 5 = ~7.5 MB
That’s a whopping 150x reduction in storage and bandwidth requirements from just using basic H.264, and 250x from using H.265.
The way the two algorithms accomplish this, is also where they differ.
What are the key differences between H.264 and H.265?
Let’s look at optimal bandwidth requirements for your video content over the internet and then break it down by the most common resolution-frame rate combinations.
These numbers use the official YouTube guidelines and adjust them for H.265, with a buffer amount of 30% in mind for both. Upstream bandwidth is almost always unstable on any ISP in any country, so you shouldn’t saturate it completely if you want stutter/lag-free streams.
|Resolution-Frame Rate combination||H.264/AVC
Optimal bandwidth needed
Optimal bandwidth needed
(1280 X 720 @ 16:9) at 60 FPS
|7200 kb/s or ~7.2 Mbps||4300 kb/s or ~4 Mbps|
(1920 x 1080 @ 16:9) at 60 FPS
|16000kb/s or ~16 Mbps||9400 kb/s or ~9 Mbps|
(3840 x 2160 @ 16:9) at 30 FPS
|32400 kb/s or ~32 Mbps||18700 kb/s or ~19 Mbps|
You can get away with lower if live streaming mostly static content – for example - chess games.
The main takeaway here, as also noted in this paper published by the IEEE, is a 50-60% reduction in storage and bandwidth requirements going from H.264/AVC to H.265/HEVC, making AVC “good enough” for anything up to 1080p content, and HEVC ideal if you want to create and broadcast UHD/4K content.
2. Compression techniques
How do codecs implementing the H.264/H.265 algorithm – for example, x264/x265 (software, open-source), or NVENC (hardware, proprietary NVIDIA) – work their magic? Well, they perform complicated mathematical processes on RAW video, reducing sizes by essentially throwing out data that the human eye cannot perceive, not at the usual resolutions content is made at, anyway.
First, each video frame is divided into a grid of pixel blocks, with low detail areas being larger (the default; 16x16 blocks), and high detail areas being smaller (transformable down to 8x8 or 4×4). Each of these is called a macroblock.
Next, the codec looks for similarities between macroblocks either
- Within the same frame (an Intra-frame, I-frame, or a keyframe) taking advantage of the similarity in luminance and chrominance between macroblocks within each frame, i.e. spatial redundancy, or
- Within the previous frame (Predicted; P-frames), or the frame before AND after it (Bidirectional; B-frames), and then relying on motion vectors to make up the difference by mathematically predicting how things move across the screen. i.e. Temporal redundancy.
…and groups them together for compression. When finished, the information between keyframes is dramatically reduced, and so is the size of the video file.
About using only I-frames and only B-frames
Using only I-frames (like Avid’s DNxHD does) would be bad for any scene that has a lot of moving objects (think big CGI-filled action sequences), while using only B-frames would be resource-heavy for both encoding AND decoding (because they need two in-memory queues to keep the encoding and decoding orders synchronized). Thus, H.264 codecs have presets and settings to fine-tune this balance to your liking.
H.265 uses Coding Tree Units (CTUs), not conventional macroblocks. You’re still using grids of pixels, though, so you can conceptually think of them as blocks, with the biggest change from H.264 being how many more options you have for the size of the block that is encoded. The lower limit is still 4x4, but now you can go all the way up to 64x64, meaning you can bump the resolution higher than with H.264, then spend far fewer bits on large areas of little detail and more on areas with fine detail – especially great for busy action scenes.
HEVC also uses spatial and temporal prediction to make up the difference for moving objects but packs a far more advanced solution for them (Merge Mode and Advanced Motion Vector Prediction). Plus, it boasts improved deblocking filters and sample adaptive offset filtering, ensuring better image quality.
All of this combined means that HEVC requires a lot more powerful hardware than AVC, and videos take a lot longer to encode, but they absolutely deliver on the promise of ~50% file size/bandwidth reduction for the same quality, with even better gains as you move up to 4K, 6K, and even 8K content, making them viable for the internet to begin with.
3. Image Quality
A picture is worth a thousand words, so here’s a side-by-side comparison of a frame from this video at 4K, transcoded at 1080p using the open-source Handbrake, at the same bitrate, using x264 and x265 respectively. With H.265, the increase in detail for scenes in motion cannot be overstated.
This metric is important because, after all, making pre-made videos available on-demand (think YouTube, Netflix, etc.) isn’t all you’d be doing as a content creator. What about use cases involving real-time feeds, like WebRTC?
H.265 (via the x265 codec) definitely delivers on the promise of ~50% smaller file sizes that can be sent faster since there is less data to send, but encoding times are 10-20x longer than the well-established H.264 (via the x264 codec).
On the decoding side (natively; ffh264 for H.264, and ffhevc for H.265), H.265 is about 30% slower.
This is a huge drawback if you make content that is live and interactive. The worse the delay between the streamer/broadcaster and the audience, the worse the live-streaming experience will be on both sides. From a business perspective, it is necessary to weed out streaming issues at the earliest so that your viewers will not only watch but could be persuaded to take a purchase decision.
The increased overhead of HEVC leads to latency issues compared to AVC/H.264.
Now, for the moment of truth.
H.264 vs. H.265: Which should I pick for high-quality buffer-free streaming?
You might assume that HEVC is the superior standard, since…well, it’s the newest one, right?
Not exactly. It depends on:
- What your content is, primarily, and
- What resolution that content needs to be at.
But here are some good rules of thumb to live by:
Do you upload on-demand content on YouTube?
Use H.265, and encode at ideally one resolution step higher than you want it to be seen at. You should ensure the highest quality uploads you can on your end. These third-party content platforms often end up transcoding your files anyway to better fit their storage/broadcast requirements (YouTube currently uses their proprietary VP9 codec for video, for example) and you’d have no control over that process.
Do you mainly do live streams?
Use H.264. For WebRTC/Twitch feeds, your latency requirements take priority over your space/bandwidth requirements. H.265, while amazing, also takes 10-20x longer to encode, which is just not going to cut it for real-time use cases. Not to mention how WebRTC is notoriously difficult to get working on some devices using H.265.
Does your content need to be 4K or higher?
H.265 all the way. This is a likely scenario if you’re working in the film/art industries and need to upload your portfolio somewhere that’s not YouTube. H.264 provides acceptable image quality at 1080p and lower, but for your needs, the higher quality the better. Plus, you’d see even better gains than the promised ~50% the higher resolutions you push (6K and 8K).
Does your content need to be accessible to everyone, no matter which device they're on (or how old these devices are)?
H.264 is the only choice here. It has had decades to mature and cultivate an ecosystem that supports it universally. Pretty much every OTT/streaming platform and device (including old cameras) supports it natively, and there have been tremendous efficiency gains in the past decade that make it work without too much hassle on older hardware.
What makes H.264 ideal for online video streaming?
As you’ve just seen, H.265 absolutely has its uses. For online video streaming, though, H.264 has been for decades and still remains, the ideal video compression standard. Let’s run down the reasons why.
- Widespread adoption
H.264 is supported by pretty much everything made since 2013 – cameras, drones, browsers – while native H.265 is limited to specific mobile chipsets, and browsers like Edge and Safari only (so no Android).
It occupied 82% of the global market of video codecs and containers till 2018 (Statista). Its market share has evidently increases since Apple rejected Adobe Flash and made the (then controversial) move to champion H.264 and HTML5 video for the iPad and connected devices.
- A ‘good enough’ quality-to-file size ratio
x264 works well enough for 95% of the video content on the internet, and the bitrate/quality ratio it offers has been good enough to be used everywhere from Blu-Ray discs to Twitch streaming, for years now. Yes, H.265 wins vs. H.264 at raw numbers, but at what cost? Its encoders are far slower, so when you modify its settings to get faster encode times, it lowers quality.
Similarly, when you set it to give you better quality, those speed gains go away. It’s far more difficult to strike a good balance between the two with H.265 than it is with H.264, which “just works” at even the baseline quality factor.
- Doesn’t need powerful hardware for encoding/decoding
H.264 and its open-source codec, x264, is simply a more well-optimized and mature software than HEVC and x265. A 50-60% reduction in final file sizes comes at the hefty cost of 10x more CPU power required, and many devices simply cannot do this efficiently. Remember, high-end PC users aren’t the only audience for video content.
- No licensing chaos
You can’t use either H.264 or H.265 without paying royalty fees. But while H.264/AVC only has 1 patent pool associated with it, HEVC has three: HEVC Advance, MPEG LA, and Velos Media. This makes the former much more attractive for commercial use cases.
Hopefully, now you have enough of a technical primer to jump into streaming, and content creation. RAW videos must be encoded/compressed into sizes and formats manageable enough for the internet, before they can be uploaded and cached on a content delivery network (CDN) that can optimize them for delivery, from servers nearest to your viewers.
Striking a balance between file sizes, visual fidelity, latency, and in a format that lets you reach as many people as you can is critical for this industry, and knowing your stuff when it comes to compression techniques like H.264 and H.265 will help you do just that.