A Crash Course in HTML 5 Video

If you want to watch Internet-delivered video on your PC, the vast majority of Web sites have settled on a single, consistent way to do that. That's the good news. The bad news is that this single, consistent delivery system is Adobe Flash, with all its security and stability issues.

If you want to watch Internet-delivered video on your PC, the vast majority of Web sites have settled on a single, consistent way to do that. That's the good news. The bad news is that this single, consistent delivery system is Adobe Flash, with all its security and stability issues.

But now a new way to deliver video through a browser is coming to the fore, one intended to be native to the browser itself: HTML 5's <VIDEO> tag. In this article I'll look at how the <VIDEO> tag can be used with the new generation of browsers. I'll also examine how parts of this equation -- the browsers and, to some degree, the video formats themselves -- are also still very much in flux.

Online video before HTML 5

One could fill a decent-size book talking about all the formats that have been used to deliver Web video at one time or another: Microsoft's .avi and .wmv container formats and the gang of codecs delivered with them, Apple's QuickTime, RealNetworks' RealVideo and RealAudio formats, and so on. Microsoft's Silverlight also deserves mention, since it allows providers such as Netflix to distribute content with embedded copy protection -- a feature not likely to fall out of demand as long as money changes hands for video content.

However, the video delivery system that's most widely deployed right now is Adobe's Flash.

The Flash Player was, and still is, one of the few browser add-ons that almost everyone is likely to have. Browsers on Macs and PCs alike typically support Flash by default, since a growing amount of Web content in general depends on it. It could be argued that Flash has become a video-delivery system as a byproduct of its original intention, which was to bring vector-based animation to the Web.

But Flash has problems as a video delivery system. It's proprietary. It requires the use of third-party code rather than something native to the browser. It has been lambasted for its lack of security and instability. The list goes on. It's a solution, when people have been hungering for the solution.

Hello to the <VIDEO> tag

The history of the <VIDEO> tag starts with the Web Hypertext Application Technology Working Group (WHATWG), a consortium made up of folks from Apple, the Mozilla Foundation and Opera Software. The WHATWG was created in 2004 to focus on the development of HTML 5 as a response to what it felt was the disregard of the World Wide Web Consortium (W3C) for real-world developers vis-à-vis XHTML and the then-extant HTML standards.

The first proposals for a <VIDEO> tag were submitted to the WHATWG in 2007 by Opera Software. The idea was simple: Create a framework in which Web browsers can natively play back video without being forced to fall back on third-party plug-ins. The user gets the experience of video that just works, those hosting the video have less maintenance to perform, and everyone walks away happy.

That's the theory, anyway. The practice has been another story entirely.

The codec conundrum

When the <VIDEO> tag was first proposed in the HTML 5 draft specification, one key omission from the spec was which video (and audio) codecs would be natively supported by the browser. As a result, while there are several video codecs that can be used in conjunction with the <VIDEO> tag, browser makers are not obliged to support any one of them: It's entirely their choice which codecs to include support for.

The original plan involved specifying the Theora video and Vorbis audio codecs as a baseline that all browsers should be able to play, but this was dropped in favor of an approach where no specific codec was recommended. Instead, the WHATWG expressed a desire for a codec that could be used in an unencumbered fashion and had a better guarantee of patent indemnity than Theora/Vorbis offered at the time.

The change sparked criticism among developers and might well have been one of the motivating factors in Google's offer of the VP8 codec as another baseline codec candidate.

In the end, the following three codecs have emerged as the main contenders for <VIDEO> tag support: H.264, Theora and VP8.

H.264: Microsoft and Apple have been major proponents of the H.264 codec family, which has already been broadly implemented and supported -- not just on the Web, but in cameras, in Blu-ray discs, and many other media that need powerful, efficient compression.

What's contentious about H.264 is not the technology itself but the licensing. H.264's usage is governed by the MPEG LA group, which levies a sliding scale of fees for H.264 based on the intended use. That said, the vast majority of end users on the Web might never pay anything for using H.264, for a couple of reasons.

First, the MPEG-LA has stated that for the next five years it will collect no royalties for H.264 Web streams that are offered free to end users.

Second, in the cases where you're dealing with for-pay content, odds are that the usage fees have already been assumed by someone else. For example, if you're encoding stuff in Windows and uploading it to YouTube as a pay-per-view item, you pay no licensing fees for using H.264, because any costs that might be levied have already been assumed by (in this case) Microsoft and Google.

For additional information on this issue, Ed Bott of ZDNet has explained how H.264 licensing fees work and why it wouldn't be in the MPEG LA's interest to suddenly ratchet up licensing fees when the current free-to-stream provisions for Web playback are up for revision in a few years. Florian Mueller's analysis is also interesting -- he examines the MPEG LA licensing terms from the point of view of an opponent of software patents, noting that the MPEG LA's licensing scheme, while not an ideal arrangement, does serve a useful function in a world where software patents exist and must be acknowledged.

That said, companies like Mozilla have not been set at ease -- for example, according to Mike Shaver, vice president of engineering at Mozilla Corp., MPEG LA's licensing isn't flexible enough to make solid exceptions for free software. Mozilla has opted to support Theora/Vorbis directly in its Firefox browser (and will support WebM in Version 4.0), and it has no plans to add native H.264 support.

Theora: Free software proponents have advocated the open Theora video format (with its matching Vorbis audio codec), which requires no licensing fees at all and has implementations immediately ready to use. But Theora has been criticized on a number of grounds: It isn't as technologically advanced as other codecs; there isn't much material encoded in the format, so current video would have to be recoded; and Theora's patent status could be subjected to future legal challenges (something Steve Jobs has hinted at).

VP8: A more evolved version of the Theora codec family (they share common ancestors), VP8 was developed by On2 Technologies, which also created one of Flash's video codecs. Google has since purchased On2 outright, and while Google now owns the patent for VP8, it's allowing unrestricted use of the codec without licensing fees under the banner of "the WebM Project." (WebM is Google's name for VP8 video plus Vorbis audio.)

This makes VP8 sound like a sure thing, but there are two problems. The first is that there are serious questions about how polished the spec is -- a factor that has serious implications for, say, hardware devices that shoot video directly in VP8. If VP8 is going to be in flux, then cameras that shoot video in VP8 would need to be firmware-upgradable (and have updates published by their makers) to use newer, better performing versions of the codec.

Another problem is VP8's quality and compression efficiency compared to H.264. One analysis, by Jason Garrett-Glaser, a developer on the FFmpeg project, has put the quality of VP8 on a par with H.264's "baseline" spec -- in other words, good but not great, and with H.264 way out in front in certain respects. He also believes that VP8's spec relies way too much on the snippets of code provided by Google. Most specifications for a standard (like the <VIDEO> tag itself) are drafted and discussed in depth before a single line of code is written; in Garrett-Glaser's view, the only real VP8 spec we have right now is the code, a cart-before-the-horse situation.

How to add HTML 5 to your site

The codecs you choose as your starting defaults should be dictated at least in part by what browsers are run by the majority of your visitors. Mark Pilgrim's Dive Into HTML 5 site has a detailed dissection of the competing and conflicting codecs, and it includes a handy chart that describes what current and next-generation browsers will support. Chrome is way out in the lead: The upcoming Chrome 6 will support all three major families of codec out of the box. As mentioned before, Firefox will support WebM in its upcoming Version 4.0, and it supports Theora, but not H.264, in Versions 3.5 and up. The most recent Internet Explorer 9 Platform Preview plays back H.264 natively; support for other codecs will most likely only be available as add-ons.

So if you're planning on adding HTML 5 <VIDEO> support to your site, what's the best way to cut through this Gordian Knot of standards? Right now, the only viable long-term answer is to hedge your bets by doing the following:

1. Encode your video in at least two different formats, with Flash being one of them as a universal worst-case fallback.

2. Set up your <VIDEO> tags to degrade gracefully, so that browsers without support for a given tier of video will fall back to whatever else is available.

3. Test your site tirelessly -- not just with multiple browsers, but with multiple versions of individual browsers and on as many different platforms as you can: desktops, laptops, smartphones, etc.

Conversion tools

Assuming you've decided which codecs you will use to run videos in HTML 5, you then have to convert your video into that format. There are several tools available.

H.264 tools

Because H.264 is already a broadly used standard, odds are that whatever professional-grade program you have for creating video (such as Adobe Premiere or QuickTime Pro) will support exporting in that format. That said, there are also several open-source/free H.264 encoders available. For example, the ffdshow library, packaged for Windows as the "ffdshow tryouts" codec pack, or the stand-alone programs Handbrake and Avidemux.

Note that your use of any of these tools must conform to the licensing requirements for H.264. Using an open-source implementation of H.264 doesn't absolve you of this. Generally, if you're rehosting video through a provider who already has a licensing agreement (e.g., YouTube), or you're not creating video "where there is remuneration for the title distributed," you won't have to pay anything. But you still need to sign a license agreement with MPEG LA to use H.264 or host your content with a third-party provider that already has one.

Theora tools

In keeping with Theora's free-and-open promise, the tools for creating Theora videos are available free of charge across multiple platforms.

An interesting place to start is the Firefogg extension for Firefox, which lets you use Firefox 3.5 and up as a front end for a Theora video converter. Feed it a video file, set a few basic options, click Save, and the conversion takes place in the browser as you watch. Be warned that the program is picky about the file format you provide: The .mov files that came from my digital camera had to be converted into .avi before they could be used. Firefogg also trades convenience for power: It's easy to use, but you can convert only one file at a time.

A more powerful but less convenient tool is the ffmpeg2theora command-line encoder utility. It's more powerful in that it gives you complete control over the encoding parameters, less convenient in that you have to supply a whole slew of switches to the program to work it. Your best bet is to use a front end of some kind, such as Theora Converter, which allows you to batch-process files and see the most important options at a glance (but be warned -- it's still in alpha). The above-mentioned Handbrake also exports to Theora.

Finally, if you use programs that export through DirectShow filters, xiph.org has a DirectShow Theora filter in both 32- and 64-bit implementations.

WebM tools

Because WebM is still very new -- especially in its current no-license-fee incarnation -- the tool set isn't as polished as it is with Theora or H.264. The WebM project's Web site lists only a few basic tools, including a DirectShow filter for Windows and a stand-alone command-line encoder called makeWebm. It's important to realize that WebM is subject to further refinement and improvement, and therefore these tools are likely to undergo refinement as WebM itself is changed.

(Incidentally, the just-released beta 1 of Firefox 4.0 supports WebM playback. Try it out for yourself: Go to www.youtube.com/html5, click "Join the HTML5 Beta," and add "&webm=1" to any search to look for WebM-encoded videos.)

Using the <VIDEO> tag

Codecs aside, the most important thing about using video in HTML 5 is the construction of the <VIDEO> tag itself. In a perfect world, you'd just need to point to the video stream in question, like this:

<VIDEO SRC="video.mov" />

1 2 Page 1
Page 1 of 2
Subscribe today! Get the best in cybersecurity, delivered to your inbox.