The problem with upcoming video codecs

Required background: Video codecs:1, Algorithms:1

An important aspect of computer algorithms is resource tradeoffs. In general, algorithms are about processing input data to produce output data. The obvious resources are the time/effort it takes to process, the accuracy of the output data, and the storage space required for the input data, the running algorithm, and the output data. Of course, tradeoffs only make sense in the context of comparing algorithms.

Consider the simple example of two sorting algorithms, merge sort and quicksort. Merge sort requires twice the runtime storage that quicksort does. On the other hand, it produces slightly more accurate results in that it is "stable" (i.e. the input order of items with equal key values is preserved in the output). Mergesort also has a guaranteed running time whereas quicksort's running time is data-dependent.

One chooses an algorithm based on what tradeoffs are acceptable/desirable for the particular application. If sorting stability is important in a given application, quicksort's tradeoffs are not acceptable. If sorting stability is not a requirement but the data to be sorted is nearly as large as available storage, merge sort's tradeoffs are not acceptable. This brings us to codecs in general and video codecs in particular.

Even more specifically, let's consider DVDs. The audio and video on a DVD is encoded as MPEG-2 (we'll ignore the encryption scheme for now). It is a lossy compression codec, meaning the accuracy of the output is guaranteed to be lower than the input. In fact, the encoding algorithm has a tunable tradeoff between accuracy of output and size of output.

For encoding, the time/effort, input storage, and runtime storage it takes are largely irrelevant since it is done once to prepare the DVD master and never again. Decoding is constrained by the available storage on a DVD (though, certainly, many movies are released to DVD with supplemental discs containing extras and such, it is pretty important that the movie itself fits on a single disc) and the processing power that can be made available in a consumer electronics component (the more processing power that is required, the higher the price of the device, not to mention power consumption and heat dissipation, thus economic realities set limits on the processing power available for decoding). The MPEG2 codec (and the DVD media format) had to be developed to have tradeoffs that satisfied these constraints.

Economic constraints change, however, and they have. The DVD media format is showing its age and there are new formats on the horizon (HD-DVD and Blu-Ray, in particular). Furthermore, there is reason to believe that in the near future video will be stored on hard drives rather than optical discs and will be delivered there over an internet connection. Typical bandwidth to the American home isn't up for speedy delivery of even MPEG2-encoded video right now, but this can be expected to improve as WiMax networks, fiber to the home, and other technologies (maybe broadband over power lines) become widely available. Meanwhile, though the cost of processing power has gone down (as it always does), there is increasing desire to view video on battery-operated devices (e.g. video iPod, cellphones, laptops, portable all-in-one DVD players). Processing power translates directly into electrical power consumption, and portable energy density isn't improving quickly. Storage, however, even storage that can be run energy efficiently (e.g. flash memory rather than 7200 or 5400 RPM hard disks), is getting cheaper and larger.

This points to a need for a different set of tradeoffs. The accuracy of MPEG2-encoded video is almost certainly sufficient for portable applications, but there is less need to restrict the encoded size of the video and more need to restrict the processing requirements for decoding it. This does not seem to be the direction of video codecs, however.

The newer video codecs, such as MPEG4, H.264 (which is part 10 of MPEG4), and Windows Media (which also involves part of MPEG4), are all about better output accuracy with lower output storage at the expense of higher processing requirements for decoding (as compared to MPEG2). This makes sense when considering yesterday's economic realities, but not tomorrow's. The video iPod plays H.264 videos. It's battery life when playing videos is advertised as 2 hours, as opposed to 14 hours when just playing music. It is unsurprising that decoding video takes more effort than decoding audio, and changing the LCD screen 30 times/second while keeping its backlight on certainly accounts for some of the power consumption, but should it really take seven times the power?

This isn't the only time that technology has been developed that does not serve the needs of the future or even the present, and it won't be the last. It is something to learn from, however.

Tags: , ,

Comments: Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?