eoio |
|
||||||||||||
compression | mpeg | m-jpeg | cinepak | ivi | quicktime | sorenson | avi | activemovie |
|||||||||||||
|
compression
lossy techniques reduce data both through complex mathematical encryption and through selective intentional shedding of visual information that our eyes and brain usually ignore and can lead to perceptible loss of picture quality. 'lossless' compression, by contrast, discards only redundant information. codecs can be implemented in hardware or software, or a combination of both. they have compression ratios ranging from a gentle 2:1 to an aggressive 100:1, making it feasible to deal with huge amounts of video data. the higher the compression ratio, the worse the resulting image. colour fidelity fades, artefacts and noise appear in the picture, the edges of objects become over-apparent, until eventually the result is unwatchable. by the end of the 1990s, the dominant techniques were based on a three-stage algorithm known as dct [discrete cosine transform]. dct uses the fact that adjacent pixels in a picture either physically close in the image [spatial] or in successive images [temporal] may be the same value. a mathematical transform a relative of the fourier transform is performed on grids of 8*8 pixels [hence the blocks of visual artefacts at high compression levels]. it doesn't reduce data but the resulting coefficient frequency values are no longer equal in their information-carrying roles. specifically, it's been shown that for visual systems, the lower frequency components are more important than high frequency ones. a quantisation process weights these accordingly and ejects those contributing least visual information, depending on the compression level required. for instance, losing 50 per cent of the transformed data may only result in a loss of five per cent of the visual information. then entropy encoding a lossless technique jettisons any truly unnecessary bits. initially, compression was performed by software. limited cpu power constrained how clever an algorithm could be to perform its task in a 25th of a second the time needed to draw a frame of full-motion video. nevertheless, avid technology and other pioneers of nle [non-linear editing] introduced pc-based editing systems at the end of the 1980s using software compression. although the video was a quarter of the resolution of broadcast tv, with washed-out colour and thick with blocky artefacts, nle signalled a revolution in production techniques. at first it was used for off-line editing, when material is trimmed down for a programme. up to 30 hours of video may be shot for a one-hour documentary, so it's best to prepare it on cheap, non-broadcast equipment to save time in an on-line edit suite. although the quality of video offered by the first pc-based nle systems was worse than the vhs vcrs used for off-line editing, there were some advantages. like a word processor for video, they offered a faster and more creative way of working. a user could quickly cut and paste sections of video, trim them and make the many fine-tuning edits typical of the production process. what's more, importing an accurate edl [edit decision list] generated by an nle system into the on-line computer on a floppy disk was far better than having to type in a list of time-codes. not only was nle a better way to edit but, by delivering an off-line product closer to the final programme, less time was needed in the on-line edit suite. nle systems really took off in 1991, however, when hardware-assisted compression brought vhs-quality video. the first hardware-assisted video compression is known as m-jpeg [motion jpeg]. it's a derivation of the dct standard developed for still images known as jpeg. it was never intended for video compression, but when c-cube introduced a codec chip in the early 1990s that could jpeg as many as 30 still images a second, nle pioneers couldn't resist. by squeezing data as much as 50 times, vhs-quality digital video could be handled by pcs. in time, pcs got faster and storage got cheaper, meaning less compression had to be used so that better video could be edited. by compressing video by as little as 10:1 a new breed of non-linear solutions emerged in the mid-1990s. these systems were declared ready for on-line editing; that is, finished programmes could essentially be played out of the back of the box. their video was at least considered to be of broadcast quality for the sort of time and cost-critical applications that most benefited from nle, such as news, current affairs and low-budget productions. the introduction of this technology proved controversial. most images compressed cleanly at 10:1, but certain material such as that with a lot of detail and areas of high contrast were degraded. few viewers would ever notice, but broadcast engineers quickly learnt to spot the so-called ringing and blocky artefacts dct compression produced. also, in order to change the contents of the video images, to add an effect or graphic, material must first be decompressed and then recompressed. this process, though digital, is akin to an analogue generation. artefacts are added like noise with each cycle in a process referred to as concatenation. sensibly designed systems render every effect in a single pass, but if several compressed systems are used in a production and broadcast environment, concatenation presents a problem. compression technology arrived just as proprietary uncompressed digital video equipment had filtered into all areas of broadcasters and video facilities. though the cost savings of the former were significant, the associated degradation in quality meant that acceptance by the engineering community was slow at first. however, as compression levels dropped to under 5:1 objections began to evaporate and even the most exacting engineer conceded that such video was comparable to the widely used betasp analogue tape. mild compression enabled sony to build its successful digital betacam format video recorder, which is now considered a gold standard. with compression a little over 2:1, so few artefacts [if any] are introduced that video goes in and out for dozens of generations apparently untouched. the cost of m-jpeg hardware has fallen steeply in the past few years and reasonably priced pci cards capable of a 3:1 compression ratio and bundled with nle software are now readily available. useful as m-jpeg is, it wasn't designed for moving pictures. when it comes to digital distribution, where bandwidth is at a premium, the mpeg family of standards specifically designed for video offer significant advantages." [quote-source] mpeg "the moving picture experts group [mpeg] have defined a series of standards for compressing motion video and audio signals using dct [discrete cosine transform] compression which provide a common world language for high-quality digital video. these use the jpeg algorithm for compressing individual frames, then eliminate the data that stays the same in successive frames. the mpeg formats are asymmetrical meaning that it takes longer to compress a frame of video than it does to decompress it requiring serious computational power to reduce the file size. the results, however, are impressive:
mpeg video needs less bandwidth than m-jpeg because it combines two forms of compression. m-jpeg video files are essentially a series of compressed stills. using intraframe, or spatial compression, it disposes of redundancy within each frame of video. mpeg does this but also utilises another process known as interframe, or temporal compression. this eradicates redundancy between video frames. take two, sequential frames of video and you'll notice very little changes in a 25th of a second. so mpeg reduces the data rate by recording changes instead of complete frames. mpeg video streams consist of a sequence of sets of frames known as a gop [group of pictures]. each group, typically eight to 24 frames long, has only one complete frame represented in full, which is compressed using only intraframe compression. it's just like a jpeg still and is known as an i frame. around it are temporally-compressed frames, representing only change data. during encoding, powerful motion prediction techniques compare neighbouring frames and pinpoint areas of movement, defining vectors for how each will move from one frame to the next. by recording only these vectors, the data which needs to be recorded can be substantially reduced. p [predictive] frames, refer only to the previous frame, while b [bi-directional] rely on previous and subsequent frames. this combination of compression techniques makes mpeg highly scalable. not only can the spatial compression of each 1 frame be cranked up, but by using longer gops with more b and p frames, data rates are pushed even lower." [quote-source] m-jpeg "jpeg is a well-known standard for compressing stills. unlike mpeg, m-jpeg compresses and stores every frame rather than only the differences between one frame and the next. thus it requires more space than mpeg, but it is more efficient when rapid scene changes are involved, and easier to edit. it is capable of a variety of compression ratios, typically between 2:1 and 12:1. at 5:1 or lower, its deemed broadcast quality. higher than that, up to about 12:1, is more than acceptable for semi-professional or consumer purposes. the m-jpeg codec works best when contained in microcode on a video capture card chip. when implemented in hardware in this way the pc's main processor is left free to concentrate on other tasks, such as maintaining the required hard disk data transfer rates. the algorithm can also be worked into a software codec, which allows video to be seamlessly edited in applications such as adobe premiere. despite its role as the workhorse of the digital video universe, the future is looking uncertain for m-jpeg. the new dv format has spread like wildfire through the professional and mid-range video market. it's totally digital, offers better picture quality than analogue-to-digital conversion can ever hope to achieve and has industry heavyweights sony and panasonic behind it. more importantly, it's custom-designed to bring real-time, high-quality video editing to the desktop pc." [quote-source] cinepak "cinepak is another asymmetric video compressor, developed jointly by apple and supermac [a company later acquired by radius]. the format outputs 320*240 [quarter screen] at 15fps with good quality, at a data rate that even slow single-speed and 2* cd-rom players can deliver. on high-performance computers, the playback rate can reach 30fps, but cinepak movies are usually recorded at intentionally low frame rates to accommodate the installed base of slower cd-rom players. scaling the window size requires additional processing power and tends to be pixelated [a blocky appearance]. this cross-platform, software-only, scaleable codec is licensed for several video players, including microsoft video for windows and apple's quicktime. with better colour definition than other codecs, cinepak is the choice for compressing 'natural' video, i.e., video without a lot of graphics or animation." [quote-source] ivi/indeo video interactive "shortly after the introduction of apple quicktime, intel responded with its indeo video interactive [ivi or indeo 4.0] codec. this format allows for scaleable software-only video playback. ivi can compress video symmetrically [in real time, larger file size] or asymmetrically [off-line, smaller file size, low data rates, highest quality]. compression times have been dramatically shortened by the new off-line quick compressor, which is up to 50 times faster than previous versions. the earlier indeo 3.1 and 3.2 codecs typically managed 320*240 at 15fps on intel 486-based computer, and scaling the window resulted in a pixelated image. the current version is optimised for pentium pro and pentium ii processors, resulting in smooth 30fps playback. indeo delivers good quality on low-end pentium-processor computers as well, employing special techniques for graceful scalability. in contrast to quicktime, which drops frames intentionally to accommodate slower computers, indeo dynamically varies image quality according to processor power available during playback. the frame rate remains constant with no dropped frames instead trading off a degree of detail. additionally, indeo's 'alternate line zoom-by-two' doubles window size by horizontal pixel doubling, then drawing a row of black pixels in between each row. this smoothing technique minimises the pixelation associated with scaling the window. other innovative features include 'transparency', a compositing effect in which an object can be layered on top of video, just as a tv weatherman stands in front of a blue screen so that his image can be electronically cut and placed on top of a background layer, the weather map. indeo's sophisticated implementation includes compositing over moving backgrounds, moving objects [sprites] across frames, and more, comprising the 'interactive' features. indeo is supported by microsoft vfw and activemovie." [quote-source] quicktime "recognising the drawback of requiring a costly playback adapter, apple developed a video format that can be played without special add-on hardware. the result, quicktime, represents a major milestone for digital video. it provides a multimedia architecture that synchronises all types of digital media, including video, sound, text, graphics and music. on playback quicktime movies gracefully drop video frames as necessary to maintain continuous sound synchronisation. such scalability was a major breakthrough, transforming the macintosh into a viable video playback platform. while early quicktime movies were typically grainy postage-stamp-size windows [160*120 pixels] with jerky motion [12fps], the format has matured to deliver full-frame [640*480], full-motion [30fps] video suitable for professional applications. due to its well-defined hardware abstraction layer, quicktime is a cross-platform standard, with versions running on windows- and nt-based pcs and unix-based workstations in addition to its native apple macintosh environment. its open architecture supports many file formats and codecs, including cinepak, indeo, motion jpeg and mpeg-1, and is extensible to support future codecs, such as dvcam." [quote-source] sorenson video "sorenson video's single biggest advantage is its ability to deliver excellent quality video at low data rates. the first mistake people usually make with sorenson is to give it too much data rate. giving sorenson video too much data per second can "choke" the codec on playback, and make it start skipping frames as it runs out of cpu power. if you're used to compressing 320*240 movies with cinepak at 200kbytes/second, try them with sorenson video at 100kbps, or even 50kbps you may be surprised with the resulting quality. for the best results, always use variable bitrate [vbr] encoding with sorenson video. this is a two-pass technique which analyzes each clip to determine which sections are the hardest, then allocates bytes as efficiently as possible. it takes longer, and requires both sorenson video developer edition, and media cleaner pro, but it's worth it. some clips can retain their quality at half the data rate they'd otherwise require, and transitions in particular tend to look much better at low data rates. as a point of comparison, nearly every major dvd-video title released uses variable bitrate mpeg in order to get the best results vbr is a really good thing. temporal compression is a real strength of the sorenson video codec. movies with relatively low motion [such as 'talking head' clips of interviews, etc] can compress extremely well. also, doubling the frame rate does not usually require doubling the data rate for comparable image quality. sorenson video takes more computing power to get a pixel to the screen than cinepak. so it's important to be realistic about frame sizes and frame rates. 320*240*15fps will play fine on almost all powermacs g3 and pentium iii, while 640*480*30fps won't. on the bright side, sorenson-compressed movies scale up much more smoothly than cinepak. try doubling 320*240 to fullscreen for impressive results on fast powermacs. another way to keep the pixel rate lower is to take advantage of wide aspect-ratio movies. a theatrical trailer shot in 16*9 aspect ratio, properly cropped, and intelecine'd to its original 24fps has almost half the pixel rate it would if left at 320*240*30fps ... but will look even better. note: sorenson video doesn't need nearly as many key frames as other codecs, such as cinepak. using too many key frames often results in poorer image quality. the difference in size between sorenson video key frames and delta frames is often much greater than with other codecs - sorenson key frames are usually very large relative to the small delta frames. this is normal, and doesn't cause problems in playback. pros
cons
tips
avi/video for windows "avi [audio video interleaved] is microsoft's generic format for digital video in windows, provided via its mci [media control interface]. avi allows for a number of compression methods, in real-time, non-real-time, and with or without hardware assistance. unlike quicktime, the video for windows [vfw] video player is not a cross-platform technology, but then windows is the dominant operating system. the initial release, introduced in late 1992, was capable of displaying 320*240 pixels at 15fps. the small window size and slow frame rate were largely a limitation of the hardware of the day, typically a 486-based computer with 4mb of ram. today's pentium processors are capable of full motion playback of avi files at the maximum resolution of the screen. codecs supported include cinepak, indeo and microsoft video 1." [quote-source] activemovie "activemovie, a microsoft api that was announced in march 1996, is receiving wide support in the computer industry as 'the next generation of cross-platform digital video technology for the desktop and the internet', according to a microsoft press release. it is being touted by industry observers as the cure for the deficiencies in microsoft's vfw and apple's quicktime. activemovie removes most of the limitations imposed by vfw, such as the small number of supported file formats, limited i/o throughput, inconsistent driver models, and the lack of driver compatibility between windows 95 and windows nt. activemovie solves these problems primarily by using the component object model [com] as its foundation, the most widely recognised implementation of which is object linking and embedding [ole]. various objects in the model control such actions as decompressing data, adjusting volume levels, and so forth. building activemovie on the com architecture, microsoft has provided application developers with a digital video api that has a number of benefits, such as independence from operating systems and programming languages, thus allowing the same or similar code to be used on multiple platforms. activemovie also supports more popular formats including mpeg audio, .wav audio, mpeg video, and apple quicktime video making it especially convenient for internet and intranet application builders. moreover, activemovie is integrated with microsoft's directx technology. this allows it to automatically take advantage of accelerating video and audio hardware to allow each computer to perform according to its capabilities. for example, activemovie improves the video playback quality of avi and quicktime movies by using directdraw, a directx component, along with features present on many standard graphics cards. one of activemovie's most impressive features is the ability to decode mpeg video using either hardware or software, including mpeg-2. it can decode mpeg-1 entirely in software and provide high-quality playback on pentium-based systems. or, if the computer has hardware for decoding mpeg, activemovie can use directmpeg, another component of directx, to access this hardware and play back the video seamlessly. activemovie has recently been enhanced and is now called directshow. the largest enhancement included in this change is that directshow supports dvd while activemovie did not." [quote-source] |
||||||||||||
|
|||||||||||||
modified 20031113 |