Digital Media
Digital Compression Lossless
Digital Compression Lossy
With this picture of a signal digitized into a series of sample numbers, you can understand a bit how compression works. For audio, the samples tend to look like this: 12000, 12002, 12006, 12007, 12010, 12006, 12005.
As a practical matter, each sample number tends to be very close to the sample numbers that come just before and just after it in time. So one way to "compress" the audio, so it takes up less space, is to record just the changes -- for each sample, record how much change it is from the previous sample. So the above looks like 12000, +2, +4, +1, +3, -4, -1. These change numbers tend to be small, so it turns out they can be recorded more compactly (requiring fewer 0's and 1's). This is nice example of digital compression -- recording the data in a way which takes up less space, but you can still recreate the original signal. In this case, the compression is lossless.
Having translated the audio into the digital domain -- a series of sample numbers -- we open the data up to all sorts of computer manipulations, since computers are cheap and effective at manipulating numbers. MP3 is another example of audio compression. MP3 is complicated, reducing the space required by 10x, and also it is lossy, so it discards little bits of the original signal in a way which the human auditory system tends not to notice.
JPEG Standard
Images - How Many Bytes?
JPEG Images
JPEG is a free and open standard for storing digital images, such as you would take with a digital camera. JPEG is a "lossy" compression format, detailed below, allowing an image to be adjusted, losing some detail but requiring less space in the process.
JPEG is an incredibly successful standard, allowing computers, phones, printers, TVs, email, blogs, .. to exchange image files and understand each other. Some commonly understood standard format for what is "an image" is needed, and JPEG is the mostly widely used one.
JPEG stands for the Joint Photographic Expert Group, a technical committee which drafted the standard, originally in 1992. I doubt it was possible at the time to understand how widespread and critical this format would become.
JPEG is a "lossy" format, meaning that the level detail preserved when a JPEG is saved is adjustable. Say the quality levels are in the range q10, q20, .. q100, q100 corresponding to very little compression and high visual quality and q10 corresponding to very aggressive compression with lower visual quality. In reality the scale terminology is not exact across systems, sometimes described as 0-10, or 1-100. An image saved with q10 saves the maximum detail, but the resulting file takes up the most space. An image can be saved with a lower quality level, causing it to lose some detail, but take up less space. Or in other words, q1 is more compressed, and q100 is less compressed. JPEG is very smart about the way it loses detail, so saving at something like q70 is a normal thing to do without losing appreciable detail.
How many bytes does an image take up? The main issue is just plain size -- how many pixels. The 457 x 360 flowers image has 164520 pixels. Say each pixel takes three bytes (one for each color channel), that's 493560 bytes or about 493 KB.
Flowers JPEG Examples
Here are versions of the flowers.jpg image with different compression levels...
Here is the image as it originally came out of my camera. I believe the camera uses about q70 compression internally. This image takes up 48 KB. The "raw" form of the image takes up 493 KB, so q70 is saving us about 10x space. Basically, this shows JPEG works quite well: giving up tiny amounts of detail for a 10x space savings.
Here is the image compressed at q50, taking up 29 KB, 60% of the size of the q70. I cannot see obvious differences between this version and the one above, although there must be some tiny differences.
Here is the same image compressed very aggressively at q10, taking up 14KB, or about 29% of the size of the q70 version. Generally you don't want to compress this much. If you zoom way in, you can see the results of the compression in this version:
There are two things you notice in JPEGs as the shed detail:
- Block artifacts -- JPEG divides the image into 8x8 blocks. If the compression is very high, you can see the block boundaries. You can see this clearly in the upper-left flower if you zoom your browser in. What's most amazing is how these blocks are not noticeable if you glance at the image normally.
- Edge artifacts/noise -- JEPG has a hard time with crisp edges between two colors. In a more-compressed version, little "noise" speckles or distortions can appear to either side of the hard edge. Look at the left edge of the flower which is halfway down vertically, or at the very upper left flower.
Considering the the q10 version takes about 4x fewer bytes than the q70 version, JPEG does a good job keeping the basic look of the scene when asked to use less space.
GIF and PNG Images
GIF and PNG (Portable Network Graphics) are "lossless" image formats, recording every pixel exactly. They are used for non-photographic images, like little solid color icons. GIF is older and used to be patented. PNG is newer and performs a little better. Most recently, a form of GIF has been used for short, no-audio video clips.
Audio Formats
MP3 is the dominant audio format (good example of a "network effect", a later topic). MP3 is lossy, like JPEG. Raw CD audio takes up about 10 MB per minute (this is how it is stored on an audio CD .. no compression). MP3 gets that down to about 1 MB per minute while still sounding pretty good. As with JPEG, you can choose the level of compression, say 2 MB per minute to keep more detail, or 512 KB per minute if space is at a premium.
MP3 is patented, and legitimately so. (Nick's opinion) Many modern software patents are ridiculous, just patenting obvious solutions. However, MP3 is legitimate: it uses complex and non-obvious techniques to get its excellent 10x compression while still sounding good. If a device or software plays or produces MP3s, a license fee is due to the patent holders, on the order $1-$2 per instance. With each MP3 device you have owned over the years .. you have in effect paid this fee each time. Licensing of devices which can play video is similar.
Video Formats
A video is basically a series of images -- 20 to 60 per second, plus an audio "track". Video data takes up a lot of bytes, but computers have now become powerful enough to handle video. Very roughly speaking, say compressed video of about DVD quality takes about 2 GB per hour (roughly 30 MB per minute). In reality, there is a very large range of video sizes -- HD video takes more space, smaller YouTube video takes less space. Video compression is complicated and the techniques are heavily patented.
MPEG (Motion Picture Experts Group) standardizes some video formats in the industry, and the MPEG-LA (Licensing Authority) handles collecting patent royalties, which are significant.
MPEG-2 is used in DVDs and some satellite TV systems, originally released in 1995. Compression techniques have gotten significantly better since then.
MPEG-4 and particular the "h.264" compression system is very good at producing good looking video with the minimum bytes. Most digital video cameras, phones, and Blu-ray disks use h.264 internally to compress and store the video data. Patent fees are paid by the manufacturer to produce an encoder or decoder in hardware or software.
h.264 Obnoxious Licensing Terms
One surprising thing about the h.264 licensing is that it does not come with an unrestricted right to distribute your own video. You have properly bought a video camera (paying for the patents), and produced your video and stored it on your hard drive. However, if you want to make a web site or whatever that distributes the video to many people, you may have to pay additional royalties for each minute distributed of your video. There may be exceptions if your video is distributed for free, however these terms have been changed over time, so really you have to consult a lawyer to see what you are permitted to do with your video. These license restrictions strike me as unusually obnoxious and certainly out of step with how you usually think of having your own data file. My guess is that MPEG did not have a lot of competition, resulting in these one-sided terms.
Open Video Format - WebM
In support of an open internet, Mozilla, Google and others have been working on a free and open "WebM" video compression scheme to compete with h.264 -- free to encode or decode video, and free to distribute the video however its owner wishes. Those are the sorts of terms under which the internet has thrived. See the WebM project.
Firefox, Chrome, and Microsoft Edge support WebM. Apple is the holdout. (Heh, maybe Apple is doing so well financially, they feel they need to behave obnoxiously. Actually I suspect Apple wants complex patent schemes to inhibit competition. Not a strategy to be proud of.)
If you were working on a project that took in video and re-distributed it ... you could fall afoul of h.264 licensing terms. That's why WebM exists. WikiPedia uses WebM for video. YouTube supports WebM.