Lossless

03 October 2017

Lossless

Lossless data compression is a method of compressing digital data (video, audio, images, documents), using which, the encoded data can be precisely recovered bit by bit. In this case, the original data is completely restored from the compressed state. This type of compression is fundamentally different from the data compression with loss (lossy data compression). For each type of digital information, as a rule, there are optimal lossless compression algorithms.

Lossless data compression is used in many applications. For example, it’s used in all file archivers. It’s also used as a component in lossy compression. Lossless compression is used when the identity of the compressed data with the original state is important. A common example is executable files and source code. Some graphic file formats (for example, PNG) use only lossless compression, while others (TIFF, MNG or GIF) can use both lossy and lossless compression.

Most lossless compression algorithms work in two stages: the first stage generates a statistical model for incoming data, the second stage displays incoming data in a bit representation, using that model to obtain frequently encountered data that is used more often.

In general terms, the meaning of lossless data compression is as follows: in the initial data the compressor finds a certain regularity and, taking it into account, generates a bit sequence that fully describes the original one. For example, to encode binary sequences in which there are many zeros and few ones, this substitution can be used:

00 → 0
01 → 10
10 → 110
11 → 111
in this case, sixteen bits:
00 01 00 00 11 10 00 00
will be converted into thirteen bits:
0 10 0 0 111 110 0 0

Such a substitution is called the prefix code, that means it has this feature: if we write a compressed bit string without spaces, we can still place spaces in it later, and, therefore, restore the original sequence: 00100110111010 → |0|0|10|0|110|111|0|10|. The most famous prefix code is the Huffman code.

No lossless compression algorithm can efficiently compress all possible types of data. For this reason, many different algorithms exist that are designed either for a specific type of input data or with specific assumptions about what kinds of redundancy the uncompressed data is likely to contain. Some of the most common lossless compression algorithms:

General purpose:

run-length encoding (RLE) is a simple scheme that provides good compression of data containing lots of runs of the same value
Huffman coding is used often in combination with other algorithms, used by Unix pack utility
prediction by partial matching (PPM) is optimized for compressing plain text
Lempel-Ziv compression (LZ77 and LZ78) is a dictionary-based algorithm that forms the basis for many other algorithms

Audio:

Apple Lossless (ALAC, Apple Lossless Audio Codec)
Free Lossless Audio Codec (FLAC)
Meridian Lossless Packing (MLP)
Monkey’s Audio (Monkey’s Audio APE)
RealPlayer (RealAudio Lossless)
TTA (True Audio Lossless)
WavPack (WavPack lossless)
WMA Lossless (Windows Media Lossless)

Image:

PNG (Portable Network Graphics)
TIFF (Tagged Image File Format)
WebP (high-density lossless or lossy compression of RGB and RGBA images)
FLIF (Free Lossless Image Format)
TGA (Truevision TGA)
PCX (PiCture eXchange)
ILBM (lossless RLE compression of Amiga IFF images)

Video:

H.264 lossless
Apple Animation (QuickTime RLE)
Autodesk Animator Codec (AASC)
Huffyuv (or HuffYUV) was written by Ben Rudiak-Gould and published under the terms of the GNU GPL as free software, meant to replace uncompressed
YCbCr as a video capture format. It uses very little CPU but takes a lot of disk space.
MSU Lossless Video Codec

3D Graphics:

OpenCTM (lossless compression of 3D triangle meshes)

Related terms:

HTML, JavaScript, Apache, CSS

References and further reading:

Lossless compression
Lossless Data Compression
Data compression theory and algorithms
Universal lossless data compression algorithms
History of Lossless Data Compression Algorithms
Big Data. All You Wanted to Know but Were Afraid to Ask
5 Ways to Use JotForm to Manage Important Data