Lossless data compression is a method of compressing digital data (video, audio, images, documents), using which, the encoded data can be precisely recovered bit by bit. In this case, the original data is completely restored from the compressed state. This type of compression is fundamentally different from the data compression with loss (lossy data compression). For each type of digital information, as a rule, there are optimal lossless compression algorithms.
Lossless data compression is used in many applications. For example, it’s used in all file archivers. It’s also used as a component in lossy compression. Lossless compression is used when the identity of the compressed data with the original state is important. A common example is executable files and source code. Some graphic file formats (for example, PNG) use only lossless compression, while others (TIFF, MNG or GIF) can use both lossy and lossless compression.
Most lossless compression algorithms work in two stages: the first stage generates a statistical model for incoming data, the second stage displays incoming data in a bit representation, using that model to obtain frequently encountered data that is used more often.
In general terms, the meaning of lossless data compression is as follows: in the initial data the compressor finds a certain regularity and, taking it into account, generates a bit sequence that fully describes the original one. For example, to encode binary sequences in which there are many zeros and few ones, this substitution can be used:
00 → 0
01 → 10
10 → 110
11 → 111
in this case, sixteen bits:
00 01 00 00 11 10 00 00
will be converted into thirteen bits:
0 10 0 0 111 110 0 0
Such a substitution is called the prefix code, that means it has this feature: if we write a compressed bit string without spaces, we can still place spaces in it later, and, therefore, restore the original sequence: 00100110111010 → |0|0|10|0|110|111|0|10|. The most famous prefix code is the Huffman code.
No lossless compression algorithm can efficiently compress all possible types of data. For this reason, many different algorithms exist that are designed either for a specific type of input data or with specific assumptions about what kinds of redundancy the uncompressed data is likely to contain. Some of the most common lossless compression algorithms:
HTML, JavaScript, Apache, CSS
Lossless compression
Lossless Data Compression
Data compression theory and algorithms
Universal lossless data compression algorithms
History of Lossless Data Compression Algorithms
Big Data. All You Wanted to Know but Were Afraid to Ask
5 Ways to Use JotForm to Manage Important Data