[edit]

Main article: Coding theory

In information theory and computer science, a code is usually considered as an algorithm which uniquely represents symbols from some source alphabet, by *encoded* strings, which may be in some other target alphabet. An extension of the code for representing sequences of symbols over the source alphabet is obtained by concatenating the encoded strings.

Before giving a mathematically precise definition, we give a brief example. The mapping

is a code, whose source alphabet is the set and whose target alphabet is the set . Using the extension of the code, the encoded string 0011001011 can be grouped into codewords as 0 011 0 01 011, and these in turn can be decoded to the sequence of source symbols *acabc*.

Using terms from formal language theory, the precise mathematical definition of this concept is as follows: Let S and T be two finite sets, called the source and target alphabets, respectively. A**code** is a total function mapping each symbol from S to a sequence of symbols over T, and the extension of M to a homomorphism of into , which naturally maps each sequence of source symbols to a sequence of target symbols, is referred to as its **extension**.

Variable-length codes[edit]

Main article: Variable-length code

In this section we consider codes, which encode each source (clear text) character by a code word from some dictionary, and concatenation of such code words give us an encoded string. Variable-length codes are especially useful when clear text characters have different probabilities; see also entropy encoding.

A *prefix code* is a code with the "prefix property": there is no valid code word in the system that is a prefix (start) of any other valid code word in the set. Huffman coding is the most known algorithm for deriving prefix codes. Prefix codes are widely referred to as "Huffman codes", even when the code was not produced by a Huffman algorithm. Other examples of prefix codes arecountry calling codes, the country and publisher parts of ISBNs, and the Secondary Synchronization Codes used in the UMTS W-CDMA 3G Wireless Standard.

Kraft's inequality characterizes the sets of code word lengths that are possible in a prefix code. Virtually any uniquely decodable one-to-many code, not necessary a prefix one, must satisfy Kraft's inequality.