17. The Huffman code problem

The principle behind this technique is to design a binary character code for each character, using a variable number of bits to represent each character, so as to reduce the total length of the document.

To allow easy, unambiguous decoding, we require the code to be a prefix code: no code is a prefix of another

For example, if the codewords are {0, 01, 11, 001}, the decoding of string like 001 is ambiguous. You can interpret it as 0-01, or 001.

Solution ---- insisting on the prefix-free property: no codeword can be a prefix of another codeword, where codeword is a string of digits to represent a character in the encoding.

Details

Given a character set C and the frequency f (c) of each character c belongs to C in the document A, the problem is to find a prefix code that minimizes the total length of the document.
The tree representing such an optimal code is called a Huffman tree T . We aim to minimize

B(T,A) = sum of f(c)d(c)

——d (c) is the depth of character c in the tree T (numbers of bits used to encode character c), or the length of the codeword of character c.
——B(T,A) is the total number of bits for document A based on the Huffman tree T.

The problem is to design a tree T to minimize B(T,A).

  1. There are no nodes with only a single child.
  2. Rearranging characters within the same level of the tree T does not change B(T,A).
  3. Swapping a character at a higher level with another less-frequent character reduces B(T,A), where the level of the tree root is the lowest level

We have a greedy strategy for the Huffman coding problem:

Process

For efficient implementation of a Huffman tree, we can use a priority queue data structure - a minimum heap

The running time is O(n) +O(n) + O(nlogn) = O(nlogn)

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • **2014真题Directions:Read the following text. Choose the be...
    又是夜半惊坐起阅读 13,493评论 0 23
  • 打嗝是因为颈椎、胸隔、肠胃 其中一个部位的神经受到刺激 引起横隔膜性收缩 一般受到寒冷刺激、吃太饱、吃饭过快、吃太...
    魏文花阅读 4,050评论 0 0
  • 回家路上听小幸运 听着听着居然哭了 多可笑 不想回家 也不想上班 在哪都痛苦 干嘛都不好 会好的吧……
    给自己的日记本阅读 1,247评论 0 0
  • 不知道从什么时候开始,在白天或者是没睡觉的晚上,总是不由自主的打哈欠。而且常常憋不住,如果是在开会或者是比较严肃的...
    谦堆雪阅读 2,047评论 0 0
  • 证据表明意见可能被误认为是我们周围的一切。周末酗酒者经常认为,只要他不喝酒,一周之内,他就不是一个酒鬼。持续驾驶吃...
    Fx_阅读 1,261评论 0 1

友情链接更多精彩内容