Huffman Codes
In 1953, David A. Huffman published his paper "A Method for the Construction of Minimum-Redundancy Codes", and hence printed his name in the history of computer science. As a professor who gives the final exam problem on Huffman codes, I am encountering a big problem: the Huffman codes are NOT unique. For example, given a string "aaaxuaxz", we can observe that the frequencies of the characters 'a', 'x', 'u' and 'z' are 4, 2, 1 and 1, respectively. We may either encode the symbols as {'a'=0, 'x'=10, 'u'=110, 'z'=111}, or in another way as {'a'=1, 'x'=01, 'u'=001, 'z'=000}, both compress the string into 14 bits. Another set of code can be given as {'a'=0, 'x'=11, 'u'=100, 'z'=101}, but {'a'=0, 'x'=01, 'u'=011, 'z'=001} is NOT correct since "aaaxuaxz" and "aazuaxax" can both be decoded from the code 00001011001001. The students are submitting all kinds of codes, and I need a computer program to help me determine which ones are correct and which ones are not.
Input Specification:
Each input file contains one test case. For each case, the first line gives an integer N (2), then followed by a line that contains all the N distinct characters and their frequencies in the following format:
c[1] f[1] c[2] f[2] ... c[N] f[N]
where c[i]
is a character chosen from {'0' - '9', 'a' - 'z', 'A' - 'Z', '_'}, and f[i]
is the frequency of c[i]
and is an integer no more than 1000. The next line gives a positive integer M (≤), then followed by M student submissions. Each student submission consists of N lines, each in the format:
c[i] code[i]
where c[i]
is the i
-th character and code[i]
is an non-empty string of no more than 63 '0's and '1's.
Output Specification:
For each test case, print in each line either "Yes" if the student's submission is correct, or "No" if not.
Note: The optimal solution is not necessarily generated by Huffman algorithm. Any prefix code with code length being optimal is considered correct.
Sample Input:
7
A 1 B 1 C 1 D 3 E 3 F 6 G 6
4
A 00000
B 00001
C 0001
D 001
E 01
F 10
G 11
A 01010
B 01011
C 0100
D 011
E 10
F 11
G 00
A 000
B 001
C 010
D 011
E 100
F 101
G 110
A 00000
B 00001
C 0001
D 001
E 00
F 10
G 11
Sample Output:
Yes
Yes
No
No
解题思路
要判断编码是否为最优编码,需要对编码进行两个方面的检验:对每一组编码来判断WPL是否为最小的,以及是否为前缀码。
下面给出第一种思路,需要构建Huffman Tree的方法,同时也是用堆来实现的。
树节点的定义如下:
1 struct Data { 2 char letter; 3 int freq; 4 }; 5 6 struct TNode { 7 Data data; 8 TNode *left, *right; 9 };
首先我们要根据给出的字符频率来构造出一颗对应的Huffman Tree,同时计算出WPL。
又因为我们是用堆来实现的,所以要构造一颗Huffman Tree我们先要把给定的频率来构建一个最小堆。
这里我们需要对堆进行定义,同时还要定义堆的相关操作。这里就直接上代码,不解释了。
1 struct Heap { 2 TNode *H; // 堆的每一个元素的数据类型是树节点TNode 3 int size; 4 int capacity; 5 }; 6 7 Heap *createMinHeap(int n) { 8 Heap *minHeap = new Heap; 9 minHeap->size = 0; 10 minHeap->capacity = n; 11 minHeap->H = new TNode[n + 1]; 12 minHeap->H[0].data.freq = -1; 13 14 for (int i = 0; i < minHeap->capacity; i++) { 15 TNode *tmp = new TNode; 16 tmp->left = tmp->right = NULL; 17 getchar(); 18 scanf("%c %d", &tmp->data.letter, &tmp->data.freq); 19 insertHeap(minHeap, tmp); 20 } 21 22 return minHeap; 23 } 24 25 void insertHeap(Heap *minHeap, TNode *treeNode) { 26 int pos = ++minHeap->size; 27 for ( ; treeNode->data.freq < minHeap->H[pos / 2].data.freq; pos /= 2) { 28 minHeap->H[pos] = minHeap->H[pos / 2]; 29 } 30 minHeap->H[pos] = *treeNode; 31 } 32 33 TNode *deleteMin(Heap *minHeap) { 34 TNode *minTreeNode = new TNode; 35 *minTreeNode = minHeap->H[1]; 36 TNode tmp = minHeap->H[minHeap->size--]; 37 38 int parent = 1, child; 39 for ( ; parent * 2 <= minHeap->size; parent = child) { 40 child = parent * 2; 41 if (child != minHeap->size && minHeap->H[child].data.freq > minHeap->H[child + 1].data.freq) child++; 42 if (tmp.data.freq < minHeap->H[child].data.freq) break; 43 else minHeap->H[parent] = minHeap->H[child]; 44 } 45 minHeap->H[parent] = tmp; 46 47 return minTreeNode; 48 }
构建好最小堆后,下一步我们需要通过这个堆来构造出对应的Huffman Tree。
就是每次从堆中弹出最小频率的那两个节点,然后把这两个节点分别插在新节点的左右边,作为左右孩子。再把新节点压入堆中。如此循环n-1次后(其中n代表节点的个数),堆中就只剩下一个元素,那个元素就是Huffman Tree的根节点,我们直接返回即可。
按照上面构造Huffman Tree的思路,相应的代码如下:
1 TNode *createHuffmanTree(Heap *minHeap) { 2 int n = minHeap->size - 1; 3 while (n--) { 4 TNode *tmp = new TNode; 5 tmp->left = deleteMin(minHeap); 6 tmp->right = deleteMin(minHeap); 7 tmp->data.freq = tmp->left->data.freq + tmp->right->data.freq; 8 9 insertHeap(minHeap, tmp); 10 } 11 12 return deleteMin(minHeap); 13 }
然后,我们要根据这颗Huffman Tree来计算WPL。我们用递归来实现计算WPL。
如果这个节点是叶子节点,那么就用当前深度乘以对应的频率,然后返回。如果不是叶子节点,就递归来计算左右子树的WPL,相加后返回。
1 int WPL(TNode *T, int depth) { 2 if (T->left == NULL && T->right == NULL) return depth * T->data.freq; // 叶子节点直接返回结果 3 else return WPL(T->left, depth + 1) + WPL(T->right, depth + 1); // 不是叶子节点,计算左右子树的WPL并返回,同时由于左右子树的深度加深一层,记得depth+1 4 }
好了,折腾了这么久,终于计算出给定频率的WPL了。
接下来我们先对每一组编码来检测其WPL是不是最小的,也就是每组编码的WPL是否与给定频率的WPL相等。
计算方法很简单,每一组编码的WPL计算公式为:
再判断codeLen是否与上面求出的给定频率的WPL相等,如果不相等,就说明这个编码不是最优编码,就不需要再判断是否为前缀码了。如果相等再去判断是否为前缀码。
这里还有个陷阱。首先我们要知道,一个最优编码的长度是不会超过n-1的。所以如果某个编码的长度大于n-1也说明该编码不是最优编码。
这里同时给出计算编码长度和判断是否是前缀码的函数:
1 bool check(TNode *huffmanTree, int n) { 2 int wpl = WPL(huffmanTree, 0); // 计算给定频率构成的Huuffman Tree的WPL 3 4 std::string code[n]; // 存放每一个字符的编码 5 int codeLen = 0; 6 bool ret = true; // 用来标记该组编码是否为最优编码 7 8 for (int i = 0; i < n; i++) { 9 char letter; 10 getchar(); 11 scanf("%c", &letter); 12 getchar(); 13 std::cin >> code[i]; 14 15 if (ret) { // 如果已经知道该组编码不是最优编码就不需要再计算编码长度了,但仍要继续输入 16 if (code[i].size() > n - 1) ret = false; // 如果某个字符的编码长度大于n-1,说明该组编码不是最优编码 17 codeLen += code[i].size() * findFreq(huffmanTree, letter); // 计算编码长度 18 } 19 } 20 21 if (ret && codeLen == wpl) { // 如果ret == true并且编码长度与WPL相同,接着判断该组编码是否为前缀码 22 TNode *T = new TNode; // 为这组编码构造一颗Huffman Tree,初始化Huffman Tree的根节点 23 T->data.freq = 0; 24 T->left = T->right = NULL; 25 26 for (int i = 0; i < n; i++) { // 有n个节点,需要判断n次 27 TNode *pre = T; // 每次判断一个字符都从根节点开始 28 29 for (std::string::iterator it = code[i].begin(); it != code[i].end(); it++) { // 对该字符的每一个编码进行判断 30 if (*it == '0') { // 如果编码是0 31 if (pre->left == NULL) { // 如果当前节点的左子树为空 32 TNode *tmp = new TNode; // 就为当前节点生成一颗左子树 33 tmp->data.freq = 0; // 该节点的频率标记为0,表示该节点还没有字符占用 34 tmp->left = tmp->right = NULL; 35 pre->left = tmp; 36 } 37 pre = pre->left; // pre指针指向左子树 38 } 39 else { // 如果编码是1 40 if (pre->right == NULL) { // 如果当前节点的右子树为空 41 TNode *tmp = new TNode; // 就为当前节点生成一颗右子树 42 tmp->data.freq = 0; // 该节点的频率标记为0,表示该节点还没有字符占用 43 tmp->left = tmp->right = NULL; 44 pre->right = tmp; 45 } 46 pre = pre->right; // pre指针指向左子树 47 } 48 } 49 50 // 读完了字符的编码后,pre指针就指向这个字符应该占用的位置 51 // 这时需要判断pre指向的这个节点是否为叶子节点,并且该节点有没有被其他字符占用 52 if (pre->left == NULL && pre->right == NULL && pre->data.freq == 0) { 53 pre->data.freq = 1; // 如果是叶子节点并且没有被占用,该字符就占用了这个节点,并把这个节点的频率标记为1 54 } 55 else { // 否则,如果这些条件中有一个不满足 56 ret = false; // 就说明该组字符不满足前缀码的要求,ret赋值为false 57 break; // 后面的字符不需要判断了,直接退出退出判断前缀码的循环 58 } 59 } 60 } 61 else { // 如果ret == false并且编码长度不等于WPL,就说明该组编码不是最优编码 62 ret = false; 63 } 64 65 return ret; 66 }
这里是通过构造一颗Huffman Tree来判断该组编码是否符合前缀码。
判断的过程如下:
有一个指向Huffman Tree根节点的指针。
- 如果编码是'0',先判断当前节点的左子树是否存在,如果不存在先生成左子树,再让指针移到左子树的节点。如果存在那么直接让指针移到左子树的节点即可。
- 如果编码是'1',先判断当前节点的右子树是否存在,如果不存在先生成右子树,再让指针移到右子树的节点。如果存在那么直接让指针移到右子树的节点即可。
读完该字符的编码后,那么此时字符应该放入这个指针指向的节点。这个节点要满足两个条件才可以放入:
- 该节点的左右孩子都为空,也就是该节点为叶子节点。
- 该节点每有被标记过,也就是说该节点没有存放其他的字符。
如果有一个条件不满足,就说明该组编码不是前缀码。
最后,给出这种方法的完整AC代码,代码量有点多。
#include <cstdio> #include <iostream> #include <string> struct Data { char letter; int freq; }; struct TNode { Data data; TNode *left, *right; }; struct Heap { TNode *H; int size; int capacity; }; Heap *createMinHeap(int n); void insertHeap(Heap *minHeap, TNode *treeNode); TNode *deleteHeap(Heap *minHeap); TNode *createHuffmanTree(Heap *minHeap); bool check(TNode *huffmanTree, int n); int WPL(TNode *T, int depth); int findFreq(TNode *huffmanTree, char letter); int main() { int n; scanf("%d", &n); Heap *minHeap = createMinHeap(n); TNode *huffmanTree = createHuffmanTree(minHeap); int m; scanf("%d", &m); for (int i = 0; i < m; i++) { bool ret = check(huffmanTree, n); printf("%s\n", ret ? "Yes" : "No"); } return 0; } Heap *createMinHeap(int n) { Heap *minHeap = new Heap; minHeap->size = 0; minHeap->capacity = n; minHeap->H = new TNode[n + 1]; minHeap->H[0].data.freq = -1; for (int i = 0; i < minHeap->capacity; i++) { TNode *tmp = new TNode; tmp->left = tmp->right = NULL; getchar(); scanf("%c %d", &tmp->data.letter, &tmp->data.freq); insertHeap(minHeap, tmp); } return minHeap; } void insertHeap(Heap *minHeap, TNode *treeNode) { int pos = ++minHeap->size; for ( ; treeNode->data.freq < minHeap->H[pos / 2].data.freq; pos /= 2) { minHeap->H[pos] = minHeap->H[pos / 2]; } minHeap->H[pos] = *treeNode; } TNode *deleteHeap(Heap *minHeap) { TNode *minTreeNode = new TNode; *minTreeNode = minHeap->H[1]; TNode tmp = minHeap->H[minHeap->size--]; int parent = 1, child; for ( ; parent * 2 <= minHeap->size; parent = child) { child = parent * 2; if (child != minHeap->size && minHeap->H[child].data.freq > minHeap->H[child + 1].data.freq) child++; if (tmp.data.freq < minHeap->H[child].data.freq) break; else minHeap->H[parent] = minHeap->H[child]; } minHeap->H[parent] = tmp; return minTreeNode; } TNode *createHuffmanTree(Heap *minHeap) { int n = minHeap->size - 1; while (n--) { TNode *tmp = new TNode; tmp->left = deleteHeap(minHeap); tmp->right = deleteHeap(minHeap); tmp->data.freq = tmp->left->data.freq + tmp->right->data.freq; insertHeap(minHeap, tmp); } return deleteHeap(minHeap); } bool check(TNode *huffmanTree, int n) { int wpl = WPL(huffmanTree, 0); std::string code[n]; int codeLen = 0; bool ret = true; for (int i = 0; i < n; i++) { char letter; getchar(); scanf("%c", &letter); getchar(); std::cin >> code[i]; if (ret) { if (code[i].size() > n - 1) ret = false; codeLen += code[i].size() * findFreq(huffmanTree, letter); } } if (ret && codeLen == wpl) { TNode *T = new TNode; T->data.freq = 0; T->left = T->right = NULL; for (int i = 0; i < n; i++) { TNode *pre = T; for (std::string::iterator it = code[i].begin(); it != code[i].end(); it++) { if (*it == '0') { if (pre->left == NULL) { TNode *tmp = new TNode; tmp->data.freq = 0; tmp->left = tmp->right = NULL; pre->left = tmp; } pre = pre->left; } else { if (pre->right == NULL) { TNode *tmp = new TNode; tmp->data.freq = 0; tmp->left = tmp->right = NULL; pre->right = tmp; } pre = pre->right; } } if (pre->left == NULL && pre->right == NULL && pre->data.freq == 0) { pre->data.freq = 1; } else { ret = false; break; } } } else { ret = false; } return ret; } int WPL(TNode *T, int depth) { if (T->left == NULL && T->right == NULL) return depth * T->data.freq; else return WPL(T->left, depth + 1) + WPL(T->right, depth + 1); } int findFreq(TNode *huffmanTree, char letter) { int ret = 0; if (huffmanTree) { if (huffmanTree->left == NULL && huffmanTree->right == NULL && huffmanTree->data.letter == letter) ret = huffmanTree->data.freq; if (ret == 0) ret = findFreq(huffmanTree->left, letter); if (ret == 0) ret = findFreq(huffmanTree->right, letter); } return ret; }AC Code1
然后,我们来对判定前缀码的代码进行改进,下面给出判定前缀码的另外一种思路,这个方法不需要构造Huffman Tree。
首先,假设现在有两个编码,如果这两个编码不满足前缀码的话,比如"110"和"1101",那么其中一个编码会与另外一个编码前的m个位置的相同(其中m是指这两个编码长度中最小的那个长度)。也就是说"110",与"1101"的前3个位置的"110"相同,就说明"110"和"1101"不满足前缀码。
我们需要对同组编码的每两个字符进行比较,需要比较的次数为 C(n, 2) = n * (n - 1) / 2 。
check函数改进的代码如下:
1 bool check(TNode *huffmanTree, int n) { 2 int wpl = WPL(huffmanTree, 0); 3 4 std::string code[n]; 5 int codeLen = 0; 6 bool ret = true; 7 8 for (int i = 0; i < n; i++) { 9 char letter; 10 getchar(); 11 scanf("%c", &letter); 12 getchar(); 13 std::cin >> code[i]; 14 15 if (ret) { 16 if (code[i].size() > n - 1) ret = false; 17 codeLen += code[i].size() * findFreq(huffmanTree, letter); 18 } 19 } 20 21 if (ret && codeLen == wpl) { // 一样的,如果ret == true并且编码长度与WPL相同,才判断该组编码是否为前缀码 22 for (int i = 0; i < n; i++) { // 每个字符都跟它之后的字符进行判断是否满足前缀码的要求 23 for (int j = i + 1; j < n; j++) { 24 // 判断某个编码是否与另外一个编码前m个位置的相同,详细请看图片 25 if (code[i].substr(0, code[j].size()) == code[j].substr(0, code[i].size())) { 26 ret = false; // 只要有一对编码的前缀相同,就说明这组的编码不满足前缀码 27 break; // 后面的字符不需要判断了,直接退出退出判断前缀码的循环 28 } 29 } 30 if (ret == false) break; 31 } 32 } 33 else { 34 ret = false; 35 } 36 37 return ret; 38 }
code[i].substr(0, code[j].size()) == code[j].substr(0, code[i].size()) ,这么做始终能够保证取到两个编码中,长度最小那个编码的全部,以及另外一个编码的前面同样长度的部分,来进行判断是否满足前缀码。
这个方法也可以通过,下面给出完整的AC代码,其中改动的部分就是check部分,其他的不变。
#include <cstdio> #include <iostream> #include <string> struct Data { char letter; int freq; }; struct TNode { Data data; TNode *left, *right; }; struct Heap { TNode *H; int size; int capacity; }; Heap *createMinHeap(int n); void insertHeap(Heap *minHeap, TNode *treeNode); TNode *deleteHeap(Heap *minHeap); TNode *createHuffmanTree(Heap *minHeap); bool check(TNode *huffmanTree, int n); int WPL(TNode *T, int depth); int findFreq(TNode *huffmanTree, char letter); int main() { int n; scanf("%d", &n); Heap *minHeap = createMinHeap(n); TNode *huffmanTree = createHuffmanTree(minHeap); int m; scanf("%d", &m); for (int i = 0; i < m; i++) { bool ret = check(huffmanTree, n); printf("%s\n", ret ? "Yes" : "No"); } return 0; } Heap *createMinHeap(int n) { Heap *minHeap = new Heap; minHeap->size = 0; minHeap->capacity = n; minHeap->H = new TNode[n + 1]; minHeap->H[0].data.freq = -1; for (int i = 0; i < minHeap->capacity; i++) { TNode *tmp = new TNode; tmp->left = tmp->right = NULL; getchar(); scanf("%c %d", &tmp->data.letter, &tmp->data.freq); insertHeap(minHeap, tmp); } return minHeap; } void insertHeap(Heap *minHeap, TNode *treeNode) { int pos = ++minHeap->size; for ( ; treeNode->data.freq < minHeap->H[pos / 2].data.freq; pos /= 2) { minHeap->H[pos] = minHeap->H[pos / 2]; } minHeap->H[pos] = *treeNode; } TNode *deleteHeap(Heap *minHeap) { TNode *minTreeNode = new TNode; *minTreeNode = minHeap->H[1]; TNode tmp = minHeap->H[minHeap->size--]; int parent = 1, child; for ( ; parent * 2 <= minHeap->size; parent = child) { child = parent * 2; if (child != minHeap->size && minHeap->H[child].data.freq > minHeap->H[child + 1].data.freq) child++; if (tmp.data.freq < minHeap->H[child].data.freq) break; else minHeap->H[parent] = minHeap->H[child]; } minHeap->H[parent] = tmp; return minTreeNode; } TNode *createHuffmanTree(Heap *minHeap) { int n = minHeap->size - 1; while (n--) { TNode *tmp = new TNode; tmp->left = deleteHeap(minHeap); tmp->right = deleteHeap(minHeap); tmp->data.freq = tmp->left->data.freq + tmp->right->data.freq; insertHeap(minHeap, tmp); } return deleteHeap(minHeap); } bool check(TNode *huffmanTree, int n) { int wpl = WPL(huffmanTree, 0); std::string code[n]; int codeLen = 0; bool ret = true; for (int i = 0; i < n; i++) { char letter; getchar(); scanf("%c", &letter); getchar(); std::cin >> code[i]; if (ret) { if (code[i].size() > n - 1) ret = false; codeLen += code[i].size() * findFreq(huffmanTree, letter); } } if (ret && codeLen == wpl) { for (int i = 0; i < n; i++) { for (int j = i + 1; j < n; j++) { if (code[i].substr(0, code[j].size()) == code[j].substr(0, code[i].size())) { ret = false; break; } } if (ret == false) break; } } else { ret = false; } return ret; } int WPL(TNode *T, int depth) { if (T->left == NULL && T->right == NULL) return depth * T->data.freq; else return WPL(T->left, depth + 1) + WPL(T->right, depth + 1); } int findFreq(TNode *huffmanTree, char letter) { int ret = 0; if (huffmanTree) { if (huffmanTree->left == NULL && huffmanTree->right == NULL && huffmanTree->data.letter == letter) ret = huffmanTree->data.freq; if (ret == 0) ret = findFreq(huffmanTree->left, letter); if (ret == 0) ret = findFreq(huffmanTree->right, letter); } return ret; }AC Code2
还可以再改进!我们不用堆,而改用优先队列,同时不需要构造任何的Huffman Tree,甚至不需要定义树节点,也可以计算出给定频率的WPL!并且代码长度也缩短许多。
这里主要说明如何通过不构造Huffman Tree来计算给定频率的WPL。其实计算WPL不一定要用深度乘以频率再求和来得到。另外一种方法是把Huffman Tree中度为2的节点存放的频率都相加起来,最后得到的结果也是WPL。这是因为叶子节点被重复计算,和用深度乘以频率的原理基本一样。
就拿题目给的测试样例来举例:
我们往优先队列里面压的就是每个字符对应的频率,而不是树节点。代码实现的过程是:我们要有一个变量来累加如上图度为2节点存放频率。每次从优先队列里弹出两个频率,这两个频率是优先队列中所包含频率里面最小的那两个,然后把这两个频率相加,相加的结果其实就对应上图度为2节点存放的频率,也就是红色的数字。然后把相加的结果累加到那个变量,同时把相加的结果压入优先队列中。其实这个累加的过程就是累加上图红色的那些数字。一直重复,直到优先队列为空,那么那个变量最后累加的结果就是我们要计算的WPL。
AC代码如下:
1 #include <cstdio> 2 #include <iostream> 3 #include <string> 4 #include <vector> 5 #include <queue> 6 #include <map> 7 using namespace std; 8 9 void readLetterFreq(map<char, int> &letterFreq, priority_queue< int, vector<int>, greater<int> > &pq, int n); 10 void checkOptimalCode(map<char, int> &letterFreq, priority_queue< int, vector<int>, greater<int> > &pq, int n); 11 int getWPL(priority_queue< int, vector<int>, greater<int> > &pq); 12 13 int main() { 14 map<char, int> letterFreq; // 用map来存储字符和对应的频率,字符映射为对应的频率 15 priority_queue< int, vector<int>, greater<int> > pq; 16 17 int n; 18 cin >> n; 19 readLetterFreq(letterFreq, pq, n); 20 checkOptimalCode(letterFreq, pq, n); 21 22 return 0; 23 } 24 25 void readLetterFreq(map<char, int> &letterFreq, priority_queue< int, vector<int>, greater<int> > &pq, int n) { 26 for (int i = 0; i < n; i++) { 27 char letter; 28 getchar(); 29 cin >> letter; // 读入字符 30 getchar(); 31 cin >> letterFreq[letter]; // 读入频率,为字符的映射 32 33 pq.push(letterFreq[letter]);// 把读入的频率压入到优先队列中 34 } 35 } 36 37 void checkOptimalCode(map<char, int> &letterFreq, priority_queue< int, vector<int>, greater<int> > &pq, int n) { 38 int wpl = getWPL(pq); // 用不构造Huffman Tree的方法来计算WPL 39 40 int m; 41 cin >> m; 42 for (int i = 0; i < m; i++) { 43 string code[n]; 44 int codeLen = 0; 45 bool ret = true; 46 47 for (int i = 0; i < n; i++) { 48 char letter; 49 getchar(); 50 cin >> letter >> code[i]; 51 52 if (ret) { 53 if (code[i].size() > n - 1) ret = false; 54 codeLen += code[i].size() * letterFreq[letter]; 55 } 56 } 57 58 if (ret && codeLen == wpl) { 59 for (int i = 0; i < n; i++) { 60 for (int j = i + 1; j < n; j++) { 61 if (code[i].substr(0, code[j].size()) == code[j].substr(0, code[i].size())) { 62 ret = false; 63 break; 64 } 65 } 66 if (ret == false) break; 67 } 68 } 69 else { 70 ret = false; 71 } 72 73 cout << (ret ? "Yes\n" : "No\n"); 74 } 75 } 76 77 int getWPL(priority_queue< int, vector<int>, greater<int> > &pq) { 78 int wpl = 0; // 用来保存累加的结果 79 while (!pq.empty()) { // 当优先队列不为空 80 int tmp = pq.top(); // 从优先队列弹出一个元素,这个元素就是最小频率 81 pq.pop(); 82 83 if (pq.empty()) break; // 如果弹出那个频元素优先队列就为空了,退出循环 84 85 tmp += pq.top(); // 如果优先队列不为空,再弹出一个元素,同时把两个频率进行相加 86 pq.pop(); 87 pq.push(tmp); // 把两个频率相加的结果压入优先队列中 88 89 wpl += tmp; // 同时,把这个相加结果进行累加,对应着累加度为2节点存放的频率 90 } 91 92 return wpl; 93 }
参考资料
浙江大学——数据结构:https://www.icourse163.org/course/ZJU-93001?tid=1461682474
priority_queue的用法:https://www.cnblogs.com/Deribs4/p/5657746.html
pta5-9 Huffman Codes (30分):https://www.cnblogs.com/Deribs4/p/4801656.html