Hashing

Hashing

The task of this problem is simple: insert a sequence of distinct positive integers into a hash table, and output the positions of the input numbers. The hash function is defined to be H(key)=key%TSize where TSize is the maximum size of the hash table. Quadratic probing (with positive increments only) is used to solve the collisions.

Note that the table size is better to be prime. If the maximum size given by the user is not prime, you must re-define the table size to be the smallest prime number which is larger than the size given by the user.

Input Specification:

Each input file contains one test case. For each case, the first line contains two positive numbers: MSize (≤ 104) and N (≤ MSize) which are the user-defined table size and the number of input numbers, respectively. Then N distinct positive integers are given in the next line. All the numbers in a line are separated by a space.

Output Specification:

For each test case, print the corresponding positions (index starts from 0) of the input numbers in one line. All the numbers in a line are separated by a space, and there must be no extra space at the end of the line. In case it is impossible to insert the number, print "-" instead.

Sample Input:

4 4
10 6 4 15

Sample Output:

0 1 4 -

 

解题思路

  这题难点在于如何判断出这个元素是否可以插入。当时想了很久都没有想出来,这里就直接说结论吧。就是,我们在用平方探测法时会有一个计数器cnt,每进行一次探测cnt都会加1,如果cnt >= tableSize,而还没有找到位置可以插入,就说明这个元素是不能够插入的。

  这里做很简单的证明:

  我们有平方探测 (p + k * tableSize)2 % tablesize ,其中 p + k * tableSize = cnt , p ϵ N && p ϵ [0, tableSize) , k ϵ N 。

  然后把平方拆开,有 (p2 + 2 * p * k * tableSize + k2 * tableSzie2) % tableSize 。

  所以有 ((p2 % tableSize) + ((2 * p * k * tableSize) % tableSize) + ((k2 * tableSzie2) % tableSize)) % tableSize 。

  其中 (2 * p * k * tableSize) % tableSize = 0 , (k2 * tableSzie2) % tableSize = 0 。

  所以有 (p + k * tableSize)2 % tablesize = p2 % tableSize 。

  我们有公式:若p为质数,则 ab % p = a(b - 1) % (p - 1) + 1 % p 。由于我们保证tableSize为质数,所以可以使用该公式。

  所以有 p2 % tableSize = p % tableSize 。

  这说明,当cnt < tableSize时,有 (p + k * tableSize)2 % tablesize = p % tableSize ,当cnt >= tableSize时,也是有 (p + k * tableSize)2 % tablesize = p % tableSize ,而 p ϵ N && p ϵ [0, tableSize) 。这意味着我们只需要判断前tableSize - 1次就可以了,若前tableSize - 1没有找到可以插入元素的位置,就说明该元素无法插入。如果循环还继续,之后做的都是重复的事情了。

  这里的证明并没有考虑到元素散列后的位置,其实道理也是一样的。可以试想一下把p再拆分为pos + q。

  AC代码如下:

 1 #include <cstdio>
 2 #include <cmath>
 3 
 4 struct HashTable {
 5     int *table;
 6     int tableSize;
 7 };
 8 
 9 int nextPrime(int n);
10 int insert(int key, HashTable *ht);
11 
12 int main() {
13     int n, m;
14     scanf("%d %d", &n, &m);
15     HashTable *ht = new HashTable;
16     ht->tableSize = nextPrime(n);
17     ht->table = new int[ht->tableSize];
18     for (int i = 0; i < ht->tableSize; i++) {
19         ht->table[i] = 0;    // 初始化为0,表示这个位置没有元素。1表示这个位置有元素。数组实际上并不存放元素 
20     }
21     
22     for (int i = 0; i < m; i++) {
23         int num;
24         scanf("%d", &num);
25         int ret = insert(num, ht);
26         
27         if (i) putchar(' ');
28         if (ret == -1) putchar('-');
29         else printf("%d", ret);
30     }
31     
32     return 0;
33 }
34 
35 int nextPrime(int n) {
36     if (n == 1) return 2; 
37     int i = n % 2 ? n + 2 : n + 1;
38     while (true) {
39         int j = (int)sqrt(i);
40         for ( ; j > 2; j--) {
41             if (i % j == 0) break;
42         }
43         if (j == 2) break;
44         i += 2;
45     }
46     
47     return i;
48 }
49 
50 int insert(int key, HashTable *ht) {
51     int curPos, newPos, cnt = 0;
52     curPos = newPos = key % ht->tableSize;
53     // 对于没有位置可放入的情况,如果cnt >= tableSize,就说明没有位置可以放入key,此后只会不断循环此前出现的情况 
54     while (ht->table[newPos] == 1 && cnt < ht->tableSize) {
55         cnt++;
56         newPos = curPos + cnt * cnt;
57         newPos %= ht->tableSize;
58     }
59     
60     if (cnt >= ht->tableSize) newPos = -1;
61     else ht->table[newPos] = 1;
62     
63     return newPos;
64 }

Hashing

 

参考资料

  常见取模公式:https://www.cnblogs.com/noraxu/p/12578396.html

  11-散列2 Hashing(难点分析):https://blog.csdn.net/qq_38975493/article/details/90370475

上一篇:一致性哈希算法 consistent hashing


下一篇:Hashing - Hard Version