堆排序 VS 快速排序解决 TOP K 问题

2024-01-16 19:40:34

解决 TOP k 问题通常可采用堆排序和快速排序的思想

1. 大根堆(前 K 小) / 小根堆（前 K 大): 时间复杂度O(NlogK)

c++ STL 中提供了 priority_queue 实现堆的基本功能，比如 priority_queue <int> pq;

堆 pq 的元素都是 int 型的，priority_queue 默认使用 vector 作为堆的底层实现，pq默认是个

大根对，priority_queue <int> pq 等同于 priority_queue <int，vector<int>,less<int> > pq;

小根堆：priority_queue <int，vector<int>,greater<int> > pq1,

堆中的元素必须是可以比较大小的。

Tips : c++ std :: pair<K,V>，是可以比较大小的。比较的逻辑如下：

1 template<class _Ty1, class _Ty2>
2  inline  bool operator<(const pair<_Ty1, _Ty2>& _Left, 
3                                   const pair<_Ty1, _Ty2>& _Right)
4     {    // test if _Left < _Right for pairs
5     return (_Left.first < _Right.first ||
6         !(_Right.first < _Left.first) && _Left.second < _Right.second);
7     }

2 、用快排变形高效解决 TopK 问题：时间复杂度O(N)

注意找前 K 大/前 K 小问题不需要对整个数组进行 O(NlogN)O(NlogN) 的排序！

比如找到数组 arr 中的前 k 小个元素，使用快速排序的划分操作一次可以确定数组中一个元素的最终位置。不断缩小划分的范围，

最终的目标是确定位置 k - 1 上的最终元素。

根据快速排序思想，要是数组的 k -1 位置上的元素确定了，则 [0:k-1] 就是所求的前 K 小个元素。

求前 K 大个元素需使用划分法确定 n - k 位置上的元素，[n-k,k-1] 就是所求的前 k 大个元素。

3. 示例

方法一. 快速排序解决 TOP K 问题

 1 class Solution {
 2 public:
 3     vector<vector<int>> kClosest(vector<vector<int>>& points, int k)
 4     {
 5         const int n = points.size();
 6         vector<int> index_vec(n);
 7         for(int i = 0 ; i < n ; ++i)
 8         {
 9             index_vec[i] = i;
10         }
11         int left = 0, right = n - 1;
12         bool flag = false;
13         while(left <= right)
14         {
15             int t = partition(points,index_vec,left,right);
16             if(t == -1) break;
17             if(t == k - 1)
18             {
19                 flag = true;
20                 break;
21             }
22             else if( t > k - 1)
23             {
24                 right = t - 1;
25             }
26             else
27             {
28                 left = t + 1;
29             }
30         }
31         vector<vector<int>> res;
32         if(flag)
33         {
34             for(int i = 0;i < k; ++i)
35             {
36                 res.push_back(points[index_vec[i]]);
37             }
38         }
39         return res;
40     }
41     
42     int partition(vector<vector<int>>& points,vector<int> &index_vec,int l,int r)
43     {
44         if(l > r)
45         {
46             return -1;
47         }
48         int i = l,j = r;
49         int pivot = cal_dist(points[index_vec[l]]);
50         while( true )
51         {
52             while(i <= j && cal_dist(points[index_vec[i]]) <= pivot) ++i;
53             while(i <= j && cal_dist(points[index_vec[j]]) >= pivot) --j;
54             if(i > j)
55             {
56                 break;
57             }
58             std::swap(index_vec[i],index_vec[j]);
59         }
60         std::swap(index_vec[l],index_vec[j]);//最后一次交换 index_vec[l] 最终的位置确定
61         return j;
62     }
63     
64     inline int cal_dist(vector<int>& point)
65     {
66         return point[0] * point[0] + point[1] * point[1];
67     }
68 };

方法二. 大根堆解决 TOP K 问题

 1 //Top K we
 2   class my_less {
 3     public:
 4         bool operator()(const pair<int, int>& pr1, const pair<int, int>& pr2) {
 5             return pr1.first < pr2.first;
 6         }
 7     };
 8 
 9 class Solution {
10 public:
11     vector<vector<int>> kClosest(vector<vector<int>>& points, int K)
12     {
13         //priority_queue<pair<int, int>,vector<pair<int, int>>,my_less> q;//大顶堆
14         priority_queue<pair<int, int>> q;
15         const int n = points.size();
16         for (int i = 0; i < n; ++i)
17         {
18             int dist = points[i][0] * points[i][0] + points[i][1] * points[i][1];
19             if(q.size() < K)
20             {
21                q.push(make_pair(dist,i));
22                 continue;
23             }
24             if (dist < q.top().first)
25             {
26                 q.pop();
27                 q.emplace(dist, i);
28             }
29         }
30         vector<vector<int>> ans;
31         while (!q.empty())
32         {
33             ans.push_back(points[q.top().second]);
34             q.pop();
35         }
36         return ans;
37     }
38 };

码农公寓

解决 TOP k 问题通常可采用 堆排序 和 快速排序的思想

1. 大根堆(前 K 小) / 小根堆（前 K 大): 时间复杂度O(NlogK)

2 、用快排变形高效解决 TopK 问题：时间复杂度O(N)

注意找前 K 大/前 K 小问题不需要对整个数组进行 O(NlogN)O(NlogN) 的排序！

相关文章

解决 TOP k 问题通常可采用堆排序和快速排序的思想