任意门:http://codeforces.com/gym/101954/problem/E
E. Locker Room
2.0 s
256 MB
standard input
standard output
There are several strange rooms in Binary Casino and one of them is a locker room. You have to enter the locker room several times a day, two times being a minimum (before and after your shift). There is no key or access code needed to enter the locker room. There is a brand new security lock system instead.
The lock system presents you with a generated puzzle which you have to solve every time you want to enter the locker room. This system prevents possible intruders to enter the locker room, as the puzzle takes them a long time to solve. Only employees, after working in the casino for some time, manage to master the puzzle.
It is your second week in the casino and you have already been late three times because you didn't manage to solve the puzzle quickly enough. You therefore decided to write a program which solves the puzzle. The puzzle is as follows:
You are given a cyclic string of NN lowercase English letters. You have to choose and mark substrings (continuous segments of characters) of a given length KK until each character in the string is marked. Marking a substring does not change the original string and each character can be marked multiple times. The task is to print the lexicographically maximal substring among chosen substrings. In addition, the printed substring has to be lexicographically minimal possible.
For example, let "acdb" be the given string of length N=4N=4 and let K=3K=3. Then you can choose substrings "acd" and "bac" to mark the whole string. The puzzle solution is "bac".
The first line of input contains two integers NN and KK (1≤N≤5⋅1051≤N≤5⋅105, 1≤K≤N1≤K≤N), describing the length of the given string and the length of marked substrings. The second line contains NN lowercase English letters – the given cyclic string.
Output the lexicographically maximal substring among chosen substrings under the condition the result is lexicographically minimal possible.
4 3
acdb
bac
6 2
aababa
ab
10 4
abaaabaaba
aaba
1 1
v
v
题意概括:
在长度为 N 的主串中选取一个长度为 K 的子串可以把原主串全覆盖。
覆盖的条件是 选取的子串 s 把原串中所有长度为 K 并且字典序小于 s 的子串的字符标记之后(每个字符可被标记多次),
如果原串的全部字符都被标记,则当前选取的子串 s 可以覆盖这个主串。
求字典序最小的 s 。
原主串是一个环!!!
即最小化最大值。
解题思路:
模拟赛时想到的做法是 把所有长度为 K 的子串存进 set 然后枚举是否为这个答案,因为也是排好序的所以可以二分搜索,但判断条件依然是 O(N).结果在OJ上ML了
正解是后缀数组:
复制一遍原主串,因为是个环。
处理出后缀数组,然后二分后缀排名 t。
倍增版本 O(2*nlongn)
DC3版本 O(n+nlongn)
如何判断当前排名为 t 的后缀是可以的呢?
扫一遍原数组,遇到字典序排名大于 t 的后缀更新区间,
暴力是否有一段连续的区间长度大于等于 N,有则说明可以全覆盖。
如何统计这个区间呢?
定义一个起点 和 一个终点,如果遇到字典序比 s 大的后缀则跳过,否则更新终点指针。
如果当前枚举 的 i 已经大于 之前所覆盖的连续区间的终点 则更新 起点(即前面的连续区间已被打断)。
因为原主串是一个环,所以每次两点作差判断覆盖区间长度是否大于等于N。
如果当前排名 t 的后缀长度 小于 K,则它肯定无法覆盖原主串,因为相同字典序,长度长的字典序更大,暴力判断时,其所形成的连续区间必定小于 N;
如果当前排名 t 的后缀长度 大于等于 K,如何保证只取其长度为 K 的前缀作为答案呢?
这里很巧妙,统计区间时处理区间,总是 把终点更新为 当前小于 s (我们选取的后缀)的后缀起点 + K,也就是说只取前 K 个。
AC code:
#include <bits/stdc++.h>
#define inc(i, j, k) for(int i = j; i <= k; i++)
#define rep(i, j, k) for(int i = j; i < k; i++)
#define F(x) ((x)/3+((x)%3==1?0:tb))
#define G(x) ((x)<tb?(x)*3+1:((x)-tb)*3+2)
#define INF 0x3f3f3f3f
#define LL long long
using namespace std;
const int maxn = int(3e6)+;///注意空间是要开到 3N 的
//const int N = maxn; #define F(x) ((x)/3+((x)%3==1?0:tb))
#define G(x) ((x)<tb?(x)*3+1:((x)-tb)*3+2)
int wa[maxn],wb[maxn],wv[maxn],Ws[maxn];
int r[maxn], sa[maxn], Rank[maxn], height[maxn]; int c0(int *r,int a,int b)
{return r[a]==r[b]&&r[a+]==r[b+]&&r[a+]==r[b+];} int c12(int k,int *r,int a,int b)
{if(k==) return r[a]<r[b]||r[a]==r[b]&&c12(,r,a+,b+);
else return r[a]<r[b]||r[a]==r[b]&&wv[a+]<wv[b+];} void Sort(int *r,int *a,int *b,int n,int m)
{
int i;
for(i=;i<n;i++) wv[i]=r[a[i]];
for(i=;i<m;i++) Ws[i]=;
for(i=;i<n;i++) Ws[wv[i]]++;
for(i=;i<m;i++) Ws[i]+=Ws[i-];
for(i=n-;i>=;i--) b[--Ws[wv[i]]]=a[i];
return;
}
void dc3(int *r,int *sa,int n,int m) //涵义与DA 相同
{
int i,j,*rn=r+n,*san=sa+n,ta=,tb=(n+)/,tbc=,p;
r[n]=r[n+]=;
for(i=;i<n;i++) if(i%!=) wa[tbc++]=i;
Sort(r+,wa,wb,tbc,m);
Sort(r+,wb,wa,tbc,m);
Sort(r,wa,wb,tbc,m);
for(p=,rn[F(wb[])]=,i=;i<tbc;i++)
rn[F(wb[i])]=c0(r,wb[i-],wb[i])?p-:p++;
if(p<tbc) dc3(rn,san,tbc,p);
else for(i=;i<tbc;i++) san[rn[i]]=i;
for(i=;i<tbc;i++) if(san[i]<tb) wb[ta++]=san[i]*;
if(n%==) wb[ta++]=n-;
Sort(r,wb,wa,ta,m);
for(i=;i<tbc;i++) wv[wb[i]=G(san[i])]=i;
for(i=,j=,p=;i<ta && j<tbc;p++)
sa[p]=c12(wb[j]%,r,wa[i],wb[j])?wa[i++]:wb[j++];
for(;i<ta;p++) sa[p]=wa[i++];
for(;j<tbc;p++) sa[p]=wb[j++];
return;
} void calheight(int *r, int *sa, int n)
{
int i, j, k = ;
for(i = ; i <= n; i++) Rank[sa[i]] = i;
for(i = ; i < n; height[Rank[i++]] = k)
for(k?k--:,j=sa[Rank[i]-]; r[i+k]==r[j+k]; k++)
return;
} int N, K;
string str; bool check(int t)
{
int b = , e = -;
for(int i = ; i < N; i++){
if(Rank[i] > t) continue;
if(e < i) b = i;
e = i+K;
if(e-b >= N) return true;
}
return false;
} int main()
{
scanf("%d %d", &N, &K);
cin >> str;
str+=str;
// cout << str << endl;
int n_len = str.size();
for(int i = ; i < n_len; i++){
r[i] = str[i]-'a'+;
}
// n_len--;
r[n_len] = ;
dc3(r, sa, n_len+, );
calheight(r, sa, n_len);
// for(int i = 0; i < n_len; i++) cout << Rank[i] << endl; int L = , R = n_len, mid, ans;
while(L <= R){
mid = (L+R)>>;
if(check(mid)){
R = mid-;
ans = mid;
}
else L = mid+;
}
// cout << ans << endl;
string res;
for(int i = ; i < N; i++){
if(Rank[i] == ans){
res = str.substr(i, K);
cout << res << endl;
return ;
}
}
return ;
}