算法 | Dai Chao

记一次愚蠢的Leetcode刷题

一道并不怎么难的题，提交了N次才成功，究其原因是边缘情况特别多，所以一定得好好复盘，练好我一直写不清楚的二分法——我经常绕不清楚奇偶两种情况，也常常照顾不好数组越界。 ...

简单的字符串练习1

初次尝试刷题，当然要刷简单的啦。 Root :: Competitive Programming 3: The New Lower Bound of Programming Contests (Steven & Felix Halim) :: String Processing :: String Matching :: Standard 455 - Periodic Strings **问题描述：**求一个字符串的最小周期，例如"123123123"的最小周期为3。从数据规模来看，穷举法完全可行，但还是要学习一下这位老哥对KMP的巧妙运用： ...

LCP数组、后缀自动机、后缀树（？）

LCP数组定义有了后缀数组，也就有了每个后缀的大小顺序。但现在我们有另一个奇怪的需求：对于任意两个后缀，求它们的最长公共前缀(Longest Common Prefix, LCP)。啊？为什么有这个需求？我也不知道，但大佬说很有用，那就求吧。 ...

The KMP Algorithm

Not difficult to understand. I learned it from Coursera. Prefix Function A “Border” of a string, is a substring that both acts as a prefix and a suffix of that string The “Prefix Function” of a string records the length of the longest border of every prefix of that string Symbol s is a string s[:i] is a prefix of s made up by s[0], s[1], ..., s[i-1] L[i] is the length of the longest border of s[:i] In other words, L is the “Prefix Function” of s Conclusion L[i+1] <= L[i]+1, and the equality holds if and only if s[L] = s[i] The longest border of the longest border of a string, is the second longest border of that string. if some border of s[:i] has a length of len and s[len] = s[i], then there must be a border in s[:i+1] with a length of len+1 Method L[0] = 0 L[i+1] = len + 1, where len is the length of the longest border of s[:i] that satisfies s[len] = s[i] To find len, try every border of s[:i] from the longest to the shortest To find the next longest border, see Conclusion 2 If no border is found, simply compare s[0] with s[i] If s[0] = s[i], then L[i+1] = 1 Else, L[i+1] = 0 Implementation int *pf(string s) { int len=s.size(),b=0; int *p=new int[len]; p[0]=0; for(int i=1;i<len;i++) { while(b>0 && s[i]!=s[b]) b=p[b-1]; if(s[i]==s[b]) b++; p[i] = b; } return p; } KMP Symbol p is the pattern string, t is the text string L is the “Prefix Function” of p Conclusion If p matches t partially with a longest common prefix p[:i], then no matches will be found before p is shifted len(p) - L[i] positions rightwards. There are two “submatches” in this partial match, both of which are the longest border of p[:i] The left one acts as a prefix, while the right one acts as a postfix p is shifted rightwards, so does p[:i] No matches are possible before the left submatch arrives at the previous position of the right submatch The length of this “vacuum area” is len(p) - L[i] Method Make s = p + '$' + t, where ‘$’ is absent from both p and t Calculate the “Prefix Function” of s, marked as L A match is found at position i - 2 * len(p) when L[i] = len(p) Under such circumstances, the longest border of s[:i] happens to be p itself, as p is a prefix of s This border is also a substring of s Thanks to the ‘$’ sign, we can guarantee that this border never crosses the real “border” between p and t, so it is also a substring of t Thus the starting position of the match is i - (len(p) - 1) in s, and i - (len(p) - 1) - (len(p) + 1) in t Implementation int main() { string pattern, text, s; int *pf; while(cin >> pattern >> text && pattern != "") { if(pattern.size() > text.size()) { cout << "-1" << endl; continue; } s = pattern + "$" + text; pf = prefix_func(s); for(int i = pattern.size()+1; i < s.size(); i++) { if(pf[i] == pattern.size()) { cout << i - 2 * pattern.size() << " "; } } cout << endl; delete pf; } return 0; }

破解外星代码：详解后缀数组模板

朕听说，后缀数组只需寥寥二十行代码即可求出？是，陛下。快，快呈上来！是，陛下。 int wa[MAXN], wb[MAXN], wv[MAXN], ws[MAXN]; inline bool cmp(int *r, int a, int b, int len) { return r[a]==r[b] && r[a+len]==r[b+len]; } void SA(char *r, int *sa, int n, int m) { int i, j, p, *x = wa, *y = wb, *t; for (i = 0; i < m; i++) ws[i] = 0; for (i = 0; i < n; i++) ws[x[i] = r[i]]++; for (i = 1; i < m; i++) ws[i] += ws[i - 1]; for (i = n - 1; i >= 0; i--) sa[--ws[x[i]]] = i; for (j = p = 1; p < n; j <<= 1, m = p) { for (p = 0, i = n - j; i < n; i++) y[p++] = i; for (i = 0; i < n; i++) if (sa[i] >= j) y[p++] = sa[i] - j; for (i = 0; i < m; i++) ws[i] = 0; for (i = 0; i < n; i++) ws[wv[i] = x[y[i]]]++; for (i = 1; i < m; i++) ws[i] += ws[i - 1]; for (i = n - 1; i >= 0; i--) sa[--ws[wv[i]]] = y[i]; for (t=x,x=y,y=t, x[sa[0]]=0, p=i=1; i<n; i++) x[sa[i]] = cmp(y,sa[i-1],sa[i],j) ? p-1 : p++; } } 陛下……陛下！快请御医！前言鉴于我也是第一次接触后缀数组，理解程度还很低，再加上类似的博客早已烂大街，因此不求读者能看懂或学到什么东西了。你还可以向我留言提出指导意见，我会非常欢迎。 ...