KMP Algorithm
Intuition
Assume we have a string BBCABCDABABCDABCDABDE
,</p>
we want to check it there is a substring ABCDABD
i
denotes the index of string, j
denotes the index of pattern
Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
i | ↓ | ||||||||||||||||||||||
String | B | B | C | A | B | C | D | A | B | A | B | C | D | A | B | C | D | A | B | D | E | ||
pattern | A | B | C | D | A | B | D | ||||||||||||||||
patternIndex | 0 | 1 | 2 | 3 | 4 | 5 | 6 | ||||||||||||||||
j | ↑ | ||||||||||||||||||||||
pattern | A | B | C | D | A | B | D | ||||||||||||||||
patternIndex | 0 | 1 | 2 | 3 | 4 | 5 | 6 | ||||||||||||||||
next j | ↑ |
After we start matching iStart = 3
, we can move to iStart
to 4
intuitively.
However, ideally, we can virtual move iStart
to 7
, comparing i = 9
and j = 2
since we alreay compared i = 7, j = 4
and i = 8, j = 5
Thus, we can create a int[] next
denotes that where j
should go when j
mismatches (s.charAt(i) != p.charAt(j)
)
Implementation
For pattern string mentioned above, we can have the following table.
pattern string | A | B | C | D | A | B | D |
---|---|---|---|---|---|---|---|
patternIndex | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
length of the common prefix and suffix | 0 | 0 | 0 | 0 | 1 | 2 | 0 |
next[] | -1 | 0 | 0 | 0 | 0 | 1 | 2 |
Thus, when a mismatch happens on j = 6
, j
should be set to next[j]
Denote k
to the length of common suffix and prefix of we already have in string, which indicates that p[0, 1, 2, ..., k - 1]
equals p[j - k, j - k + 1, ..., j - 1]
- If
p[k] == p[j]
,k
can be increased by 1.next[j + 1]
will bek
- If not,
k = next[k]
next[]
static int[] getNext(final String pattern) {
final int[] next = new int[pattern.length()];
if (pattern.length() == 0) {
return next;
}
next[0] = -1;
int k = -1, j = 0;
while (j < pattern.length() - 1) {
if (k == -1 || pattern.charAt(j) == pattern.charAt(k)) {
++k;
++j;
next[j] = k;
} else {
k = next[k];
}
}
return next;
}
substring()
在KMP
模板,需要严格遵守 int i = 0, j = 0
,并在外面比较 -> 否则,""
and ""
这个case过不去
public int substring(final String s, final String p) {
final int[] next = getNext(p);
final int sLen = s.length(), pLen = p.length();
int i = 0, j = 0;
while (i < sLen && j < pLen) {
if (j == -1 || s.charAt(i) == p.charAt(j)) {
// when mismatch
++i;
++j;
} else {
// when match
j = next[j];
}
}
if (j == pLen) {
return i - j;
} else {
return -1;
}
}
Reference
http://wiki.jikexueyuan.com/project/kmp-algorithm/define.html