KMP Algorithm

Intuition

Assume we have a string BBCABCDABABCDABCDABDE,</p> we want to check it there is a substring ABCDABD

i denotes the index of string, j denotes the index of pattern

Index 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
i
String B B C A B C D A B A B C D A B C D A B D E
pattern A B C D A B D
patternIndex 0 1 2 3 4 5 6
j
pattern A B C D A B D
patternIndex 0 1 2 3 4 5 6
next j

After we start matching iStart = 3, we can move to iStart to 4 intuitively.

However, ideally, we can virtual move iStart to 7, comparing i = 9 and j = 2 since we alreay compared i = 7, j = 4 and i = 8, j = 5

Thus, we can create a int[] next denotes that where j should go when j mismatches (s.charAt(i) != p.charAt(j))

Implementation

For pattern string mentioned above, we can have the following table.

pattern string A B C D A B D
patternIndex 0 1 2 3 4 5 6
length of the common prefix and suffix 0 0 0 0 1 2 0
next[] -1 0 0 0 0 1 2

Thus, when a mismatch happens on j = 6, j should be set to next[j] 

Denote k to the length of common suffix and prefix of we already have in string, which indicates that p[0, 1, 2, ..., k - 1] equals p[j - k, j - k + 1, ..., j - 1]

  • If p[k] == p[j], k can be increased by 1. next[j + 1] will be k
  • If not, k = next[k]

next[]

static int[] getNext(final String pattern) {
    final int[] next = new int[pattern.length()];
    if (pattern.length() == 0) {
      return next;
    }
    next[0] = -1;
    int k = -1, j = 0;
    while (j < pattern.length() - 1) {
      if (k == -1 || pattern.charAt(j) == pattern.charAt(k)) {
        ++k;
        ++j;
        next[j] = k;
      } else {
        k = next[k];
      }
    }
    return next;
}

substring()

KMP模板,需要严格遵守 int i = 0, j = 0,并在外面比较 -> 否则,"" and "" 这个case过不去

  public int substring(final String s, final String p) {
    final int[] next = getNext(p);
    final int sLen = s.length(), pLen = p.length();
    int i = 0, j = 0;
    while (i < sLen && j < pLen) {
      if (j == -1 || s.charAt(i) == p.charAt(j)) {
        // when mismatch
        ++i;
        ++j;
      } else {
        // when match
        j = next[j];
      }
    }

    if (j == pLen) {
      return i - j;
    } else {
      return -1;
    }
  }

Reference

http://wiki.jikexueyuan.com/project/kmp-algorithm/define.html

results matching ""

    No results matching ""