AC自动机
AC自动机是一个经典的多模式串匹配算法,它可以实现对主串的一次扫描来匹配多个模式串的功能。
实现AC自动机就是在Trie树的基础上加上了类似KMP的next数组。
class ACNode{
public int data;
public ACNode[] children = new ACNode[26];
public boolean isEndingChar = false;
public int length = -1;
public ACNode fail;
public ACNode(char data) {
this.data = data;
}
}
AC自动机的构造分成两步:
- 构造Trie树
- 在Trie树的基础上构建fail指针(相当于KMP的next数组)
Trie树的构造可以参考Trie树;
下面是fail指针的构建
public class AC {
private ACNode root = new ACNode('/');
public void buildFailurePointer(){
Queue<ACNode> queue = new LinkedList<ACNode>();
root.fail = null;
queue.add(root);
while(!queue.isEmpty()){
ACNode p = queue.remove();
for(int i = 0; i < 26; i++){
ACNode pc = p.children[i];
if(p.children[i] == null){
continue;
}
if(p == root){
pc.fail = root;
}
else{
ACNode q = p.fail;
while(q != null){
ACNode qc = q.children[pc.data - 'a'];
if(qc != null){
pc.fail = qc;
break;
}
q = q.fail;
}
if(q == null){
pc.fail = root;
}
}
queue.add(pc);
}
}
}
public void match(char[] text){
int n = text.length;
ACNode p = root;
for(int i = 0; i < n; i++){
int index = text[i] - 'a';
while(p.children[index] == null && p != root){
p = p.fail;
}
p = p.children[index];
if(p == null){
p = root;
}
ACNode temp = p;
while(temp != root){
if(temp.isEndingChar){
int pos = i - temp.length + 1;
System.out.println("matched from index " + pos + "; length " +temp.length);
}
temp = temp.fail;
}
}
}
}