当前位置:网站首页>Code analysis of AC automata template
Code analysis of AC automata template
2022-04-21 10:56:00 【Jinze_ L】
I still watch it today AC automata , Finally understand the template code , To sum up :
AC automata , Applied to pattern matching problem , In particular, the problem of long text and multiple templates is better than KMP Algorithm and dictionary tree have more advantages . The method is to build all templates into a large state transition diagram , Instead of creating a diagram for each template .KMP The state transition graph of the algorithm is a linear string + Composed of mismatched edges , and AC Automata is a dictionary tree + Composed of mismatched edges . Dictionary trees have great advantages in judging whether a word belongs to a dictionary , But the dictionary tree is not enough to judge how many words in a dictionary a text contains , Because each character of the text string must go through the dictionary tree from the following node of the dictionary tree ( Very similar to the violent method used in pattern string matching ), You should use AC automata .
The template code is as follows :
#include<cstdio>
#include<cstring>
#include<queue>
using namespace std;
const int maxnode=11000;
const int sigma_size=26;
struct AC_Automata
{
int ch[maxnode][sigma_size];
int val[maxnode]; // The end node of each string has a non - 0 Of val
int f[maxnode]; // fail function
int last[maxnode]; // last[i]=j surface j The word represented by the node is i The suffix of the node word , And j Nodes are word nodes
int sz;
// initialization 0 Information about the root node
void init()
{
sz=1;
memset(ch[0],0,sizeof(ch[0]));
val[0]=0;
}
//insert Responsible for construction ch And val Array
// Insert string ,v Must not 0 Represents a word node
void insert(char *s,int v)
{
int n=strlen(s),u=0;
for(int i=0; i<n; i++)
{
int id=s[i]-'a';
if(ch[u][id]==0)
{
ch[u][id]=sz;
memset(ch[sz],0,sizeof(ch[sz]));
val[sz++]=0;
}
u=ch[u][id];
}
val[u]=v;
}
//getFail Function is responsible for constructing f and last Array
void getFail()
{
queue<int> q;
last[0]=f[0]=0;
for(int i=0; i<sigma_size; i++)
{
int u=ch[0][i];
if(u)
{
f[u]=last[u]=0;
q.push(u);
}
}
/* then , Is to see how to use BFS Find the failure pointers of all nodes .
1) For the root node root Failed pointer to , We point it directly to NULL, For all child nodes under the root node , The failure pointer must point to root Of , Because when none of the characters match , Naturally, there is no shorter prefix that can match it ;
2) Insert the node with the failed pointer into the queue ;
3) Pop up one node at a time now, Ask the child node corresponding to each character of it , For the sake of convenience , We will now Of i The sub node number is now->next[i]:
a) If now->next[i] by NULL, It will be now->next[i] Point to now Of the failed pointer i Number sub node , namely now->next[i] = now->fail->next[i];
b) If now->next[i] It's not equal to NULL, You need to construct now->next[i] Failed pointer to , because a) The operation of , We know now There must be a failure pointer i Number sub node , namely now->fail->next[i], So we're going to now->next[i] The failed pointer points to it , namely now->next[i]->fail = now->fail->next[i];
4) repeat 2) Until the queue is empty ;*/
while(!q.empty())// Press BFS Sequential calculation fail
{
int r=q.front(); q.pop();
for(int i=0; i<sigma_size; i++)
{
int u=ch[r][i];
if(u==0)continue;
q.push(u);
int v=f[r];
while(v && ch[v][i]==0) v=f[v];
f[u]= ch[v][i];
last[u] = val[f[u]]?f[u]:last[f[u]];
}
}
}
// Recursive printing and node i The node number of the template word with the same suffix
// Before entering this function, ensure that val[i]>0
void print(int i)
{
if(i)
{
printf("%d\n",i);
print(last[i]);
}
}
// stay s To find out in the Which template words appear
void find(char *s)
{
int n=strlen(s),j=0;
for(int i=0; i<n; i++)
{
int id=s[i]-'a';
while(j && ch[j][id]==0) j=f[j];
j=ch[j][id];
if(val[j]) print(j);
else if(last[j]) print(last[j]);
}
}
};
AC_Automata ac; I'll continue to look at the topic tomorrow , The question is really difficult ...
版权声明
本文为[Jinze_ L]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204211048484838.html
边栏推荐
- Six practices of Windows operating system security attack and defense
- C语言中使用scanf函数时应注意的问题
- 训练总结报告
- Modern Actuarial Risk Theory 07: risk measurement
- IDEA和PyCharm启动时进入欢迎界面
- Go环境的搭建
- Wonderful preview of the event 𞓜 Vitas + Long Zhi: building the "number one player" in the game industry digitally
- Use of laravel redis
- println输入和toString方法的重写
- 伦敦金在哪里开户安全?
猜你喜欢

基于润和大禹开发板的导购系统项目方案
Go环境的搭建

Mysql 002 DDL

Bucket sorting ← C language implementation

手把手教你:基于深度学习的滚动轴承故障诊断

pgpool-II 4.3 中文手册 - 入门教程

Android学习 ① Android连接不上Mysql数据库的多种原因以及解决方式

Why programming is so difficult: starting from the deduction of stored value card

A collection of econometric models and empirical Stata codes, with top publication examples

8通道CAN FD,更强大的数据记录仪GL3400
随机推荐
JS 實現置頂輸入文本框
[非线性控制理论]1_Lyapunov直接方法
【acwing】1459. Cow Gymnastics (simulation, thinking)
基于润和大禹开发板的导购系统项目方案
js---call,apply,bind
伦敦金在哪里开户安全?
Go环境的搭建
[nonlinear control theory] 1_ Lyapunov direct method
你不知道的 parseInt?
手把手教你:基于深度学习的滚动轴承故障诊断
TypeError: The view function did not return a valid response. The function either returned None 的解决
GO语言反射机制
GO 使用channel进行同步 (channel 1)
基于润和大禹开发板的导购系统路线详解
L1-045 cosmic invincible greeting (5 points)
Tamigou enterprise M & a platform, list of legal service providers!
C#调用delphi dll接口问题
Runhe Dayu learning program
PHP文件包含:require,require_once;include,include_once
Postman setting environment variables is simple and practical