Regular expression matching method and system

a matching method and regex technology, applied in the field of data processing, can solve the problems of low matching performance, too many storage resources, and the matching method consumes too much time and storage resources, and achieves the reduction of the effect of reducing the time consumed by data loading in the matching process and reducing the time consumed by data loading

Active Publication Date: 2011-12-01
HUAWEI TECH CO LTD
View PDF0 Cites 28 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

The technology described allows for efficient processing of large amounts of data without having too many copies or requiring long periods before being able to match it correctly. It uses different types of regular expressions (regex) grouped together based on their strings. These rules help reduce the amount needed while still ensuring accurate matches between specific sequences within them. This improves efficiency when performing certain tasks such as searching databases quickly and accurately.

Problems solved by technology

This patented technical problem addressed in the previous paragraph relates to the excessive amount of space needed when performing regular expressions like hashes over dictionaries. Large amounts of RAM may make it difficult to integrate them onto logic chips efficiently without sacrificing their functionality. Additionally, there has been some research showing how long these techniques take up, which makes efficient processing challenging.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Regular expression matching method and system
  • Regular expression matching method and system
  • Regular expression matching method and system

Examples

Experimental program
Comparison scheme
Effect test

embodiment 1

[0030]A regex matching method is provided in an embodiment of the present invention to shorten the time consumed by data loading in the regex matching process and improve the matching performance. As shown in FIG. 2, the method includes the following steps:

[0031]201. Sort multiple regexes into several regex groups, where all regexes in one regex group include a common string, which is known as a generic string.

[0032]202. Compile each regex group into a DFA, and set up a correlation between the generic string of each regex group and the DFA.

[0033]203. Match to-be-matched data streams with the generic string respectively, and use the matched generic string as a matched string.

[0034]204. Obtain a DFA corresponding to the matched string.

[0035]205. Perform regex matching for the to-be-matched data streams according to the DFA, and output a matching result.

[0036]The string mentioned herein refers to the meaning represented by a combination of printable characters and non-printable characters

embodiment 2

[0038]A regex matching method is provided in an embodiment of the present invention to shorten the time consumed by data loading in the regex matching process and improve the matching performance. As shown in FIG. 3, the method includes the following steps:

[0039]301. Sort multiple regexes into several regex groups when the matching condition includes the multiple regexes, where all regexes in one regex group include a common string, and this string is used to differentiate regex groups and is known as a generic string.

[0040]In step 301, any regexes that include the same string are sorted into a group, regardless of the string itself.

[0041]For example, if a string indicating that “data needs to include ace” exists in two regexes, the two regexes are sorted into a group.

[0042]Further, if the regexes that include the same string are more than a preset threshold, these regexes are sorted into multiple groups, each group consisting of the regexes less than the preset threshold. Therefore, i

embodiment 3

[0072]A regex matching system is provided in an embodiment of the present invention to shorten the time consumed by data loading in the regex matching process and improve the matching performance. As shown in FIG. 4, the system includes:

[0073]a grouping and compiling device A, configured to: sort multiple regexes into several regex groups, where all regexes in one regex group include a common string, which is known as a generic string; and compile each regex group into a DFA, and set up a correlation between the generic string of each regex group and the DFA; and

[0074]a matching device B, configured to: match to-be-matched data streams with the generic string respectively, and use the matched generic string as a matched string; obtain a DFA corresponding to the matched string; and perform regex matching for the to-be-matched data streams according to the DFA, and output a matching result.

[0075]As shown in FIG. 5, the grouping and compiling device A includes:

[0076]a grouping module 501,

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses a regex matching method and system, and relates to the field of computer technologies. The method includes: sorting multiple regexes into several regex groups, where all regexes in one regex group include a common string, which is known as a generic string; compiling each regex group into a DFA, and setting up a correlation between the generic string of each regex group and the DFA; matching to-be-matched data streams with the generic string respectively, and using the matched generic string as a matched string; obtaining a DFA corresponding to the matched string; and performing regex matching for the to-be-matched data streams according to the DFA, and outputting a matching result. The embodiments of the present invention shorten the data loading process, decrease the time consumed by data loading, and improve the matching performance.

Description

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Owner HUAWEI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products