Character string matching method and device

一种字符串匹配、字符串的技术,应用在电数字数据处理、特殊数据处理应用、仪器等方向,能够解决字符串误差大等问题,达到提高准确度的效果

Inactive Publication Date: 2018-07-06
POTEVIO INFORMATION TECH CO LTD
View PDF4 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

It can be seen that only relying on the similarity calculated by the edit distance to measure the string matching situation, the correct rate is higher for strings with a small measurement length, without numbers and unique names, while for strings containing numbers, the length gap between strings is too large, and String error too large with unique name

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0029] figure 1 A schematic flow chart of a character string matching method provided by an embodiment of the present invention, such as figure 1 As shown, the method includes:

[0030] Step 101: Obtain a character string to be matched and at least one key character string corresponding to the character string to be matched, and calculate the ma...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a character string matching method and device. The method comprises the following steps that: obtaining a character string to be matched and at least one key character string corresponding to the character string to be matched, calculating the matching value of the key character string, wherein the character string to be matched comprises a first characterstring and a second character string; calculating the maximum prefix matching character string length of the first character string and the second character string; according to the maximum prefix matching character string length, utilizing a preset rule to calculate first editing distance between the first character string and the second character string; and according to the first editing distance and the matching value, obtaining a similarity between the first character string and the second character string. The device is used for executing the method. By use of the embodiment of the invention, through the calculation of the matching value of the key character string, the preset rule is used for calculating the first editing distance between the first character string and the second character string, finally, the similarity between the first editing distance and the matching value is obtained, and the accuracy of character string matching is improved.

Description

technical field [0001] The embodiments of the present invention relate to the technical field of text classification processing, and in particular to a string matching method and device. Background technique [0002] The Jaro-Winkler algorithm is used to calculate the similarity between two strings, and is currently the mainstream algorithm for measuring the degree of string matching. [0003] is the calculation method of the Jaro-Winkler algorithm as shown in formula (1): [0004] W ij =D ij +lp(1-D ij ) (1) [0005] Among them: W ij is the string S to be matched i and S j edit distance; l is the string S to be matched i and S j The length of the common prefix field, the upper limit is 4; p is a constant scaling factor, p=0.1, D ij It is jaro distance, a type of data edit distance. [0006] D. ij The calculation method of is shown in formula (2): [0007] [0008] Among them, m ij is the string S to be matched i and S j The number of characters matched in...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/33G06F16/90344
Inventor 闫继东
Owner POTEVIO INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products