A comprehensive utilization method of text features

A text and feature engineering technology, applied in the field of artificial intelligence, can solve problems such as low efficiency of manual classification, and achieve the effect of optimizing hospital workflow, improving accuracy, and improving the effect of training

Pending Publication Date: 2019-04-26
JINAN INSPUR HIGH TECH TECH DEV CO LTD
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the face of rapidly accumulating data, although manual classification can guarantee a high accuracy rate, compared with the method of machin

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A comprehensive utilization method of text features
  • A comprehensive utilization method of text features

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0041] The present invention will be further described below in conjunction with specific embodiments.

[0042] A comprehensive application method of text features. In this method, the corpus is processed with a completely consistent text preprocessing method, and then the TFIDF feature engineering model and the Word2vec feature engineering model are trained separately to obtain the same corpus represented by two different vector matrices. But these two different vector matrices have different concerns, such as the saliency of the vocabulary or the relevance of the context;

[0043] Among them, TFIDF is used to calculate word frequency, including the original word frequency algorithm and inverse document frequency value; Word2vec is used to solve the relevance of words in context on the basis of TFIDF.

[0044] In an embodiment of the present invention, the text preprocessing method includes word segmentation and stop word removal.

[0045] Then the two vector matrices obtained are simpl

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a comprehensive application method of text features. The method belongs to the technical field of artificial intelligence, and comprises the following steps: processing a corpus by using a completely consistent text preprocessing method, and then respectively training a TFIDF feature engineering model and a Word2vec feature engineering model to obtain the same corpus represented by two different vector matrixes; And then simply splicing the two obtained vector matrixes into a vector matrix with a higher dimension, and training a classification task model by using the vector matrix. According to the method, the respective advantages and characteristics of the TFIDF and the word2vec are combined for complementation, the relevance between the significance and context of one word in a document can be described more comprehensively and accurately, and the accuracy of a subsequent training classification model is improved.

Description

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Owner JINAN INSPUR HIGH TECH TECH DEV CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products