Method for clustering network-based short texts
A clustering method and text clustering technology, applied in text database clustering/classification, unstructured text data retrieval, special data processing applications, etc., can solve the problems of few clustering studies, unsatisfactory clustering results, Value is very sensitive and other issues, to achieve the effect of high clustering accuracy, ideal clustering effect, and strong practicability
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Examples
Example Embodiment
[0062] Example:
[0063] 1. Experiment with TFIDF formula for weight calculation in preprocessing.
[0064] In this paper, user comment information is obtained from Zhongguancun Online as the experimental data set. First, the traditional TFIDF formula is used for calculation. The experimental data set is segmented by the Chinese Academy of Sciences word segmentation software ICTCLAS. Table 1 below is the result of removing stop words from the experimental part of the text.
[0065]
[0066] Now we select the first text in Table 1 after removing the stop words and use the original TFIDF formula to calculate the weight of their feature items. The results are shown in Table 2 below.
[0067]
[0068] From the number of texts containing feature items in text one, it can be seen that the highest number is not necessarily the most important. Therefore, although some words contain a large number of texts, they are not important keywords to distinguish texts. It can be seen that
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap