Time series based topic development clustering analysis system and method

A technology of time series and cluster analysis, which is applied in text database clustering/classification, other database retrieval, network data retrieval, etc. It can solve problems such as decreased accuracy, inability to reflect sequence similarity, disorder, etc.

Active Publication Date: 2018-08-17
COMMUNICATION UNIVERSITY OF CHINA
View PDF3 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

This patented technology helps analyze data from different sources over multiple days or months into clusters that help identify similarities between them for better understanding purposes. It uses specific techniques like histograms to determine how well people are attending certain things together at once (their average readiness). By analyzing these patterns across all available hours, researchers may find areas where there's more interest than usual by identifying those groups with high frequency content.

Problems solved by technology

This patented technical problem addressed in this patents relates to improving methods for studying temporal relationships among diverse types of data like web pages and news articles. Traditional approaches involve calculating distances from consecutive points along certain dimensions, rather than directly comparing pairs across space instead of just looking at specific areas called windows. Dynamic Time Warp Distance (DTwG) was developed specifically designed to measure differences in length without losing any significant parts of the original data. It also allows users to compare documents containing varying content quickly through tools like Hotspot Spotting Lens Arrays (HSLA). Overall, there were limitations associated with current techniques related to identifying patterns in complex datasets.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Time series based topic development clustering analysis system and method
  • Time series based topic development clustering analysis system and method
  • Time series based topic development clustering analysis system and method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of one or more embodiments. It may be evident, however, that these embodiments may be practiced without these specific details. Specific embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0043] Specific embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0044] image 3 It is a flow chart of the topic development cluster analysis method based on time series of the present invention, as shown in Figure 1, the cluster analysis method includes:

[0045] In step S310, with a predetermined collection period T 0 Topics are collected from the Internet and microblogs, and the topics include topic URLs, topic names, and time series of cumulative reading volumes, wherein the time series of cumulative re

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a time series based topic development clustering analysis system and method. The method comprises: collecting topics to form a cumulative reading quantity time series; performing differential processing on the time series to obtain a topic hot degree time series; determining whether the topic is in a recession period, continuing to collect topics if not; calculating S-Euc and S-DTW between the topics if so, and clustering all topics. The system comprises: a data collection unit; a data processing unit that performs forward differential processing on the cumulative reading quantity time series to determine whether the topic is in the recession period, stores the topics that are not in the recession period in a first topic storage library, and stores the topics that are in the recession period in a second topic repository; a time series distance calculation unit that calculates the S-Euc and the S-DTW between the topics; and a topic clustering unit that clusters all the topics. The above system and method have high precision and a good clustering effect.

Description

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Owner COMMUNICATION UNIVERSITY OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products