Method and system for eliminating repeated pages from favorite webpages
A technology for eliminating repetition and web pages, applied in the field of Internet information, can solve the problems of increased time complexity, changes in punctuation character strings, and low judgment accuracy, so as to improve accuracy and efficiency, reduce time complexity, and improve algorithm efficiency. Effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Example Embodiment
[0046] figure 1 A schematic flowchart of a method for eliminating duplicate webpages from favorite webpages according to an embodiment of the present invention is shown. As shown in the figure, the present invention provides a method for eliminating duplicate web pages from favorite web pages, including the following steps:
[0047] S100: Obtain a favorite folder for favorite webpages, and obtain the source code of the favorite webpage from the favorite folder;
[0048] S200: Extract at least part of the body content of the webpage according to the source code;
[0049] S300: Perform similarity calculation on the at least part of the body content and corresponding content in the previously favorite webpage;
[0050] S400: When the similarity is greater than or equal to a preset similarity, delete the webpage corresponding to at least part of the body content.
[0051] Extracting at least part of the body content of the webpage according to the source code includes the following steps:
[0
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap