The present invention belongs to the technical field of the Internet, and in particular relates to a web crawler method and system based on improved PageRank. The method includes: (1) crawling web pages; (2) obtaining web page relationships; (3) obtaining relationship matrices; 4) Obtain the initial probability matrix; (5) PageRank calculation; after obtaining the relationship matrix, the initial probability matrix and the damping coefficient, calculate the PR value of the web page, and iteratively calculate until the probability matrix converges and terminate the iteration; by the method provided by the present invention, solve The problem of web page deception in which some pages reuse keywords in web crawlers to improve search rankings. The web crawler system provided by the present invention is easy to use and has easy-to-accept storage, conversion, and calculation forms, and can efficiently and quickly calculate the weight of each web page; at the same time, it includes a relatively complete crawler program, which shows the crawler in real life. Applications, including picture and file download, Baidu Encyclopedia search, video playback, web page relationship visualization and other functions.