The present invention discloses a CN-DBpedia-based entity identification and linking
system and method. The
system comprises an
entity linking module and an entity identification module; the
entity linking module comprises a
synonym matching unit and an
entity linking unit; and the entity identification module comprises a tokenizer, a word probability calculation unit, and an entity discriminatingunit. According to the technical scheme of the present invention, a
semantic relationship between an entity and a word is constructed, so that the relationship with the entity can be mined in a few of context; a
machine learning-based entity
recognition algorithm is combined with an unsupervised word segmentation
algorithm, the rationality of entity name division is considered from the perspective of globality, the vocabulary space of word segmentation is expanded, and the word formation probability of entity words can be calculated by using a more reasonable
algorithm; and with a linking first and then identification manner, the
semantic information of the text is fully utilized in the entity identification, and better word segmentation and entity identification are realized.