A method and
system is described for modeling the content evolution of an accessed document and predicting an associated outcome for said document. The
system accesses a document but can further receive additional tags,
metadata, or related information that characterizes the nature of such text collection. The invention applies various
processing to separate the document into elements and performs semantic modeling to create a narrative model that describes the evolution of the contents of the elements in terms of their respective sequencing. This
system then uses a set of training documents with target values assigned to them to predict an associated outcome for the accessed document. The most relevant subset of a
training set can be selected by matching
metadata information that characterize the accessed document and a collection of
metadata that characterize other broad document sets. Such characterization is done using graph partitioning or other
community detection methods from metadata information that characterize the document sets and relations between multiple sets of such documents. The outcome of the method may apply to prediction of economic value of a events described by the accessed document, success measures of the
document quality, or discovery of related content with similar associated outcome to the accessed document.