Summarization is an algorithmic shortening of the input content so as to best represent the whole content in a limited amount of words to enhance the text analysis process.
Lexalytics' semantic analysis software accomplishes this at the sentence level. This means that we pick the most important, representative sentences for the content and then use them for the summary, as in the diagram below:
Figure 1: Text Summarization
Representative sentences are chosen via lexical chaining.
The longest chain is assumed to best represent the content, and the first sentence of the summary is then the first sentence of the longest chain. The second-longest chain is assumed to be the next-best, and the first sentence of the summary is then the first sentence of the second-longest chain.
Repeat until we've reached the number of sentences that you desire.