Class Section

java.lang.Object
com.lexalytics.salience.Section

public class Section
extends java.lang.Object

The Section object provides the statistical information about the document content determined by Salience. This includes information needed for term frequency analysis.

  • Constructor Summary

    Constructors 
    Constructor Description
    Section​(int wordCount, int sentenceCount, int objectiveCount, int subjectiveCount, int parsedCount, java.util.Map<java.lang.String,​java.lang.Integer> tokenFrequency, java.util.Map<java.lang.String,​java.lang.Integer> taggedtokenFrequency, java.util.Map<java.lang.String,​java.lang.Integer> bigramFrequency, java.util.Map<java.lang.String,​java.lang.Integer> taggedbigramFrequency, java.util.Map<java.lang.String,​java.lang.Integer> trigramFrequency, java.util.Map<java.lang.String,​java.lang.Integer> taggedtrigramFrequency, java.util.Map<java.lang.String,​java.lang.Integer> quadgramFrequency, java.util.Map<java.lang.String,​java.lang.Integer> negators, java.util.Map<java.lang.String,​java.lang.Integer> intensifiers, java.util.Vector<Sentence> sentences, java.lang.String text, java.lang.String fingerprint, java.util.Vector<Row> rows, java.lang.String header)
    Creates a new Section object.
  • Method Summary

    Modifier and Type Method Description
    java.util.Map<java.lang.String,​java.lang.Integer> getBigramFrequency()
    Retrieves a term frequency analysis for the bigrams in the document.
    java.util.Vector<Sentence> getChunks()
    Retrieves the individual sentences in the document.
    java.lang.String getFingerprint()  
    java.lang.String getHeader()
    Retrieves the header for this section.
    java.util.Map<java.lang.String,​java.lang.Integer> getIntensifierFrequency()
    Retrieves a term frequency analysis for the intensifiers in the document.
    java.util.Map<java.lang.String,​java.lang.Integer> getNegatorFrequency()
    Retrieves a term frequency analysis for the negators in the document.
    int getObjectiveCount()
    Retrieves the number of sentences in the document that are marked as objective.
    int getParsedCount()
    Retrieves the number of sentences in the document that were grammatically parsed.
    java.util.Map<java.lang.String,​java.lang.Integer> getQuadgramFrequency()
    Retrieves a term frequency analysis for the quadgrams in the document in ideogram languages.
    java.util.Vector<Row> getRows()
    Retrieves the rows in the document if it corresponds to a table or list.
    int getSentenceCount()
    Retrieves the number of sentences in the document.
    int getSubjectiveCount()
    Retrieves the number of sentences in the document that are marked as subjective.
    java.util.Map<java.lang.String,​java.lang.Integer> getTaggedBigramFrequency()
    Retrieves a tagged term frequency analysis for the bigrams in the document.
    java.util.Map<java.lang.String,​java.lang.Integer> getTaggedTokenFrequency()
    Retrieves a tagged term frequency analysis for the tokens in the document.
    java.util.Map<java.lang.String,​java.lang.Integer> getTaggedTrigramFrequency()
    Retrieves a tagged term frequency analysis for the trigrams in the document.
    java.lang.String getText()
    Retrieves the system's internal representation of the document.
    java.util.Map<java.lang.String,​java.lang.Integer> getTokenFrequency()
    Retrieves a term frequency analysis for the tokens in the document.
    java.util.Map<java.lang.String,​java.lang.Integer> getTrigramFrequency()
    Retrieves a term frequency analysis for the trigrams in the document.
    int getWordCount()
    Retrieves the number of words in the document.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • Section

      public Section​(int wordCount, int sentenceCount, int objectiveCount, int subjectiveCount, int parsedCount, java.util.Map<java.lang.String,​java.lang.Integer> tokenFrequency, java.util.Map<java.lang.String,​java.lang.Integer> taggedtokenFrequency, java.util.Map<java.lang.String,​java.lang.Integer> bigramFrequency, java.util.Map<java.lang.String,​java.lang.Integer> taggedbigramFrequency, java.util.Map<java.lang.String,​java.lang.Integer> trigramFrequency, java.util.Map<java.lang.String,​java.lang.Integer> taggedtrigramFrequency, java.util.Map<java.lang.String,​java.lang.Integer> quadgramFrequency, java.util.Map<java.lang.String,​java.lang.Integer> negators, java.util.Map<java.lang.String,​java.lang.Integer> intensifiers, java.util.Vector<Sentence> sentences, java.lang.String text, java.lang.String fingerprint, java.util.Vector<Row> rows, java.lang.String header)
      Creates a new Section object. This is not intended for client use, these objects are created by Salience Engine when calls to the Salience.getDocumentDetails() API method is called.
      Parameters:
      wordCount - The number of words in the document.
      sentenceCount - The number of sentences in the document.
      parsedCount - The number of sentences that grammatically parse.
      objectiveCount - The number of sentences containing objective content.
      subjectiveCount - The number of sentences containing subjective content.
      tokenFrequency - A term frequency analysis for the terms in the document.
      taggedtokenFrequency - A tagged term frequency analysis for the terms in the document.
      bigramFrequency - A term frequency analysis for the bigrams in the document.
      taggedbigramFrequency - A tagged term frequency analysis for the bigrams in the document.
      trigramFrequency - A term frequency analysis for the trigrams in the document.
      taggedtrigramFrequency - A tagged term frequency analysis for the trigrams in the document.
      quadgramFrequency - A term frequency analysis for quadgrams (only ideogram languages).
      negators - A term frequency analysis for the negators in the document.
      intensifiers - A term frequency analysis for the intensifiers in the document.
      sentences - A list of the individual sentences in the document.
      text - The parsed text of the document.
      fingerprint - A signature generated by the document content.
      rows - A list of the individual rows in the document if the document is a table or list.
      header - The header of the document.
  • Method Details

    • getText

      public final java.lang.String getText()
      Retrieves the system's internal representation of the document.
      Returns:
      A String containing the document content as parsed by Salience.
    • getFingerprint

      public final java.lang.String getFingerprint()
      Returns:
      A String containing the calculated representation of the document content.
    • getHeader

      public final java.lang.String getHeader()
      Retrieves the header for this section.
      Returns:
      A String of the section's header.
    • getParsedCount

      public final int getParsedCount()
      Retrieves the number of sentences in the document that were grammatically parsed.
      Returns:
      The number of parsed sentences.
    • getObjectiveCount

      public final int getObjectiveCount()
      Retrieves the number of sentences in the document that are marked as objective.
      Returns:
      The number of sentences marked as objective.
    • getSubjectiveCount

      public final int getSubjectiveCount()
      Retrieves the number of sentences in the document that are marked as subjective.
      Returns:
      The number of sentences marked as subjective.
    • getTokenFrequency

      public java.util.Map<java.lang.String,​java.lang.Integer> getTokenFrequency()
      Retrieves a term frequency analysis for the tokens in the document.
      Returns:
      A Map containing the unique tokens in the document and their counts.
    • getTaggedTokenFrequency

      public java.util.Map<java.lang.String,​java.lang.Integer> getTaggedTokenFrequency()
      Retrieves a tagged term frequency analysis for the tokens in the document.
      Returns:
      A Map containing the tagged unique tokens in the document and their counts.
    • getBigramFrequency

      public final java.util.Map<java.lang.String,​java.lang.Integer> getBigramFrequency()
      Retrieves a term frequency analysis for the bigrams in the document.
      Returns:
      A Map containing the unique bigrams (two-word phrases) in the document and their counts.
    • getTaggedBigramFrequency

      public final java.util.Map<java.lang.String,​java.lang.Integer> getTaggedBigramFrequency()
      Retrieves a tagged term frequency analysis for the bigrams in the document.
      Returns:
      A Map containing the tagged unique bigrams (two-word phrases) in the document and their counts.
    • getTrigramFrequency

      public final java.util.Map<java.lang.String,​java.lang.Integer> getTrigramFrequency()
      Retrieves a term frequency analysis for the trigrams in the document.
      Returns:
      A Map containing the unique trigrams (three-word phrases) in the document and their counts.
    • getTaggedTrigramFrequency

      public final java.util.Map<java.lang.String,​java.lang.Integer> getTaggedTrigramFrequency()
      Retrieves a tagged term frequency analysis for the trigrams in the document.
      Returns:
      A Map containing the tagged unique trigrams (three-word phrases) in the document and their counts.
    • getQuadgramFrequency

      public final java.util.Map<java.lang.String,​java.lang.Integer> getQuadgramFrequency()
      Retrieves a term frequency analysis for the quadgrams in the document in ideogram languages. In word based languages (like English, French, etc.) this structure is empty.
      Returns:
      A Map containing the unique quadgram (four-word phrases) in the document and their counts.
    • getNegatorFrequency

      public final java.util.Map<java.lang.String,​java.lang.Integer> getNegatorFrequency()
      Retrieves a term frequency analysis for the negators in the document.
      Returns:
      A Map containing the negators in the document and their counts.
    • getIntensifierFrequency

      public final java.util.Map<java.lang.String,​java.lang.Integer> getIntensifierFrequency()
      Retrieves a term frequency analysis for the intensifiers in the document.
      Returns:
      A Map containing the intensifiers in the document and their counts.
    • getWordCount

      public int getWordCount()
      Retrieves the number of words in the document.
      Returns:
      The number of words in the document.
    • getSentenceCount

      public int getSentenceCount()
      Retrieves the number of sentences in the document.
      Returns:
      The number of sentences in the document.
    • getChunks

      public java.util.Vector<Sentence> getChunks()
      Retrieves the individual sentences in the document.
      Returns:
      A Vector of Sentence objects.
    • getRows

      public java.util.Vector<Row> getRows()
      Retrieves the rows in the document if it corresponds to a table or list.
      Returns:
      A Vector of Row objects.