Class ChunkToken

java.lang.Object
com.lexalytics.salience.ChunkToken

public class ChunkToken
extends java.lang.Object

A ChunkToken object represents a token from a Chunk object.

A list of ChunkToken objects is returned in Chunk results.

  • Constructor Summary

    Constructors 
    Constructor Description
    ChunkToken​(java.lang.String sToken, java.lang.String sPOSTag, boolean bInvert, java.lang.String sStem, int nByteOffset, int nByteLength, int nCharOffset, int nCharLength)
    Creates a new chunk token.
  • Method Summary

    Modifier and Type Method Description
    int GetByteLength()
    Returns the byte length for the chunk (in UTF-8).
    int GetCharLength()
    Returns the character length for the chunk (actual characters.
    java.lang.String getPOSTag()
    Retrieves the part-of-speech tag for the chunk token.
    int GetStartByte()
    Returns the byte offset within the document for the chunk (in UTF-8).
    int GetStartChar()
    Returns the character offset within the document for the chunk (actual characters.
    java.lang.String getStem()
    The stemmed form of the chunk.
    java.lang.String getToken()
    Retrieves the chunk token.
    boolean isInverter()
    Indicates whether the chunk token inverts sentiment.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • ChunkToken

      public ChunkToken​(java.lang.String sToken, java.lang.String sPOSTag, boolean bInvert, java.lang.String sStem, int nByteOffset, int nByteLength, int nCharOffset, int nCharLength)
      Creates a new chunk token. This is not intended for client use. Chunk tokens are created by Salience Engine when Chunk objects are created.
      Parameters:
      sToken - The text of the chunk token.
      sPOSTag - The part-of-speech tag for the chunk token.
      bInvert - A flag indicating if the chunk token inverts sentiment.
      sStem - A String containing the stemmed form of the chunk.
      nByteOffset - The byte offset of the chunk (in UTF-8 representation).
      nByteLength - The length of the chunk in bytes (in UTF-8 representation).
      nCharOffset - The character offset of the chunk (actual characters. Java for example counts some emoji as two characters, so there may be discrepencies)
      nCharLength - The character length of the chunk (actual characters. Java for example counts some emoji as two characters, so there may be discrepencies)
  • Method Details

    • getToken

      public java.lang.String getToken()
      Retrieves the chunk token.
      Returns:
      A String containing the text of the chunk token.
    • getPOSTag

      public java.lang.String getPOSTag()
      Retrieves the part-of-speech tag for the chunk token. The tagset used for Salience part-of-speech tags are based on the Penn Treebank tagset. The full tagset can be found on the Lexalytics developer wiki.
      Returns:
      A String containing the part-of-speech tag.
    • isInverter

      public boolean isInverter()
      Indicates whether the chunk token inverts sentiment.
      Returns:
      A boolean indicating whether the chunk token inverts sentiment.
    • getStem

      public java.lang.String getStem()
      The stemmed form of the chunk.
      Returns:
      A String containing the stemmed chunk.
    • GetStartByte

      public int GetStartByte()
      Returns the byte offset within the document for the chunk (in UTF-8).
      Returns:
      The byte offset for chunk.
    • GetByteLength

      public int GetByteLength()
      Returns the byte length for the chunk (in UTF-8).
      Returns:
      The length in bytes of chunk.
    • GetStartChar

      public int GetStartChar()
      Returns the character offset within the document for the chunk (actual characters. Java for example counts some emoji as two characters, so there may be discrepencies).
      Returns:
      The character offset for chunk.
    • GetCharLength

      public int GetCharLength()
      Returns the character length for the chunk (actual characters. Java for example counts some emoji as two characters, so there may be discrepencies)..
      Returns:
      The length in characters of chunk.