Sentiment Analysis – What does it mean for Florida to be positive?

  1 m, 46 s

Since our release of Salience 4.0, there have been a couple of discussions about entity-level sentiment, and what it means. So I decided to write a posting based on a chat I had with our Chief Scientist, Mike Marshall on the subject. Let’s start with the basic method by which a sentiment score is assigned to an entity. The sentiment score for an entity is based on an aggregation of the sentiment scores for phrases that are associated with that entity. Observation #1: Sentiment for a thing is audience-specific and content-specific Think of recent news articles regarding executive compensations for companies that have been failing on Wall Street, or profits reported by oil companies. A record profit for an oil company would normally be considered a positive statement. However, when measured against the backdrop of other content indicating market concerns over the price of crude oil or consumer concerns of the price of gasoline at the pump, these phrases may weigh down the positive sentiment of the earnings report, and may result in a negative sentiment for a company that reports a record profit. Observation #2: The power of entity sentiment is in aggregation of results If you are a stock holder, you might not agree that record profits are a bad thing. And another piece of content that focuses solely on the earnings report may rate the same company positively. To get a true measure of the sentiment for an entity, you need as many data points as possible. This should smooth out individual results to provide a reasonable score for an entity. Observation #3: Machines can achieve the same result as humans on repetitive tasks, and perform them faster than humans In general, multiple humans judging the sentiment will agree with each other approximately 80% to 85% of the time. Machine-generated results will agree with human-generated results at approximately the same rate. The difference, however, is the amount of content a machine can process in a given span of time, and the consistency with which the machine will make its judgments. This also supports Observation #2, allowing many data points to produce quality sentiment scores for extracted entities.

Categories: Technology, Text Analytics