Impressions from the Sentiment Analysis Symposium

I recently attended the Sentiment Symposium in NYC that Seth Grimes puts on a couple of times per year.

Facebook IPO - a high level view of Twitter leading up to their big day...

I analyzed 98,000 tweets, roughly the past 5 days worth of traffic for anything mentioning facebook and (ipo OR stock).

The Avengers: Most Popular Superhero?

I ran some analysis on about 330,000 tweets having to do with people going to see/having seen The Avengers.   In case you've been completely deprived of any sort of media recently, it's a superhero movie.

Some might say "The Superhero Movie", but, not having seen it myself yet, I'm not in a position to judge.

Facets: Automatically Extracting Opinions from Reviews and Surveys

Facets are another unique feature of Lexalytics' Salience Engine.  They provide a quick and easy way to get to actionable information about opinions given in reviews and surveys.

Concept Topics in 3 minutes

Hi!  Concept Topics are our revolutionary way to create classifiers for what used to be hard-to-classify buckets.   Things like politics, food, real-estate, business.  Most of our customers need to do some sort of classification - bucketing responses on surveys, determining which area of business is being talked about in the press.

Datasift, Twitter, and "Privacy"

Datasift ( just announced a fancy new service.   They have worked out a deal with Twitter where they can provide 2+ years of historical tweets.

Text Analytics: Coming soon to an application near you!

I’ve long believed that Text Analytics was going to “pop” one day and start showing up in all sorts of applications.  I talked with a company yesterday that made me believe that day may be coming soon.

Machine Learning vs. Natural Language Processing, part 1

I'm going to write a few blog articles to show how machine learning and natural language processing techniques are used in partnership inside of Lexalytics software.

What is Machine Learning?

Machine-assisted editorial curation: does it risk putting us in a "bubble of our own"?

So I watched an interesting video on some of the mechanically automated editorial work that is happening on major sites like Google and Facebook (see the video here:

Tuning Sentiment: Three Two's.

In honor of the Sentiment Analysis Symposium this week in San Francisco (you are going to be there, right?), here's a summary of best practices for tuning sentiment.   These will work for any sentiment analysis system, but you should use ours.

Because it's the best, 'natch.

1) 2 datasets:  Gather a set of documents and split it in half.