Seth Redmore's blog

Viafoura, Semantria, Amazon, and Lexalytics Hack-A-Thon!

On Saturday April 27th, viafoura is hosting a hack-a-thon to see what interesting ideas developers, product designers, and data scientists can come up with when given large data sets.

"Is Social Media Worthy of Text Analytics": A Response

I happened upon (well, really, was fed via LinkedIn) a blog post by Tom Anderson over at OdinText.  I've seen some of his stuff before, and he seems like a reasonable gentleman.

I was kinda surprised by this post:  Is Social Media Worthy of Text Analytics, and thought it would be worth responding to.

Happy New Year - Some stats from Twitter + New Years Eve

Hello all and Happy New Year!

I snagged 412,700 tweets from Twitter from about 16 hours that went from 2300 UTC December 31 to 1500 UTC Jan 1.   This was 5% of the total tweets that went out with the phrase "new year".

Without further ado... The top 50 hashtags were as follows:

All Hail Datasift!

Takers of the 3rd round of funding!  Or 2nd, or fifth, but I'm guessing it's third.   Whatever round it is, congrats to Datasift for building a really great service.

Facebook IPO - a high level view of Twitter leading up to their big day...

I analyzed 98,000 tweets, roughly the past 5 days worth of traffic for anything mentioning facebook and (ipo OR stock).

The Avengers: Most Popular Superhero?

I ran some analysis on about 330,000 tweets having to do with people going to see/having seen The Avengers.   In case you've been completely deprived of any sort of media recently, it's a superhero movie.

Some might say "The Superhero Movie", but, not having seen it myself yet, I'm not in a position to judge.

Facets: Automatically Extracting Opinions from Reviews and Surveys

Facets are another unique feature of Lexalytics' Salience Engine.  They provide a quick and easy way to get to actionable information about opinions given in reviews and surveys.

Concept Topics in 3 minutes

Hi!  Concept Topics are our revolutionary way to create classifiers for what used to be hard-to-classify buckets.   Things like politics, food, real-estate, business.  Most of our customers need to do some sort of classification - bucketing responses on surveys, determining which area of business is being talked about in the press.

Datasift, Twitter, and "Privacy"

Datasift (www.datasift.com) just announced a fancy new service.   They have worked out a deal with Twitter where they can provide 2+ years of historical tweets.