Lexalytics and Datasift and the Twitter Firehose

Nick Halstead has launched a new service called Datasift. Datasift exists to help you filter the twitter firehose, chopping things up along any vector you can think of (and more). For example, you can do simple keyword filtering, but across a polygon that defines a geographical locale around the tweet-er, and only includes positive tweets.

Salience, Twitter, and other data work

Salience is roughly split into two parts: the software itself, and the supporting data directories that have dictionaries, models, etc. This post discusses work specific to the data directory itself.

Discovering relevant concepts in hotel reviews

In an earlier post, Jeff Catlin described analysis that we did around Bally's vs. Bellagio using publicly available customer reviews. We did this analysis using something called "categories", which is basically a fancy name for search strings.

Moving ahead with cost-effective text analytics

For the past year, we've experimented with using a web service to provide low-cost text analytics services, and we've learned a lot doing this.

Text Mining in Hotel Reviews: Bally's vs. Bellagio

Hotel Reviews represent one of my favorite uses of text analytics. About five years ago we built a site with FAST that measured hotel reviews to build a “consensus opinion ” of hotels in a narrow geographic area. The idea was to give users of the site (shown below) an idea of what people thought of various hotels in a given area (Manhattan for example).

Salience 4.3: Opinion Mining

One of the two major new features in Salience 4.3 (releasing around June 30th) is "opinion mining". Opinion mining expands our core technology to handle indirect quotes. We've been able to extract quote-mark delimited quotes for a while now, and you could perform further analysis on those quotes (which were attached to the speaker).

Using queries to recognize entities

Entity extraction in text analytics is the basis of the entire process - identify people, places, companies and themes and use them to better understand the content.

Lexalytics Sentiment Spectrum

Sentiment Analysis solutions are popping up everywhere these days - or so it seems. Every day there is a blog post, or Twitter post (or 100), asking how it works or arguing a point about sentiment and what exactly it means. There has been an increase in articles covering everything from automated solutions vs.

Textual Analysis of Financial News Stories

I came across an interesting blog post in Technology Review that showcased the Arizona Financial Text system (AZFin Text).

Blog categories