LexaBlog: Our Sentiment about Text Analytics and Social Media
On the Evolution of Social Media
Every once in a while, it is interesting to take a look back at how technologies evolve from the mundane to the sublime. Technology, in and of itself, is rarely the sole component of the solution to a problem. It is how people use it that helps to define its worth. The various online media outlets that we have been following at Lexalytics are no different.
When we started working with content acquisition, web spidering was the primary avenue to get at interesting content. If you were a business, you had a website, your partners had websites, and your competitors had websites. And there was real business value in being able to keep an automated eye on your competitor’s websites to see if anything changed or anything new appeared. Our cell phones were also a lot bigger back then.
Then newsfeeds hit the scene. Even better than webpages, here we had outlets of focused content that would update on a periodic basis. As a business, you wanted to watch the news feeds relevant to your industry, keep an ear to the ground to listen for which direction the buffalo were running. And cell phones got smaller.
It wasn’t too much of a leap from newsfeeds to blogs. But, as I recall, the first time I looked at a blog it was on a friend’s website and the main thread of the discussion was what he was going to have for lunch. Another comment I’ve heard from the beginnings of blogs, “Blogs were just a bunch of message forums.” The technology was in place, but where was the application of the technology (where was the ROI). Over time, blogs have come to feature content that is more and more relevant to business. Business themselves are now finding blogs as a mechanism for a more dynamic dialog with customers and the general public.
I first came across Twitter about a year ago. And again, at the time, it was not in a business context. An article in the Wall Street Journal at the time chronicled the content of the service at that time. But as with others that have come before it, business is starting to pay attention to Twitter. Why? First, a tweet about waiting on the phone with customer service, could easily be a tweet about your company’s customer service. That tweet about a new cool product, might be your company’s product. Twitter is becoming an outlet for customer opinions and discussions. And not only in the tweets that occur on the service itself, but as a gateway to websites (and blogs!) that are linked in tweets. Social media experts such as Chris Brogan are counseling businesses on why services like Twitter matter and how they can use Twitter to connect with their customers.
We’re including mechanisms to gather from Twitter and analyze the content we find for our customers, to help them tap into and understand this additional source of information. Social media is evolving, it is finding a place in business, and finding a place to make an impact.
PS. My cell phone has gotten bigger again, but now I can access Twitter on it.
- Carl Lambrecht's blog
- Login or register to post comments


Comments
Great post, we’ve also found it pretty easy to build a system that gets 70-80% accuracy in almost no time at all.
Sentiment accuracy is also interesting and measuring it becomes all the more difficult when you consider it as a spectrum rather than simple agree/disagree.
For example, when dealing with financial sentiment measurements as we do, you can be a little wrong, or you can be REALLY wrong.
Getting the right polarity 80% of the time is great, but you also need to consider what 20% you missed. Humans who disagree will usually have agreement on the highly polarized articles. We expect people to disagree on the more intricate cases.
In our experience, even if you are getting the same % agreement overall human-human or human-computer, computers are much more likely to throw articles humans would all agree on into the wrong bucket.
Thanks for the feedback Bryce
You make a good point about ranges, and it would be something that I would love to test with some external data, but I’m just not aware of any good sources.
Internally the model based system supports a range of ’sentiment classes’ rather than just Positive / Negative and of course our phrase based model produces actual scores so the higher (or lower) the score the more or less posivitive it thinks a document is.