LexaBlog

Our Sentiment about Text Analytics and Social Media

Submitted by Christine Sierra on Tue, 2009-10-06 04:00

If you've got tweets, we've got sentiment. And themes. And most mentioned people. And spam lists. In fact, the only issue we've run into is that Twitter won't give us all the data to analyze. All 100 gazillion tweets would be fascinating to analyze automatically, but they just don't seem to be there yet. Or, perhaps, they are building out their revenue model to sell us the data. Either way, don't fret. Just like in reputation management where analyzing every single document can be both time consuming and incredibly inefficient, the same holds true for tweets. The average of the sentiment is often greater than the individual tweet. As our CEO Jeff Catlin mentioned recently on ZDNet: "Sentiment measurement is at the forefront of much business analysis these days, but in some ways Twitter seems as if it was designed from the ground up to defeat any automated sentiment engine. For instance, there isn't much sentence structure in tweets, and what's there is often wrong. And many of the tweets are just tinyurl or bit.ly links with absolutely no content contained in the URL itself. Given these challenges, is monitoring and measuring sentiment in Twitter a hopeless chore? Fortunately the answer is No. Even though there are some challenges to automated scoring of Twitter content, there are also some advantages to processing tweets and in particular the tone within Twitter. The beauty of Twitter is that there is very little grey area in tweets. You're either posting some source of information, posting an opinion you have, or replying to another informative or opinion-oriented tweet." With the volumes of online data growing at an unbelievable rate, decreasing processing time and implementing automation become key to getting the job done. And from that automation process comes incredible value such as all the associated concepts and themes with a particular topic. Not just the ones with the most hashtags associated with them. And who is talking about those topics? And who else is mentioned with those topics? The value is not always in the number of mentions, while in some aspects that is helpful, but with the context surrounding the tweets and how businesses can use them. In the coming days we will be attending the Inbound Marketing Summit in Boston where we'll have a demo of our twitter topic tracking system available. We aren't formally releasing a site, or promoting a new product, but we are welcome to conversations about the topic of Twitter Topics and what is useful and what is just fluff. Text analytics doesn't just have to be about processing word documents and research reports - it is just as helpful in processing tweets, customer comments and smaller documents as well.

Submitted by Christine Sierra on Fri, 2009-10-02 04:00

It seems lately that there are more and more companies offering sentiment solutions to a variety of markets. Everything from health care to customer service to financial services and reputation management. But in spite of this, very few prospects seem to really understand what the technology will and won't do for them. Let's start with some basic questions to help you understand more about sentiment:

What does it mean to measure sentiment? How do I know if I really need to use it? That depends entirely on the intentions of the user and the content being measured. If you're looking at customer review data (let's say hotel reviews in this case), then you may be interested in the sentiment of each review for the hotel. Were people happy with their stay at this hotel? This would be an example of document sentiment. It would tell you if the overall review was good or bad, and offer little insight to the details of each review. In this case, processing large amounts of data about the same topic works well. If, however, you're reading a publication like Consumer Reports, then you're probably thinking more about how the different hotels stack up against one another. You'd like to do some comparison. In this case, the overall document sentiment wouldn't be of much help because the document will have some good and some bad content mixed within it. In fact, what the reader really cares about in this kind of content is the tone for each specific hotel that's being described in the document and the reasons why. Were the beds comfy? How was the shower pressure? Is the staff friendly? In some cases the beds may have been comfortable but the staff rude, which can sway the sentiment of a review. Depending on what is important to you, you'd want to extract the sentiment of each entity. This is known as entity-level sentiment.

What really matters in sentiment analysis? Is it the accuracy or the automation? Again, it depends on your needs and goals for using sentiment analysis. An example we often use where a technology-based automated solution really shines is in financial services where the trends across a collection of stories are what users are most interested in. They care less about the accuracy of every document detail, and more about the sentiment across a corpus of data that needs to be processed quickly. Financial Services is definitely one of the up and coming industrial uses of sentiment because the technology tends to perform better than humans in processing large collections of content. Reputation Management is another industry where automated sentiment analysis shines bright, but where accuracy comes under more scrutiny. It could be said that automated sentiment analysis was born in this space, and was invented because of the amount of time people spent hand measuring the tone around products and brands. While Reputation Management is currently the biggest market for the technology, it's probably not the best example of accuracy. It's hard enough to get humans to agree with humans on the tone for a specific story, but to get people to agree with a computer is even harder. I bring up these two contrasting uses because it's important for people to think about their specific needs and requirements before they jump into using any vendor's solution. Make sure the solution you're looking at is well-suited for the problem you're trying to solve. So while there are more claims of sentiment analysis hitting the market, and after 6 years as a company processing unstructured text and watching online content take hold, it's interesting to see how sentiment appears to be somewhat of a commodity. It challenges all the providers to do a better job in all aspects of the technology. However, it's a fact that analysis of good, bad and neutral isn't as easy as 1,2,3. Ask for a proof of concept before making a decision and make sure the solution is right for you and your business.

Submitted by Jeff Catlin on Tue, 2009-09-08 04:00

Ever since the New York Time's article about sentiment scoring, published a couple of weeks ago, there has been a pretty constant stream of people jumping in and demonizing automated sentiment or trying to pedal its eventual takeover of the free world. It felt a lot like political media coverage to me, lots of opinions but very few of them taking an honest look at the real problems and real solutions. Kudos to Nathan Gilliatt for putting out the list of many of these posts (see the list here). It's obvious that we all have axes to grind and software or services to sell, but focusing on the accuracy of automated sentiment is the wrong place to go. To remove any doubts let me state for the record that a machine based system will never score any random piece of social content as well as a human will. People simply have too much context in their brains, and there is no way a machine is going to match that. Given the preceding, one my ask whether I believe that automated sentiment is doomed to failure? The answer is NO, we need to use automated sentiment in ways that you can't provide with humans. I'll illustrate my point by focusing on one of the recent posts about sentiment, Sentiment analysis for online content: Honest? from CyTRAP Labs. The post wasn't particularly favorable to automated sentiment, but was one of the first posts I've read that asked the right question... Is this story relevant? If you're monitoring social media sources, then what you really want to know is: What's happening that I need to worry about? Text Analytics and automated sentiment is very good at answering these "trend spotting" questions. In fact, machines are an essential piece in providing trend spotting. Sentiment Analysis has made rapid inroads in the financial services industry because users don't care about the tone of each story, they care about the effect of a bunch of stories on the market as a whole. A number of our financial services customers are making lots of money trading equities based in part on sentiment trends, and that should say something about its validity. If you're job is to monitor a brand in social media then the trends and patterns are what you should be worried about, and automated analysis is great for this. If an automated system is only 70% accurate, it's still going to get the overall trend (up or down) correct for a given brand, and then the humans should always step in and provide the detailed analysis of that trend, including the identification and correction of the posts where the machine got it wrong. Let automated sentiment point the way, but trust humans to provide the detailed analysis that requires a few neurons. Automated systems will never beat humans on a story by story basis, so let's stop worrying about that and use them to provide services that humans can't afford to do.

Submitted by Christine Sierra on Wed, 2009-08-26 04:00

Learn how Search and Text Analytics fit together in the enterprise.

**Courtesy of Network World Video Library - NetworkWorld.tv

Submitted by Christine Sierra on Thu, 2009-08-20 04:00

Earlier this year Jeff outlined a few things you should consider when investigating a reputation management solution. Since we often get asked for our opinion on this topic, I thought it would be good to outline those "questions to ask" again.

1. Where does the content come from? Good analysis starts with the content. Please, please, please don't get wowed by the pretty pictures. Real insight comes from looking for patterns and trends in large and varied content sets. Make sure your vendor can tell you how they acquire their mainstream, blog, and social media content. Ask the hard questions about where their data comes from, are there any potential copyright issues that could alter access to this information in the future, and do they have any agreements in place to go after content from the likes of Facebook or MySpace.

2. Can they customize for your industry? Nowadays, it's not enough to just monitor the passing mentions of you or your competitors. Insight comes from digging deeper to figure out what people are commenting or worried about, not what your marketing folks think they *might* be worried about. Whether a solution is based on a search engine or a text analytics engine, make sure it can discover what's driving the discussion about your industry. You need to go beyond measuring the penetration of your marketing message, because after some analysis, that may not be what people online are talking about.

 3. What's the sentiment of my brand? Sentiment: it's the new beige (yes, this is good for us, since we have a sentiment engine). We've noticed in the last 12 months that sentiment has become one of those checklist items in brand and social media monitoring. I suspect it has come about due to the economic ups and downs, and the ever increasing reach of consumer generated content. Companies have to know if they're getting trashed out in cyberspace, and because of the volumes, the only way to do this is with an automated sentiment engine. Your vendor may not use our engine, but whatever they use, make sure that they can measure sentiment at the item (company, brand, product) level. Measuring sentiment at the document level is fine, and may provide some of the needed insight, but if the content is comparing two brands, then you want to go beyond the document level and into the actual comparisons. And, yes, we do recommend humans play a role in the sentiment analysis process. Automation is an added benefit in the process.

4. Can I touch it? The first generation of reputation management systems tended to have large account management teams behind them to build out and manage customer's reports. The customer couldn't go in and adjust the reports themselves because the systems weren't exactly user-friendly. This is fine if you have deep pockets for all the services work, but in today's world that doesn't seem to be the norm. Many of the newer solutions that are available, or are being built, allow the customer to build and manage their own reports. This is no small undertaking, but it does provide users a cost effective way to gain the insight that something like Google Reader can't provide. Naturally, when selecting a provider, it's not as easy as asking these 4 simple questions when making your decision. There are many other questions about integration, update frequency and test-driving the solution. But if you can answer these 4 to your satisfaction then chances are the solution you're considering can at least help get you started with reputation management.

Submitted by Jeff Catlin on Wed, 2009-08-12 04:00

So, it's been a while since I penned a blog post, and in this case that's a good thing, because its been a pretty busy summer. As I haven't blogged in a while, I thought it would be a good time for a "State of the State" sort of post, so without further delay...
Historically, Lexalytics has tried to cast a pretty wide net into the OEM world, but most of our success has come in the Reputation Management space, but I'm happy to report that this appears to be changing. We're still doing very well in Rep Mgmt, but we are seeing an ever increasing percentage of our leads in other areas like financial services and Customer Satisfaction. The really good news on this is that it appears to be due to the maturing of the industry, and not some specific marketing program we're running. There are more and more prospects showing up at our door, who have specific "text analytics" needs, and this bodes well for the future. In spite of a tough environment, it looks like we'll grow the business at least 10% this year, which in the new world order of "Flat is the new up" is a pretty solid performance.
On the technical side, we're rolling out a number of interesting new things this fall, the most important of which is a web services layer and SaaS version of our Salience engine for smaller companies that need some high end text processing capabilities but don't have the budgets to bring the engine in house. We're using this SaaS service ourselves to roll out a new Excel plug in that will bring lightweight text analytics (entities, themes, sentiment) to anyone with excel, and at a price (<$100/month) that just about anyone can afford.

Submitted by Christine Sierra on Tue, 2009-07-21 04:00

Sentiment is usually categorized into three buckets: positive, negative and neutral. It often get's presented looking something like this:

 

Sounds pretty simple, right? If content has good words in it, then it's positive. And if the words aren't so nice...well, that can be bad. However, when applying sentiment to any set of content, there is always the chance that what you may think of as good could, in fact, be bad. How about when you automate that process? When we're asked about the "absolute" of automated sentiment, we often use an example from one of our technology customers; they had content that our sentiment engine thought should have be tagged as negative, but was actually positive for them. In their case, they were applying sentiment to product-related content and during the analysis several of the documents included the words "Error Message". In a traditional sentiment situation, anything relating to an "error" would be considered negative, so the engine tagged it as such. After the results were presented, the analysts reviewing the results disagreed with the sentiment engine and concluded that the documents containing "Error Message" were positive. How could that be? Had our automated sentiment gotten it wrong? No. Our software was fine, but since this client believed it to be a good thing that an "Error Message" was thrown when their product failed, they thought of this content as positive. If nothing had been presented in product failure situations, then they would have believed it to be a negative thing. So, something perceived as bad by the software, was in fact good to the client. This is a rare instance, but we use this example to show that sentiment can be subjective, depending on the situation and the content being analyzed. And while automated sentiment helps to expedite the processing time, can be over 80% accurate, and is good in situations where you are weeding out the bulk of neutral content, it is often up to the individual company to dictate how to apply the spectrum of positive - neutral - negative. If you are thinking about applying sentiment to the content you are analyzing, you should know that Lexalytics provides you with both a sentiment score and a confidence score. That is important because it allows you to determine where the good and bad thresholds fall in your world - AND - we let you know how confident we are about our assessment of the good, the bad, and the neutral. When considering sentiment solutions, be wary of the simple red, yellow, green methodology. Without some freedom to move those scales, you may find your analysis will be at the mercy of the technology and you may not always agree with the results.

Submitted by Christine Sierra on Wed, 2009-07-08 04:00

Jeff Catlin provided ZDNet's Jennifer Leggio a guest post on why companies need to be thinking about twitter speak and how it can be analyzed. Check it out here.

Submitted by Christine Sierra on Wed, 2009-07-01 04:00

In a former life, I worked for Thomson Financial in Boston - First Call to be exact. Over the years there, I learned to enjoy the ebb and flow of earnings season - watching the markets move based on quantitative analysis. There was little analysis of research and text back then. Boy, things have changed. Today, Lexalytics is working with the grown-up version of Thomson Financial now known as ThomsonReuters. Our software is used in their algorithmic trading platform and incorporates sentiment into the trading process. The sentiment derived from the aggregated content flowing through a trading system has proven to be extremely useful in that environment. We also recently announced our relationship with First Coverage, which has a web service called "The Community". It essentially provides a collaboration platform for the buy-side and sell-side to help money managers filter out the noise and focus on the data that matter most to their holdings. It's very cool, especially if you are a quant-type, but more importantly it reinforces the shift from strictly numbers driven analysis and trading to a healthy mix of numbers and text analysis. We've known for a while that information stored in unstructured data can be helpful to monitoring corporate brands and reputation. In the PR/Marketing world there is pressure to be "exact" regarding sentiment for every document, but as Jeff noted last week in a Q&A session with PRWeek: "My belief is that over time the PR industry will begin to look at the aggregate statistics and when they do so, automated sentiment will be a perfect fit because it's consistent and accurate across large blocks of content." In the world of market trading and research, text and sentiment analysis technology provides the insight and measurement capabilities needed by financial services organizations to discover, react, and respond to market opinions. This is just one of the expanding markets we've seen in the past year embracing text analytics capabilities.

Submitted by Tim Mohler on Thu, 2009-06-18 04:00

Recently I had one of those unfortunate circumstances where a customer of ours was unhappy with the results of the sentiment produced by Salience and contacted us to help tune the engine. The document set was a couple hundred articles mentioning a particular company. About 25% of the mentions scored by Salience as negative were rated by humans as mostly neutral in tone.
As I dug into the articles, it became clear that most of the articles where Salience and the humans disagreed were articles where the company in question was co-mentioned with lots of disappointing economic news - "credit crisis", "in a poor economic climate" and the like were frequent phrases. In addition, most of those articles had only 1 or 2 mentions of the company in question. Although there was often little grammatical connection between the negative phrases and the company, without any positive elements, the company score wound up weakly negative.

It seemed clear that this company was the victim of guilt by association. It was the "correct" result from an engine perspective, but not from a human one. What we decided to do was to introduce a scoring step that took into account the number of mentions and the number of scoring phrases - an output from Salience called "evidence." Essentially, what the client decided to do was to score as neutral any article that contained only a single mention of the company and only a single scoring phrase - regardless of what Salience returned as the score.

The idea is that, at bottom, machine scoring of tone is a statistical guess and when there is little to base the guess on, you're likely to be wrong. Humans react poorly to something scored as negative in these cases but are more accepting of a neutral score for these "passing mentions" even if they themselves might score it as a positive or negative. Another way of looking at it is that humans accept that sentiment is a continuum and what they really don't like is the machine to be a polar opposite.

After we did that (and a few other steps similar in nature) we managed to get the document set into quite close agreement with a human. What about other documents, though? Had we succumbed to the dread "overfitting" problem? To determine that, we assembled another group of several hundred documents picked at random from the rest of the set and had the same group of humans score the new documents with the new algorithm. We found that in fact the agreement with humans went up by more than 10% compared to the earlier scoring algorithm. Also important was the near elimination of "polar opposite" scores.

Short Documents

But what about Twitter, you might ask, or other kinds of short documents where you probably will only ever get a single mention? Should we score all those as neutral regardless? We weren't scoring Twitter and the like in this instance, but it's a valid criticism. My instinct is that with Twitter type content the original problem would not occur often. In other words, with only 140 characters, you won't have a lot of extraneous sentiment from poor economic news cluttering up the message.