At Lexalytics, we try to follow the news in all of the industries that affect and are affected by text analytics, chief among them is social media monitoring. An interesting bit of news came out last week from Radian6; they reported that last year was their first profitable year. This is bigger news than one might think because there are soooooo many players in the social media monitoring space that I have always worried that none of them could garner enough market share to become viable money-making companies. It says something about how they run their business in that they've reached this milestone, but it also says something about the space in general. Social Media and all the things it's used for: - Brand Awareness - Disaster discovery - Advertising & customer engagement have quickly jumped the fence from novelties to mainstream features that almost all large companies worry about (and some mid-sized ones as well). Not only are companies like Radian6 prospering, but established players like Facebook are growing their traffic numbers to the point that they are seeing more page views than Google. The users are forcing companies to listen to them, and the smart companies are engaging monitoring companies like Radian6, ScoutLabs and others to listen to the customers. Some may wonder why this transformation has happened so quickly (I'd say in less than 2 years), and the answer is that pretty much everybody wins, so why wouldn't Social Media become a defacto part of life for consumers and businesses? The consumers get more of a say, so they're happy, and the companies that choose to listen have a direct connection to their customers thereby a means to interact and hopefully increase sales and revenue.
Now for something a bit out of the norm for me...a post about the enterprise search market. I don't know if everyone has noticed, but the enterprise search market has changed dramatically in the last 18 months or so. For example, when Microsoft bought FAST a couple of years ago I expected that they would pull FAST into Microsoft and adjust their business model, and so they have. FAST is much more focused on Sharepoint these days and will over time cease their support of Linux as an operating system. If you're Microsoft this is a pretty reasonable move, as Sharepoint has a massive user base and has historically been weak at search. Unfortunately, this isn't a good thing for FAST's biggest clients who can't realistically migrate to Sharepoint. So, what are these users to do? Well, Autonomy and Endeca are the obvious choices, but Autonomy has been moving away from the enterprise search space for the last few years. Endeca has historically focused on eCommerce applications, but now appears to be moving toward enterprise search, though they aren't there yet. So, where does this leave all the companies with massive content search problems? At the moment there doesn't seem to be a clear answer to this one, but as near as I can tell there are 2 candidates vying for the crown; as I alluded to above, one is Endeca. They seem to realize that FAST's acquisition has opened up the high end market, and the other is Lucene/SOLR which is making noticeable inroads in the enterprise search world. It's too early to tell how all this will play out, but my hope is Endeca makes a full-on play for the high-end enterprise market by continuing to upgrade their core search offering [DISCLOSURE: Endeca is a reseller of Lexalytics software, so we have a vested interest in their success]. In the other corner is Lucene/SOLR, which is a great engine, and is a perfect solution for the companies that are willing to spend the time and money to get a fully customized solution, however until it reaches critical mass there will be those companies that are uncomfortable with it because it's open source. Lucene/SOLR may well overcome this stigma and become the enterprise engine that takes over the high end market, but it'll be a while before we know for sure. I'll keep an eye on it, and post on this topic again in 4 or 5 months and see if the landscape has changed, but for now keep your eyes on Endeca and Lucene.
You know you've hit the right spot when Microsoft starts telling your story. I just watched an interesting TED webcast by Gary Flake of Microsoft explaining Microsoft's Pivot application that the labs group has developed: http://blogs.msdn.com/searchblog/archive/2010/03/05/microsoft-pivot-future-of-browsing-searching.aspx The basic premise is that we shouldn't fear the explosion of information that's occurring, we should embrace it as an opportunity to look at the big picture. The key to this demonstration is that by looking at the aggregate of information and what that content has in common we can spot patterns and trends that would otherwise be invisible. The only thing missing in the presentation is the recognition that figuring out the right ways to sort this sea of information is the tricky piece; how do I know that I'll see interesting things if I sort by "Tour de France" after I see a cover story on Lance Armstrong? They may have ignored it because they haven't got that piece figured out yet, but the truth is that Text Analytics is a perfect tool for exactly that purpose. The best way to swim around in a big sea of content is to give the users a high level idea of what's in the content and then adjust the sort columns dynamically based on the content you're looking at. For example if I search for "car recalls" I should be presented with the best ways to sift into the data from all of the recall stories, so I'd expect sort options like: "Toyota", "acceleration" or "Congress". Everyone realizes that content volumes aren't going down, so it stands to reason that this sort of application will have to become mainstream because simple search just won't narrow down the choices enough. It's great to see Microsoft pitching this story, and hopefully the Pivot application will become a bit more generic over time because it's a really nice looking application with a lot of potential.
I am fascinated by the nay-sayers that repeatedly claim automated sentiment software, web sites and services are garbage. According to them, the software almost always get it wrong; it doesn't interpret sarcasm; it's not 100% accurate. The rule of thought is that you can't extract true sentiment unless you have a staff of humans reading and tagging every potential story about your company as negative, neutral or positive. Really? I mean can you honestly say that automated sentiment is useless and wrong ALL of the time? There are certainly some examples out there of solutions that are far from useful, but is that a reflection of the entire industry? I started to think of some other automated things that I rely on in my daily life and wondered if they are always perfect. If they always intepret things the way I do. Or, perhaps, they make my job easier so as a human I can focus on other tasks that matter and can filter out items that would otherwise take up time and distract me. Here's what I came up with:
Spam filter software: Wow. Wouldn't I love this to be 100% accurate and catch every spam email that hits my desktop. But while it doesn't always filter out every offensive message, it is much more efficient than opening and reading every email to determine for myself if it's legit or not.
Search engines: How come when I type "best pizza in Boston" into Google Search I get a list of 7 establishments scattered around the city? Can't there only be one "best"? Doesn't it understand that? But luckily I agree with all the yummy choices it returns so I'd say it did a pretty good job.
Spell checker: I make typos. All the time. Every day. Luckily, there are these little squiggly lines that show up under my mis-spelled words to let me know there may be something wrong. But sometimes the suggestion the software makes is wrong or the word I wanted isn't even listed as a suggestion. However, most of the time, when I type "teh" and mean "the", it gets caught. Obviously I'm having fun with this post and I certainly don't mean to imply that any of the above need to be perfect, I'm happy they work most of the time. But the bigger picture is that these examples all have to do with words - and how we express them and how we input them or receive them. If you are being vocal with your expectations that sentiment software based on text needs to be accurate and right 100% of the time, you are bound to be disappointed more times than not. If you are processing a lot of information and need to streamline the process by concentrating on the extremes, then explore what automated systems can do for you. It seems easier at times to focus on what it can't do instead of what it can. And don't believe claims of absolute accuracy, especially with sentiment and text analytics. Computers can only process so much from text, as text is comprised of typed words, and like the spell checker example above, typed words are by no means perfect. At least mine aren't.
The top 10 companies, with the sentiment scores, from Pogue:
|Amazon Com Inc||2||10||9||21|
Somebody doesn’t like Verizon, eh? This begs the question of what are the companies that they tend to talk about the most… Here it is, the top 20: None of these come as a surprise, in fact none of the top 50 really come as a surprise. Verizon has by far the highest percentage of negative commentary out of all of the companies mentioned. Enough of that. Just what sorts of things are they discussing in the blogs? By turning off companies and looking at a list of top themes, we get the following… The top 10 themes (aggregated) are: Let’s see how they differ by journalist: Mossberg’s themes (stacked sentiment): Clearly, he’s happy about the direction that storage is taking. Pogue’s themes (stacked sentiment): No strong sentiment with any of his themes, but the man believes in the power of the phone. In the last bit of this analysis, let’s take a look at the themes that are associated with Apple (for both journalists). You can see that the connections are strongest between the concepts of “phone”, “software”, and “screen” (and, oh boy doesn’t Apple always have nice screens on their stuff?) For Mossberg, the picture changes slightly: There’s still a tight connection to software, but, the connection to phone is less (not surprising). But what’s interesting is the fact that he seems to be writing a lot about the connection between Apple and music (hence the tight connection to the theme of “music”). And there’s that storage thing again (”hard disk”). And here’s Pogue’s: In Pogue’s connection map with Apple, you can see the prevalence of phone and phone-related themes (cellphone, phone call, cell), and an almost complete absence of things related to music. One interesting thing to note is that the theme “music play” – probably “music player” – has a lot of negative in both maps, indicating some issues there. So, why are they Fanboiz of Apple’s? I don’t know. I do know that I’m sitting here writing this on an 8-core Mac Pro & 30? studio monitor, with a 17? MacBook Pro open next to me, so I don’t have a lot of room to make fun of anybody. I do wonder if Apple is really such an important force in personal technology that they deserved mention in 25% of their articles, tho. I will probably look at some geekier (and I mean that with all the love in my heart) tech sites like Engadget and Gizmodo next, and we’ll see if a similar proportion occurs there.
The Super Bowl brings inevitable re-running and armchair quarterbacking at all the advertising agencies about how they could have done better. There's statistics and measurements galore, from Nielsen to USAToday about popularity and viewing. I was curious about what people were saying about the ads, so, I pointed Lexascope at the blogosphere and news feeds, and below is what it told me. We didn't do a big scientific-sounding study with lots of important seeming partners, we just snagged a bunch of blog and news content and let Lexascope read it and tell us what's up. Surprisingly enough, given the huffing and puffing around Tim Tebow's "pro-life" ad, Google's ad turned out the most pundits in the blogosphere and got the most news coverage. The top 3 were: Google's "Parisian Romance" Focus on the Family's "Tim Tebow" Audi's "Green Police" I'm not saying these were the best, just that they generated the most conversation. Google's ad has generated a number of pariodies - perhaps this is the real genius of the spot, in that it is going viral in a very different way than normal - viral with parodies, not necessarily with the original spot. Here's one of them. Here's the connection map:
A few very interesting things come up.
The USA Today "Super Bowl Ad Meter" showed that the favorite ad by viewers was the Snickers ad. My question is this: What's better - to be rated as "viewer's favorite ad" or to generate the most discussion? This, of course, is where the art and science of PR mix together - it probably depends on the tone of the conversation and the folks you were trying to reach.
Jeff recently shared his thoughts on Text Analytics Market Growth with Seth Grimes for his report on B-Eye Network: Text Analytics Opportunities and Challenges for 2010 (free registration required). It's a well written report, outlining the thoughts and expectations for 2010 from industry leaders and innovators in the Text Analytics industry. Here is Jeff's excerpt, providing some insight on how Text Analytics and Sentiment will play out in 2010: Market Growth Lexalytics CEO Jeff Catlin affirms other respondents' themes. He sees search as a particular growth area and makes other points regarding text-analytics market growth:
• Text analytics [TA] will become a mainstream feature set in enterprise search applications (though not by name). We've seen a steady march toward this in 2009, and it's most notable in how accepted TA features are by the general public. When I'm at a party now and tell someone what we do, they say "Oh yeah, I read something about that sort of stuff last month" as opposed to the "Huh???" that I used to get. The effect of this is that there are a lot more opportunities for TA in enterprise applications, and I suspect it will mean that one or two of the players may get picked up by a big company.
• Sentiment will complete its transition to a "checklist" feature that everyone who works in this space will have to provide. All of the vendors (big and small) will claim to have sentiment. The consumers of this technology will also get a bit more educated - we're seeing this in RFP requests for particular capabilities of sentiment - which will help separate the wheat from the chaff. Unfortunately for us, sentiment won't be a totally differentiating feature that you can hang a business on anymore, as there will be lots of competition on the sentiment front.
• The [differentiation between] larger TA players and the niche players will become even more obvious. The bigger players will integrate a number of useful and useable semantic features into their engines which will help with things like ad hoc classification, concept roll-up, and relationship [extraction].
• On the business side, we expect 2010 to be a "Home Run" year for all the TA vendors with growth rates of 75% to 200% not out of the norm. This is partly due to the mainstreaming of the technology, which is opening up a lot of additional verticals.
About a year ago, our CTO Mike Marshall did some accuracy testing on sentiment using our software. This wasn't so much to showcase Lexalytics capabilities as it was to show that accuracy using automated sentiment can be helpful in the business process if done correctly.
One thing we do know for sure is that computers don't change their minds about the sentiment for a certain piece of text. If you run the same piece of text through the software 100 times, it will come back with the same results every time. Humans, on the other hand, have the capacity to change their minds - and disagree with each other - on the same piece of text. But that's okay. At Lexalytics we've never suggested you take human analysis out of the equation when it comes to analyzing unstructured content. In fact, our hope has always been to help the humans be more productive. Removing the neutral content is the goal, so the focus can be on the extremes within the content - the really positive or the really negative. I was recently surprised by a statement recently from Forrester Principal Analyst Suresh Vital that "in talking to clients who have deployed some form of sentiment analysis, accuracy rests at about 50 percent." If this were to be true in our client base, we'd sadly be out of business. I hope as more and more companies enter into the sentiment analysis arena that they continue to test and retest their models. Below is Mike's analysis from earlier in 2009:
Experience has also shown us that human analysts tend to agree about 80% of the time, which means that you are always going to find documents that you disagree with the machine on. However, having said all that, customers still like to be told a base line number, it's human nature after all to want to know how something will perform, so I thought I would do a little test using the new model based system on a known set of data. As recommended on the Text Analytics mailing list I used the Movie Review Data put together by Pang and Lee for their various sentiment papers. This data consists of 2000 documents (1000 positive, 1000 negative) and I sliced it into a training set consisting of 1800 documents (900 positive and 900 negative) and a test set consisting of the remaining 200. It took about 45 seconds to train the model and then I ran the test set against it (using a quick PHP script). Now bearing in mind this is still experimental and that we plan to make more tweaks to the model, I was pleasantly surprised (ok I was more than pleasantly surprised) at the results. Our overall accuracy was 81.5% with 81 of the positive documents being correctly identified and 82 of the negative ones. This is right in the magic space for human agreement. For fun, I then ran the same 200 test set documents against our phrase based sentiment system, expecting a far lower score, but again we performed better than I thought scoring 70.5% accuracy. With a domain specific dictionary I'm sure that that score could be pushed up towards 80% as well. So what does all that tell us? Well, it tells us that for specific domain sets you can get very high accuracy levels, though if you ran say, financial content against the movie trained database the results would be far different. It also tells us that the phrase based sentiment technique produces good results even in its base state against a wide range of content sources (we normally are processing news-related data after all).
So, would you agree?