LexaBlog

Our Sentiment about Text Analytics and Social Media

Submitted by Jeff Catlin on Tue, 2012-01-24 12:47

So I watched an interesting video on some of the mechanically automated editorial work that is happening on major sites like Google and Facebook (see the video here: http://www.youtube.com/watch?v=bOE1HFEL8XA).  What’s interesting about this talk isn’t the idea that these sites are making decisions about what content you will and won’t see, it’s that they aren’t even telling their users that these decisions are being made. 

I found it fascinating and kind of scary that a Facebook newsfeed would automatically prune out all of this guys “republican” content in favor of his “democratic” content simply because he clicked on the democratic links more frequently.  I already worry that as a society we spend too much time seeking out opinions like our own and far too little time seeking out dissenting opinions. 

Now we have the technology helping us down this road.

As a company that provides technology that is often used for this sort of automated editorial work I feel it’s important to examine the effects of our work to assure that we’re not doing more harm than good.  Summarization and Dominant Ideas are the sort of features that are absolutely required in the world we live in, there is simply too much information flowing by for us to read all of it, so using technology to reduce the stream to a manageable volume isn’t only just convenient, it’s absolutely necessary. 

The trick is to make sure people understand the potential negatives so that they can make intelligent decisions about how to acquire and digest content.

Submitted by Seth Redmore on Mon, 2011-11-07 07:12

In honor of the Sentiment Analysis Symposium this week in San Francisco (you are going to be there, right?), here's a summary of best practices for tuning sentiment.   These will work for any sentiment analysis system, but you should use ours.

Because it's the best, 'natch.

1) 2 datasets:  Gather a set of documents and split it in half. 

2) 2 people:  Have 2 people tag each dataset for sentiment, and have 2 people participate in the process of scoring sentiment bearing phrases, that way you can mitigate the risk of slanting the tuning too much towards one person's biases.

3) 2 tests:  After re-modelling or modifying the sentiment scores of sentiment bearing phrases, test against the half of our dataset that you did not use for step one.  Check to see how well it agrees with the tagging that the two of you assigned.  Then, for your second test, run it against the first half to make sure that you didn't make things worse.  You probably didn't, but "mistakes can be made".

Hope to see you in San Francisco in a few days...

Happy Sentimenting!

Submitted by Seth Redmore on Tue, 2011-10-25 18:18

In about 6 minutes, I'll show you how easy it is to configure a concept topic that classifies documents by two different classifiers:

1) Is a country mentioned that is in the Middle East?
2) Are there weapons mentioned?  

(Watch in fullscreen unless you have bionic vision and can see what I'm typing in that tiny window below.)

Submitted by Seth Redmore on Thu, 2011-10-06 22:07

I got kinda dissed in the comments for being too vague in my last blog post, with a demand for a video/demo.

We have heard and are responding!

We're working on more, but check out this quick demo of concept topics.

Check this out full-screen so you can see what I'm typing.

More to come.  (To read more on the ins and outs of categorization, snag the categorization/classification/concepts/tags whitepaper).

Submitted by Seth Redmore on Tue, 2011-10-04 22:18

Yes, yes, we've been very quiet.  That's because we've been working on really cool stuff, and now it's released to the marketplace.

Salience Five is our most important release since our last release!  :)

Seriously, though, we've introduced some groundbreaking new features to help you understand what's being said in all that text.

First and foremost, we are the very first text analytics company to truly harness the power of Wikipedia™ in our algorithms.  We've created the world's first Concept Matrix, a distillation of 640,000 Wikipedia™articles into 1.1 Million concepts that we can understand, compare, and extract, using the 56 Million links we've discovered between them in Wikipedia.

So what?  Concept Topics.  That's what.  Say goodbye to tedious and expensive taxonomy management when all you want to do is categorize all Tweets mentioning any kind of food, or tag any article about "crime", or classify any article about "natural disasters".  Concept Topics are going to change how you categorize content. 

So what else? Document Collection Processing.  We're moving beyond things like clustering to provide meaningful analysis that leverages the semantic and conceptual similarities inherent in a collection of related documents.

Which brings us to:  Salience Facets and Concept-based Facet Rollups.  Salience Facets are new to Salience Five.  These are not "search facets", even though they could be used as search facets - they are more than clustering-based search facets.   Salience Facets represent a completely new way of extracting meaning from text.

Whew.  That's a lot of new stuff.  We'll be giving interesting examples of how this can be used over the coming weeks.

Submitted by Jeff Catlin on Wed, 2011-03-30 19:38

Exciting news in the social media monitoring world today with the acquisition of Radian6 by Salesforce.com for some nice money. The move makes all the sense in the world as it gives Salesforce.com a reach into the Marketing/PR world rather than just sales. Radian6 was already the most widely recognized name in the social media monitoring space, and as an independent operator was a logical choice for acquisition. Buzz in the industry has been that they would be acquired sometime this year, but I never expected Salesforce.com to be the purchaser. Going forward I suspect that this will change the SMM marketplace quite a bit.

There is now an 800 lb gorilla in the game, with the money and brand to really put pressure on all of the other SMM vendors. It’s going to be interesting to watch this space over the next 6 months and see if there are any other big mergers or acquisitions.

Hitting closer to home, this is important for Lexalytics because Radian6 is one of our newer customers, and we’re obviously very interested in seeing the relationship succeed and expand. It’s an exciting world out there, and this acquisition should ramp up interest in social media monitoring even more, because for if a company like Salesforce.com is willing to pay $318M for a social media monitoring company, then they clearly believe that global brands have figured out the need to watch social media, and that need will translate into more business for them.

Submitted by Seth Redmore on Tue, 2011-03-08 15:43

One of our partners up in the Great White North has taken the leap and is the first to roll out our French language pack. To really sum up what this brings to their customers: Now, an English-only speaker can have access to rigorous metrics from the French language content of interest. That's pretty cool! Link to MediaVantage release

Submitted by Seth Redmore on Tue, 2011-02-22 23:37

Quick quiz - how many of the companies in this article rely on Lexalytics for sentiment analysis technology? Computerworld: Sentiment Analysis Comes of Age Answer: 3/5 technology providers (including ourselves, duh), and 3/6 of the social media monitoring companies. Awesome-tastic. Oh - they're: Endeca Cymfony evolve24/Maritz Radian6 DNA13 (now MediaVantage)

Submitted by Seth Redmore on Tue, 2011-01-18 22:36

Our buddies over at MediaVantage http://www.mediavantage.com have just announced some significant improvements to their already really good service. Included with improvements to their content stream for social media (including both more stuff and better filtering) and an improved search interface; they've launched some stuff that's near and dear to my heart - automated tonality scoring. That would be where our stuff comes in. So, check out MediaVantage, and know that you're using the best sentiment scoring system available. Here's the release Here's a video about the new release

Submitted by Seth Redmore on Thu, 2010-12-23 19:46

I came across the following article today, our last day before the Christmas holiday break. If ever an article deserved a blog post, it's this one. Expert: Roseville Galleria paid no attention to social media for flash mob The article talks about a flash mob that occurred in a mall in Roseville, CA. This resulted in overloading in the food court, some "popping sounds", and the "floor shifting". There is a certain amount of question as to the "purity" of the flash behind this flash mob. (Given that the mall supposedly issued a press release about it 3 days ahead of time, along with other things). The author attempts to make the case that by monitoring social media, the mall could have gotten a handle on how many people were going to come and Do Something About It. Ok, I agree that monitoring for buzz is a useful tool to prepare yourself, and social media is a great way to keep your finger on the pulse. And, yes, if you were able to compare the amount of buzz this was getting compared to, say, some other event that you'd announced, you could plan for extra security and such. However, watching the pulse of social media does NOTHING for you if you aren't prepared in other operational ways to deal with situations. I think the question that should be asked is "Why wasn't the mall equipped to deal with a larger than expected crowd in ways other than 'evacuate everyone'?" I would suggest that they first need to really review their emergency preparedness plans. And perhaps part of a better process is a better listening strategy. I bet they listen to the weather, and I also bet that their parking lots are free of snow for the shoppers. That's because the operational weight is set behind the folks keeping the parking lot open for people to park. They need similar thinking for other aspects of their operations. We can give you really, really great data. We can help you get visibility into areas of your business that you haven't previously seen into. However, *you should have at least some ideas about what you plan to do with the results before engaging in an kind of monitoring/listening program*. Listening should not be a thing in and of itself. Even though listening is important, if all you do is listen, you aren't doing anything about what you're hearing.