“Is Social Media Worthy of Text Analytics”: A Response

  3 m, 42 s

I happened upon (well, really, was fed via LinkedIn) a blog post by Tom Anderson over at OdinText.  I’ve seen some of his stuff before, and he seems like a reasonable gentleman.

I was kinda surprised by this post:  Is Social Media Worthy of Text Analytics, and thought it would be worth responding to.

To outline what he’s saying (it’s short, go check it out), it comes down to this:

  • Coca Cola found that they can’t use social media to predict short term revenue.
  • Twitter is lagging, not leading
  • Not many people tweet, so, you’re getting a distorted sample
  • And those that are tweeting are trying to sell you on their expertise in managing social media, because, really – who tweets about Coke?  To wit:  “The fact that Twitter even scores as many mentions as it does for products like “Coca-Cola”, which most regular consumers would be unlikely to ever think about any given week, is that there are so many want to be social media marketing guru’s on Twitter and blogs trying to analyze others marketing campaigns – further proving what a peculiar sample blogs and twitter is.

Well, I haven’t read the original source about how they were trying to predict revenue, so, I can’t really comment on that first bullet.  I’m not sure that using it to “predict short term revenue” is as interesting as using twitter to “find places and events at which people are drinking coke” and market to those folks.

I disagree with the second bullet – it really depends on what you’re looking at.  If you’re looking for a reaction, sure, it’s obviously lagging.  But, what if you’re looking for second order  or future effects (like people talking about what they’re going to do this weekend).  Brand mentions might be mostly lagging, but I’m not even sure about that.

I totally agree with the third bullet – Twitter is a self-selecting group with a large set of biases, I’m sure.

The fourth bullet was the one that I took real exception to.   So, here’s what I did – I collected 24 hours worth of english tweets about Coke or Coca Cola, using the system over at Datasift.

26k tweets.  Ok, it was more like 23 hours, but I was impatient and kinda lazy, and just wanted to do this.

Let’s look at what people are talking about… This is a really quick and dirty look at the top themes (important noun phrases), and the sentiment of the themes themselves over time.  This is completely unfiltered.  Color is sentiment, size is number of tweets, and no, you don’t get a legend because you’ve been very bad.

overall themes from coke

I’m not seeing anything in there about marketing.  Even delving into the verbatims (so to speak) doesn’t show much about marketing or social media monitoring (except that “crowdsourcing” and “photo booth” tie-up bits). But I do see a lot of people talking about coke.  🙂

Let’s do a bit more digging…  Here’s the top hashtags:

Overall Hashtags

That “IAmSoMiddleClass” bit is an Indian thing, apparently.  The more you know…  We’ll get back to the #addicted in a minute.

Datasift has some tech that lets me get gender for some tweets.  Cool.  Let’s use that!  Not enough of the tweets are demographically classified to make a pretty picture, so, let’s look at charts…

Here’s the ladies:

female themes coke

(Maybe they’re retail dudes, too, but it seems more likely that they’re ladies.  No hating on the choice of pronouns, k?)

coke gigi hill

Speaking of dudes, what do they have to say?

male themes coke

Yes, more references to drugs, and, well the dudes like to talk about “sausage”.  Hm.  Note that we could probably do some word sense disambiguation between the “coke” that is referred to as Bolivian Marching Powder, and the coke that my son likes, but, this is quick and dirty, like I said.

And for completeness sake, let’s do the same thing for hashtags, ladies first:

female coke hashtags

Note the fact that there are 2 instances of “addict” on there.  (And, yes, they’re talking about Diet Coke, not nose candy.)

Let’s look at the men:

male hashtags coke

Check that out, not even a word about “addiction”, and the whole “IAmSoMiddleClass” thing is concentrated with men.

So, my point, if I were to have one, is that real people are really talking about real stuff on Twitter.  Even brands like Coca Cola, someone is talking, right now, about where/how/when/why they’re drinking one.   And in that information comes the possibility to learn/understand/market/connect/sell.

Categories: Natural Language Processing, Sentiment Analysis, Social Media, Text Analytics