Stop what you’re doing at watch this immediately. (Warning: profuse profanity).
I love this new sketch from Key and Peele. It’s funny because it’s so completely relatable (minus the bat full of nails). We’ve all misunderstood or misinterpreted a text, or an email , or an IM. The results can be confusing, catastrophic (bat full of nails), or just downright embarrassing. It’s a prime example of something that I think is important to keep in mind when discussing text analytics.
Humans kind of suck at interpreting text.
Not all text, of course, but short, choppy pieces of information like texts or tweets are especially difficult, because they lack the context from which we get a lot of our information about whats being said.
When we test Salience against real live humans analyzing text sentiment, we find that inter-rater agreement is only around 80%. Inter-rater agreement is a measure of how much the raters (real live humans) agree on the particular thing they are rating. That means a whopping 20% of the time, people are interpreting the sentiment of a piece of text in significantly different ways (chilling at the bar vs. BAT FULL OF NAILS).
Until we reach a technological singularity in the distant future where computer programs surpass the human brain’s ability to synthesize and comprehend language, we’re stuck with technology that literally can’t possibly achieve higher than 80% accuracy.
Accuracy gets complicated
If two people interpret wildly different sentiment from a piece of text, which interpretation is the accurate one? Sentiment, as it turns out, is a pretty personal measurement.
So how do you correct for this whole personal sentiment thing without falling into an existential angst about the futility of communication and the ultimate meaninglessness of language?
Personalize your text mining tools
If you know what you interpret as positive or negative, you can fine tune your text mining tool so that it interprets it that way as well. You can also tweak your accuracy for high precision, like for pinpointing social media trends, or high recall, so you don’t miss a single possible customer support request.
For more on maximizing accuracy, check out Seth Redmore’s presentation from his recent talk at the Terrapin Conference on Consumer Text Analytics in London.
For more Key and Peele, watch this similarly amazing sketch.