Text Analytics Fails! – Ifs, Ands, and Buts

  1 m, 36 s

Sometimes technology doesn’t work quite how you’d like it to.

robot walking through door

Conditional sentiment is something that is hard for a machine to evaluate. When one part of a sentence expresses sentiment that relies on a judgement from another part, only very clear and strong relationships can be properly parsed by Salience. For instance, straightforward negations (“This isn’t good”) are relatively easy to handle, but anything with an “if” in it gets much hazier. Here’s an example that Salience rated positively:

“I would be more proud of my company if I was more fairly compensated.”

That’s clearly not a positive, but how would Salience know? We have a future tense (would be) and a conditional (if…). Should this be negative? Should Salience take anything in an “if” clause and ignore it? I’m sure you can easily think of examples where the IF clause flips the sentiment the other way – “if wishes were horses, beggars would ride” – or the IF clause should be taken at face value (“If I hear you complain one more time, I’m turning this car around!”)

But wait! There’s bonus fail!

robot applying ketchup

There’s lots of ways to condition a sentence. Here’s one from a recent project:

Dirige mais rápido, estamos num bairro perigoso. -> “Run faster, we’re in a dangerous neighborhood.”

Salience rates “run faster” as positive, and “dangerous neighborhood” as negative. The overall document sentiment is negative, which is correct, but much more weakly than it should be.

There is really a degree of reasoning involved here that humans perform smoothly and that text analytics software just doesn’t do. But even a human might get tripped up by the first example – there’s a reason people will re-phrase double negatives in a conversation to make the meaning more clear.

Want more text analytics fails?
Text Analytics Fails! Pieces of Miranda Kerr

Text Analytics Fails! A Romantic Getaway

Categories: Text Mining