Who Wrote Macbeth? 3 Things Text Mining Can Tell Us About Authors

  1 m, 53 s

The-Three-Witches-From-MacbethText mining is a powerful tool that can tell us quite a lot not only about the words themselves, but about who wrote them. For instance:

Whether They’re Lying

Nowadays many customers rely on, or at least take into consideration, online reviews before making purchases or visiting restaurants and hotels. This has given rise to an entirely new shady business – fake review generation. Thousands of freelance writers earn money every day writing positive or negative reviews of goods and services they’ve never actually experienced. It gets even more troubling when you discover that most people have only a 50/50 chance of detecting a fake review. That’s why researchers have begun developing algorithms using text analytics and machine learning to detect deception. One such algorithm is able to tell if a hotel review is real or fake with 90% accuracy.

If They Have Alzheimer’s

Researchers at the University of Toronto applied text mining tools to the works of Agatha Christie in an effort to solve the mystery surrounding her later life. The prolific author was rumored to have become erratic and withdrawn in her old age, and there has been plenty of speculation that Christie suffered from Alzheimer’s or some other neurodegenerative disease in later life. While nothing can be proven conclusively, analysis of her writing shows a significant decrease in the complexity of her writing. Her vocabulary became smaller, and she became more repetitive and more likely to use vague words such as “thing” or “something” – changes that would be consistent with an author suffering from a neurodegenerative disease.

Who Wrote Macbeth (or at least, who didn’t)

Researchers attempted to solve the famous Shakespeare authorship debate using text analysis to comparatively analyze Shakespeare and his contemporaries, Sir Francis Bacon, Christopher Marlowe, and Edward de Vere, all of whom have been offered as alternative authors for Shakespeare’s work. By analyzing character usage, word length, and the percentage of unique words used, they were able to definitively rule out both Bacon and Marlowe as probable authors. The study seems to suggest Edward de Vere as the most likely candidate, but unfortunately, the corpus of his work is so small that accuracy cannot be guaranteed. Personally, I choose to believe in the one and only Billy Shakespeare.

