Analyzing Authorship and the Upstart Crow

  1 m, 26 s

April 23 marks the 450th anniversary of the birthday of one William Shakespeare of Stratford-upon-Avon, who not only wrote a multitude plays and sonnets that are still being studied and performed, but also invented an astounding number of words and phrases that are now in common use. In the past two centuries, the legendary figure has come under scrutiny. Unable to reconcile Shakespeare’s supposedly rudimentary education with his wildly eloquent writing, some began to propose that Shakespeare was not the man everyone thought him to be. And thus began the great Shakespearean Authorship Question 

For a long time, the debate was waged in the trenches of history and subjective literary analysis, but some shrewd young data scientists have applied text analysis and data mining techniques in order to shed further light on this continuing debate. 

Researchers from Dartmouth used text analysis to comparatively analyze Shakespeare and a few of his contemporaries, Sir Francis Bacon, Christopher Marlowe, and Edward de Vere, Earl of Oxford, all of whom have been offered as alternative authors for Shakespeare’s work. By analyzing character usage, word length, and the percentage of unique words used, they were able to definitively rule out both Bacon and Marlowe as probable authors. The study seems to suggest Edward de Vere as the most likely candidate, but unfortunately, the corpus of his work is so small that accuracy cannot be guaranteed. 

In any case, the debate over Shakespeare’s authenticity will likely rage on, unless some new key evidence is discovered or Shakespeare himself rises from the grave. As for myself, I’m hoping that he was nothing more or less than what he seemed: a writer, an actor, and an upstart crow.

Categories: Analysis, Text Analytics