The first presidential debate of the 2016 general election was one of the most-watched in modern history. Throughout this infamously polarizing election season there’s been a fixation on undecided voters. I try to be an informed voter. I’ve watched multi-channel coverage and stomached as much of the party conventions as I could. The news cycles, the stump speeches, the conventions—it all bled together into “on message” chaos. And the debate wasn’t much better.
So I decided to apply our NLP and text analytics engine to the responses from each of the candidates as they spoke at Hofstra. One feature that requires little to no tuning is the extraction of themes. These are extracted through part-of-speech patterns and are scored using a technique called lexical chaining. What you end up with as a result are fragments from the content that represent main topics of the conversation.
I started by identifying direct questions moderator Lester Holt asked the candidates. Initially, I took all comments from each candidate in between the questions. Invariably as the debate unfolds, the answers for a particular question meanders away from the moderator’s topic to platform pivots and verbal attacks of their opponent. The conversation in the debate is also fluid, with a lot of interruptions and interjections.
To remove the impact of these aspects of the debate, I took the moderator’s question, and only the first response from each candidate. I made two adjustments to our out-of-the-box theme extraction. I made minor adjustments to the part-of-speech patterns for themes in a way that best captured the moderator’s themes. I also removed stopwords which are words that can filter out certain themes that meet the pattern but are not highly informative (ie. “next” in “next segment” would be a theme stopword). However, the analysis of the moderator’s questions and both candidate’s responses were performed in the exact same manner.
In the first question, Holt asked the candidates about the economy, specifically about job growth and Americans’ incomes. He asked about “money in the pockets” of Americans “living paycheck-to-paycheck.”
The top-ranked themes from Hillary Clinton’s response show that her initial response was mostly germane to the question. Her answer focused on “small businesses” and a “national minimum wage” increase, using “innovation and technology” and plans that the next President can “put into action.”
Donald Trump’s response was less clear. The strongest themes from his answer were about a friend of his talking about “what’s happening in Mexico” and that jobs are “fleeing the country.”
Interestingly, both of these answers were modified versions of the candidates’ “stump speech” (or common things the candidates repeat at their events).
One question that got a lot of attention on social media and by the pundits dealt with potential cyber-attacks from foreign actors. Holt’s themes for this question are that “cyber attacks” are part of “a 21st century war happening every day.”
Clinton’s answers, again, hit on the themes of the question, but she did diverge into a critique of her opponent. Still, the themes she touched on dealt with public and private sector information, defending American citizens, and that there’s “no doubt” that this is a serious problem.
Trump, on the other hand, devolved into what pundits call “a word-salad.” Clinton generally provided apposite responses to the question, going off topic only occasionally. Trump’s answer contained fewer relevant bits. What themes did emerge were tangential at best. Trump was mocked for referring to the issue as “the cyber,” but the relevant themes we found weren’t much better, the most cogent one being “we’ve lost control of things.”
Finally, Holt asked both candidates if they would respect the outcome of the election. Clinton’s answers hit on almost the exact same themes as the question, specifically the “will of the voters” and accepting “the outcome” of the election. Trump’s answer, however, didn’t address the question at all. The top-ranked theme for this response came from his warning that “we are a nation that is seriously troubled.”
Since this is a new mode of evaluating the debates, we can’t provide the kind of top line result that opinion polling has. There’s no exact percentage, but our initial analysis suggests that Clinton’s answers were more on-topic than Trump’s. Regardless, this is but one data-point, an experiment started by a random question I had. There is more work to be done.
I am going to do this analysis for the other presidential debates, and see what we discover. Perhaps this is a new way to evaluate what candidates say. Using technology like this we could identify if a “Jobs” speech or “National Security” speech efficiently addresses the issues we as invested citizens care about. Applying data science to politics is one way we can hold our elected representatives accountable.