3 Ways to Analyze the State of the Union with Text Analytics

To most people, analyzing last week’s State of the Union address, is something politicians and talking heads do. It involves talking about policy, politics, and the occasional presidential zinger. But when you add text analytics to the mix, the picture becomes a lot bigger and a lot more interesting. Here are three different ways text analytics was used to analyze the State of the Union:

1. Word Frequency

A faculty member in the Northeastern NU Lab for Texts, Maps, and Networks, Ben Schmidt, put the language of 2015 State of the Union address in context of every other State of the Union ever given. Among his many fascinating discoveries were the 84 words used in President Obama’s speech which have never before been used in a State of the Union, including “lesbian”, “Instagram”, and “innovators”. Check out his State of the Union visualization – you can see the frequency with which any word or two-word phrase has been used in every State of the Union ever.


2. Readability

In a different approach, Vocativ used the Flesch-Kincaid readability test to analyze not only the readability level of each of President Obama’s speeches, but the average scores for every president dating back to Woodrow Wilson. According to their findings, this year’s State of the Union speech scored a 10.0, or tenth-grade level accessibility, the highest level of any State of the Union address he’s given so far. The analysis also shows a downward trend in the sophistication of the language in State of the Union speeches over the years. Woodrow Wilson scored an average 15, FDR a 13, and Reagan an 11, for example. For more on the downward trend in presidential speech readability levels, check out this fascinating interactive graph (original graph is not on the Vocativ site anymore, but a similar one is found in the linked Guardian article) of over 600 speeches in the last 200 years.


3. Ideology

Real Clear Politics published a piece in which they analyzed the political tone of 34 presidential speeches, starting with President Reagan in 1981. They were able to chart each speech ideologically (on a left-right political scale) based  on its content. Their analysis shows President Clinton’s speeches shifted significantly to the right during his second term. Interestingly, President Obama’s speeches have become more liberal in the past six years, although the article points out that according to their analysis, this president has been more ideologically consistent in the content of his speech than Presidents George W. Bush or Bill Clinton.

There are a countless ways to explore unstructured content using text analytics.