Exploring Parsey McParseface

Development of AI/ML for NLP is advancing rapidly. One recent project that caught my eye was Google’s “Parsey McParseface”. Let’s briefly review what it is, and then take Parsey for a spin..

Why are we talking about this?

When people ask me what I do for work, I could say that I work in artificial intelligence and machine learning for text analytics natural language processing. But that often gets a blank stare and an awkward sip of the drink before an awkward transition to another subject. So, I usually just say, “I teach computers how to read.”

In the right circles, though, awareness of AI/ML for NLP is taking center-stage. Amidst the steady buzz about chatbots and digital assistants is a general discussion about how we can train computers to understand human-generated communication.

This recent flurry of activity makes life very exciting for those of us in this business. After all, these are the challenges we’ve been working on for years. But at the same time, we filter out the noise to understand what other folks in the industry are doing and how our approaches align with theirs.

Getting our hands dirty with Parsey McParseface

Google has been making advances in machine-learning techniques with their Tensorflow framework, and earlier this month announced the availability of an open-source language framework implemented in Tensorflow called SyntaxNet. As a demonstration of the capabilities of SyntaxNet, Google developed an English parser called Parsey McParseface. The announcement on the Google Research blog is an interesting read because it does a good job of explaining why this work is difficult, and difficult to get right.

At this point, we’ve got a complex machine-learning framework, and we’ve got an open-source language understanding framework, and we’ve got an English language parser with a funny name. What can we do with all this? Let’s get our hands dirty and take Parsey for a spin. I chose the “easy” route and downloaded an image of SyntaxNet to build and run in Docker. This gives me a virtual machine set up with all the bits to try out Parsey. My first step was to run the example given in the documentation:

echo ‘Bob brought the pizza to Alice.’ | syntaxnet/demo.sh

Sending individual bits of text into a demo shell script is a bit clunky, but that’s guaranteed to improve over time. This command generates a boatload of output, and at the bottom I get this:

Input: Bob brought the pizza to Alice .

Parse:

brought VBD ROOT

+– Bob NNP nsubj

+– pizza NN dobj

|   +– the DT det

+– to IN prep

|   +– Alice NNP pobj

+– . . punct

SyntaxNet is trained with the Penn Treebank, so the part-of-speech tags look familiar, and the tree shows us what parts of the sentence are related. Let’s compare that to Salience, because a comparison of this new approach is an important part of this whole exercise.

In the Salience Demo application, I enter the same content and select to view “Chunk Tagged Text”. This is going to provide me with Salience output of the part-of-speech tags as well as similar grouping of related words called “chunks.” Salience produces the following output:

[Bob_NNP brought_VBN] [the_DT pizza_NN] [to_TO Alice_NNP] [._.]

The majority of the part-of-speech tags match, and the groupings look roughly similar, though the notable piece missing from Salience output is the dependency tree. Salience actually implements a light-weight dependency parse internally, and utilizes it for higher level functions such as sentiment analysis and named entity extraction. But for the purposes of performance, it’s a much simpler approach. And that’s another perspective that needs to be considered. After initialization, the output from Parsey McParseface shows the following:

INFO:tensorflow:Processed 1 documents

INFO:tensorflow:Total processed documents: 1

INFO:tensorflow:num correct tokens: 0

INFO:tensorflow:total tokens: 7

INFO:tensorflow:Seconds elapsed in evaluation: 0.12, eval metric: 0.00%

INFO:tensorflow:Processed 1 documents

INFO:tensorflow:Total processed documents: 1

INFO:tensorflow:num correct tokens: 1

INFO:tensorflow:total tokens: 6

INFO:tensorflow:Seconds elapsed in evaluation: 0.43, eval metric: 16.67%

INFO:tensorflow:Read 1 documents

This tells us that overall, evaluation of this sample sentence took 0.55 seconds. By comparison, Salience performs “text preparation” (tokenization, part-of-speech tagging, and chunking) and generates POS-tagged output on the same sentence in 0.059 seconds.

What can we do with the Gettysburg address?

Let’s look at another example, something with a bit more meat on the bones. What happens if we process the text of the Gettysburg address? Parsey splits the content into three “documents,” and parses the 271 words in 1.86 seconds. Salience processes the same content in 0.093 seconds.

At the end of the day however, parsing sample sentences and the Gettysburg address is all nice and academic. But what we really want to do is use this technology for some practical purpose. Voice of Customer and Social Media Monitoring analytics are two major, and majorly difficult, applications of natural language processing. If you thought plain English was hard to understand, try Twitter!

#MyOnePhoneCallGoesTo JAKE from @StateFarm because he’s #HereToHelp ?

Input: # MyOnePhoneCallGoesTo JAKE from @ StateFarm because he ‘s # HereToHelp

Parse:

JAKE NN ROOT

+– MyOnePhoneCallGoesTo NNP nn

|   +– # NN nn

+– from IN prep

|   +– StateFarm NNP pobj

|       +– @ NNP nn

+– HereToHelp NN dep

+– because IN mark

+– he PRP nsubj

+– ‘s VBZ cop

+– # NN nn

Performance-wise, Parsey McParseface generates this parse of the tweet in 0.32 seconds, but has misinterpreted some of the elements that are unique to Twitter such as hashtags and mentions. Salience generates this parse of the same content in 0.07 seconds:

[ #MyOnePhoneCallGoesTo_HASHTAG] [JAKE_NNP] [from_IN] [ @StateFarm_MENTION] [because_IN he_PRP] [‘s_VBZ] [ #HereToHelp_HASHTAG]

This is where the rubber really meets the road. A syntactic English parser is a core component in any text analytics system. However, there is a lot that needs to be built on top of it to take advantage of its capabilities for higher level functions such as entity extraction (including Twitter mentions and hashtags) and sentiment analysis. I came across another article online that draws the same conclusion. In this article, Matthew Honnibal gives a great analogy that a syntactic parser is a drill bit. So a better parser is a better drill bit, but by itself, does not give you any oil. Having a better drill bit can improve the overall system, but it is one piece that plays a specific role with other pieces depending on it and performing their function.