The hit film Deadpool surprised everyone by becoming the second highest-grossing R-rated film after The Passion of the Christ, and created an interesting problem.
Marvel Studios owns most of the market share for superhero movies. Yet, when laying the foundation for their takeover of this market, they sold off one of their largest properties to 20th Century Fox; The X-Men. Along with that team and their villains, Fox ended up getting the rights to all of their mutant characters. This was key.
In an earlier film that comic-book-movie-fans will not speak of (X-Men Origins: Wolverine) they cast Ryan Reynolds as a cult-favorite character named Deadpool. This Deadpool was much more family friendly than his comic book alter ego, and the performance fell flat.
So when making a Deadpool film the studio emphasized keeping the character crude and irreverent like he is in the comic books. But this presents a problem when analyzing viewer response to the film—the crass and harsh vocabulary, with no small amount of sarcasm, can confuse how a machine model measures response. So we trained up a (pretty entertaining) codebook specifically to deal with these alternatively worded reviews. Then, we ran a quick analysis of tweets in the immediate aftermath of the release, and while things worked mostly as we expected there were a few tweets that were confusing.
Almost none of these tweets can be shared here, due to content. But what’s interesting is the bulk of the vulgarity, which is typically used in a negative way, trended positive. Even references to the vulgarity were read as positive: one fan tweeted “#Deadpool is brash and vulgar,” which would typically not be favorable— but this is exactly what the studio was going for.
If Deadpool hadn’t been a financial success—the sequel was greenlit almost immediately after the early release—this would an aberration. Yet, considering Deadpool out-grossed a number of PG-13 X-Men films, we’re likely to see a lot more superhero films in this tone. Expect Marvel audiences to hold the applause in the future, opting instead for humor, sarcasm, and obscenities.
This sort of case is interesting because it shows how quickly the language customers use can change, because of a breakthrough no one saw coming. It highlights why text analytics is so difficult to get right. Language is not rigid, it’s democratic and messy. The simple truth is sarcasm, double entendre, and swear words aren’t easy to analyze—but then again, that’s why you’ve got Lexalytics.