Starting a new AI project in 2019? Don’t fail prey to the hype. The new year will see a separation of the wheat from the chaff, and an AI fail at this stage could spell disaster for your project, team, and company.
This article includes classic stories of recent, high-profile AI fails, as well as information and advice from veteran data scientists and executives to help you build an effective AI solution:
- AI Failures From IBM, Microsoft and Apple
- “9 More Ways to Fail With AI” by the Chief Data Officer at Abe.ai
- Why Maintenance is Critical to Avoiding an Embarrassing AI Failure
- How to Get Real Value from Artificial Intelligence
Full disclosure if you’re new to Lexalytics: we provide a business intelligence platform that uses AI and machine learning for natural language processing and text analytics. But the stories and the advice presented here are relevant for anyone involved in AI/machine learning.
Fail: IBM’s “Watson for Oncology” Cancelled After $62 million and Unsafe Treatment Recommendations
No AI project captures the “moonshot” attitude of big tech companies quite like Watson for Oncology. In 2013, IBM partnered with The University of Texas MD Anderson Cancer Center to develop a new “Oncology Expert Advisor” system. The goal? Nothing less than to cure cancer.
The first line of the press release boldly declares, “MD Anderson is using the IBM Watson cognitive computing system for its mission to eradicate cancer.” IBM’s role was to enable clinicians to “uncover valuable insights from the cancer center’s rich patient and research databases.”
So, how’d that go?
“This product is a piece of sh–.”
In July 2018, StatNews reviewed internal IBM documents and found both that IBM’s Watson was making erroneous, downright dangerous cancer treatment advice.
According to StatNews, the documents (internal slide decks) largely place the blame on IBM’s engineers. Evidently, they trained the software on a small number of hypothetical cancer patients, rather than real patient data.
The result? Medical specialists and customers identified “multiple examples of unsafe and incorrect treatment recommendations,” including one case where Watson suggested that doctors give a cancer patient with severe bleeding a drug that could worsen the bleeding.
From this Verge article:
“This product is a piece of s—,” one doctor at Jupiter Hospital in Florida told IBM executives, according to the documents. “We bought it for marketing and with hopes that you would achieve the vision. We can’t use it for most cases.”
In February 2017, Forbes reported that MD Anderson had “benched” the Watson for Oncology project. A special report from University of Texas auditors said that MD Anderson had spent more than $62 million without reaching their goals.
Fail: Microsoft’s AI Chatbot Corrupted by Twitter Trolls
Microsoft made big headlines when they announced their new chatbot. Writing with the slang-laden voice of a teenager, Tay could automatically reply to people and engage in “casual and playful conversation” on Twitter.
Some of Tay’s early tweets, pulled from this Verge article:
@HereIsYan omg totes exhausted.
swagulated too hard today.
— TayTweets (@TayandYou) March 23, 2016
@themximum damn. tbh i was kinda distracted..u got me.
— TayTweets (@TayandYou) March 23, 2016
@ArtsRawr like some og kush dank
— TayTweets (@TayandYou) March 23, 2016
Tay grew from Microsoft’s efforts to improve their “conversational understanding”. To that end, Tay used machine learning and AI. As more people talked with Tay, Microsoft claimed, the chatbot would learn how to write more naturally and hold better conversations.
Microsoft won’t say exactly how the algorithms worked, of course. Perhaps because of what happened next.
Less than 24 hours after Tay launched, internet Trolls had thoroughly “corrupted” the chatbot’s personality.
By flooding the bot with a deluge of racist, misogynistic, and anti-semitic tweets, Twitter users turned Tay – a chatbot that the Verge described as “a robot parrot with an internet connection” – into a mouthpiece for a terrifying ideology.
Microsoft claimed that their training process for Tay included “relevant public data” that had been cleaned and filtered. But clearly they hadn’t planned for failure, at least not this kind of catastrophe.
After a cursory effort to clean up Tay’s timeline, Microsoft pulled the plug on their unfortunate AI chatbot.
Fail: Apple’s Face ID Defeated by a 3D Mask
Apple released the iPhone X (10? Ten? Eks?) to mixed, but generally positive reviews. The phone’s shiniest new feature was Face ID, a facial recognition system that replaced the fingerprint reader as your primary passcode.
Apple said that Face ID used the the iPhone X’s advanced front-facing camera and machine learning to create a 3-dimensional map of your face. The machine learning/AI component helped the system adapt to cosmetic changes (such as putting on make-up, donning a pair of glasses, or wrapping a scarf around your neck), without compromising on security.
But a week after the iPhone X’s launch, hackers were already claiming to beat Face ID using 3D printed masks. Vietnam-based security firm Bkav found that they could successfully unlock a Face ID-equipped iPhone by glueing 2D “eyes” to a 3D mask. The mask, made of stone powder, cost around $200. The eyes were simple, printed infrared images.
Bkav’s claims, outlined in a blog post, gained widespread attention, not least because Apple had already written that Face ID was designed to protect against “spoofing by masks or other techniques” using “sophisticated anti-spoofing neural networks”.
Not everyone was convinced by Bkav’s work. Publications such as Wired had already tried and failed to beat Face ID using masks. And Wired’s own article on Bkav’s announcement included some skepticism from Marc Rogers, a researcher for security firm Cloudflare. But the work – and this glimpse into the weakness of AI – is fascinating.
In one story, Facebook had to shut down their “Bob” and “Alice” chatbots after the computers started talking to each other in their own language. And that’s just the beginning. Srishti continues with more examples from Mitra, Uber and Amazon.
Together, these 5 AI failures cover: chatbots, political gaffs, autonomous driving accidents, facial recognition mixups, and angry neighbors.
Srishti argues that these failures suggest companies should be more cautious and diligent when implementing AI systems.
Francesco’s list is comprehensive, funny, and thought-provoking. It features some classic paths to failure, such as “Cut R&D to save money” and “Work without a clear vision”. But, Francesco says, “there is a plethora of ways to fail with AI”.
My favorite is #2, “Operate in a technology bubble.”
As Francesco points out, AI doesn’t always fail due to technical problems. Sometimes, the problem is a lack of social need or interest.
“Artificial intelligence technologies cannot be built in isolation from the social circumstances that make them necessary,” Francesco writes.
This is a fantastic point. In the rush to stay ahead of the technology curve, companies often fail to consider the impact of their inherent biases. This is particularly dangerous for companies working in data analytics for healthcare, biotechnology, financial services and law.
“Operating in a bubble and ignoring the current needs of society is a sure path to failure.” – Francesco Gadaleta
Francesco’s list is a must-read for any executive, developer or data scientist looking to add AI to their technology stack
Plan for failure; work on your reaction times; adopt a change management model. Manifesto of a management consulting firm? No, it’s veteran data scientist Paul Barba writing for KDnuggets.
Just like a car, Paul explains, an AI can tick along for a while on its own. But failing to maintain it can destroy your project or product, and maybe even your company.
As cars become more complex, insurance companies advise owners to keep up with preventative maintenance before the cost of repairs becomes staggering. Similarly, as an AI grows more complex, the risks and costs of AI failure grow larger. And the longer you wait to repair your AI, the more expensive it’ll be.
“Through auditing, quantitative measuring and proactive organizational responsiveness, you can avoid the equivalent of blowing an AI gasket.” – Paul Barba
Just like your car, an AI requires maintenance to remain robust and valuable. And just like your car, you may be faced with a sudden, catastrophic failure if you don’t keep it up-to-date.
In this article, Paul explains how data scientists can avoid AI failure by maintaining it with new training data, methods and models.
How to Get Real Value from Artificial Intelligence in 2019
Big AI projects, such as Watson for Oncology and self-driving cars, get most of the press coverage. But as the past few years have shown, moon-shots like these are the most likely to fail. And when they fail, they fail spectacularly (as we’ve been discussing).
Related article: How to Choose an AI Vendor
How, then, can you build an AI system that actually succeeds? The answer is deceptively simple:
Focus on solving a real business problem.
Our own CEO, Jeff Catlin, has spent the past 15 years watching AI and machine learning get over-hyped and under-delivered. In this article on Forbes, he examines a number of business applications for AI solutions to:
- Predict customer churn
- Create better surveys
- Read and handle online reviews
- Craft effective messaging
“Building a business case for AI isn’t so different from building one for any other business problem,” Catlin writes. “First, identify a need and a desired outcome (automation and efficiency are common drivers of successful AI projects). Then undertake a feasibility assessment.”
The key is to look for business use cases where AI is already in action, or where it’s emerging as an effective solution.
Jeff puts it best: “With the right business case and the right data, AI can deliver powerful time and cost savings, as well as valuable insights you can use to improve your business.”
Read Jeff’s article on Forbes: Using AI to Solve a Business Problem