Third Unsolved Problem in Data Science and Analytics
We’ve been exploring unsolved problems, and looked at two of them.
A third unsolved problem is our hope that AI can grasp meaning. We want to create sentiment analysis, for example. And of course, people are sort of doing it today. Lone Star has done it. Some successful quant traders do it.
We can take natural language as input, and we can mimic natural language as an algorithm’s output. We can build systems which equate ideas, not just words. We can build systems which “know” that a “patent lawyer” is roughly the same thing as an “intellectual property attorney”.
But our success is very limited.
We can’t truly detect Sentiment, Truth and Accuracy in our analysis of what humans say, do, and write.
Roger Schank, who was a leading light in AI at Stanford proposed the “Groucho Test”. Can your AI “get” a Groucho joke? For example:
Those are my principles, and if you don’t like them… well, I have others.
Outside of a dog, a book is a man’s best friend… Inside of a dog it’s too dark to read.
The secret of life is honesty and fair dealing. If you can fake that, you’ve got it made.
And, a personal favorite – Quote me as saying I was mis-quoted.
These kinds of sayings make sense to us. We know when Groucho is being silly, sarcastic, cynical or just being a crook. And he loved playing the role of cynical crook.
Humans can do this pretty well, and when we can’t we usually KNOW we are confused. An unsolved problem in analytics is doing this well, including knowing when our algorithms are confused.
If we take the ideas of sentiment analysis, truth and veracity, then the stamps shown here are a real challenge. If itâ€™s hard to deal with words, these images are VERY hard.
Here we have Marx and Lennon. Groucho and John… not Carl and Vladimir.
This is a pun working on several different levels. The odds any existing AI can decode them all are low. We’ve done some testing of this kind of image understanding. It is a very hard problem.
Perhaps as important, can you tell whether it’s true that this is a real stamp? It turns out that’s a complicated question, so neither you, nor an AI can say with 100% certainty whether this statement is true – this is a valid postage stamp.
Since we can’t do it either, we should expect it will be a while before we can train an algorithm. Which also suggests this problem is connected to our first two problems; detecting dirty data and dealing with uncertainty.
So… Sentiment, Truth and Accuracy unsolved problem number three.
About Lone Star Analysis
Lone Star Analysis enables customers to make insightful decisions faster than their competitors. We are a predictive guide bridging the gap between data and action. Prescient insights support confident decisions for customers in Oil & Gas, Transportation & Logistics, Industrial Products & Services, Aerospace & Defense, and the Public Sector.
Lone Star delivers fast time to value supporting customers planning and on-going management needs. Utilizing our TruNavigator® software platform, Lone Star brings proven modeling tools and analysis that improve customers top line, by winning more business, and improve the bottom line, by quickly enabling operational efficiency, cost reduction, and performance improvement. Our trusted AnalyticsOSSM software solutions support our customers real-time predictive analytics needs when continuous operational performance optimization, cost minimization, safety improvement, and risk reduction are important.
Headquartered in Dallas, Texas, Lone Star is found on the web at http://www.Lone-Star.com.