Machine Learning? I Don’t Think Those Words Mean What You Think They Mean

I find more and more people use the term “machine learning” when they really mean to say “modeling.” I guess that is like calling all types of data activities — with big and small data — “Big Data.” And that’s OK.

machine learning

I find more and more people use the term “machine learning” when they really mean to say “modeling.” I guess that is like calling all types of data activities — with big and small data — “Big Data.” And that’s OK.

Languages are developed to communicate with other human beings more effectively. If most people use the term to include broader meanings than the myopic definition of the words in question, and if there is no trouble understanding each other that way, who cares? I’m not here to defend the purity of the meaning, but to monetize big or small data assets.

The term “Big Data” is not even a thing in most organizations with ample amounts of data anymore, but there are many exceptions, too. I visit other countries for data and analytics consulting, and those two words still work like “open sesame” to some boardrooms. Why would I blame words for having multiple meanings? The English dictionary is filled with such colloquial examples.

I recently learned that famous magic words “Hocus Pocus” came from the Latin phrase “hoc est corpus,” which means “This is the body (of Christ)” as spoken during Holy Communion in Roman Catholic Churches. So much for the olden-day priests only speaking in Latin to sound holier; ordinary people understood the process as magic — turning a piece of bread into the body of Christ — and started applying the phrase to all kinds of magic tricks.

However, if such transformations of words start causing confusion, we all need to be more specific. Especially when the words are about specific technical procedures (not magic). Going back to my opening statement, what does “machine learning” mean to you?

  • If spoken among data scientists, I guess that could mean a very specific way to describe modeling techniques that include Supervised Learning, Unsupervised Learning, Reinforced Learning, Deep Learning, or any other types of Neural Net modeling, indicating specific methods to construct models that serve predetermined purposes.
  • If used by decision-makers, I think it could mean that the speaker wants minimal involvement of data scientists or modelers in the end, and automate the model development process as much as possible. As in “Let’s set up Machine Learning to classify all the inbound calls into manageable categories of inquiries,” for instance. In that case, the key point would be “automation.”
  • If used by marketing or sales; well, now, we are talking about really broad set of meanings. It could mean that the buyers of the service will require minimal human intervention to achieve goals. That the buyer doesn’t even have to think too much (as the toolset would just work). Or, it could mean that it will run faster than existing ways of modeling (or pattern recognition). Or, they meant to say “modeling,” but they somehow thought that it sounded antiquated. Or, it could just mean that “I don’t even know why I said Machine Learning, but I said it because everyone else is saying it” (refer to “Why Buzzwords Suck”).

I recently interviewed a candidate fresh out of a PhD program for a data scientist position, whose resume is filled with “Machine Learning.” But when we dug a little deeper into actual projects he finished for school work or internship programs, I found out that most of his models were indeed good, old regression models. So I asked why he substituted words like that, and his answer was staggering; he said his graduate school guided him that way.

Why Marketers Need to Know What Words Mean

Now, I’m not even sure whom to blame in a situation like this, where even academia has fallen under the weight of buzzwords. After all, the schools are just trying to help their students getting high paying jobs before the summer is over. I guess then the blame is on the hiring managers who are trying to recruit candidates based on buzzwords, not necessarily knowing what they should look for in the candidates.

And that is a big problem. This is why even non-technical people must understand basic meanings of technical terms that they are using; especially when they are hiring employees or procuring outsourcing vendors to perform specific tasks. Otherwise, some poor souls would spend countless hours to finish things that don’t mean anything for the bottom-line. In a capitalistic economy, we play with data for only two reasons:

  1. to increase revenue, or
  2. to reduce cost.

If it’s all the same for the bottom line, why should a non-technician care about the “how the job is done” part?

Why It Sucks When Marketers Demand What They Don’t Understand

I’ve been saying that marketers or decision-makers should not be bad patients. Bad patients won’t listen to doctors; and further, they will actually command doctors prescribe certain medications without testing or validation. I guess that is one way to kill themselves, but what about the poor, unfortunate doctor?

We see that in the data and analytics business all of the time. I met a client who just wanted to have our team build neural net models for him. Why? Why not insist on a random forest method? I think he thought that “neural net” sounded cool. But when I heard his “business” problems out, he definitely needed something different as a solution. He didn’t have the data infrastructure to support any automated solutions; he wanted to know what went on in the modeling process (neural net models are black boxes, by definition), he didn’t have enough data to implement such things at the beginning stage, and projected gains (by employing models) wouldn’t cover the cost of such implementation for the first couple of years.

What he needed was a short-term proof of concept, where data structure must be changed to be more “analytics-ready.” (It was far from it.) And the models should be built by human analysts, so that everyone would learn more about the data and methodology along the way.

Imagine a junior analyst fresh out of school, whose resume is filled with buzzwords, meeting with a client like that. He wouldn’t fight back, but would take the order verbatim and build neural net models, whether they helped in achieving the business goals or not. Then the procurer of the service would still be blaming the concept of machine learning itself. Because bad patients will never blame themselves.

Even advanced data scientists sometimes lose the battle with clients who insist on implementing Machine Learning when the solution is something else. And such clients are generally the ones who want to know every little detail, including how the models are constructed. I’ve seen data scientists who’d implemented machine learning algorithms (for practical reasons, such as automation and speed gain), and reverse-engineered the models, using traditional regression techniques, only to showcase what variables were driving the results.

One can say that such is the virtue of a senior-level data scientist. But then what if the analyst is very green? Actually some decision-makers may like that, as a more junior-level person won’t fight back too hard. Only after a project goes south, those “order takers” will be blamed (as in “those analysts didn’t know what they were doing”).

Conclusion

Data and analytics businesses will continually evolve, but the math and the human factors won’t change much. What will change, however, is that we will have fewer and fewer middlemen between the decision-makers (who are not necessarily well-versed in data and analytics) and human analysts or machines (who are not necessarily well-versed in sales or marketing). And it will all be in the name of automation, or more specifically, Machine Learning or AI.

In that future, the person who orders the machine around — ready or not — will be responsible for bad results and ineffective implementations. That means, everyone needs to be more logical. Maybe not as much as a Vulcan, but somewhere between a hardcore coder and a touchy-feely marketer. And they must be more aware of capabilities and limitations of technologies and techniques; and, more importantly, they should not blindly trust machine-based solutions.

The scary part is that those who say things like “Just automate the whole thing with AI, somehow” will be the first in line to be replaced by the machines. That future is not far away.

Author: Stephen H. Yu

Stephen H. Yu is a world-class database marketer. He has a proven track record in comprehensive strategic planning and tactical execution, effectively bridging the gap between the marketing and technology world with a balanced view obtained from more than 30 years of experience in best practices of database marketing. Currently, Yu is president and chief consultant at Willow Data Strategy. Previously, he was the head of analytics and insights at eClerx, and VP, Data Strategy & Analytics at Infogroup. Prior to that, Yu was the founding CTO of I-Behavior Inc., which pioneered the use of SKU-level behavioral data. “As a long-time data player with plenty of battle experiences, I would like to share my thoughts and knowledge that I obtained from being a bridge person between the marketing world and the technology world. In the end, data and analytics are just tools for decision-makers; let’s think about what we should be (or shouldn’t be) doing with them first. And the tools must be wielded properly to meet the goals, so let me share some useful tricks in database design, data refinement process and analytics.” Reach him at stephen.yu@willowdatastrategy.com.

Leave a Reply

Your email address will not be published. Required fields are marked *