Data Analytics Projects Only Benefit Marketers When Properly Applied

A recent report shared that only about 20% of all analytics projects work turns out to be beneficial to businesses. Such waste. Nonetheless, is that solely the fault of data scientists? After all, even effective medicine renders useless if the patient refuses to take it.

I recently read a report that only about 20% of all analytics projects work turns out to be beneficial to businesses. Such waste. Nonetheless, is that solely the fault of data scientists? After all, even effective medicine renders useless if the patient refuses to take it.

Then again, why would users reject the results of analytics work? At the risk of gross simplification, allow me to break it down into two categories: Cases where project goals do not align with the business goals, and others where good intelligence gets wasted due to lack of capability, procedure, or will to implement follow-up actions. Basically, poor planning in the beginning, and poor execution at the backend.

Results of analytics projects often get ignored if the project goal doesn’t serve the general strategy or specific needs of the business. To put it in a different way, projects stemming from the analyst’s intellectual curiosity may or may not align with business interests. Some math geek may be fascinated by the elegance of mathematical precision or complexity of solutions, but such intrigue rarely translates directly into monetization of data assets.

In business, faster and simpler answers are far more actionable and valuable. If I ask business people if they want an answer with 80% confidence level in next 2 days, or an answer with 95% certainty in 4 weeks, the great majority would choose the quicker but less-than-perfect answer. Why? Because the keyword in all this is “actionable,” not “certainty.”

Analysts who would like to maintain a distance from immediate business needs should instead pursue pure science in the world of academia (a noble cause, without a doubt). In business settings, however, we play with data only to make tangible differences, as in dollars, cents, minutes or seconds. Once such differences in philosophy are accepted and understood by all involved parties, then the real question is: What kind of answers are most needed to improve business results?

Setting Analytics Projects Up for Success

Defining the problem statement is the hardest part for many analysts. Even the ones who are well-trained often struggle with the goal setting process. Why? Because in school, the professor in charge provides the problems to solve, and students submit solutions to them.

In business, analysts must understand the intentions of decision makers (i.e., their clients), deciphering not-so-logical general statements and anecdotes. Yeah, sure, we need to attract more high-value customers, but how would we express such value via mathematical statements? What would the end result look like, and how will it be deployed to make any difference in the end?

If unchecked, many analytics projects move forward purely based on the analysts’ assumptions, or worse, procedural convenience factors. For example, if the goal of the project is to rank a customer list in the order of responsiveness to certain product offers, then to build models like that, one may employ all kinds of transactional, behavioral, response, and demographic data.

All these data types come with different strengths and weaknesses, and even different missing data ratios. In cases like this, I’ve encountered many — too many — analysts who would just omit the whole population with missing demographic data in the development universe. Sometimes such omission adds up to be over 30% of the whole. What, are we never going to reach out to those souls just because they lack some peripheral data points for them?

Good luck convincing the stakeholders who want to use the entire list for various channel promotions. “Sorry, we can provide model scores for only 70% of your valuable list,” is not going to cut it.

More than a few times, I received questions about what analysts should do when they have to reach deep into lower model groups (of response models) to meet the demand of marketers, knowing that the bottom half won’t perform well. My response would be to forget about the model — no matter how elegant it may be — and develop heuristic rules to eliminate obvious non-targets in the prospect universe. If the model gets to be used, it is almost certain that the modeler in charge will be blamed for mediocre or bad performance, anyway.

Then I firmly warn them to ask about typical campaign size “before” one starts building some fancy models. What is the point of building a response model when the emailer would blast emails as much as he wants? To prove that the analyst is well-versed in building complex response models? What difference would it ever make in the “real” world? With that energy, it would be far more prudent to build a series of personas and product affinity models to personalize messages and offers.

Supporting Analytics Results With Marketing

Now, let’s pause for a moment and think about the second major reason why the results of analytics are not utilized. Assume that the analytics team developed a series of personas and product affinity models to customize offers on a personal level. Does the marketing team have the ability to display different offers to different targets? Via email, websites, and/or print media? In other words, do they have capabilities and resources to show “a picture of two wine glasses filled with attractive looking red wine” to people who scored high scores in the “Wine Enthusiast” model?

I’ve encountered too many situations where marketers look concerned — rather than getting excited — when talking about personas for personalization. Not because they care about what analysts must go through to produce a series of models, but because they lack creative assets and technical capabilities to make it all happen.

They often complain about lack of budget to develop multiple versions of creatives, lack of proper digital asset management tools, lack of campaign management tools that allows complex versioning, lack of ability to serve dynamic contents on websites, etc. There is no shortage of reasons why something “cannot” be done.

But, even in a situation like that, it is not the job of a data scientist to suggest increasing investments in various areas, especially when “other” departments have to cough up the money. No one gets to command unlimited resources, and every department has its own priorities. What analytics professionals must do is to figure out all kinds of limitations beyond the little world of analytics, and prioritize the work in terms of actionability.

Consider what can be done with minimal changes in the marketing ecosystem, and for preservation of analytics and marketing departments, what efforts will immediately bring tangible results? Basically, what will we be able to brag about in front of CEOs and CFOs?

When to Put Analytics Projects First

Prioritization of analytics projects should never be done solely based on data availability, ease of data crunching or modeling, or “geek” factors. It should be done in terms of potential value of the result, immediate actionability, and most importantly, alignment with overall business objectives.

The fact that only about 20% of analytics work yields business value means that 80% of the work was never even necessary. Sure, data geeks deserve to have some fun once in a while, but the fun factor doesn’t pay for the systems, toolsets, data maintenance, and salaries.

Without proper problem statements on the front-end and follow-up actions on the back-end, no amount of analytical activities would produce any value for businesses. That is why data and analytics professionals must act as translators between the business world and the technical world. Without that critical consulting layer, it becomes the-luck-of-the-draw when prioritizing projects.

To stay on target, always start with a proper analytics roadmap covering from ideation to applications stages. To be valued and appreciated, data scientists must act as business consultants, as well.


Marketers Find the Least-Wrong Answers Via Modeling

Why do marketers still build models when we have ample amounts of data everywhere? Because we will never have every piece of data about everything. We just don’t know what we don’t know.

Why do marketers still build models when we have ample amounts of data everywhere? Because we will never have every piece of data about everything. We just don’t know what we don’t know.

Okay, then — we don’t get to know about everything, but what are the data that we possess telling us?

We build models to answer that question. Even scientists who wonder about the mysteries of the universe and multiverses use models for their research.

I have been emphasizing the importance of modeling in marketing through this column for a long time. If I may briefly summarize a few benefits here:

  • Models Fill in the Gaps, covering those annoying “unknowns.” We may not know for sure if someone has an affinity for luxury gift items, but we can say that “Yes, with data that we have, she is very likely to have such an affinity.” With a little help from the models, the “unknowns” turn into “potentials.”
  • Models Summarize Complex Data into simple-to-use “scores.” No one has time to dissect hundreds of data variables every time we make a decision. Model scores provide simple answers, such as “Someone likely to be a bargain-seeker.” Such a model may include 10 to 20 variables, but the users don’t need to worry about those details at the time of decision-making. Just find suitable offers for the targets, based on affinities and personas (which are just forms of models).
  • Models are Far More Accurate Than Human Intuition. Even smart people can’t imagine interactions among just two or three variables in their heads. Complex multivariate interaction detection is a job for a computer.
  • Models Provide Consistent Results. Human decision-makers may get lucky once in a while, but it will be hard to keep it up with machines. Mathematics do not fluctuate too much in terms of performance, provided with consistent and accurate data feeds.
  • Models Reveal Hidden Patterns in data. When faced with hundreds of data variables, humans often resort to what they are accustomed to (often fewer than four to five factors). Machines indiscriminately find new patterns, relentlessly looking for the best suitable answers.
  • Models Help Expand the Targeting Universe. If you want a broader target, just go after slightly lower score targets. You can even measure the risk factors while in such an expansion mode. That is not possible with some man-made rules.
  • When Done Right, Models Save Time and Effort. Marketing automation gets simpler, too, as even machines can tell high and low scores apart easily. But the keywords here are “when done right.”

There are many benefits of modeling, even in the age of abundant data. The goal of any data application is to help in the decision-making process, not aid in hoarding the data and bragging about it. Do you want to get to the accurate, consistent, and simple answers — fast? Don’t fight against modeling, embrace it. Try it. And if it doesn’t work, try it in another way, as the worst model often beats man-made rules, easily.

But this time, I’m not writing this article just to promote the benefits of modeling again. Assuming that you embrace the idea already, let’s now talk about the limitations of it. With any technique, users must be fully aware of the downsides of it.

It Mimics Existing Patterns

By definition, models identify and mimic the patterns in the existing data. That means, if the environment changes drastically, all models built in the old world will be rendered useless.

For example, if there are significant changes in the supply chain in a retail business, product affinity models built for old lines of products won’t work anymore (even if products may look similar). More globally, if there were major disruptions, such as a market crash or proliferation of new technologies, none of the old assumptions would continue to be applicable.

The famous economics phrase Ceteris paribus — all other things being equal — governs conventional modeling. If you want your models to be far more adaptive, then consider total automation of modeling through machine learning. But I still suggest trying a few test models in an old-fashioned way, before getting into a full automation mode.

If the Target Is Off, Everything Is Off

If the target mark is hung on a wrong spot, no sharpshooter will be able to hit the real target. A missile without a proper guidance system is worse than not having one at all. Setting the right target for a model is the most critical and difficult part in the whole process, requiring not only technical knowledge, but also deep understanding of the business at stake, the nature of available data, and the deployment mechanism at the application stage.

This is why modeling is often called “half science, half art.” A model is only as accurate as the target definition of the model. (For further details on this complex subject, refer to “Art of Targeting”).

The Model Is Only as Good as the Input Data

No model can be saved if there are serious errors or inconsistencies in the data. It is not just about bluntly wrong data. If the nature of the data is not consistent between the model development sample and the practical pool of data (where the model will be applied and used), the model in question will be useless.

This is why the “Analytics Sandbox” is important. Such a sandbox environment is essential — not just for simplification of model development, but also for consistent application of models. Most mishaps happen before or after the model development stage, mostly due to data inconsistencies in terms of shapes and forms, and less due to sheer data errors (not that erroneous data is acceptable).

The consistency factor matters a lot: If some data variables are “consistently” off, they may still possess some predictive power. I would even go as far as stating that consistency matters more than sheer accuracy.

Accuracy Is a Relative Term

Users often forget this important fact, but model scores aren’t pinpoint accurate all of the time. Some models are sharper than others, too.

A model score is just the best estimate with the existing data. In other words, we should take model scores as the least-wrong answers in a given situation.

So, when I say it is accurate, I mean to say a model is more accurate than human intuition based on a few basic data points.

Therefore, the user must always consider the risk of being wrong. Now, being wrong about “Who is more likely to respond to this 15% discount offer?” is a lot less grave than being wrong about “Who is more likely to be diabetic?”

In fact, if I personally face such a situation, I won’t even recommend building the latter model, as the cost of being wrong is simply too high. (People are very sensitive about their medical information.) Some things should not just be estimated.

Even with innocuous models, such as product affinities and user propensities, users should never treat them as facts. Don’t act like you “know” the target, simply because some model scores are available to you. Always approach your target with a gentle nudge; as in, “I don’t know for sure if you would be interested in our new line of skin care products, but would you want to hear more about it?” Such gentle approaches always sound friendlier than acting like you “know” something about them for sure. That seems just rude on the receiving end, and recipients of blunt messages may even think that you are indeed creepy.

Users sometimes make bold moves with an illusion that data and analytics always provide the right answers. Maybe the worst fallacy in the modern age is the belief that anything a computer spits out is always correct.

Users Abuse Models

Last month, I shared seven ways users abuse models and ruin the results (refer to “Don’t Ruin Good Models by Abusing Them”). As an evangelist of modeling techniques, I always try to prevent abuse cases, but they still happen in the application stages. All good intentions of models go out the window if they are used for the wrong reasons or in the wrong settings.

I am not at all saying that anyone should back out of using models in their marketing practices for the shortfalls that I listed here. Nonetheless, to be consistently successful, users must be aware of limitations of models, as well. Especially if you are about to go on full marketing automation. With improper application of models, you may end up automating bad or wrong practices really fast. For the sake of customers on the receiving end — not just for the safety of your position in the marketing industry — please be more careful with this sharp-edged tool called modeling.

Don’t Ruin Good Models by Abusing Them, Marketers

Models may be built, but the work is not nearly done until they are properly applied and deployed in live campaigns. When in doubt, always consult with the analyst in charge; hopefully, before the drop date.

Modern-day 1:1 marketing is all about precision targeting, using all available data. And precision targeting is not possible with a few popular variables selected based on human intuition.

If human intuition takes over the process, every targeting logic would start with income and age. But let me ask you this: Do you really think that the differences between Mercedes and Lexus buyers are just income and age? If that is too tricky, how about the differences between travelers of luxury cruise lines and buyers of luxury cars? Would that be explained by income and age?

I’m sorry to break it to you bluntly, but all of those targets are rich. To come up with more effective targeting logic, you must dig deeper through data for other clues. And that’s where algorithmic solutions come into play.

I’ve worked with many smart people over the years, but I’ve never met a human who is capable of seeing through interactions among complex data variables without a computer. Some may understand two- or even three-dimensional interactions, when presented in a graphic format, but never more than that. Conversely, a simple regression model routinely incorporates 10 to 20 variables, and provides us with rank orders in forms of simple scores. Forget the next generation AI algorithms; humans have been solidly beaten by computers for decades when it comes to precision targeting.

So, when you have a dire need for more accurate targeting (i.e., you want to be mostly right, not mostly wrong); and have an ample amount of data (i.e., more data variables than you can easily handle); don’t even hesitate to go with statistical models. Resistance is simply futile. In the age of abundant data, we need models more than ever, as they convert mounds of data into digestible answers to questions. (For an extended list of benefits, refer to one of my early articles “Why Model?”)

But today, I am not writing this article to convince non-believers to become believers in statistical models. Quite frankly, I just don’t care if someone still is a non-believer in this day and age. It’s his loss, not mine. This not-so-short article is for existing users of models, who may have ruined them by abusing them from time to time.

As a data and analytics consultant, I get called in when campaign results are less than satisfactory; even when statistical models were actively employed in the target selection process. The most common expression I hear in such cases is, “The model didn’t work.” But when I dig through the whole process, I often find that the model algorithm is the only error-free item. How ironic.

I’ve talked about “analytics-readiness” so many times already. And, yes, inadequate sets of input data can definitely ruin models. So allow me to summarize ways users wreck perfectly adequate models “after” they were developed and validated. And there are many ways you can do that, unfortunately. Allow me to introduce a few major ones.

Using the Model in a Wrong Universe

Without a doubt, setting a wrong target will lead to an unusable model. Now, an equally important factor as the “target definition” is the “comparison universe.” If you are building a response model, for example, responders (i.e., targets) will be compared to non-responders (i.e., non-targets). If you are off in one of those, the whole model will be wrong — because a model is nothing but a mathematical expression of differences between the two dichotomous groups. This is why setting a proper comparison universe — generally, a sample out of the pool of names that you are using for the campaign — is equally as important as setting the right target.

Further, let’s say that you want to use models within preset universes, based on region, age, gender, income, past spending level, certain number of clicks, or any other segmentation rules. Such universe definitions — mostly about exclusion of obvious non-targets — should be determined “before” the model development phase. When such divisions are made, applying the model built for one universe (e.g., a regional model for the Mid-Atlantic) to another universe (e.g., the Pacific Northwest region) will not provide good results, other than with some dumb luck.

Ignoring the Design Principle of the Model

Like buildings or cars, models are built for specific purposes. If I may list a few examples:

  • “Future customer value estimation, in dollars”
  • “Propensity to purchase in response to discount offers via email”
  • “Product affinity for a certain product category”
  • “Loyalty vs. churn prediction”
  • “Likelihood to be a bargain-seeker”
  • Etc.

This list could be as long as what you want as a marketer.

However, things start to go wrong when the user starts ignoring (or forgetting) the original purpose of the model. Years back, my team built a model for a luxury cruise line for a very specific purpose. The brand was very reputable, so it had no trouble filling in staterooms with balconies at a higher price point. But it did have some challenges filling in inside staterooms at a relatively high price of entry, which was equivalent to a window room on a less fancy ship. So, the goal was to find cruisers who would take up inside staterooms, for the brand value, on Europe-bound ships that depart U.S. ports between Thanksgiving and Christmas. A very specific target? You bet.

Troubles arose because it worked all too well for the cruise line. So, without any further consultation with any analysts, they started using that model for other purposes. We got phone calls only after the attempt failed miserably. Now, is that really the fault of the model? Sure, you can heat up your house with a kitchen oven, but don’t blame the manufacturer when it breaks down by abusing it like that. I really don’t think the warranty applies there.

Playing With Selection Rules

Some marketers are compelled to add more rules after the fact, probably out of sheer enthusiasm for success. For instance, a person in charge of a campaign may come up with an idea at the last minute, and add a few rules on top of the model selection, as in “Let’s send mails only to male prospects in the high-score group.” What this means is that he just added the strongest variable on top of a good model, which may include 15 to 20 variables, all carefully weighted by a seasoned statistician. This type of practice may not lead to a total disaster, but the effectiveness of the model in question is definitely diluted by the post-selection rules.

When the bad results start to come in, again, don’t blame the modeler for it. Because “you” essentially redesigned the model by adding new variables on top of existing predictors. Unfortunately, this type of last-minute meddling is quite common. If you have a good reason to do any “post-selection,” please talk to the analyst before the model is built, so that she can incorporate the rule as a “pre-selection” logic. She may give you multiple models fitted for multiple universes, too.

Realigning the Model Groups in Arbitrary Ways

Model scores are just long numbers — with eight or nine decimal places, in general. It is hard to use sheer numeric values like that, so kind modelers generally break the scored universe into 10 or 20 equal groups. (We call them decile or demi-decile groups.)

For instance, each decile group would represent 10% of the development and validation samples. When applied to the campaign universe, resultant score groups should not deviate too much from that 10% mark.

If you see big bumps in model group sizes, it is a clear sign that something went wrong in scoring, there were significant changes in the input variables, or the model is losing its effectiveness, over time.

I’ve seen cases where users just realigned the model score groups after the fact, simply because groups were not showing an equal 10% break anymore. That is like covering serious wounds with a make-up. Did the model work after that? Please take a wild guess.

Using Expired Models

Models do have limited shelf-lives. Models lose their predictive power over time, as market conditions, business models, data sources, data procurement methods, and target profiles all inevitably go through changes.

If you detect signs of lagging results or wide fluctuations in model group distribution (i.e., showing only 3% in the top decile, which is supposed to be around 10%), it is time to review the model. In mild cases, modelers may be able to refit the model. But in this day and age of fast computers and automation, I recommend full redevelopment of the model in question at the first sign of trouble.

Ignoring the ‘Level’ of Prediction

A model for target marketing is to rank the records from high to low scores, according to the design principle. If you built an affinity model for “Likely to be an early adopter,” high score means the target is more likely to be an early adopter, and low score means she’s less likely to be one. Now, the level of the record matters here. What are you really ranking, anyway?

The most common ones are individual and household levels. It is possible to build a model on an email level, as one individual may have multiple email addresses. If you are in a telecom business, you may not even care for the household-level identity, as the “house” may be the target, regardless of who lives in there.

In the application stage, matching the “level” of prediction is important. For household models, it is safe to assume that almost all predictors in the model are on a household level. Applying such models on a different level may negatively affect the model performance. A definite “no” is using household-level score for an address, not knowing who lives there. One may think “How different will the new mover be from the old resident?” But considering a wide variety of demographic variables commonly used in models, it is something that no modeler would recommend. If the model employed any transaction or behavioral data, don’t even think about switching levels like that. You’d be better off building a regional model (such as ZIP model) only using geo-demographic data.

Applying Average Scores to Non-Matches or Non-Scorable Records

Sometimes, scores are missing because of non-matches in the data append process, or strict universe definition using pre-selection rules. It can be tempting to apply some “average” score to cover the missing ones, but that is a big no-no, as well. Statisticians may perform such imputation on a variable level to fill missing values, but not with model scores.

If you really have to have a score for every record, build separate models for non-match or non-select universes, using any available data (if there are any to be used). In CRM models, no one should just drop non-matches into demographic files, as the main drivers of such models would be transaction and behavioral data. Let missing values play out in the model (refer to “Missing Data Can Be Meaningful”).

For prospecting, once you set up a pre-selection universe (hopefully, after some profile analysis), don’t look back and just go with a “scored” universe. Records with missing scores are generally not salvageable, in practice.

Go Forth and Do Good Business With Models, Marketers

As you can see, there are many ways to mess up a good model. A model is not an extension of rudimentary selection rules, so please do NOT treat it that way. Basically, do not put diesel fuel in a gasoline car, and hope to God that the engine will run smoothly. And when — not if — the engine stalls, don’t blame the engineer.

Models may be built, but the work is not nearly done until they are properly applied and deployed in live campaigns. When in doubt, always consult with the analyst in charge; hopefully, before the drop date.

When You Fail, Don’t Blame Data Scientists First — or Models

The first step in analytics should be “formulating a question,” not data-crunching. I can even argue formulating the question is so difficult and critical, that it is the deciding factor dividing analysts into seasoned data scientists and junior number-crunchers.

Last month, I talked about ways marketing automation projects go south (refer to “Why Many Marketing Automation Projects Go South”). This time, let’s be more specific about modeling, which is an essential element in converting mounds of data into actionable solutions to challenges.

Without modeling, all automation efforts would remain at the level of rudimentary rules. And that is one of the fastest routes to automate wrong processes, leading to disappointing results in the name of marketing automation.

Nonetheless, when statistically sound models are employed, users to tend to blame the models first when the results are less than satisfactory. As a consultant, I often get called in when clients suspect the model performance. More often than not, however, I find that the model in question was the only thing that was done correctly in a series of long processes from data manipulation and target setting to model scoring and deployment. I guess it is just easier to blame some black box, but most errors happen before and after modeling.

A model is nothing but an algorithmic expression measuring likelihood of an object resembling — or not resembling — the target. As in, “I don’t know for sure, but that household is very likely to purchase high-end home electronics products,” only based on the information that we get to have. Or on a larger scale, “How many top-line TV sets over 65 inches will we sell during the Christmas shopping season this year?” Again, only based on past sales history, current marcom spending, some campaign results, and a few other factors — like seasonality and virality rate.

These are made-up examples, of course, but I tried to make them as specific and realistic as possible here. Because when people think that a model went wrong, often it is because a wrong question was asked in the first place. Those “dumb” algorithms, unfortunately, only provide answers to specific questions. If a wrong question is presented? The result would seem off, too.

That is why the first step in analytics should be “formulating a question,” not data-crunching. Jumping into a data lake — or any other form of data depository, for that matter — without a clear definition of goals and specific targets is often a shortcut to demise of the initiative itself. Imagine a case where one starts building a house without a blueprint. Just as a house is not a random pile of building materials, a model is not an arbitrary combination raw data.

I can even argue formulating the question is so difficult and critical, that it is the deciding factor dividing analysts into seasoned data scientists and junior number-crunchers. Defining proper problem statements is challenging, because:

  • business goals are often far from perfectly constructed logical statements, and
  • available data are mostly likely incomplete or inadequate for advanced analytics.

Basically, good data players must be able to translate all those wishful marketing goals into mathematical expressions, only using the data handed to them. Such skill is far beyond knowledge in regression models or machine learning.

That is why we must follow these specific steps for data-based solutioning:

data scientists use this roadmap
Credit: Stephen H. Yu
  1. Formulating Questions: Again, this is the most critical step of all. What are the immediate issues and pain points? For what type of marketing functions, and in what context? How will the solution be applied and how will they be used by whom, through what channel? What are the specific domains where the solution is needed? I will share more details on how to ask these questions later in this series, but having a specific set of goals must be the first step. Without proper goal-setting, one can’t even define success criteria against which the results would be measured.
  2. Data Discovery: It is useless to dream up a solution with data that are not even available. So, what is available, and what kind of shape are they in? Check the inventory of transaction history; third-party data, such as demographic and geo-demographic data; campaign history and response data (often not in one place); user interaction data; survey data; marcom spending and budget; product information, etc. Now, dig through everything, but don’t waste time trying to salvage everything, either. Depending on the goal, some data may not even be necessary. Too many projects get stuck right here, not moving forward an inch. The goal isn’t having a perfect data depository — CDP, Data Lake, or whatever — but providing answers to questions posed in Step 1.
  3. Data Transformation: You will find that most data sources are NOT “analytics-ready,” no matter how clean and organized they may seem (there are often NOT well-organized, either). Disparate data sources must be merged and consolidated, inconsistent data must be standardized and categorized, different levels of information must be summarized onto the level of prediction (e.g., product, email, individual, or household levels), and intelligent predictors must be methodically created. Otherwise, the modelers would spend majority of their time fixing and massaging the data. I often call this step creating an “Analytics Sandbox,” where all “necessary” data are in pristine condition, ready for any type of advanced analytics.
  4. Analytics/Model Development: This is where algorithms are created, considering all available data. This is the highlight of this analytics journey, and key to proper marketing automation. Ironically, this is the easiest part to automate, in comparison to previous steps and post-analytics steps. But only if the right questions — and right targets — are clearly defined, and data are ready for this critical step. This is why one shouldn’t just blame the models or modelers when the results aren’t good enough. There is no magic algorithm that can save ill-defined goals and unusable messy data.
  5. Knowledge Share: The models may be built, but the game isn’t over yet. It is one thing to develop algorithms with a few hundred thousand record samples, and it’s quite another to apply them to millions of live data records. There are many things that can go wrong here. Even slight differences in data values, categorization rules, or even missing data ratio will make well-developed models render ineffective. There are good reasons why many vendors charge high prices for model scoring. Once the scoring is done and proven correct, resultant model scores must be shared with all relevant systems, through which decisions are made and campaigns are deployed.
  6. Application of Insights: Just because model scores are available, it doesn’t mean that decision-makers and campaign managers will use them. They may not even know that such things are available to them; or, even if they do, they may not know how to use them. For instance, let’s say that there is a score for “likely to respond to emails with no discount offer” (to weed out habitual bargain-seekers) for millions of individuals. What do those scores mean? The lower the better, or the higher the better? If 10 is the best score, is seven good enough? What if we need to mail to the whole universe? Can we differentiate offers, depending on other model scores — such as, “likely to respond to free-shipping offers”? Do we even have enough creative materials to do something like that? Without proper applications, no amount of mathematical work will seem useful. This is why someone in charge of data and analytics must serve as an “evangelist of analytics,” continually educating and convincing the end-users.
  7. Impact Analysis: Now, one must ask the ultimate question, “Did it work?” And “If it did, what elements worked (and didn’t work)?” Like all scientific approaches, marketing analytics and applications are about small successes and improvements, with continual hypothesizing and learning from past trials and mistakes. I’m sure you remember the age-old term “Closed-loop” marketing. All data and analytics solutions must be seen as continuous efforts, not some one-off thing that you try once or twice and forget about. No solution will just double your revenue overnight; that is more like a wishful thinking than a data-based solution.

As you can see, there are many “before” and “after” steps around modeling and algorithmic solutioning. This is why one should not just blame the data scientist when things don’t work out as expected, and why even casual users must be aware of basic ins and outs of analytics. Users must understand that they should not employ models or solutions outside of their original design specifications, either. There simply is no way to provide answers to illogical questions, now or in the future.

Use People-Oriented Marketing: Because Products Change, But People Rarely Do

In 1:1 marketing, product-level targeting is “almost” taken for granted. I say almost, because most so-called personalized messages are product-based, rarely people-oriented marketing. Even from mighty Amazon, we see rudimentary product recommendations as soon as we buy something. As in: “Oh, you just bought a yoga mat! We will send you absolutely everything that is related to yoga on a weekly basis until you opt out of email promotions completely. Because we won’t quit first.”

In 1:1 marketing, product-level targeting is “almost” taken for granted. I say almost, because most so-called personalized messages are product-based, rarely people-oriented marketing. Even from mighty Amazon, we see rudimentary product recommendations as soon as we buy something. As in: “Oh, you just bought a yoga mat! We will send you absolutely everything that is related to yoga on a weekly basis until you opt out of email promotions completely. Because we won’t quit first.”

How nice of them. Taking care of my needs so thoroughly.

Annoying as they may be, both marketers and consumers tolerate such practices. For marketers, the money talks. Even rudimentary product recommendations — all in the name of personalization — work much better than no targeting at all. Ain’t the bar really low here, in the age of abundant data and technologies? Yes, such a product recommendation is a hit-or-miss, but who cares? Those “hits” will still generate revenue.

For consumers, aren’t we all well-trained to ignore annoying commercials when we want to? And who knows? I may end up buying a decent set of yoga mat cleaners with a touch of lavender scent because of such emails. Though we all know purchase of that item will start a whole new series of product offerings.

Now, marketers may want to call this type of collaborative filtering an active form of personalization, but it isn’t. It is still a very reactive form of marketing, at the tail end of another purchase. It may not be as passive as waiting for someone to type in keywords, but product recommendations are mixture of reactive and active (because you may send out a series of emails) forms of marketing.

And I’m not devaluing such endeavors, either. After all, it works, and it generates revenue. All I am saying is that marketers should recognize that a reactive product recommendation is only a part of personalization efforts.

As I have been writing for five years now, 1:1 marketing is about effectively deciding:

  1. whom to contact, and
  2. what to offer.

Part One is good old targeting for outbound efforts, and there are a wide variety of techniques for it, starting with rules that marketers made up, basic segmentation, and all of the way to sophisticated modeling.

The second part is a little tricky; not because we don’t know how to list relevant products based on past purchases, but because it is not easy to support multiple versions of creatives when there is no immediate shopping basket to copy (like cases for recent purchases or abandoned carts).

In between unlimited product choices and relevant offers, we must walk the fine lines among:

  1. dynamic display technology,
  2. content and creative library,
  3. data (hopefully clean and refined), and
  4. analytics in forms of segments, models or personas (refer to “Key Elements of Complete Personalization”).

If specific product categories are not available (i.e., a real indicator that a buyer is interested in certain items), we must get the category correct at the minimum, using modeling techniques. I call it personas, and some may call it architypes. (But they are NOT segments. Refer to “Segments vs. Personas”).

Using the personas, it is not too difficult to map proper products to potential buyers. In fact, marketers are free to use their imaginations when they do such mapping. Plus, while inferred, these model scores are never missing, unlike those hard-to-get “real” data. No need to worry about targeting only a small part of potential buyers.

What should a marketer offer to fashionistas? To trendsetters? To bargain seekers? To active, on-the-go types? To seasonal buyers? To big spenders? Even for a niche brand, we can create 10 to 20 personas that represent key product categories and behavioral types, and the deployment of personalized messages become much simpler.

And it gets better. Imagine a situation where you have to launch a new product or a product line. It gets tricky for the fashion industry, and even trickier for tech companies that are bold enough to launch something that didn’t exist before, such as a new line of really expensive smartphones. Who among the fans of cutting-edge technologies would actually shell out over a grand for a “phone”? This kind of question applies not just to manufacturers, but every merchant who sells peripherals for such phones.

Let’s imagine that a marketer would go with an old marketing plan for “similar” products that were introduced in the past. They could be similar in terms of “newness” and some basic features, but what if they differ in terms of specific functionality, look-and-feel, price point and even the way users would use them? Trying to copy some old targeting methods may lead to big misses, as even consumers hear about them from time to time.

Such mishaps happen because marketers see consumers as simple extensions of products. Pulling out old tricks may work in some cases, but even if just a small bit of product attributes are different, it won’t work.

Luckily for geeks like us, an individual’s behavior does not change so fast. Sure, we all age a bit every year; but in comparison to products in the market, humans do not transform so suddenly. Simply, early adapters will remain early adapters, and bargain seekers will continue to be bargain seekers. Spending level on certain product categories won’t change drastically, either.

Our interests and hobbies do change; but again, not so fast. It took me about two to three years to turn from an avid golfer to a non-golfer. And all golf retailers caught up with my inactivity and stopped sending golf offers.

So, if marketers set up personas that “they” need to push their products, and update them periodically (say once a year), they can gain tremendous momentum in reaching out to customers and prospects more proactively. If they just rely on specific product purchases to trigger a series of product recommendations, outreach programs will remain at the level of general promotions.

Further, even inbound visits can be personalized better (granted that you identified the visitor) using the personas and set of rules in terms of what product goes well with what persona.

The reason why models work well — man-made or machine-built — is because human behavior is predictable with reasonable consistency. We are all extensions of our past behaviors to a greater degree than the evolution rate of products and technologies.

Years ago, we’ve had a heated internal discussion about whether we should create a new series of product categories from VHS to DVD. I argued that such new formats would not change human behavior that much. In fact, genres matter more than video format for the prediction of future purchases. “Godfather” fans will buy the movie again on DVD, and then again in Blu-ray. Now some type of ultra-high-definition download from some cloud somewhere. Through all of this, movie collectors remain movie collectors for their favorite types of movies. In other words, products changed, but not human attributes.

That was what I argued then, and I still stand by it. So, all the analytical efforts must be geared toward humans, not products. In coming days, that may be the shortest path to fake human friendliness using AI and machine-made models.


Replacing Unskilled Data Marketers With AI

People react to words like “machine learning” or “artificial intelligence” very differently, depending on their interests and levels of understanding of technology. Some get scared, and among them are smart people like Elon Musk or the late Stephen Hawking. Others, including data marketers who lack strategic skills, may react based on a vague fear of becoming irrelevant, thinking that a machine will replace them in the job market soon.

People react to words like “machine learning” or “artificial intelligence” very differently, depending on their interests and levels of understanding of technology. Some get scared, and among them are smart people like Elon Musk or the late Stephen Hawking. Others, including data marketers who lack strategic skills, may react based on a vague fear of becoming irrelevant, thinking that a machine will replace them in the job market soon.

On the contrary, I find that most marketers welcome terms like machine learning. Many think that, in the near future, computers will automatically perform all the number-crunching and just tell them what to do. In marketing environments where “Do more with less” is the norm, the idea of machines making decisions for them may sound attractive to many marketers. How great it would be if some super-duper-computer would do all of the hard work for us? The trouble is that the folks who think like that will be the first ones to be replaced by the machines.

Modern marketing is closely tied into the world of data and analytics (the operative word being “modern,” as there are plenty of marketers still going with their gut feelings). There are countless types of data and analytics applications influencing operations management, R&D or even training programs for world-class athletes, but most of the funding for analytical activities is indeed related to marketing. I’d go even further and claim that most of data-related work is profit-driven; either to make more money for organizations or to cut costs in running businesses. In other words, without the bottom-line profit, why bother with any of this geeky stuff?

Yet, many marketers aren’t interested in analytics and some even have fears of lots of numbers being thrown at them. A set of numbers that would excite analytical minds would scare off many marketers. For the record, I blame such an attitude on school systems and jock cultures that have been devaluing the importance of mathematics. It is no accident that most “nerdy” analysts nowadays are from foreign places, where people who are really good at math are not ridiculed among other teenage students but praised or even worshiped.

The joke is that those geeky analysts will be replaced by machines first, as any semi-complex analytical work is delegated to them already. Or will they?

I find it ironic that marketers who have a strong aversion to words like “advanced analytics” or “modeling” would freely embrace machine learning or AI. Because that is like saying you don’t like music, unless it is played by machines. What do they think machine learning is? Some “thinking-slave” that will do all of the work without complaint or asking too many questions?

Machine learning is one of many ways of modeling, whether it is for prediction or pattern recognition. It just became more attractive to the business community as computing power increased over time to accommodate heavy iterations of calculations, and because words like neural net models were replaced by easier sounding “machine learning.”

To wield such machines, nonetheless, one must possess “some” idea about how they work and what they require. Otherwise, it would be like a musically illiterate person trying to produce a piece of music all automatically. Yes, I’ve heard that now there are algorithms that can compose music or write novels on their own, but I would argue that such formulaic music will be a filler in a hotel elevator, at best. If emotionally moving another human being is the goal, one can’t eliminate all human factors out of the equation.

Machines are to automate things that humans already know how to do. And it takes ample amounts of “man-hours” to train the machine, even for the relatively simple task of telling the difference between dogs and cats in pictures. And some other human would have decided that such a task would be meaningful for other humans. Of course, once the machines are set up to learn on their own, a huge momentum will kick in and millions of pictures will be sorted out automatically.

And as such evolution goes on, a whole lot of people may lose their jobs. But not the ones who know how to set the machines up and give them purposes for such work.

Let’s Take a Breath Here

Dialing back to something much simpler: Operations. In automating reports and creating custom messages for target audiences, the goals must be set by stakeholders and machines must be tweaked for such purposes at the beginning. Someday soon, AI will reach the level where it can operate with very general guidelines; but at least for now, requesters must provide logical instructions.

Let’s say a set of reports come out of the computer for the use of marketing analysis. “What reports to show”-type decisions are still being made by humans, but producing useful intelligence in an automated fashion isn’t a difficult task these days. Then what? The users still have to make sense out of all of those reports. Then they must decide what to do about the findings.

There are folks who hope that machine will tell them exactly what to do out of such intel. The first part may come close to their expectation sometime soon, if not already for some. Producing tidbits like “Hi, human: It looks like over 80% of your customers who shopped last year never came back,” or “The top 10% of your customers, in terms of lifetime spending level, account for over 70% your yearly revenue, but about half of them show days between transactions far longer than a year.” By the way, mimicking human speech isn’t easy, but if all these numbers are sitting somewhere in the computer, yes, it is possible to expect something like this out of machines.

The hard part for the machines would be picking five to six of the most important tidbits out of hundreds, if not thousands of other “facts,” as that requires understanding of business goals. But we can fake even that type of decision-making by assuming most businesses are about “increasing revenue by acquiring new valuable customers, and retaining them for as long as possible.”

Then the really hard part would be deciding what to do about it. What should you do to make your valuable customers come back? Answering that type of question requires not only an analytical mindset, but a deep understanding in human psychology and business acumen. Analytics consultants are generally multi-dimensional thinkers, and the one-trick ponies who just spit out formulaic answers do not last too long. The same rule would apply to machines, and we may call those one-dimensional machines “posers” too (refer to “Don’t Hire Data Posers”).

But let’s say that by entering thousands business cases with final solutions and results as a training set into machines, we finally get to have such machine intelligence. Would we be free from having to “think” even a bit?

The short answer is that, like I said in the beginning, such folks who don’t want to analyze anything will become irrelevant even sooner. Why would we need illogical people when the machines are much cheaper and smarter? Besides, even future computers shown in science fiction movies will require “logical” inquiries to function properly. “Asking the right question” will remain a human function, even in a faraway future. And the logical mindset is a result of mathematical training with some aptitude for it, much like musical abilities.

The word “illiterate” used to mean folks who didn’t know how to read and write. In the age of machines, “logic” is the new language. So, dear humans, do not give up on math, if self-preservation is an instinct that you possess. I am not asking everyone to get a degree in mathematics, but I am insisting that we all must learn about ways of scientific approaches to problem-solving and logical methods of defining inquiries. In the future, people who can wield machines will be in secure places — whether they are coders or not — while new breeds of logically illiterate people will be replaced by the machines, one-by-one.

So, before you freely invite advanced thinking machines into your marketing operations, think carefully if you are either the one who gives purpose to such machines (by understanding what’s at stake, and what those numbers all mean), or one who can train machines to solve those pre-defined (by humans) problems.

I am not talking about some doomsday scenario of machines killing people to take over the world; but like any historical events that are described as “revolutions,” this machine revolution will have real impact on our lives. And like anything, it will be good for some, and bad for others. I am saying that data illiterates who would say things like, “I don’t understand what all those numbers mean,” may be ignored by machines — just like they are by smartass analysts. (But maybe without the annoying attitudes.)

Data Geeks Must Learn to Speak to Clients

This piece is for aspiring data scientists, analysts or consultants (or any other cool title du jour in this data and analytics business). Then again, people who spend even a single dime on a data project must remember this, as well: “The main goal of any analytical endeavor is to make differences in business.”

This piece is for aspiring data scientists, analysts or consultants (or any other cool title du jour in this data and analytics business). Then again, people who spend even a single dime on a data project must remember this, as well: “The main goal of any analytical endeavor is to make differences in business.”

To this, some may say “Duh, keep stating the obvious.” But I am stating the obvious, as too many data initiatives are either for the sake of playing with data at hand, or for the “cool factor” among fellow data geeks. One may sustain such a position for a couple of years if he is lucky, but sooner or later, someone who is paying for all of the data stuff will ask where the money is going. In short, no one will pay for all of those servers, analytical tools and analysts’ salaries so that a bunch of geeks have some fun with data. If you just want the fun part, then maybe you should just stay in academia “paying” tuition for such an experience.

Not too long ago, I encountered a promising resume in a deep pile. Seemingly, this candidate had very impressive credentials. A PhD in statistics from a reputable school, hands-on analytics experience in multiple industries (so he claimed), knowledge in multiple types of statistical techniques, and proficiency in various computing languages and toolsets. But the interview couldn’t have gone worse.

When the candidate was going on and on about minute details of his mathematical journey for a rather ordinary modeling project, I interrupted and asked a very simple question: “Why did you build that model?” Unbelievably, he couldn’t answer that question, and kept resorting back to the methodology part. Unfortunately for him, I was not looking for a statistician, but an analytics consultant. There was just no way that I would put such a mechanical person in front of a client without risking losing the deal entirely.

When I interview to fill a client-facing position, I am not just looking for technical skills. What I am really looking for is an ability to break down business challenges into tangible analytics projects to meet tangible business goals.

In fact, in the near future, this will be all that is left for us humans to do: “To define the problem statement in the business context.” Machines will do all of the tedious data prep work and mathematical crunching after that. (Well, with some guidance from humans, but not requiring line-by-line instructions by many.) Now, if number-crunching is the only skill one is selling, well then, he is asking to be replaced by machines sooner than others.

From my experience, I see that the overlap between a business analyst and a statistical analyst is surprisingly small. Further, let me go on and say that most graduates with degrees in statistics are utterly ill-prepared for the real world challenges. Why?

Once I read an article somewhere (I do not recall the name of the publication or the author) that colleges are not really helping future data scientists in a practical manner, as (

  1. all of the datasets for school projects are completely clean and free of missing data, and
  2. professors set the goals and targets of modeling exercises.

I completely agree with this statement, as I have never seen a totally clean dataset since my school days (which was a long time ago in a galaxy far far away), and defining the target of any model is the most difficult challenge in any modeling project. In fact, for most hands-on analysts, data preparation and target definition are the work. If the target is hung on a wrong place, no amount of cool algorithms will save the day.

Yet, kids graduate schools thinking that they are ready to take on such challenges in the real world on Day One. Sorry to break it to them this way, but no, mathematical skills do not directly translate into ability to solve problems in the business world. Such training will definitely give them an upper hand in the job market, though, as no math-illiterate should be called an analyst.

Last summer, my team hired two promising interns, mainly to build a talent pool for the following year. Both were very bright kids, indeed, and we gave them two seemingly straightforward modeling projects. The first assignment was to build a model to proximate customer loyalty in a B2B setting. I don’t remember the second assignment, as they spent the entire summer searching for the definition of a “loyal customer” to go after. They couldn’t even begin the modeling part. So more senior members in the team had to do that fun part after they went back to school. (For more details about this project, refer to “The Secret Sauce for B2B Loyalty Marketing.”)

Of course, we as a team knew what we were doing all along, but I wanted to teach these youngsters how to approach a project from the very beginning, as no client will define the target for consultants and vendors. Technical specs? You’re supposed to write that spec from scratch.

In fact, determining if we even need a model to reach the business goal was a test in itself. Why build a model at all? Because it’s a cool thing on your resume? With what data? For what specific success metrics? If “selling more things by treating valuable customers properly” is the goal, then why not build a customer value model first? Why the loyalty model? Because clients just said so? Why not product propensity models, if there are specific products to push? Why not build multiple models and cover all bases while we’re at it? If so, will we build a one-size-fits-all model in one shot, or should we consider separating the universe for distinct segments in the footprint? If so, how would you determine such segments then? (Ah, that “segmentation of the universe” part was where the interns were stuck.)

Boy, did I wish schools spent more time doing these types of problem-solving exercises with their students. Yes, kids will be uncomfortable as these questions do NOT have clear yes or no answers to them. But in business, there rarely are clear answers to our questions. Converting such ambiguity into measurable and quantifiable answers (such as probability that a certain customer will respond to a certain offer, or sales projection of a particular product line for the next two quarters with limited data) is the required skill. Prescribing the right approach and methodology to solve long- and short-term challenges is the job, not just manipulating data and building algorithms.

In other words, mathematical elegance may be a differentiating factor between a mediocre and excellent analyst, but such is not the end goal. Then what should aspiring analysts keep in mind?

In the business world, the goals of data or analytical work are really clear-cut and simple. We work with the data to (1) increase revenue, (2) decrease cost (hence, maximizing profit), or minimize risks. That’s it.

From that point, a good analyst should:

  • Define clear problem statements (even when ambiguity is all around)
  • Set tangible and attainable goals employing a phased approach (i.e., a series of small successes leading to achievement of long-term goals)
  • Examine quality of available data, and get them ready for advanced analytics (as most datasets are NOT model-ready)
  • Consider specific methodologies best fit to solve goals in each phase (as assumptions and conditions may change drastically for each progression, and one brute-force methodology may not work well in the end)
  • Set the order of operation (as sequence of events does matter in any complex project)
  • Determine success metrics, and think about how to “sell” the results to sponsors of the project (even before any data or math work begins)
  • Go about modeling or any other statistical work (only if the project calls for it)
  • Share knowledge with others and make sure resultant model scores and other findings are available to users through their favorite toolsets (even if the users are non-believers of analytics)
  • Continuously monitor the results and re-fit the models for improvement

As you can see here, even in this simplified list, modeling is just an “optional” step in the whole process. No one should build models because they know how to do it. You’re not in school anymore, where the goal is to get an A at the end of the semester. In the real world (although using this term makes me sound like a geezer), data players are supposed to make money with data, with or without advanced techniques. Methodologies? They are just colors on a palette, and you don’t have to use all of them.

For the folks who are in position to hire math geeks to maximize the value of data, simply ask them “why they would do anything.” If the candidate actually pauses and tries to think from the business perspective, then she is actually displaying some potential to be a business partner in the future. If the candidate keeps dropping technical jargon to this simple question, cut the interview short — unless you have natural curiosity in the mechanics of models and analytics, and your department’s success is just measured in complexity and elegance of solutions. But I highly doubt that such a goal would be above increasing profit for the organization in the end.

Why Modeling Beats Rule-Based Segmentation

I have been talking about “employing all available data” for targeting and customer insights for some time now. So allow me to pick a different bone today. Let’s forget the data part, and talk about the methodology. When machines can build models super-fast, aversion to modeling only limits the users. After all, I am not asking any marketers to get a degree in statistics. I am just asking them to consider modeling techniques.

I cringe when I hear “rule-based” segments are sitting on top of so-called state-of-the-art campaign engines. This is year 2018 A.D. It’s the age of abundant data with an ample number of tools and options to harness their true powers. And marketers are still making up rules now? It’s time for marketers to embrace modeling.

I wonder what most of the rules marketers are using are made of. Recency? Certainly, but how recent is recent enough?

Frequency? Sure, why not? The more the merrier, right? But in what timeframe? Are you counting transactions, orders or items? Or just some “events”?

Monetary? Hmm, that’s tricky. Are we using an individual-level lifetime total amount, value of the last transaction, average spending per transaction, average spending amount per year, or what? Don’t tell me you don’t even have individual-level summary data. No customer is just a reflection of her last transaction.

Actually, if a company is using some RFM (Recency, Frequency, Monetary Value) data for targeting, that is not so bad. At least it’s taking a look at what actually happened in terms of monetary transactions, not just clicks and page views, along with basic demographic data.

I have been talking about “employing all available data” for targeting and customer insights for some time now. So allow me to pick a different bone today. Let’s forget the data part, and talk about the methodology. When machines can build models super-fast, aversion to modeling only limits the users. After all, I am not asking any marketers to get a degree in statistics. I am just asking them to consider modeling techniques, as this data industry has moved forward from the days when some basic RFM rule sets used to get a passing grade.

Let’s look at the specific reasons why marketers should consider modeling techniques more seriously and ditch rule-based segmentation.

Reason No. 1: Variable Selection

We are surrounded by data, as every move that anyone makes is digitized now. When you describe a buyer, you may need to evaluate hundreds, if not thousands, of data points. Even if you are just using simple set of demographic data without any behavioral data, we are talking about over 100 variables to consider out of the gate.

Let’s say you want to build a rule to find a good segment for the sale of luxury cruises. How would you pick the most predictable variable for that one purpose? Income and age? That is not a bad start, but that is like using just two colors out of a crayon box containing 80 colors.

Case in point: Do you really believe that the main difference between luxury cruisers and luxury car buyers is “income”? Guess what, those buyers are all rich. You must dig much deeper than that.

Marketers often choose variables that they can easily understand and visualize. Unfortunately, the goal of the targeting exercise should be effectiveness of targeting, not easy comprehension by the marketer.

We often find obscure variables in models. They may “seem” obscure, as a human being would never have instinctively picked them. But mathematics doesn’t care for our opinions. In modeling, variables are picked for their predictive power, nothing else. The bonus is that this is exactly how new patterns are discovered.

We hear tidbits such as “People who tend to watch more romantic comedies are more likely to rent cars over the weekend,” “Aggressive investors are less likely to visit family restaurants” or “High-value customers for a certain teenage apparel company are more likely to be seasonal buyers with high item counts per customer, but relatively lower transaction counts.”

These are the contributing factors found through vigorous mathematical exercises, not someone’s imagination or intuition. But they should always make sense in the end (unless of course, there were errors). Picking the right predictor is indeed the most important step in modeling.

Reason No. 2: Weight Factor

Let’s say that by chance, a user stumbled upon a set of useful predictors of certain customer behavior. Let’s go back to the last example of the teenage apparel company’s high-value customer model. In that one sentence, I listed: seasonality (expressed in number of transactions by month, regardless of year), number of item counts per customer (with time limits, such as past 36 months), and number of transactions per customer.

In real life, there would be a far greater number of variables that would pass the initial variable selection process. But for simplicity’s sake, let’s just review these three variables.

Now tell me, which one of these three variables is the most important predictor of this high-value customer model? (Please don’t say they are all equally important.) Model scores are made of selected variables multiplied by the weight of each, as not all predictors carry the same level of predictability. Some may even be “negatively” correlated to the ideal behavior that we are going after. In this example alone, we saw that the number of items was positively related to the high value, while the number of transactions are negatively related. When investigating further about this “strange” correlation, we found out that most of the high-value customers are trained by the marketer to wait for a big sale, and then buy lots of items in one transaction.

The main trouble with the “rule-based” segmentation or targeting exercise is that human beings put arbitrary weight (or importance) on each variable, even if “the right” variables were picked — mostly by chance — in the first place.

The modeling process reveals the actual balance among all important predictors, with the sole purpose of maximizing predictability. Conversely, I have never met a person who can “imagine” the dynamics of two or three variables, let alone 10 to 20 (the typical number of variables in models).

Forget about the recent emergence of machine learning; with or without human statisticians, modeling techniques have been beating rudimentary rules by end-users for decades. If solely left to humans, the No. 1 predictor of any human behavior would be the income of the target. But that is just a reflection of human perception and a one-dimensional way of looking at a complex composition of human behavior. You don’t believe you can explain the difference between a Lexus buyer and a Mercedes buyer with just income, do you?

Reason No. 3: Banding

Much of data are composed of numbers and figures. The rest of them are called categorical variables (i.e., data that cannot be added or subtracted, such as product category or channel description).

Let’s assume that income — not my first pick, as you can see — is found to be predictable for mid- to high-scale female accessory buyers. Surely, different ranges of income would behave differently in such models. If the income is too low, they won’t be able to afford such items. Too high, then the buyer may have moved on to even more expensive handbags. So, the middle ground may seem to be the ideal target. The trouble is that now you have to describe that middle group in terms of actual dollars. Exactly where does that ideal range begin and end? To make it even more complicated, what about regional biases in buying power? Can one set of banding explain the whole thing? We’ve gone way past any intuitive grouping.

Moving onto categorical variables, one of the most predictable variables in any B2B modeling is the SIC code. There are thousands of variations in any one field, and they are definitely not numbers (although they look like them). How would one go about putting them into ideal groups to describe the target (such as “loyal customers”)?

If you are selling expensive computer servers, one may put “Agricultural, Fishing and Mining” as a low priority group. Then, how about all those variations in huge groups, such as “Retail,” “Business Service” or “Manufacturing,” with hundreds of sub-categories? Let’s just say that I’ve never met a human being who went beyond the initial two-digit SIC code in their heads. Good luck creating an effective group with that one variable with rudimentary methods.

Grouping “values” that move together in terms of predictability is not simple. In fact, that is exactly why computers were invented. Don’t struggle with such jobs.

These are just a few reasons why we must rely on advanced modeling techniques to navigate through complex data. The benefits of modeling are plenty (refer to “Why Model?”). Compared to our gut feelings, statistical models are much more accurate and consistent. They also reveal previously unseen patterns in data. Because they are summarized answers to specific questions, users do not have to consider hundreds of factors, but just one model score at a time. In the current marketing environment, when things move at a light speed, who has time to consider hundreds of data points in real-time? Machine learning — leading to full automation — is just a natural extension of modeling.

Each model score is a summary of hundreds of contributing factors. “Responsiveness to email campaigns for a European cruise vacation” is a complex question to answer, especially when we all go through daily data overload. But if the answer is in the form of a simple score (say, one through 10), any user who understands “high is good, low is bad” can make a sound decision at the time of campaign execution.

Marketers already have ample amounts of data and advanced campaign tools. Running such machines with some man-made segmentation rules from the last century is a real shame. No one is asking marketers to become seasoned data scientists; they just need to be more open to advanced techniques. With firm commitments, we can always hire experts, or in the near future, machines that will do the mathematical jobs for us. But marketers must move out of old fashioned rule-based marketing first.

The Secret Sauce for B2B Loyalty Marketing

Who’s likely to be your valuable customer? What will their value be in next few years? How long will they continue to do business with you? Which ones are in vulnerable positions, and who’s likely to churn in next three months? Wouldn’t it be great if you could identify who’s vulnerable among your valuable customers “before” they actually stop doing business with you?

B2B loyalty
“business-agreement,” Creative Commons license. | Credit: Flickr by Kevin Johnston

Properly measuring customer loyalty is often a difficult task in multichannel B2B marketing environment. The first question is often, “Where should we start digging when there are many data silos?” Before embarking on a massive data consolidation project throughout the organization, we suggest defining the problem statements by breaking down what customer loyalty means to you first, as that exercise will narrow down the list of data assets to be dealt with.

Who’s likely to be your valuable customer? What will their value be in next few years? How long will they continue to do business with you? Which ones are in vulnerable positions, and who’s likely to churn in next three months? Wouldn’t it be great if you could identify who’s vulnerable among your valuable customers “before” they actually stop doing business with you?

Marketers often rely on surveys to measure loyalty. Net Promoter Score, for example, is a good way to measure customer loyalty for the brand. But if you want to be proactive about each customer, you will need to know the loyalty score for everyone in your base. And asking “everyone” is too cost-prohibitive and impractical. On top of that, the respondents may not be completely honest about their intentions; especially when it comes to monetary transactions.

That’s where modeling techniques come in. Without asking direct questions, what are the leading indicators of loyalty or churn? What specific behaviors lead to longevity of the relationship or complete attrition? In answering those questions, past behavior is often proven to be a better predictor of future behavior than survey data, as what people say they would do and what they actually do are indeed different.

Modeling is also beneficial, as it fills inevitable data gaps, as well. No matter how much data you may have collected, you will never know everything about everyone in your base. Models are tools that make the most of available data assets, summarizing complex datasets into forms of answers to questions. How loyal is the Company XYZ? The loyalty model score will express that in a numeric form, such as a score between one and 10 for every entity in question. That would be a lot simpler than setting up rules by digging through a long data dictionary.

Our team recently developed a loyalty model for a leading computing service company in the U.S. The purposes of the modeling exercise were two-fold:

  1. Find a group of customers who are likely to be loyal customers, and
  2. Find the “vulnerable” segment in the base.

This way, the client can treat “potentially” loyal customers even before they show all of the signs of loyalty. At the opposite end of the spectrum, the client can proactively contact vulnerable customers, if their present or future value (need a customer value model for that) is high. We would call that the “valuable-vulnerable” segment.

We could have built a separate churn model more properly, but that would have required long historical data in forms of time-series variables (processes for those can be time-consuming and costly). To get to the answer fast with minimal data that we had access to, we chose to build one loyalty model, making sure that the bottom scores could be used to measure vulnerability, while the top scores indicate loyalty.

What did we need to build this model? Again, to provide a “usable” answer in the shortest time, we only used the past three years of transaction history, along with some third-party firmographic data. We considered promotion and response-history data, technical support data, non-transactional engagement data and client-initiated activity data, but we pushed them out for future enhancement due to difficulties in data procurement.

To define what “loyal” means in a mathematical term for modeling, we considered multiple options, as that word can mean lots of different things. Depending on the purpose, it could mean high value, frequent buyer, tenured customers, or other measurements of loyalty and levels of engagement. Because we are starting with the basic transaction data, we examined many possible combinations of RFM data.

In doing so, we observed that many indicators of loyalty behave radically differently among different segments, defined by spending level in this instance, which is a clear sign that separate models are required. For other cases, such overarching segments, they can be defined based on region, product line or target groups, too.

So we divided the base into small, medium and large segments, based on annual spending level, then started examining other types of indicators of loyalty for target definition. If we had some survey data, we could have used them to define what “loyal” means. In this case, we mixed the combinations of recency and frequency factors, where each segment ended up with different target definitions. For the first round, we defined the loyal customers with the last transaction date within the past 12 months and total transaction counts within the top 10 to 15 percent range, where the governing idea was to have the target universes that are “not too big” or “not too small.” During this exercise, we concluded that the small segment of big spenders was deemed to be loyal, and we didn’t need a model to further discriminate.

Stephen Yu's B2B loyalty marketing chart
Credit: Stephen H. Yu

As expected, models built for small- and medium-level spenders were quite different, in terms of usage of data and weight assigned to each variable. For example, even for the same product category purchases, a recency variable (weeks since the last transaction within the category) showed up as a leading indicator for one model, while various bands of categorical spending levels were important factors for the other. Common variables, such as industry classification code (SIC code) also behaved very differently, validating our decision to build separate models for each spending level segment.

How I Leveraged My 5-Year-Old to Prepare for AI

Over the span of my career, I have had opportunities to mentor future data-driven business leaders. The advice I used to give primarily revolved around the hottest analytical tools and certifications and how to tell stories through data. Five years ago, however, my advice evolved in a very dramatic way, based on a reasonably benign event.

Over the span of my career, I have had opportunities to mentor future data-driven business leaders. The advice I used to give primarily revolved around the hottest analytical tools and certifications and how to tell stories through data. Five years ago, however, my advice evolved in a very dramatic way, based on a reasonably benign event.

My wife, our two daughters and I were on a multi-state road trip. Early on, we decided to make a pit stop. My wife gave the girls $5 each to buy goodies for the road — with no conditions. Unleashed from the shackles of healthy snacking, my older daughter set about making the most her newfound economic freedom. Analytically inclined, my oldest began optimizing for the right combination of quantity, quality and taste that would provide her with the maximum overall satisfaction. My younger daughter (five years old at the time), quickly picked up her favorite fruit candy, asked my wife for a suggestion and purchased that, as well. Eager to get back on the road, I asked my oldest to finalize her decision quickly. My request was met with a look of sheer horror and frustration as she frantically searched for the optimal basket of goods that $5 would buy. With hope that the optimal solutions was only minutes away, she begged for more time to no avail.

Back on the road, my younger daughter offered my wife a substantial portion of the candy she had recommended. Astonished, my wife says, “Sweetie, if you share that with me, you will have less for the trip.” To which my daughter replied, “That’s okay, Mom. I know you like these candies. Can I have another five dollars?” To which my wife uncharacteristically replied: “Of course!” Shocked at these turn of events, my older daughter protested “What? No fair, you can do that!?”

Data Is an Equal Opportunity Enabler

I often think about that incident; especially when I am trying to help clients achieve better results through analytics. This incident is a great allegorical example of why data-driven decisions, when done well, can improve specific results, but many times fail to change the overall game. A 2015 study by KPMG identified operational efficiencies as the primary beneficiary of data and analytics in the near horizon and a more recent study in HBR also confirms that most data and analytics success is still focused on low-hanging operational opportunities. In both reports, business leaders also recognize the transformational opportunities of data and analytics. However, they will also identify an acute need for new and unique skill sets to make those transformational changes a reality.

This brings me back to the car ride. Before you assume this is a lesson about how customer empathy beats algorithms, I can assure you it is not. Not only has my younger daughter’s strategy failed on several other occasions, but I have also seen plenty of well-researched market advice from customer-centric strategy firms fail, as well. Nor do I believe this anecdote implies optimization leads to strategic myopia. (This is also not about which kid I am betting on, as they both manage to amaze and worry me in equal doses.) Instead, the lesson for me is that while analytical rigor can be foundational to disruptive innovation, the optimal solutions algorithms provide only reflect the audacity of the optimizer’s vision.

The body of recent research on successful disruptors dispels the belief that they are solely the product of a brilliant idea conceived by a highly intuitive visionary. Instead, their very existence is often an optimization exercise involving many experiments. Not only do successful new entrants go through many failed iterations, but they also emerge through the crucible of other competing ventures with similar industry disrupting objectives. Once emerged and unleashed, there is still no guarantee that the new ventures are the absolutely optimal solution. One needs only to think of MySpace, AOL and Yahoo if there is doubt. As a result, the body of knowledge on innovation is now focusing around the concept of failing fast, failing early and failing often. A critical component of the “failing for success” strategy involves testing, measuring, and optimizing rapidly and regularly and but also involves having a broad view of the playing field and the bravery to challenge existing assumptions.

AI Whisperers Wanted

The career implications of these trends for data-driven talent are significant. As analytics takes a central role in strategic business functions, it does not necessarily mean that my fellow quant jocks will rule the future. This is because traditional optimization algorithms are just beginning to transition into artificial intelligence-based solutions with the ability to learn on their own and at some point human talent will no longer be needed to build models. If you are in analytics today, it will be important to keep up with the evolution of AI solutions, but even more critical is developing your analytical creativity and bravery.