Data Deep Dive: The Art of Targeting

Even if you own a sniper rifle (and I’m not judging), if you aim at the wrong place, you will never hit the target. Obvious, right? But that happens all the time in the world of marketing, even when advanced analytics and predictive modeling techniques are routinely employed. How is that possible? Well, the marketing world is not like an Army shooting range where the silhouette of the target is conveniently hung at the predetermined location, but it is more like the “Twilight Zone,” where things are not what they seem. Marketers who failed to hit the real target often blame the guns, which in this case are targeting tools, such as models and segmentations. But let me ask, was the target properly defined in the first place?

Even if you own a sniper rifle (and I’m not judging), if you aim at the wrong place, you will never hit the target. Obvious, right? But that happens all the time in the world of marketing, even when advanced analytics and predictive modeling techniques are routinely employed. How is that possible? Well, the marketing world is not like an Army shooting range where the silhouette of the target is conveniently hung at the predetermined location, but it is more like the “Twilight Zone,” where things are not what they seem. Marketers who failed to hit the real target often blame the guns, which in this case are targeting tools, such as models and segmentations. But let me ask, was the target properly defined in the first place?

In my previous columns, I talked about the importance of predictive analytics in modern marketing (refer to “Why Model?”) for various reasons, such as targeting accuracy, consistency, deeper use of data, and most importantly in the age of Big Data, concise nature of model scores where tons of data are packed into ready-for-use formats. Now, even the marketers who bought into these ideas often make mistakes by relinquishing the important duty of target definition solely to analysts and statisticians, who do not necessarily possess the power to read the marketers’ minds. Targeting is often called “half-art and half-science.” And it should be looked at from multiple angles, starting with the marketer’s point of view. Therefore, even marketers who are slightly (or, in many cases, severely) allergic to mathematics should come one step closer to the world of analytics and modeling. Don’t be too scared, as I am not asking you to be a rifle designer or sniper here; I am only talking about hanging the target in the right place so that others can shoot at it.

Let us start by reviewing what statistical models are: A model is a mathematical expression of “differences” between dichotomous groups; which, in marketing, are often referred to as “targets” and “non-targets.” Let’s say a marketer wants to target “high-value customers.” To build a model to describe such targets, we also need to define “non-high-value customers,” as well. In marketing, popular targets are often expressed as “repeat buyers,” “responders to certain campaigns,” “big-time spenders,” “long-term, high-value customers,” “troubled customers,” etc. for specific products and channels. Now, for all those targets, we also need to define “bizarro” or “anti-” versions of them. One may think that they are just the “remainders” of the target. But, unfortunately, it is not that simple; the definition of the whole universe should be set first to even bring up the concept of the remainders. In many cases, defining “non-buyers” is much more difficult than defining “buyers,” because lack of purchase information does not guarantee that the individual in question is indeed a non-buyer. Maybe the data collection was never complete. Maybe he used a different channel to respond. Maybe his wife bought the item for him. Maybe you don’t have access to the entire pool of names that represent the “universe.”

Remember T, C, & M
That is why we need to examine the following three elements carefully when discussing statistical models with marketers who are not necessarily statisticians:

  1. Target,
  2. Comparison Universe, and
  3. Methodology.

I call them “TCM” in short, so that I don’t leave out any element in exploratory conversations. Defining proper target is the obvious first step. Defining and obtaining data for the comparison universe is equally important, but it could be challenging. But without it, you’d have nothing against which you compare the target. Again, a model is an algorithm that expresses differences between two non-overlapping groups. So, yes, you need both Superman and Bizarro-Superman (who always seems more elusive than his counterpart). And that one important variable that differentiates the target and non-target is called “Dependent Variable” in modeling.

The third element in our discussion is the methodology. I am sure you may have heard of terms like logistic regression, stepwise regression, neural net, decision trees, CHAID analysis, genetic algorithm, etc., etc. Here is my advice to marketers and end-users:

  • State your goals and usages cases clearly, and let the analyst pick proper methodology that suites your goals.
  • Don’t be a bad patient who walks into a doctor’s office demanding a specific prescription before the doctor even examines you.

Besides, for all intents and purposes, the methodology itself matters the least in comparison with an erroneously defined target and the comparison universes. Differences in methodologies are often measured in fractions. A combination of a wrong target and wrong universe definition ends up as a shotgun, if not an artillery barrage. That doesn’t sound so precise, does it? We should be talking about a sniper rifle here.

Clear Goals Leading to Definitions of Target and Comparison
So, let’s roll up our sleeves and dig deeper into defining targets. Allow me to use an example, as you will be able to picture the process better that way. Let’s just say that, for general marketing purposes, you want to build a model targeting “frequent flyers.” One may ask for business or for pleasure, but let’s just say that such data are hard to obtain at this moment. (Finding the “reasons” is always much more difficult than counting the number of transactions.) And it was collectively decided that it would be just beneficial to know who is more likely to be a frequent flyer, in general. Such knowledge could be very useful for many applications, not just for the travel industry, but for other affiliated services, such as credit cards or publications. Plus, analytics is about making the best of what you’ve got, not waiting for some perfect datasets.

Now, here is the first challenge:

  • When it comes to flying, how frequent is frequent enough for you? Five times a year, 10 times, 20 times or even more?
  • Over how many years?
  • Would you consider actual miles traveled, or just number of issued tickets?
  • How large are the audiences in those brackets?

If you decided that five times a year is a not-so-big or not-so-small target (yes, sizes do matter) that also fits the goal of the model (you don’t want to target only super-elites, as they could be too rare or too distinct, almost like outliers), to whom are they going to be compared? Everyone who flew less than five times last year? How about people who didn’t fly at all last year?

Actually, one option is to compare people who flew more than five times against people who didn’t fly at all last year, but wouldn’t that model be too much like a plain “flyer” model? Or, will that option provide more vivid distinction among the general population? Or, one analyst may raise her hand and say “to hell with all these breaks and let’s just build a model using the number of times flown last year as the continuous target.” The crazy part is this: None of these options are right or wrong, but each combination of target and comparison will certainly yield very different-looking models.

Then what should a marketer do in a situation like this? Again, clearly state the goal and what is more important to you. If this is for general travel-related merchandizing, then the goal should be more about distinguishing more likely frequent flyers out of the general population; therefore, comparing five-plus flyers against non-flyers—ignoring the one-to-four-time flyers—makes sense. If this project is for an airline to target potential gold or platinum members, using people who don’t even fly as comparison makes little or no sense. Of course, in a situation like this, the analyst in charge (or data scientist, the way we refer to them these days), must come halfway and prescribe exactly what target and comparison definitions would be most effective for that particular user. That requires lots of preliminary data exploration, and it is not all science, but half art.

Now, if I may provide a shortcut in defining the comparison universe, just draw the representable sample from “the pool of names that are eligible for your marketing efforts.” The key word is “eligible” here. For example, many businesses operate within certain areas with certain restrictions or predetermined targeting criteria. It would make no sense to use the U.S. population sample for models for supermarket chains, telecommunications, or utility companies with designated footprints. If the business in question is selling female apparel items, first eliminate the male population from the comparison universe (but I’d leave “unknown” genders in the mix, so that the model can work its magic in that shady ground). You must remember, however, that all this means you need different models when you change the prospecting universe, even if the target definition remains unchanged. Because the model algorithm is the expression of the difference between T and C, you need a new model if you swap out the C part, even if you left the T alone.

Multiple Targets
Sometimes it gets twisted the other way around, where the comparison universe is relatively stable (i.e., your prospecting universe is stable) but there could be multiple targets (i.e., multiple Ts, like T1, T2, etc.) in your customer base.

Let me elaborate with a real-life example. A while back, we were helping a company that sells expensive auto accessories for luxury cars. The client, following his intuition, casually told us that he only cares for big spenders whose average order sizes are more than $300. Now, the trouble with this statement is that:

  1. Such a universe could be too small to be used effectively as a target for models, and
  2. High spenders do not tend to purchase often, so we may end up leaving out the majority of the potential target buyers in the whole process.

This is exactly why some type of customer profiling must precede the actual target definition. A series of simple distribution reports clearly revealed that this particular client was dealing with a dual-universe situation, where the first group (or segment) is made of infrequent, but high-dollar spenders whose average orders were even greater than $300, and the second group is made of very frequent buyers whose average order sizes are well below the $100 mark. If we had ignored this finding, or worse, neglected to run preliminary reports and just relying on our client’s wishful thinking, we would have created a “phantom” target, which is just an average of these dual universes. A model designed for such a phantom target will yield phantom results. The solution? If you find two distinct targets (as in T1 and T2), just bite the bullet and develop two separate models (T1 vs. C and T2 vs. C).

Multi-step Approach
There are still other reasons why you may need multiple models. Let’s talk about the case of “target within a target.” Some may relate this idea to a “drill-down” concept, and it can be very useful when the prospecting universe is very large, and the marketer is trying to reach only the top 1 percent (which can be still very large, if the pool contains hundreds of millions of people). Correctly finding the top 5 percent in any universe is difficult enough. So what I suggest in this case is to build two models in sequence to get to the “Best of the Best” in a stepwise fashion.

  • The first model would be more like an “elimination” model, where obviously not-so-desirable prospects would be removed from the process, and
  • The second-step model would be designed to go after the best prospects among survivors of the first step.

Again, models are expressions of differences between targets and non-targets, so if the first model eliminated the bottom 80 percent to 90 percent of the universe and leaves the rest as the new comparison universe, you need a separate model—for sure. And lots of interesting things happen at the later stage, where new variables start to show up in algorithms or important variables in the first step lose steam in later steps. While a bit cumbersome during deployment, the multi-step approach ensures precision targeting, much like a sniper rifle at close range.

I also suggest this type of multi-step process when clients are attempting to use the result of segmentation analysis as a selection tool. Segmentation techniques are useful as descriptive analytics. But as a targeting tool, they are just too much like a shotgun approach. It is one thing to describe groups of people such as “young working mothers,” “up-and-coming,” and “empty-nesters with big savings” and use them as references when carving out messages tailored toward them. But it is quite another to target such large groups as if the population within a particular segment is completely homogeneous in terms of susceptibility to specific offers or products. Surely, the difference between a Mercedes buyer and a Lexus buyer ain’t income and age, which may have been the main differentiator for segmentation. So, in the interest of maintaining a common theme throughout the marketing campaigns, I’d say such segments are good first steps. But for further precision targeting, you may need a model or two within each segment, depending on the size, channel to be employed and nature of offers.

Another case where the multi-step approach is useful is when the marketing and sales processes are naturally broken down into multiple steps. For typical B-to-B marketing, one may start the campaign by mass mailing or email (I’d say that step also requires modeling). And when responses start coming in, the sales team can take over and start contacting responders through more personal channels to close the deal. Such sales efforts are obviously very time-consuming, so we may build a “value” model measuring the potential value of the mail or email responders and start contacting them in a hierarchical order. Again, as the available pool of prospects gets smaller and smaller, the nature of targeting changes as well, requiring different types of models.

This type of funnel approach is also very useful in online marketing, as the natural steps involved in email or banner marketing go through lifecycles, such as blasting, delivery, impression, clickthrough, browsing, shopping, investigation, shopping basket, checkout (Yeah! Conversion!) and repeat purchases. Obviously, not all steps require aggressive or precision targeting. But I’d say, at the minimum, initial blast, clickthrough and conversion should be looked at separately. For any lifetime value analysis, yes, the repeat purchase is a key step; which, unfortunately, is often neglected by many marketers and data collectors.

Inversely Related Targets
More complex cases are when some of these multiple response and conversion steps are “inversely” related. For example, many responders to invitation-to-apply type credit card offers are often people with not-so-great credit. Well, if one has a good credit score, would all these credit card companies have left them alone? So, in a case like that, it becomes very tricky to find good responders who are also credit-worthy in the vast pool of a prospect universe.

I wouldn’t go as far as saying that it is like finding a needle in a haystack, but it is certainly not easy. Now, I’ve met folks who go after the likely responders with potential to be approved as a single target. It really is a philosophical difference, but I much prefer building two separate models in a situation like this:

  • One model designed to measure responsiveness, and
  • Another to measure likelihood to be approved.

The major benefit for having separate models is that each model will be able employ different types and sources of data variables. A more practical benefit for the users is that the marketers will be able to pick and choose what is more important to them at the time of campaign execution. They will obviously go to the top corner bracket, where both scores are high (i.e., potential responders who are likely to be approved). But as they dial the selection down, they will be able to test responsiveness and credit-worthiness separately.

Mixing Multiple Model Scores
Even when multiple models are developed with completely different intentions, mixing them up will produce very interesting results. Imagine you have access to scores for “High-Value Customer Model” and “Attrition Model.” If you cross these scores in a simple 2×2 matrix, you can easily create a useful segment in one corner called “Valuable Vulnerable” (a term that my mentor created a long time ago). Yes, one score is predicting who is likely to drop your service, but who cares if that customer shows little or no value to your business? Take care of the valuable customers first.

This type of mixing and matching becomes really interesting if you have lots of pre-developed models. During my tenure at a large data compiling company, we built more than 120 models for all kinds of consumer characteristics for general use. I remember the real fun began when we started mixing multiple models, like combining a “NASCAR Fan” model with a “College Football Fan” model; a “Leaning Conservative” model with an “NRA Donor” model; an “Organic Food” one with a “Cook for Fun” model or a “Wine Enthusiast” model; a “Foreign Vacation” model with a “Luxury Hotel” model or a “Cruise” model; a “Safety and Security Conscious” model or a “Home Improvement” model with a “Homeowner” model, etc., etc.

You see, no one is one dimensional, and we proved it with mathematics.

No One is One-dimensional
Obviously, these examples are just excerpts from a long playbook for the art of targeting. My intention is to emphasize that marketers must consider target, comparison and methodologies separately; and a combination of these three elements yields the most fitting solutions for each challenge, way beyond what some popular toolsets or new statistical methodologies presented in some technical conferences can acomplish. In fact, when the marketers are able to define the target in a logical fashion with help from trained analysts and data scientists, the effectiveness of modeling and subsequent marketing campaigns increase dramatically. Creating and maintaining an analytics department or hiring an outsourcing analytics vendor aren’t enough.

One may be concerned about the idea of building multiple models so casually, but let me remind you that it is the reality in which we already reside, anyway. I am saying this, as I’ve seen too many marketers who try to fix everything with just one hammer, and the results weren’t ideal—to say the least.

It is a shame that we still treat people with one-dimensional tools, such segmentations and clusters, in this age of ubiquitous and abundant data. Nobody is one-dimensional, and we must embrace that reality sooner than later. That calls for rapid model development and deployment, using everything that we’ve got.

Arguing about how difficult it is to build one or two more models here and there is so last century.

Loyalty Programs? We Don’t Need No Stinkin’ Loyalty Programs!

Without fear of (much) argument, it’s a fair statement to say that all companies want, and try to generate and achieve, optimum loyalty from their customer bases. They should want this, because study after study shows the financial rewards of having loyal customers. Some companies reach this goal through superior value delivery, built on quality products and services, and positive, consistent customer experiences. For the past several decades, many companies have relied on customer loyalty cards or programs, by which they can track purchase behavior and give rewards for repeat and volume buying activity.

Without fear of (much) argument, it’s a fair statement to say that all companies want, and try to generate and achieve, optimum loyalty from their customer bases. They should want this, because study after study shows the financial rewards of having loyal customers. Some companies reach this goal through superior value delivery, built on quality products and services, and positive, consistent customer experiences. For the past several decades, many companies have relied on customer loyalty cards or programs, by which they can track purchase behavior and give rewards for repeat and volume buying activity.

Customer loyalty programs are especially popular among retailers. During the years, retailers have found these programs to be powerful business tools within their highly competitive markets. But some retailers have completely disavowed loyalty programs, either never initiating them in the first place or canceling them, in favor of reduced pricing. In fact, this has become something of a trend. What’s behind it?

Let’s start with the biggest retailer—Walmart. The company has long claimed that a loyalty program isn’t needed because its prices are so low. Walmart believes that loyalty programs can, indeed, provide excellent information about customers who participate; however, as one Walmart executive put it: ” … some of the loyalty programs are very expensive, and we don’t think that serves everyday low cost and everyday low price.” Lower-than-competition everyday prices has been Walmart’s merchandising and marketing mantra since its inception. But, at least for groceries and sundry products, that often isn’t the case. Supermarket chains like Save-A-Lot and Aldi’s, neither of which has a loyalty program, will often beat Walmart’s item-for-item pricing by a significant margin. And other competitors can use their loyalty programs to selectively pick products, and individual customers, to offer pricing—which undermines Walmart.

As for generating customer purchase data, Walmart has a “scan & go” app for mobile devices, which allows customers to scan their own items as they shop; and this provides the company with valuable information on what customers are purchasing, the length of time they’re shopping in the store, and what offers and coupons might drive future purchases. Walmart uses additional methods of understanding individual customer purchases. One of these is Walmart credit cards. Another is reloadable MasterCard and Visa debit cards. A third is “Bluebird,” a prepaid debit card which functions as Walmart customers’ alternative to having a checking account, with which they can make deposits, pay bills—and shop at Walmart. Like Tesco is already doing in the U.K, Walmart has been considering development of its own bank, which would provide even more customer data.

Asda, a Walmart-owned supermarket chain in the U.K, also has no loyalty program. It’s the second-largest supermarket company, behind Tesco; and, as in the U.S., newer low-priced chains, such as Aldi, are actively competing with Asda. In place of a loyalty program, Asda believes it provides customers with what they want most, a “great multichannel retail experience.” The chain, according to executives, focuses on the key fundamentals: prices, quality, convenience and service. Alex Chrusczcz, Asda’s head of insights and pricing, offers two explanations of how the organization is endeavoring to build customer loyalty:

  • “Aspire to treat customers equally, or you’ll create a fractured brand and shopping experience. If you have someone paying one price and another customer with a coupon paying a different price, the perception of the brand is becoming fractured. Make sure it’s consistent.”
  • “Be pragmatic in terms of technology and analytics. They aren’t a silver bullet. Use these tools and combine them with the experience of your team.”

From my perspective, the second explanation is common sense; however, the first statement is really questionable—even counterintuitive, if a subordinating goal of loyalty behavior is to help drive customer-centricity. Simply put, all customers are not equal in value; and marketing strategies which treat them as such often create lower revenue.

In the U.S., regional supermarket chain Publix has no loyalty program. The company doesn’t have, as a result, the ability to track, at a household level, what customers are and aren’t purchasing in their stores. What Publix does, instead of loyalty cards, is try different alternative approaches to build sales. One of these, for example, was to test a program where shoppers could set up an online account where they could digitally clip coupons; and then, in the Publix store, the discounts they’d set up online could be automatically applied by typing in their phone numbers. Publix also has a BOGO program for their own brands, and accepts competitors’ coupons in their stores.

Some retailers do more than emphasize the sales and service fundamentals. They build genuine passion for, and bonding with, the brand by creating a more human, emotional connection. And, though there are few organizations like this, retailers such as Trader Joe’s are the exception that proves the rule. Trader Joe’s has no customer loyalty program. What they have is enthusiasm, achieved through differentiated, every-changing customer experiences, enhanced by upbeat, helpful employees. This has enabled Trader Joe’s to generate sales per square foot that are double the sales per square foot of Whole Foods. So, another way of stating that Trader Joe’s creates loyalty behavior without a program is to say: The shopping experience is, defacto, the loyalty program.

Now, we come to retailers which had customer loyalty programs, usually of long-standing, and elected to discontinue them. Actually, much of this has been done by one organization, Cerberus Capital Group, the early 2013 purchaser of multiple regional retail supermarket chains from Supervalu (Shaw’s, Acme, Star, Albertson’s and Jewel-Osco). Calling the new positioning “card-free savings,” and reflective of the first strategy stated above by Asda, each of the chains issued statements with themes like “We want buying to be simple for all, so that every (name of company) customer gets the same price whether a loyalty card has been used or not.” Additionally, and again like Asda, these chains have said they will go back to the basics: clean stores, well-stocked shelves, reduced checkout time, clearly marked sale items and creation of a more customer-focused culture. Some of their executives have also theorized that the chains will now adopt a more local-level approach, rather than customer-level, to their decision-making, and that individual store managers will now be more actively involved in driving successful performance.

So, the chains acquired by Cerberus appear to believe that “sunsetting,” or eliminating these programs, is a calculated risk and that they would still find good ways of providing value to retain more loyal customers, as well as incentives for those with the potential to move from purchase infrequency. Most analysts, however, felt that Cerberus eliminated the programs largely because the chains they purchased were either not mining card data, or not effectively analyzing and applying this material for better marketing and merchandising, thus making the loyalty systems too expensive to maintain.

Cerberus has entered into takeover discussions with California-based Safeway, which also owns Vons and Pavilion. If this sale takes place, it’s a good bet that these chains will also drop their reward cards, because Cerberus-owned supermarkets clearly don’t need, or want, no stinkin’ loyalty programs.