Don’t Ruin Good Models by Abusing Them, Marketers

Models may be built, but the work is not nearly done until they are properly applied and deployed in live campaigns. When in doubt, always consult with the analyst in charge; hopefully, before the drop date.

Modern-day 1:1 marketing is all about precision targeting, using all available data. And precision targeting is not possible with a few popular variables selected based on human intuition.

If human intuition takes over the process, every targeting logic would start with income and age. But let me ask you this: Do you really think that the differences between Mercedes and Lexus buyers are just income and age? If that is too tricky, how about the differences between travelers of luxury cruise lines and buyers of luxury cars? Would that be explained by income and age?

I’m sorry to break it to you bluntly, but all of those targets are rich. To come up with more effective targeting logic, you must dig deeper through data for other clues. And that’s where algorithmic solutions come into play.

I’ve worked with many smart people over the years, but I’ve never met a human who is capable of seeing through interactions among complex data variables without a computer. Some may understand two- or even three-dimensional interactions, when presented in a graphic format, but never more than that. Conversely, a simple regression model routinely incorporates 10 to 20 variables, and provides us with rank orders in forms of simple scores. Forget the next generation AI algorithms; humans have been solidly beaten by computers for decades when it comes to precision targeting.

So, when you have a dire need for more accurate targeting (i.e., you want to be mostly right, not mostly wrong); and have an ample amount of data (i.e., more data variables than you can easily handle); don’t even hesitate to go with statistical models. Resistance is simply futile. In the age of abundant data, we need models more than ever, as they convert mounds of data into digestible answers to questions. (For an extended list of benefits, refer to one of my early articles “Why Model?”)

But today, I am not writing this article to convince non-believers to become believers in statistical models. Quite frankly, I just don’t care if someone still is a non-believer in this day and age. It’s his loss, not mine. This not-so-short article is for existing users of models, who may have ruined them by abusing them from time to time.

As a data and analytics consultant, I get called in when campaign results are less than satisfactory; even when statistical models were actively employed in the target selection process. The most common expression I hear in such cases is, “The model didn’t work.” But when I dig through the whole process, I often find that the model algorithm is the only error-free item. How ironic.

I’ve talked about “analytics-readiness” so many times already. And, yes, inadequate sets of input data can definitely ruin models. So allow me to summarize ways users wreck perfectly adequate models “after” they were developed and validated. And there are many ways you can do that, unfortunately. Allow me to introduce a few major ones.

Using the Model in a Wrong Universe

Without a doubt, setting a wrong target will lead to an unusable model. Now, an equally important factor as the “target definition” is the “comparison universe.” If you are building a response model, for example, responders (i.e., targets) will be compared to non-responders (i.e., non-targets). If you are off in one of those, the whole model will be wrong — because a model is nothing but a mathematical expression of differences between the two dichotomous groups. This is why setting a proper comparison universe — generally, a sample out of the pool of names that you are using for the campaign — is equally as important as setting the right target.

Further, let’s say that you want to use models within preset universes, based on region, age, gender, income, past spending level, certain number of clicks, or any other segmentation rules. Such universe definitions — mostly about exclusion of obvious non-targets — should be determined “before” the model development phase. When such divisions are made, applying the model built for one universe (e.g., a regional model for the Mid-Atlantic) to another universe (e.g., the Pacific Northwest region) will not provide good results, other than with some dumb luck.

Ignoring the Design Principle of the Model

Like buildings or cars, models are built for specific purposes. If I may list a few examples:

  • “Future customer value estimation, in dollars”
  • “Propensity to purchase in response to discount offers via email”
  • “Product affinity for a certain product category”
  • “Loyalty vs. churn prediction”
  • “Likelihood to be a bargain-seeker”
  • Etc.

This list could be as long as what you want as a marketer.

However, things start to go wrong when the user starts ignoring (or forgetting) the original purpose of the model. Years back, my team built a model for a luxury cruise line for a very specific purpose. The brand was very reputable, so it had no trouble filling in staterooms with balconies at a higher price point. But it did have some challenges filling in inside staterooms at a relatively high price of entry, which was equivalent to a window room on a less fancy ship. So, the goal was to find cruisers who would take up inside staterooms, for the brand value, on Europe-bound ships that depart U.S. ports between Thanksgiving and Christmas. A very specific target? You bet.

Troubles arose because it worked all too well for the cruise line. So, without any further consultation with any analysts, they started using that model for other purposes. We got phone calls only after the attempt failed miserably. Now, is that really the fault of the model? Sure, you can heat up your house with a kitchen oven, but don’t blame the manufacturer when it breaks down by abusing it like that. I really don’t think the warranty applies there.

Playing With Selection Rules

Some marketers are compelled to add more rules after the fact, probably out of sheer enthusiasm for success. For instance, a person in charge of a campaign may come up with an idea at the last minute, and add a few rules on top of the model selection, as in “Let’s send mails only to male prospects in the high-score group.” What this means is that he just added the strongest variable on top of a good model, which may include 15 to 20 variables, all carefully weighted by a seasoned statistician. This type of practice may not lead to a total disaster, but the effectiveness of the model in question is definitely diluted by the post-selection rules.

When the bad results start to come in, again, don’t blame the modeler for it. Because “you” essentially redesigned the model by adding new variables on top of existing predictors. Unfortunately, this type of last-minute meddling is quite common. If you have a good reason to do any “post-selection,” please talk to the analyst before the model is built, so that she can incorporate the rule as a “pre-selection” logic. She may give you multiple models fitted for multiple universes, too.

Realigning the Model Groups in Arbitrary Ways

Model scores are just long numbers — with eight or nine decimal places, in general. It is hard to use sheer numeric values like that, so kind modelers generally break the scored universe into 10 or 20 equal groups. (We call them decile or demi-decile groups.)

For instance, each decile group would represent 10% of the development and validation samples. When applied to the campaign universe, resultant score groups should not deviate too much from that 10% mark.

If you see big bumps in model group sizes, it is a clear sign that something went wrong in scoring, there were significant changes in the input variables, or the model is losing its effectiveness, over time.

I’ve seen cases where users just realigned the model score groups after the fact, simply because groups were not showing an equal 10% break anymore. That is like covering serious wounds with a make-up. Did the model work after that? Please take a wild guess.

Using Expired Models

Models do have limited shelf-lives. Models lose their predictive power over time, as market conditions, business models, data sources, data procurement methods, and target profiles all inevitably go through changes.

If you detect signs of lagging results or wide fluctuations in model group distribution (i.e., showing only 3% in the top decile, which is supposed to be around 10%), it is time to review the model. In mild cases, modelers may be able to refit the model. But in this day and age of fast computers and automation, I recommend full redevelopment of the model in question at the first sign of trouble.

Ignoring the ‘Level’ of Prediction

A model for target marketing is to rank the records from high to low scores, according to the design principle. If you built an affinity model for “Likely to be an early adopter,” high score means the target is more likely to be an early adopter, and low score means she’s less likely to be one. Now, the level of the record matters here. What are you really ranking, anyway?

The most common ones are individual and household levels. It is possible to build a model on an email level, as one individual may have multiple email addresses. If you are in a telecom business, you may not even care for the household-level identity, as the “house” may be the target, regardless of who lives in there.

In the application stage, matching the “level” of prediction is important. For household models, it is safe to assume that almost all predictors in the model are on a household level. Applying such models on a different level may negatively affect the model performance. A definite “no” is using household-level score for an address, not knowing who lives there. One may think “How different will the new mover be from the old resident?” But considering a wide variety of demographic variables commonly used in models, it is something that no modeler would recommend. If the model employed any transaction or behavioral data, don’t even think about switching levels like that. You’d be better off building a regional model (such as ZIP model) only using geo-demographic data.

Applying Average Scores to Non-Matches or Non-Scorable Records

Sometimes, scores are missing because of non-matches in the data append process, or strict universe definition using pre-selection rules. It can be tempting to apply some “average” score to cover the missing ones, but that is a big no-no, as well. Statisticians may perform such imputation on a variable level to fill missing values, but not with model scores.

If you really have to have a score for every record, build separate models for non-match or non-select universes, using any available data (if there are any to be used). In CRM models, no one should just drop non-matches into demographic files, as the main drivers of such models would be transaction and behavioral data. Let missing values play out in the model (refer to “Missing Data Can Be Meaningful”).

For prospecting, once you set up a pre-selection universe (hopefully, after some profile analysis), don’t look back and just go with a “scored” universe. Records with missing scores are generally not salvageable, in practice.

Go Forth and Do Good Business With Models, Marketers

As you can see, there are many ways to mess up a good model. A model is not an extension of rudimentary selection rules, so please do NOT treat it that way. Basically, do not put diesel fuel in a gasoline car, and hope to God that the engine will run smoothly. And when — not if — the engine stalls, don’t blame the engineer.

Models may be built, but the work is not nearly done until they are properly applied and deployed in live campaigns. When in doubt, always consult with the analyst in charge; hopefully, before the drop date.

How to Outsource Analytics

In this series, I have been emphasizing the importance of statistical modeling in almost every article. While there are plenty of benefits of using statistical models in a more traditional sense (refer to “Why Model?”), in the days when “too much” data is the main challenge, I would dare to say that the most important function of statistical models is that they summarize complex data into simple-to-use “scores.”

In this series, I have been emphasizing the importance of statistical modeling in almost every article. While there are plenty of benefits of using statistical models in a more traditional sense (refer to “Why Model?”), in the days when “too much” data is the main challenge, I would dare to say that the most important function of statistical models is that they summarize complex data into simple-to-use “scores.”

The next important feature would be that models fill in the gaps, transforming “unknowns” to “potentials.” You see, even in the age of ubiquitous data, no one will ever know everything about everybody. For instance, out of 100,000 people you have permission to contact, only a fraction will be “known” wine enthusiasts. With modeling, we can assign scores for “likelihood of being a wine enthusiast” to everyone in the base. Sure, models are not 100 percent accurate, but I’ll take “70 percent chance of afternoon shower” over not knowing the weather forecast for the day of the company picnic.

I’ve already explained other benefits of modeling in detail earlier in this series, but if I may cut it really short, models will help marketers:

1. In deciding whom to engage, as they cannot afford to spam the world and annoy everyone who can read, and

2. In determining what to offer once they decide to engage someone, as consumers are savvier than ever and they will ignore and discard any irrelevant message, no matter how good it may look.

OK, then. I hope you are sold on this idea by now. The next question is, who is going to do all that mathematical work? In a country where jocks rule over geeks, it is clear to me that many folks are more afraid of mathematics than public speaking; which, in its own right, ranks higher than death in terms of the fear factor for many people. If I may paraphrase “Seinfeld,” many folks are figuratively more afraid of giving a eulogy than being in the coffin at a funeral. And thanks to a sub-par math education in the U.S. (and I am not joking about this, having graduated high school on foreign soil), yes, the fear of math tops them all. Scary, heh?

But that’s OK. This is a big world, and there are plenty of people who are really good at mathematics and statistics. That is why I purposefully never got into the mechanics of modeling techniques and related programming issues in this series. Instead, I have been emphasizing how to formulate questions, how to express business goals in a more logical fashion and where to invest to create analytics-ready environments. Then the next question is, “How will you find the right math geeks who can make all your dreams come true?”

If you have a plan to create an internal analytics team, there are a few things to consider before committing to that idea. Too many organizations just hire one or two statisticians, dump all the raw data onto them, and hope to God that they will figure some ways to make money with data, somehow. Good luck with that idea, as:

1. I’ve seen so many failed attempts like that (actually, I’d be shocked if it actually worked), and

2. I am sure God doesn’t micromanage statistical units.

(Similarly, I am almost certain that she doesn’t care much for football or baseball scores of certain teams, either. You don’t think God cares more for the Red Sox than the Yankees, do ya?)

The first challenge is locating good candidates. If you post any online ad for “Statistical Analysts,” you will receive a few hundred resumes per day. But the hiring process is not that simple, as you should ask the right questions to figure out who is a real deal, and who is a poser (and there are many posers out there). Even among qualified candidates with ample statistical knowledge, there are differences between the “Doers” and “Vendor Managers.” Depending on your organizational goal, you must differentiate the two.

Then the next challenge is keeping the team intact. In general, mathematicians and statisticians are not solely motivated by money; they also want constant challenges. Like any smart and creative folks, they will simply pack up and leave, if “they” determine that the job is boring. Just a couple of modeling projects a year with some rudimentary sets of data? Meh. Boring! Promises of upward mobility only work for a fraction of them, as the majority would rather deal with numbers and figures, showing no interest in managing other human beings. So, coming up with interesting and challenging projects, which will also benefit the whole organization, becomes a job in itself. If there are not enough challenges, smart ones will quit on you first. Then they need constant mentoring, as even the smartest statisticians will not know everything about challenges associated with marketing, target audiences and the business world, in general. (If you stumble into a statistician who is even remotely curious about how her salary is paid for, start with her.)

Further, you would need to invest to set up an analytical environment, as well. That includes software, hardware and other supporting staff. Toolsets are becoming much cheaper, but they are not exactly free yet. In fact, some famous statistical software, such as SAS, could be quite expensive year after year, although there are plenty of alternatives now. And they need an “analytics-ready” data environment, as I emphasized countless times in this series (refer to “Chicken or the Egg? Data or Analytics?” and “Marketing and IT; Cats and Dogs”). Such data preparation work is not for statisticians, and most of them are not even good at cleaning up dirty data, anyway. That means you will need different types of developers/programmers on the analytics team. I pointed out that analytical projects call for a cohesive team, not some super-duper analyst who can do it all (refer to “How to Be a Good Data Scientist”).

By now you would say “Jeez Louise, enough already,” as all this is just too much to manage to build just a few models. Suddenly, outsourcing may sound like a great idea. Then you would realize there are many things to consider when outsourcing analytical work.

First, where would you go? Everyone in the data industry and their cousins claim that they can take care of analytics. But in reality, it is a scary place where many who have “analytics” in their taglines do not even touch “predictive analytics.”

Analytics is a word that is abused as much as “Big Data,” so we really need to differentiate them. “Analytics” may mean:

  • Business Intelligence (BI) Reporting: This is mostly about the present, such as the display of key success metrics and dashboard reporting. While it is very important to know about the current state of business, much of so-called “analytics” unfortunately stops right here. Yes, it is good to have a dashboard in your car now, but do you know where you should be going?
  • Descriptive Analytics: This is about how the targets “look.” Common techniques such as profiling, segmentation and clustering fall under this category. These techniques are mainly for describing the target audience to enhance and optimize messages to them. But using these segments as a selection mechanism is not recommended, while many dare to do exactly that (more on this subject in future articles).
  • Predictive Modeling: This is about answering the questions about the future. Who would be more likely to behave certain ways? What communication channels will be most effective for whom? How much is the potential spending level of a prospect? Who is more likely to be a loyal and profitable customer? What are their preferences? Response models, various of types of cloning models, value models, and revenue models, attrition models, etc. all fall under this category, and they require hardcore statistical skills. Plus, as I emphasized earlier, these model scores compact large amounts of complex data into nice bite-size packages.
  • Optimization: This is mostly about budget allocation and attribution. Marketing agencies (or media buyers) generally deal with channel optimization and spending analysis, at times using econometrics models. This type of statistical work calls for different types of expertise, but many still insist on calling it simply “analytics.”

Let’s say that for the purpose of customer-level targeting and personalization, we decided to outsource the “predictive” modeling projects. What are our options?

We may consider:

  • Individual Consultants: In-house consultants are dedicated to your business for the duration of the contract, guaranteeing full access like an employee. But they are there for you only temporarily, with one foot out the door all the time. And when they do leave, all the knowledge walks away with them. Depending on the rate, the costs can add up.
  • Standalone Analytical Service Providers: Analytical work is all they do, so you get focused professionals with broad technical and institutional knowledge. Many of them are entrepreneurs, but that may work against you, as they could often be understaffed and stretched thin. They also tend to charge for every little step, with not many freebies. They are generally open to use any type of data, but the majority of them do not have secure sources of third-party data, which could be essential for certain types of analytics involving prospecting.
  • Database Service Providers: Almost all data compilers and brokers have statistical units, as they need to fill in the gap within their data assets with statistical techniques. (You didn’t think that they knew everyone’s income or age, did you?) For that reason, they have deep knowledge in all types of data, as well as in many industry verticals. They provide a one-stop shop environment with deep resource pools and a variety of data processing capabilities. However, they may not be as agile as smaller analytical shops, and analytics units may be tucked away somewhere within large and complex organizations. They also tend to emphasize the use of their own data, as after all, their main cash cows are their data assets.
  • Direct Marketing Agencies: Agencies are very strategic, as they touch all aspects of marketing and control creative processes through segmentation. Many large agencies boast full-scale analytical units, capable of all types of analytics that I explained earlier. But some agencies have very small teams, stretched really thin—just barely handling the reporting aspect, not any advanced analytics. Some just admit that predictive analytics is not part of their core competencies, and they may outsource such projects (not that it is a bad thing).

As you can see here, there is no clear-cut answer to “with whom you should you work.” Basically, you will need to check out all types of analysts and service providers to determine the partner best suitable for your long- and short-term business purposes, not just analytical goals. Often, many marketers just go with the lowest bidder. But pricing is just one of many elements to be considered. Here, allow me to introduce “10 Essential Items to Consider When Outsourcing Analytics.”

1. Consulting Capabilities: I put this on the top of the list, as being a translator between the marketing and the technology world is the most important differentiator (refer to “How to Be a Good Data Scientist”). They must understand the business goals and marketing needs, prescribe suitable solutions, convert such goals into mathematical expressions and define targets, making the best of available data. If they lack strategic vision to set up the data roadmap, statistical knowledge alone will not be enough to achieve the goals. And such business goals vary greatly depending on the industry, channel usage and related success metrics. Good consultants always ask questions first, while sub-par ones will try to force-fit marketers’ goals into their toolsets and methodologies.

Translating marketing goals into specific courses of action is a skill in itself. A good analytical partner should be capable of building a data roadmap (not just statistical steps) with a deep understanding of the business impact of resultant models. They should be able to break down larger goals into smaller steps, creating proper phased approaches. The plan may call for multiple models, all kinds of pre- and post-selection rules, or even external data acquisition, while remaining sensitive to overall costs.

The target definition is the core of all these considerations, which requires years of experience and industry knowledge. Simply, the wrong or inadequate targeting decision leads to disastrous results, no matter how sound the mathematical work is (refer to “Art of Targeting”).

Another important quality of a good analytical partner is the ability to create usefulness out of seemingly chaotic and unstructured data environments. Modeling is not about waiting for the perfect set of data, but about making the best of available data. In many modeling bake-offs, the winners are often decided by the creative usage of provided data, not just statistical techniques.

Finally, the consultative approach is important, as models do not exist in a vacuum, but they have to fit into the marketing engine. Be aware of the ones who want to change the world around their precious algorithms, as they are geeks not strategists. And the ones who understand the entire marketing cycle will give advice on what the next phase should be, as marketing efforts must be perpetual, not transient.

So, how will you find consultants? Ask the following questions:

  • Are they “listening” to you?
  • Can they repeat “your” goals in their own words?
  • Do their roadmaps cover both short- and long-term goals?
  • Are they confident enough to correct you?
  • Do they understand “non-statistical” elements in marketing?
  • Have they “been there, done that” for real, or just in theories?

2. Data Processing Capabilities: I know that some people look down upon the word “processing.” But data manipulation is the most important key step “before” any type of advanced analytics even begins. Simply, “garbage-in, garbage out.” And unfortunately, most datasets are completely unsuitable for analytics and modeling. In general, easily more than 80 percent of model development time goes into “fixing” the data, as most are unstructured and unrefined. I have been repeatedly emphasizing the importance of a “model-ready” (or “analytics-ready”) environment for that reason.

However, the reality dictates that the majority of databases are indeed NOT model-ready, and most of them are not even close to it. Well, someone has to clean up the mess. And in this data business, the last one who touches the dataset becomes responsible for all the errors and mistakes made to it thus far. I know it is not fair, but that is why we need to look at the potential partner’s ability to handle large and really messy data, not just the statistical savviness displayed in glossy presentations.

Yes, that dirty work includes data conversion, edit/hygiene, categorization/tagging, data summarization and variable creation, encompassing all kinds of numeric, character and freeform data (refer to “Beyond RFM Data” and “Freeform Data Aren’t Exactly Free”). It is not the most glorious part of this business, but data consistency is the key to successful implementation of any advanced analytics. So, if a model-ready environment is not available, someone had better know how to make the best of whatever is given. I have seen too many meltdowns in “before” and “after” modeling steps due to inconsistencies in databases.

So, grill the candidates with the following questions:

  • If they support file conversions, edit, categorization and summarization
  • How big of a dataset is too big, and how many files/tables are too many for them
  • How much free-form data are too much for them
  • Ask for sample model variables that they have created in the past

3. Track Records in the Industry: It can be argued that industry knowledge is even more crucial for the success than statistical know-how, as nuances are often “Lost in Translation” without relevant industry experience. In fact, some may not even be able to carry on a proper conversation with a client without it, leading to all kinds of wrong assumptions. I have seen a case where “real” rocket scientists messed up models for credit card campaigns.

The No. 1 reason why industry experience is important is everyone’s success metrics are unique. Just to name a few, financial services (banking, credit card, insurance, investment, etc.), travel and hospitality, entertainment, packaged goods, online and offline retail, catalogs, publication, telecommunications/utilities, non-profit and political organizations all call for different types of analytics and models, as their business models and the way they interact with target audiences are vastly different. For example, building a model (or a database, for that matter) for businesses where they hand over merchandise “before” they collect money is fundamentally different than the ones where exchange happens simultaneously. Even a simple concept of payment date or transaction date cannot be treated the same way. For retailers, recent dates could be better for business, but for subscription business, older dates may carry more weight. And these are just some examples with “dates,” before touching any dollar figures or other fun stuff.

Then the job gets even more complicated, if we further divide all of these industries by B-to-B vs. B-to-C, where available data do not even look similar. On top of that, divisional ROI metrics may be completely different, and even terminology and culture may play a role in all of this. When you are a consultant, you really don’t want to stop the flow of a meeting to clarify some unfamiliar acronyms, as you are supposed to know them all.

So, always demand specific industry references and examine client roasters, if allowed. (Many clients specifically ask vendors not to use their names as references.) Basically, watch out for the ones who push one-size-fits-all cookie-cutter solutions. You deserve way more than that.

4. Types of Models Supported: Speaking of cookie-cutter stuff, we need to be concerned with types of models that the outsourcing partner would support. Sure, nobody employs every technique, and no one can be good at everything. But we need to watch out for the “One-trick Ponies.”

This could be a tricky issue, as we are going into a more technical domain. Plus, marketers should not self-prescribe with specific techniques, instead of clearly stating their business goals (refer to “Marketing and IT; Cats and Dogs”). Some of the modeling goals are:

  • Rank and select prospect names
  • Lead scoring
  • Cross-sell/upsell
  • Segment the universe for messaging strategy
  • Pinpoint the attrition point
  • Assign lifetime values for prospects and customers
  • Optimize media/channel spending
  • Create new product packages
  • Detect fraud
  • Etc.

Unless you have successfully dealt with the outsourcing partner in the past (or you have a degree in statistics), do not blurt out words like Neural-net, CHAID, Cluster Analysis, Multiple Regression, Discriminant Function Analysis, etc. That would be like demanding specific medication before your new doctor even asks about your symptoms. The key is meeting your business goals, not fulfilling buzzwords. Let them present their methodology “after” the goal discussion. Nevertheless, see if the potential partner is pushing one or two specific techniques or solutions all the time.

5. Speed of Execution: In modern marketing, speed to action is the king. Speed wins, and speed gains respect. However, when it comes to modeling or other advanced analytics, you may be shocked by the wide range of time estimates provided by each outsourcing vendor. To be fair they are covering themselves, mainly because they have no idea what kind of messy data they will receive. As I mentioned earlier, pre-model data preparation and manipulation are critical components, and they are the most time-consuming part of all; especially when available data are in bad shape. Post-model scoring, audit and usage support may elongate the timeline. The key is to differentiate such pre- and post-modeling processes in the time estimate.

Even for pure modeling elements, time estimates vary greatly, depending on the complexity of assignments. Surely, a simple cloning model with basic demographic data would be much easier to execute than the ones that involve ample amounts of transaction- and event-level data, coming from all types of channels. If time-series elements are added, it will definitely be more complex. Typical clustering work is known to take longer than regression models with clear target definitions. If multiple models are required for the project, it will obviously take more time to finish the whole job.

Now, the interesting thing about building a model is that analysts don’t really finish it, but they just run out of time—much like the way marketers work on PowerPoint presentations. The commonality is that we can basically tweak models or decks forever, but we have to stop at some point.

However, with all kinds of automated tools and macros, model development time has decreased dramatically in past decades. We really came a long way since the first application of statistical techniques to marketing, and no one should be quoting a 1980s timeline in this century. But some still do. I know vendors are trained to follow the guideline “always under-promise and over-deliver,” but still.

An interesting aspect of this dilemma is that we can negotiate the timeline by asking for simpler and less sophisticated versions with diminished accuracy. If, hypothetically, it takes a week to be 98 percent accurate, but it only takes a day to be 90 percent accurate, what would you pick? That should be the business decision.

So, what is a general guideline? Again, it really depends on many factors, but allow me to share a version of it:

  • Pre-modeling Processing

– Data Conversions: from half a day to weeks

– Data Append/Enhancement: between overnight and two days

– Data Edit and Summarization: Data-dependent

  • Modeling: Ranges from half a day to weeks

– Depends on type, number of models and complexity

  • Scoring: from half a day to one week

– Mainly depends on number of records and state of the database to be scored

I know these are wide ranges, but watch out for the ones that routinely quote 30 days or more for simple clone models. They may not know what they are doing, or worse, they may be some mathematical perfectionists who don’t understand the marketing needs.

6. Pricing Structure: Some marketers would put this on top of the checklist, or worse, use the pricing factor as the only criterion. Obviously, I disagree. (Full disclosure: I have been on the service side of the fence during my entire career.) Yes, every project must make an economic sense in the end, but the budget should not and cannot be the sole deciding factor in choosing an outsourcing partner. There are many specialists under famous brand names who command top dollars, and then there are many data vendors who throw in “free” models, disrupting the ecosystem. Either way, one should not jump to conclusions too fast, as there is no free lunch, after all. In any case, I strongly recommend that no one should start the meeting with pricing questions (hence, this article). When you get to the pricing part, ask what the price includes, as the analytical journey could be a series of long and winding roads. Some of the biggest factors that need to be considered are:

  • Multiple Model Discounts—Less for second or third models within a project?
  • Pre-developed (off-the-shelf) Models—These can be “much” cheaper than custom models, while not custom-fitted.
  • Acquisition vs. CRM—Employing client-specific variables certainly increases the cost.
  • Regression Models vs. Other Types—At times, types of techniques may affect the price.
  • Clustering and Segmentations—They are generally priced much higher than target-specific models.

Again, it really depends on the complexity factor more than anything else, and the pre- and post-modeling process must be estimated and priced separately. Non-modeling charges often add up fast, and you should ask for unit prices and minimum charges for each step.

Scoring charges in time can be expensive, too, so negotiate for discounts for routine scoring of the same models. Some may offer all-inclusive package pricing for everything. The important thing is that you must be consistent with the checklist when shopping around with multiple candidates.

7. Documentation: When you pay for a custom model (not pre-developed, off-the-shelf ones), you get to own the algorithm. Because algorithms are not tangible items, the knowledge is to be transformed in model documents. Beware of the ones who offer “black-box” solutions with comments like, “Oh, it will work, so trust us.”

Good model documents must include the following, at the minimum:

  • Target and Comparison Universe Definitions: What was the target variable (or “dependent” variable) and how was it defined? How was the comparison universe defined? Was there any “pre-selection” for either of the universes? These are the most important factors in any model—even more than the mechanics of the model itself.
  • List of Variables: What are the “independent” variables? How were they transformed or binned? From where did they originate? Often, these model variables describe the nature of the model, and they should make intuitive sense.
  • Model Algorithms: What is the actual algorithm? What are the assigned weight for each independent variable?
  • Gains Chart: We need to examine potential effectiveness of the model. What are the “gains” for each model group, from top to bottom (e.g., 320 percent gain at the top model group in comparison to the whole universe)? How fast do such gains decrease as we move down the scale? How do the gains factors compare against the validation sample? A graphic representation would be nice, too.

For custom models, it is customary to have a formal model presentation, full documentation and scoring script in designated programming languages. In addition, if client files are provided, ask for a waterfall report that details input and output counts of each step. After the model scoring, it is also customary for the vendor to provide a scored universe count by model group. You will be shocked to find out that many so-called analytical vendors do not provide thorough documentation. Therefore, it is recommended to ask for sample documents upfront.

8. Scoring Validation: Models are built and presented properly, but the job is not done until the models are applied to the universe from which the names are ranked and selected for campaigns. I have seen too many major meltdowns at this stage. Simply, it is one thing to develop models with a few hundred thousand record samples, but it is quite another to apply the algorithm to millions of records. I am not saying that the scoring job always falls onto the developers, as you may have an internal team or a separate vendor for such ongoing processes. But do not let the model developer completely leave the building until everything checks out.

The model should have been validated against the validation sample by then, but live scoring may reveal all kinds of inconsistencies. You may also want to back-test the algorithms with past campaign results, as well. In short, many things go wrong “after” the modeling steps. When I hear customers complaining about models, I often find that the modeling is the only part that was done properly, and “before” and “after” steps were all messed up. Further, even machines misunderstand each other, as any differences in platform or scripting language may cause discrepancies. Or, maybe there was no technical error, but missing values may have caused inconsistencies (refer to “Missing Data Can Be Meaningful”). Nonetheless, the model developers would have the best insight as to what could have gone wrong, so make sure that they are available for questions after models are presented and delivered.

9. Back-end Analysis: Good analytics is all about applying learnings from past campaigns—good or bad—to new iterations of efforts. We often call it “closed-loop marketing—while many marketers often neglect to follow up. Any respectful analytics shop must be aware of it, while they may classify such work separately from modeling or other analytical projects. At the minimum, you need to check out if they even offer such services. In fact, so-called “match-back analysis” is not as simple as just matching campaign files against responders in this omnichannel environment. When many channels are employed at the same time, allocation of credit (i.e., “what worked?”) may call for all kinds of business rules or even dedicated models.

While you are at it, ask for a cheaper version of “canned” reports, as well, as custom back-end analysis can be even more costly than the modeling job itself, over time. Pre-developed reports may not include all the ROI metrics that you’re looking for (e.g., open, clickthrough, conversion rates, plus revenue and orders-per-mailed, per order, per display, per email, per conversion. etc.). So ask for sample reports upfront.

If you start breaking down all these figures by data source, campaign, time series, model group, offer, creative, targeting criteria, channel, ad server, publisher, keywords, etc., it can be unwieldy really fast. So contain yourself, as no one can understand 100-page reports, anyway. See if the analysts can guide you with such planning, as well. Lastly, if you are so into ROI analysis, get ready to share the “cost” side of the equation with the selected partner. Some jobs are on the marketers.

10. Ongoing Support: Models have a finite shelf life, as all kinds of changes happen in the real world. Seasonality may be a factor, or the business model or strategy may have changed. Fluctuations in data availability and quality further complicate the matter. Basically assumptions like “all things being equal” only happen in textbooks, so marketers must plan for periodic review of models and business rules.

A sure sign of trouble is decreasing effectiveness of models. When in doubt, consult the developers and they may recommend a re-fit or complete re-development of models. Quarterly reviews would be ideal, but if the cost becomes an issue, start with 6-month or yearly reviews, but never go past more than a year without any review. Some vendors may offer discounts for redevelopment, so ask for the price quote upfront.

I know this is a long list of things to check, but picking the right partner is very important, as it often becomes a long-term relationship. And you may find it strange that I didn’t even list “technical capabilities” at all. That is because:

1. Many marketers are not equipped to dig deep into the technical realm anyway, and

2. The difference between the most mathematically sound models and the ones from the opposite end of the spectrum is not nearly as critical as other factors I listed in this article.

In other words, even the worst model in the bake-off would be much better than no model, if these other business criterion are well-considered. So, happy shopping with this list, and I hope you find the right partner. Employing analytics is not an option when living in the sea of data.

It’s All About Ranking

The decision-making process is really all about ranking. As a marketer, to whom should you be talking first? What product should you offer through what channel? As a businessperson, whom should you hire among all the candidates? As an investor, what stocks or bonds should you purchase? As a vacationer, where should you visit first?

The decision-making process is really all about ranking. As a marketer, to whom should you be talking first? What product should you offer through what channel? As a businessperson, whom should you hire among all the candidates? As an investor, what stocks or bonds should you purchase? As a vacationer, where should you visit first?

Yes, “choice” is the keyword in all of these questions. And if you picked Paris over other places as an answer to the last question, you just made a choice based on some ranking order in your mind. The world is big, and there could have been many factors that contributed to that decision, such as culture, art, cuisine, attractions, weather, hotels, airlines, prices, deals, distance, convenience, language, etc., and I am pretty sure that not all factors carried the same weight for you. For example, if you put more weight on “cuisine,” I can see why London would lose a few points to Paris in that ranking order.

As a citizen, for whom should I vote? That’s the choice based on your ranking among candidates, too. Call me overly analytical (and I am), but I see the difference in political stances as differences in “weights” for many political (and sometimes not-so-political) factors, such as economy, foreign policy, defense, education, tax policy, entitlement programs, environmental issues, social issues, religious views, local policies, etc. Every voter puts different weights on these factors, and the sum of them becomes the score for each candidate in their minds. No one thinks that education is not important, but among all these factors, how much weight should it receive? Well, that is different for everybody; hence, the political differences.

I didn’t bring this up to start a political debate, but rather to point out that the decision-making process is based on ranking, and the ranking scores are made of many factors with different weights. And that is how the statistical models are designed in a nutshell (so, that means the models are “nuts”?). Analysts call those factors “independent variables,” which describe the target.

In my past columns, I talked about the importance of statistical models in the age of Big Data (refer to “Why Model?”), and why marketing databases must be “model-ready” (refer to “Chicken or the Egg? Data or Analytics?”). Now let’s dig a little deeper into the design of the “model-ready” marketing databases. And surprise! That is also all about “ranking.”

Let’s step back into the marketing world, where folks are not easily offended by the subject matter. If I give a spreadsheet that contains thousands of leads for your business, you wouldn’t be able to tell easily which ones are the “Glengarry Glen Ross” leads that came from Downtown, along with those infamous steak knives. What choice would you have then? Call everyone on the list? I guess you can start picking names out of a hat. If you think a little more about it, you may filter the list by the first name, as they may reflect the decade in which they were born. Or start calling folks who live in towns that sound affluent. Heck, you can start calling them in alphabetical order, but the point is that you would “sort” the list somehow.

Now, if the list came with some other valuable information, such as income, age, gender, education level, socio-economic status, housing type, number of children, etc., you may be able to pick and choose by which variables you would use to sort the list. You may start calling the high income folks first. Not all product sales are positively related to income, but it is an easy way to start the process. Then, you would throw in other variables to break the ties in rich areas. I don’t know what you’re selling, but maybe, you would want folks who live in a single-family house with kids. And sometimes, your “gut” feeling may lead you to the right place. But only sometimes. And only when the size of the list is not in millions.

If the list was not for prospecting calls, but for a CRM application where you also need to analyze past transaction and interaction history, the list of the factors (or variables) that you need to consider would be literally nauseating. Imagine the list contains all kinds of dollars, dates, products, channels and other related numbers and figures in a seemingly endless series of columns. You’d have to scroll to the right for quite some time just to see what’s included in the chart.

In situations like that, how nice would it be if some analyst threw in just two model scores for responsiveness to your product and the potential value of each customer, for example? The analysts may have considered hundreds (or thousands) of variables to derive such scores for you, and all you need to know is that the higher the score, the more likely the lead will be responsive or have higher potential values. For your convenience, the analyst may have converted all those numbers with many decimal places into easy to understand 1-10 or 1-20 scales. That would be nice, wouldn’t it be? Now you can just start calling the folks in the model group No. 1.

But let me throw in a curveball here. Let’s go back to the list with all those transaction data attached, but without the model scores. You may say, “Hey, that’s OK, because I’ve been doing alright without any help from a statistician so far, and I’ll just use the past dollar amount as their primary value and sort the list by it.” And that is a fine plan, in many cases. Then, when you look deeper into the list, you find out there are multiple entries for the same name all over the place. How can you sort the list of leads if the list is not even on an individual level? Welcome to the world of relational databases, where every transaction deserves an entry in a table.

Relational databases are optimized to store every transaction and retrieve them efficiently. In a relational database, tables are connected by match keys, and many times, tables are connected in what we call “1-to-many” relationships. Imagine a shopping basket. There is a buyer, and we need to record the buyer’s ID number, name, address, account number, status, etc. Each buyer may have multiple transactions, and for each transaction, we now have to record the date, dollar amount, payment method, etc. Further, if the buyer put multiple items in a shopping basket, that transaction, in turn, is in yet another 1-to-many relationship to the item table. You see, in order to record everything that just happened, this relational structure is very useful. If you are the person who has to create the shipping package, yes, you need to know all the item details, transaction value and the buyer’s information, including the shipping and billing address. Database designers love this completeness so much, they even call this structure the “normal” state.

But the trouble with the relational structure is that each line is describing transactions or items, not the buyers. Sure, one can “filter” people out by interrogating every line in the transaction table, say “Select buyers who had any transaction over $100 in past 12 months.” That is what I call rudimentary filtering, but once we start asking complex questions such as, “What is the buyer’s average transaction amount for past 12 months in the outdoor sports category, and what is the overall future value of the customers through online channels?” then you will need what we call “Buyer-centric” portraits, not transaction or item-centric records. Better yet, if I ask you to rank every customer in the order of such future value, well, good luck doing that when all the tables are describing transactions, not people. That would be exactly like the case where you have multiple lines for one individual when you need to sort the leads from high value to low.

So, how do we remedy this? We need to summarize the database on an individual level, if you would like to sort the leads on an individual level. If the goal is to rank households, email addresses, companies, business sites or products, then the summarization should be done on those levels, too. Now, database designers call it the “de-normalization” process, and the tables tend to get “wide” along that process, but that is the necessary step in order to rank the entities properly.

Now, the starting point in all the summarizations is proper identification numbers for those levels. It won’t be possible to summarize any table on a household level without a reliable household ID. One may think that such things are given, but I would have to disagree. I’ve seen so many so-called “state of the art” (another cliché that makes me nauseous) databases that do not have consistent IDs of any kind. If your database managers say they are using “plain name” or “email address” fields for matching or summarization, be afraid. Be very afraid. As a starter, you know how many email addresses one person may have. To add to that, consider how many people move around each year.

Things get worse in regard to ranking by model scores when it comes to “unstructured” databases. We see more and more of those, as the data sources are getting into uncharted territories, and the size of the databases is growing exponentially. There, all these bits and pieces of data are sitting on mysterious “clouds” as entries on their own. Here again, it is one thing to select or filter based on collected data, but ranking based on some statistical modeling is simply not possible in such a structure (or lack thereof). Just ask the database managers how many 24-month active customers they really have, considering a great many people move in that time period and change their addresses, creating multiple entries. If you get an answer like “2 million-ish,” well, that’s another scary moment. (Refer to “Cheat Sheet: Is Your Database Marketing Ready?”)

In order to develop models using variables that are descriptors of customers, not transactions, we must convert those relational or unstructured data into the structure that match the level by which you would like to rank the records. Even temporarily. As the size of databases are getting bigger and bigger and the storage is getting cheaper and cheaper, I’d say that the temporary time period could be, well, indefinite. And because the word “data-mart” is overused and confusing to many, let me just call that place the “Analytical Sandbox.” Sandboxes are fun, and yes, all kinds of fun stuff for marketers and analysts happen there.

The Analytical Sandbox is where samples are created for model development, actual models are built, models are scored for every record—no matter how many there are—without hiccups; targets are easily sorted and selected by model scores; reports are created in meaningful and consistent ways (consistency is even more important than sheer accuracy in what we do), and analytical language such as SAS, SPSS or R are spoken without being frowned up by other computing folks. Here, analysts will spend their time pondering upon target definitions and methodologies, not about database structures and incomplete data fields. Have you heard about a fancy term called “in-database scoring”? This is where that happens, too.

And what comes out of the Analytical Sandbox and back into the world of relational database or unstructured databases—IT folks often ask this question—is going to be very simple. Instead of having to move mountains of data back and forth, all the variables will be in forms of model scores, providing answers to marketing questions, without any missing values (by definition, every record can be scored by models). While the scores are packing tons of information in them, the sizes could be as small as a couple bytes or even less. Even if you carry over a few hundred affinity scores for 100 million people (or any other types of entities), I wouldn’t call the resultant file large, as it would be as small as a few video files, really.

In my future columns, I will explain how to create model-ready (and human-ready) variables using all kinds of numeric, character or free-form data. In Exhibit A, you will see what we call traditional analytical activities colored in dark blue on the right-hand side. In order to make those processes really hum, we must follow all the steps that are on the left-hand side of that big cylinder in the middle. Preventing garbage-in-garbage-out situations from happening, this is where all the data get collected in uniform fashion, properly converted, edited and standardized by uniform rules, categorized based on preset meta-tables, consolidated with consistent IDs, summarized to desired levels, and meaningful variables are created for more advanced analytics.

Even more than statistical methodologies, consistent and creative variables in form of “descriptors” of the target audience make or break the marketing plan. Many people think that purchasing expensive analytical software will provide all the answers. But lest we forget, fancy software only answers the right-hand side of Exhibit A, not all of it. Creating a consistent template for all useful information in a uniform fashion is the key to maximizing the power of analytics. If you look into any modeling bakeoff in the industry, you will see that the differences in methodologies are measured in fractions. Conversely, inconsistent and incomplete data create disasters in real world. And in many cases, companies can’t even attempt advanced analytics while sitting on mountains of data, due to structural inadequacies.

I firmly believe the Big Data movement should be about

  1. getting rid of the noise, and
  2. providing simple answers to decision-makers.

Bragging about the size and the speed element alone will not bring us to the next level, which is to “humanize” the data. At the end of the day (another cliché that I hate), it is all about supporting the decision-making processes, and the decision-making process is all about ranking different options. So, in the interest of keeping it simple, let’s start by creating an analytical haven where all those rankings become easy, in case you think that the sandbox is too juvenile.

Data Deep Dive: The Art of Targeting

Even if you own a sniper rifle (and I’m not judging), if you aim at the wrong place, you will never hit the target. Obvious, right? But that happens all the time in the world of marketing, even when advanced analytics and predictive modeling techniques are routinely employed. How is that possible? Well, the marketing world is not like an Army shooting range where the silhouette of the target is conveniently hung at the predetermined location, but it is more like the “Twilight Zone,” where things are not what they seem. Marketers who failed to hit the real target often blame the guns, which in this case are targeting tools, such as models and segmentations. But let me ask, was the target properly defined in the first place?

Even if you own a sniper rifle (and I’m not judging), if you aim at the wrong place, you will never hit the target. Obvious, right? But that happens all the time in the world of marketing, even when advanced analytics and predictive modeling techniques are routinely employed. How is that possible? Well, the marketing world is not like an Army shooting range where the silhouette of the target is conveniently hung at the predetermined location, but it is more like the “Twilight Zone,” where things are not what they seem. Marketers who failed to hit the real target often blame the guns, which in this case are targeting tools, such as models and segmentations. But let me ask, was the target properly defined in the first place?

In my previous columns, I talked about the importance of predictive analytics in modern marketing (refer to “Why Model?”) for various reasons, such as targeting accuracy, consistency, deeper use of data, and most importantly in the age of Big Data, concise nature of model scores where tons of data are packed into ready-for-use formats. Now, even the marketers who bought into these ideas often make mistakes by relinquishing the important duty of target definition solely to analysts and statisticians, who do not necessarily possess the power to read the marketers’ minds. Targeting is often called “half-art and half-science.” And it should be looked at from multiple angles, starting with the marketer’s point of view. Therefore, even marketers who are slightly (or, in many cases, severely) allergic to mathematics should come one step closer to the world of analytics and modeling. Don’t be too scared, as I am not asking you to be a rifle designer or sniper here; I am only talking about hanging the target in the right place so that others can shoot at it.

Let us start by reviewing what statistical models are: A model is a mathematical expression of “differences” between dichotomous groups; which, in marketing, are often referred to as “targets” and “non-targets.” Let’s say a marketer wants to target “high-value customers.” To build a model to describe such targets, we also need to define “non-high-value customers,” as well. In marketing, popular targets are often expressed as “repeat buyers,” “responders to certain campaigns,” “big-time spenders,” “long-term, high-value customers,” “troubled customers,” etc. for specific products and channels. Now, for all those targets, we also need to define “bizarro” or “anti-” versions of them. One may think that they are just the “remainders” of the target. But, unfortunately, it is not that simple; the definition of the whole universe should be set first to even bring up the concept of the remainders. In many cases, defining “non-buyers” is much more difficult than defining “buyers,” because lack of purchase information does not guarantee that the individual in question is indeed a non-buyer. Maybe the data collection was never complete. Maybe he used a different channel to respond. Maybe his wife bought the item for him. Maybe you don’t have access to the entire pool of names that represent the “universe.”

Remember T, C, & M
That is why we need to examine the following three elements carefully when discussing statistical models with marketers who are not necessarily statisticians:

  1. Target,
  2. Comparison Universe, and
  3. Methodology.

I call them “TCM” in short, so that I don’t leave out any element in exploratory conversations. Defining proper target is the obvious first step. Defining and obtaining data for the comparison universe is equally important, but it could be challenging. But without it, you’d have nothing against which you compare the target. Again, a model is an algorithm that expresses differences between two non-overlapping groups. So, yes, you need both Superman and Bizarro-Superman (who always seems more elusive than his counterpart). And that one important variable that differentiates the target and non-target is called “Dependent Variable” in modeling.

The third element in our discussion is the methodology. I am sure you may have heard of terms like logistic regression, stepwise regression, neural net, decision trees, CHAID analysis, genetic algorithm, etc., etc. Here is my advice to marketers and end-users:

  • State your goals and usages cases clearly, and let the analyst pick proper methodology that suites your goals.
  • Don’t be a bad patient who walks into a doctor’s office demanding a specific prescription before the doctor even examines you.

Besides, for all intents and purposes, the methodology itself matters the least in comparison with an erroneously defined target and the comparison universes. Differences in methodologies are often measured in fractions. A combination of a wrong target and wrong universe definition ends up as a shotgun, if not an artillery barrage. That doesn’t sound so precise, does it? We should be talking about a sniper rifle here.

Clear Goals Leading to Definitions of Target and Comparison
So, let’s roll up our sleeves and dig deeper into defining targets. Allow me to use an example, as you will be able to picture the process better that way. Let’s just say that, for general marketing purposes, you want to build a model targeting “frequent flyers.” One may ask for business or for pleasure, but let’s just say that such data are hard to obtain at this moment. (Finding the “reasons” is always much more difficult than counting the number of transactions.) And it was collectively decided that it would be just beneficial to know who is more likely to be a frequent flyer, in general. Such knowledge could be very useful for many applications, not just for the travel industry, but for other affiliated services, such as credit cards or publications. Plus, analytics is about making the best of what you’ve got, not waiting for some perfect datasets.

Now, here is the first challenge:

  • When it comes to flying, how frequent is frequent enough for you? Five times a year, 10 times, 20 times or even more?
  • Over how many years?
  • Would you consider actual miles traveled, or just number of issued tickets?
  • How large are the audiences in those brackets?

If you decided that five times a year is a not-so-big or not-so-small target (yes, sizes do matter) that also fits the goal of the model (you don’t want to target only super-elites, as they could be too rare or too distinct, almost like outliers), to whom are they going to be compared? Everyone who flew less than five times last year? How about people who didn’t fly at all last year?

Actually, one option is to compare people who flew more than five times against people who didn’t fly at all last year, but wouldn’t that model be too much like a plain “flyer” model? Or, will that option provide more vivid distinction among the general population? Or, one analyst may raise her hand and say “to hell with all these breaks and let’s just build a model using the number of times flown last year as the continuous target.” The crazy part is this: None of these options are right or wrong, but each combination of target and comparison will certainly yield very different-looking models.

Then what should a marketer do in a situation like this? Again, clearly state the goal and what is more important to you. If this is for general travel-related merchandizing, then the goal should be more about distinguishing more likely frequent flyers out of the general population; therefore, comparing five-plus flyers against non-flyers—ignoring the one-to-four-time flyers—makes sense. If this project is for an airline to target potential gold or platinum members, using people who don’t even fly as comparison makes little or no sense. Of course, in a situation like this, the analyst in charge (or data scientist, the way we refer to them these days), must come halfway and prescribe exactly what target and comparison definitions would be most effective for that particular user. That requires lots of preliminary data exploration, and it is not all science, but half art.

Now, if I may provide a shortcut in defining the comparison universe, just draw the representable sample from “the pool of names that are eligible for your marketing efforts.” The key word is “eligible” here. For example, many businesses operate within certain areas with certain restrictions or predetermined targeting criteria. It would make no sense to use the U.S. population sample for models for supermarket chains, telecommunications, or utility companies with designated footprints. If the business in question is selling female apparel items, first eliminate the male population from the comparison universe (but I’d leave “unknown” genders in the mix, so that the model can work its magic in that shady ground). You must remember, however, that all this means you need different models when you change the prospecting universe, even if the target definition remains unchanged. Because the model algorithm is the expression of the difference between T and C, you need a new model if you swap out the C part, even if you left the T alone.

Multiple Targets
Sometimes it gets twisted the other way around, where the comparison universe is relatively stable (i.e., your prospecting universe is stable) but there could be multiple targets (i.e., multiple Ts, like T1, T2, etc.) in your customer base.

Let me elaborate with a real-life example. A while back, we were helping a company that sells expensive auto accessories for luxury cars. The client, following his intuition, casually told us that he only cares for big spenders whose average order sizes are more than $300. Now, the trouble with this statement is that:

  1. Such a universe could be too small to be used effectively as a target for models, and
  2. High spenders do not tend to purchase often, so we may end up leaving out the majority of the potential target buyers in the whole process.

This is exactly why some type of customer profiling must precede the actual target definition. A series of simple distribution reports clearly revealed that this particular client was dealing with a dual-universe situation, where the first group (or segment) is made of infrequent, but high-dollar spenders whose average orders were even greater than $300, and the second group is made of very frequent buyers whose average order sizes are well below the $100 mark. If we had ignored this finding, or worse, neglected to run preliminary reports and just relying on our client’s wishful thinking, we would have created a “phantom” target, which is just an average of these dual universes. A model designed for such a phantom target will yield phantom results. The solution? If you find two distinct targets (as in T1 and T2), just bite the bullet and develop two separate models (T1 vs. C and T2 vs. C).

Multi-step Approach
There are still other reasons why you may need multiple models. Let’s talk about the case of “target within a target.” Some may relate this idea to a “drill-down” concept, and it can be very useful when the prospecting universe is very large, and the marketer is trying to reach only the top 1 percent (which can be still very large, if the pool contains hundreds of millions of people). Correctly finding the top 5 percent in any universe is difficult enough. So what I suggest in this case is to build two models in sequence to get to the “Best of the Best” in a stepwise fashion.

  • The first model would be more like an “elimination” model, where obviously not-so-desirable prospects would be removed from the process, and
  • The second-step model would be designed to go after the best prospects among survivors of the first step.

Again, models are expressions of differences between targets and non-targets, so if the first model eliminated the bottom 80 percent to 90 percent of the universe and leaves the rest as the new comparison universe, you need a separate model—for sure. And lots of interesting things happen at the later stage, where new variables start to show up in algorithms or important variables in the first step lose steam in later steps. While a bit cumbersome during deployment, the multi-step approach ensures precision targeting, much like a sniper rifle at close range.

I also suggest this type of multi-step process when clients are attempting to use the result of segmentation analysis as a selection tool. Segmentation techniques are useful as descriptive analytics. But as a targeting tool, they are just too much like a shotgun approach. It is one thing to describe groups of people such as “young working mothers,” “up-and-coming,” and “empty-nesters with big savings” and use them as references when carving out messages tailored toward them. But it is quite another to target such large groups as if the population within a particular segment is completely homogeneous in terms of susceptibility to specific offers or products. Surely, the difference between a Mercedes buyer and a Lexus buyer ain’t income and age, which may have been the main differentiator for segmentation. So, in the interest of maintaining a common theme throughout the marketing campaigns, I’d say such segments are good first steps. But for further precision targeting, you may need a model or two within each segment, depending on the size, channel to be employed and nature of offers.

Another case where the multi-step approach is useful is when the marketing and sales processes are naturally broken down into multiple steps. For typical B-to-B marketing, one may start the campaign by mass mailing or email (I’d say that step also requires modeling). And when responses start coming in, the sales team can take over and start contacting responders through more personal channels to close the deal. Such sales efforts are obviously very time-consuming, so we may build a “value” model measuring the potential value of the mail or email responders and start contacting them in a hierarchical order. Again, as the available pool of prospects gets smaller and smaller, the nature of targeting changes as well, requiring different types of models.

This type of funnel approach is also very useful in online marketing, as the natural steps involved in email or banner marketing go through lifecycles, such as blasting, delivery, impression, clickthrough, browsing, shopping, investigation, shopping basket, checkout (Yeah! Conversion!) and repeat purchases. Obviously, not all steps require aggressive or precision targeting. But I’d say, at the minimum, initial blast, clickthrough and conversion should be looked at separately. For any lifetime value analysis, yes, the repeat purchase is a key step; which, unfortunately, is often neglected by many marketers and data collectors.

Inversely Related Targets
More complex cases are when some of these multiple response and conversion steps are “inversely” related. For example, many responders to invitation-to-apply type credit card offers are often people with not-so-great credit. Well, if one has a good credit score, would all these credit card companies have left them alone? So, in a case like that, it becomes very tricky to find good responders who are also credit-worthy in the vast pool of a prospect universe.

I wouldn’t go as far as saying that it is like finding a needle in a haystack, but it is certainly not easy. Now, I’ve met folks who go after the likely responders with potential to be approved as a single target. It really is a philosophical difference, but I much prefer building two separate models in a situation like this:

  • One model designed to measure responsiveness, and
  • Another to measure likelihood to be approved.

The major benefit for having separate models is that each model will be able employ different types and sources of data variables. A more practical benefit for the users is that the marketers will be able to pick and choose what is more important to them at the time of campaign execution. They will obviously go to the top corner bracket, where both scores are high (i.e., potential responders who are likely to be approved). But as they dial the selection down, they will be able to test responsiveness and credit-worthiness separately.

Mixing Multiple Model Scores
Even when multiple models are developed with completely different intentions, mixing them up will produce very interesting results. Imagine you have access to scores for “High-Value Customer Model” and “Attrition Model.” If you cross these scores in a simple 2×2 matrix, you can easily create a useful segment in one corner called “Valuable Vulnerable” (a term that my mentor created a long time ago). Yes, one score is predicting who is likely to drop your service, but who cares if that customer shows little or no value to your business? Take care of the valuable customers first.

This type of mixing and matching becomes really interesting if you have lots of pre-developed models. During my tenure at a large data compiling company, we built more than 120 models for all kinds of consumer characteristics for general use. I remember the real fun began when we started mixing multiple models, like combining a “NASCAR Fan” model with a “College Football Fan” model; a “Leaning Conservative” model with an “NRA Donor” model; an “Organic Food” one with a “Cook for Fun” model or a “Wine Enthusiast” model; a “Foreign Vacation” model with a “Luxury Hotel” model or a “Cruise” model; a “Safety and Security Conscious” model or a “Home Improvement” model with a “Homeowner” model, etc., etc.

You see, no one is one dimensional, and we proved it with mathematics.

No One is One-dimensional
Obviously, these examples are just excerpts from a long playbook for the art of targeting. My intention is to emphasize that marketers must consider target, comparison and methodologies separately; and a combination of these three elements yields the most fitting solutions for each challenge, way beyond what some popular toolsets or new statistical methodologies presented in some technical conferences can acomplish. In fact, when the marketers are able to define the target in a logical fashion with help from trained analysts and data scientists, the effectiveness of modeling and subsequent marketing campaigns increase dramatically. Creating and maintaining an analytics department or hiring an outsourcing analytics vendor aren’t enough.

One may be concerned about the idea of building multiple models so casually, but let me remind you that it is the reality in which we already reside, anyway. I am saying this, as I’ve seen too many marketers who try to fix everything with just one hammer, and the results weren’t ideal—to say the least.

It is a shame that we still treat people with one-dimensional tools, such segmentations and clusters, in this age of ubiquitous and abundant data. Nobody is one-dimensional, and we must embrace that reality sooner than later. That calls for rapid model development and deployment, using everything that we’ve got.

Arguing about how difficult it is to build one or two more models here and there is so last century.

Why Model?

Why model? Uh, because someone is ridiculously good looking, like Derek Zoolander? No, seriously, why model when we have so much data around? The short answer is because we will never know the whole truth. That would be the philosophical answer. Physicists construct models to make new quantum field theories more attractive theoretically and more testable physically. If a scientist already knows the secrets of the universe, well, then that person is on a first-name basis with God Almighty, and he or she doesn’t need any models to describe things like particles or strings. And the rest of us should just hope the scientist isn’t one of those evil beings in “Star Trek.”

Why model? Uh, because someone is ridiculously good looking, like Derek Zoolander? No, seriously, why model when we have so much data around?

The short answer is because we will never know the whole truth. That would be the philosophical answer. Physicists construct models to make new quantum field theories more attractive theoretically and more testable physically. If a scientist already knows the secrets of the universe, well, then that person is on a first-name basis with God Almighty, and he or she doesn’t need any models to describe things like particles or strings. And the rest of us should just hope the scientist isn’t one of those evil beings in “Star Trek.”

Another answer to “why model?” is because we don’t really know the future, not even the immediate future. If some object is moving toward a certain direction at a certain velocity, we can safely guess where it will end up in one hour. Then again, nothing in this universe is just one-dimensional like that, and there could be a snowstorm brewing up on its path, messing up the whole trajectory. And that weather “forecast” that predicted the snowstorm is a result of some serious modeling, isn’t it?

What does all this mean for the marketers who are not necessarily masters of mathematics, statistics or theoretical physics? Plenty, actually. And the use of models in marketing goes way back to the days of punch cards and mainframes. If you are too young to know what those things are, well, congratulations on your youth, and let’s just say that it was around the time when humans first stepped on the moon using a crude rocket ship equipped with less computing power than an inexpensive passenger car of the modern days.

Anyhow, in that ancient time, some smart folks in the publishing industry figured that they would save tons of money if they could correctly “guess” who the potential buyers were “before” they dropped any expensive mail pieces. Even with basic regression models—and they only had one or two chances to get it right with glacially slow tools before the all-too-important Christmas season came around every year—they could safely cut the mail quantity by 80 percent to 90 percent. The savings added up really fast by not talking to everyone.

Fast-forward to the 21st Century. There is still a beauty of knowing who the potential buyers are before we start engaging anyone. As I wrote in my previous columns, analytics should answer:

1. To whom you should be talking; and
2. What you should offer once you’ve decided to engage someone.

At least the first part will be taken care of by knowing who is more likely to respond to you.

But in the days when the cost of contacting a person through various channels is dropping rapidly, deciding to whom to talk can’t be the only reason for all this statistical work. Of course not. There are plenty more reasons why being a statistician (or a data scientist, nowadays) is one of the best career choices in this century.

Here is a quick list of benefits of employing statistical models in marketing. Basically, models are constructed to:

  • Reduce cost by contacting prospects more wisely
  • Increase targeting accuracy
  • Maintain consistent results
  • Reveal hidden patterns in data
  • Automate marketing procedures by being more repeatable
  • Expand the prospect universe while minimizing the risk
  • Fill in the gaps and summarize complex data into an easy-to-use format—A must in the age of Big Data
  • Stay relevant to your customers and prospects

We talked enough about the first point, so let’s jump to the second one. It is hard to argue about the “targeting accuracy” part, though there still are plenty of non-believers in this day and age. Why are statistical models more accurate than someone’s gut feeling or sheer guesswork? Let’s just say that in my years of dealing with lots of smart people, I have not met anyone who can think about more than two to three variables at the same time, not to mention potential interactions among them. Maybe some are very experienced in using RFM and demographic data. Maybe they have been reasonably successful with choices of variables handed down to them by their predecessors. But can they really go head-to-head against carefully constructed statistical models?

What is a statistical model, and how is it built? In short, a model is a mathematical expression of “differences” between dichotomous groups. Too much of a mouthful? Just imagine two groups of people who do not overlap. They may be buyers vs. non-buyers; responders vs. non-responders; credit-worthy vs. not-credit-worthy; loyal customers vs. attrition-bound, etc. The first step in modeling is to define the target, and that is the most important step of all. If the target is hanging in the wrong place, you will be shooting at the wrong place, no matter how good your rifle is.

And the target should be expressed in mathematical terms, as computers can’t read our minds, not just yet. Defining the target is a job in itself:

  • If you’re going after frequent flyers, how frequent is frequent enough for you? Five times a year or 10 times a year? Or somewhere in between? Or should it remain continuous?
  • What if the target is too small or too large? What then?
  • If you are looking for more valuable prospects, how would you express that? In terms of average spending, lifetime spending or sheer number of transactions?
  • What if there is an inverse relationship between frequency and dollar spending (i.e., high spenders shopping infrequently)?
  • And what would be the borderline number to be “valuable” in all this?

Once the target is set, after much pondering, then the job is to select the variables that describe the “differences” between the two groups. For example, I know how much marketers love to use income variables in various situations. But if that popular variable does not explain the differences between the two groups (target and non-target), the mathematics will mercilessly throw it out. This rigorous exercise of examining hundreds or even thousands of variables is one of the most critical steps, during which many variables go through various types of transformations. Statisticians have different preferences in terms of ideal numbers of variables in a model, while non-statisticians like us don’t need to be too concerned, as long as the resultant model works. Who cares if a cat is white or black, as long as it catches mice?

Not all selected variables are equally important in model algorithms, either. More powerful variables will be assigned with higher weight, and the sum of these weighted values is what we call model score. Now, non-statisticians who have been slightly allergic to math since the third grade only need to know that the higher the score, the more likely the record in question is to be like the target. To make the matter even simpler, let’s just say that you want higher scores over lower scores. If you are a salesperson, just call the high-score prospects first. And would you care how many variables are packed into that score, for as long as you get the good “Glengarry Glen Ross” leads on top?

So, let me ask again. Does this sound like something a rudimentary selection rule with two to three variables can beat when it comes to identifying the right target? Maybe someone can get lucky once or twice, but not consistently.

That leads to the next point, “consistency.” Because models do not rely on a few popular variables, they are far less volatile than simple selection rules or queries. In this age of Big Data, there are more transaction and behavioral data in the mix than ever, and they are far more volatile than demographic and geo-demographic data. Put simply, people’s purchasing behavior and preferences change much faster than family composition or their income, and that volatility factor calls for more statistical work. Plus, all facets of marketing are now more about measurable results (ah, that dreaded ROI, or “Roy,” the way I call it), and the businesses call for consistent hitters over one-hit wonders.

“Revealing hidden patterns in data” is my favorite. When marketers are presented with thousands of variables, I see a majority of them just sticking to a few popular ones all the time. Some basic recency and frequency data are there, and among hundreds of demographic variables, the list often stops after income, age, gender, presence of children, and some regional variables. But seriously, do you think that the difference between a luxury car buyer and an SUV buyer is just income and age? You see, these variables are just the ones that human minds are accustomed to. Mathematics do not have such preconceived notions. Sticking to a few popular variables is like children repeatedly using three favorite colors out of a whole box of crayons.

I once saw a neighborhood-level U.S. Census variable called “% Households with Septic Tanks” in a model built for a high-end furniture catalog. Really, the variable was “percentage of houses with septic tanks in the neighborhood.” Then I realized it made a lot of sense. That variable was revealing how far away that neighborhood was located in comparison to populous city centers. As the percentage of septic tanks increased, the further away the residents were from the city center. And maybe those folks who live in scarcely populated areas were more likely to shop for furniture through catalogs than the folks who live closer to commercial areas.

This is where we all have that “aha” moment. But you and I will never pick that variable in anything that we do, not in million years, no matter how effective it may be in finding the target prospects. The word “septic” may scare some people off at “hello.” In any case, modeling procedures reveal hidden connections like that all of the time, and that is a very important function in data-rich environments. Otherwise, we will not know what to throw out without fear, and the databases will continuously become larger and more unusable.

Moving on to the next points, “Repeatable” and “Expandable” are somewhat related. Let’s say a marketer has been using a very innovative selection logic that she came across almost by accident. In pursuing special types of wealthy people, she stumbled upon a piece of data called “owner of swimming pool.” Now, she may have even had a few good runs with it, too. But eventually, that success will lead to the question of:

1. Having to repeat that success again and again; and
2. Having to expand that universe, when the “known” universe of swimming pool owners become depleted or saturated.

Ah, the chagrin of a one-hit-wonder begins.

Use of statistical models, with help of multiple variables and scalable scoring, would avoid all of those issues. You want to expand the prospect universe? No trouble. Just dial down the scores on the scale a little further. We can even measure the risk of reaching into the lower-scoring groups. And you don’t have to worry about coverage issues related to a few variables, as those won’t be the only ones in the model. Want to automate the selection process? No problem there, as using a score, which is a summary of key predictors, is far simpler than having to carry a long list of data variables into any automated system.

Now, that leads to the next point, “Filling in the gaps and summarizing the complex data into an easy-to-use format.” In the age of ubiquitous and “Big” data, this is the single-most important point, way beyond the previous examples for traditional 1-to-1 marketing applications. We are definitely going through massive data overloads everywhere, and someone better refine the data and provide some usable answers.

As I mentioned earlier, we build models because we will never know the whole truth. I believe that the Big Data movement should be all about:

1. Filtering the noise from valuable information; and
2. Filling the gaps.

“Gaps,” you say? Believe me, there are plenty of gaps in any dataset, big or small.

When information continues to get piled on, the resultant database may look big. And they are physically large. But in marketing, as I repeatedly emphasized in my previous columns, the data must be realigned to “buyer-centric” formats, with every data point describing each individual, as marketing is all about people.

Sure, you may have tons of mobile phone-related data. In fact, it could be quite huge in size. But let me turn that upside down for you (more like sideways-up, in practice). Now, try to describe everyone in your footprint in terms of certain activities. Say, “every smart phone owner who used more than 80 percent of his or her monthly data allowance on the average for the past 12 months, regardless of the carrier.” Hey, don’t blame me for asking these questions just because it’s inconvenient for data handlers to answer them. Some marketers would certainly benefit from information like that, and no one cares about just bits and pieces of data, other than for some interesting tidbits at a party.

Here’s the main trouble when you start asking buyer-related questions like that. Once we try to look at the world from the “buyer-centric” point of view, we will realize there are tons of missing data (i.e., a whole bunch of people with not much information). It may be that you will never get this kind of data from all carriers. Maybe not everyone is tracked this way. In terms of individuals, you may end up with less than 10 percent in the database with mobile information attached to them. In fact, many interesting variables may have less than 1 percent coverage. Holes are everywhere in so-called Big Data.

Models can fill in those blanks for you. For all those data compilers who sell age and income data for every household in the country, do you believe that they really “know” everyone’s age and income? A good majority of the information is based on carefully constructed models. And there is nothing wrong with that.

If you don’t get to “know” something, we can get to a “likelihood” score—of “being like” that something. And in that world, every measurement is on a scale, with no missing values. For example, the higher the score of a model built for a telecommunication company, the more likely that the prospect is going to use a high-speed data plan, or the international long distance services, depending on the purpose of the model. Or the more likely the person will buy sports packages via cable or satellite. Or the person is more likely to subscribe to premium movie channels. Etc., etc. With scores like these, a marketer can initiate the conversation with—not just talking to—a particular prospect with customized product packages in his hand.

And that leads us to the final point in all this, “Staying relevant to your customers and prospects.” That is what Big Data should be all about—at least for us marketers. We know plenty about a lot of people. And they are asking us why we are still so random about marketing messages. With all these data that are literally floating around, marketers can do so much better. But not without statistical models that fill in the gaps and turn pieces of data into marketing-ready answers.

So, why model? Because a big pile of information doesn’t provide answers on its own, and that pile has more holes than Swiss cheese if you look closely. That’s my final answer.