Marketing Machines — Possible or Pipedream?

True data-driven marketing is still “just a dream” for many marketers, rather than a reality. Under this vision, systems data-mine autonomously, and present fresh actionable insights at your desktop in the morning.

True data-driven marketing is still “just a dream” for many marketers, rather than a reality. Under this vision, systems data-mine autonomously, and present fresh actionable insights at your desktop in the morning.

For about 99 percent of marketers, this may sound too good to be true — and in all candor, it usually is.

But it is important to know and recognize that the intelligent application of mathematics and statistics, and the creation of purpose-specific algorithms, have been quietly creating value for years now. Yet the typical marketer still struggles to find enough time to get the mail out, or execute well-thought-out website marketing experiments against a control. (see “Analytics Isn’t Reporting”)

So there have never been more skeptics of the legitimate power of the intelligent application of data, even as the C-suite expectations of a data strategy that creates competitive advantage grows. Sound like your experience, industry or career? Sure it does.

But as investment continues to grind higher and competition grows, progress continues to be made.

The Amazon of Data, Is of Course, Amazon.
You may know that Amazon.com elected to release to the public some technology it uses internally in making recommendations and determining what you’d be likely to buy and when. Amazon took the same tool-set it uses and published it on Amazon Web Services. “Pretty neat” you might say …

Mike Ferranti infographic

Because we get so many questions about how Amazon does it, and how all of this actually works, we’ll break down the AWS Machine Learning and Prediction tool-set so that qualified organizations have an idea of what’s possible.

For the purposes of this article, a “qualified organization” is one that has development talent, experience with data and at least a basic working knowledge of statistical methods. Of course, experience developing models is very helpful, as well.

We call these “requirements,” because Amazon’s tools, and every tool like it (Google has a similar tool-set for the Google Cloud Platform) requires significant programming to use. They also have a learning curve for inexperienced developers and organizations that haven’t developed competencies in structuring and transforming their data to a treatment that is readily ingested and workable with these tools.

What AWS Tools Do
AWS offers a “Machine Learning” and “Prediction” tool-set. These are two related components. Machine Learning is used to ingest large amounts of data and identify patterns in that data. A typical example is extracting promotional history and responses, and utilizing it to identify what customers are most likely to respond to a marketing promotion or offer.

When Should You Use Machine Learning and Prediction?
Generally speaking, machine learning works best when a simple “logic-based” algorithm doesn’t work, or doesn’t work consistently. Simple (or even complex) logic defines a set of rules or requirements for a decision the algorithm makes to be determined. This is also called a deterministic or rule-based approach.

If there are a lot of variables, say hundreds or more — you can’t realistically develop “brute force” rules that cover every scenario that you’d need to create value. You may determine a favorite color of a buyer with a simple rule that says if the majority of their purchases are in red, then they like red. But each purchase is influenced by more than just color… there is style, season, price and category of product, material, size and discount, to name a few. As the permutations of these combinations of variables grow more complex, a simple deterministic rule-based approach can break down, and make a prediction that doesn’t work more and more of the time.

If and when business rules begin to collide with one another and discrepancies require more rules to manage these logical collisions, Machine Learning can help sort through your data in ways rule-based algorithms cannot.

“In short, you can’t realistically create or code all the permutations and business logic costeffectively.”

If your data set is very large and the diversity of variables you have is high, any “brute force” approach is destined to fail. Running through a set of rules on a sample of a few thousand cases may still work. Now what if you have millions of raw records? This can be possible even without a multi-million record customer file, given we may be looking at the colors and other attributes of items purchased during a period of years. Machine Learning can help make the task scaleable, and when you’re using Amazon’s computing power to do it, scale becomes the easy part.

Here’s An Overview of How the Prediction Process Works
So here’s an executive-level overview of how we use Machine Learning, and how it works if you build your solution on top of AWS, or Google’s developer APIs.

1. Problem Definition — Begin with The End in Mind: Here’s the step too many really don’t get right. If you’re going to venture into Machine Learning with AWS, or anywhere else, first you must define the core problems or opportunities you wish to pursue. You’ll have to do so describing that which you can observe (through your data) and an “answer” a model is expected to predict.

2. Data Preparation: Your data is going to go into a “training algorithm” where the tools will identify patterns in the data that will ultimately be used to predict the answers you’re looking for on a like dataset. Look at your data before it goes in. Be curious. Do some logical testing on it. If it is not adding up to the common sense “sniff test,” odds are very good it won’t add up later, either.

3. Transformation: Input variables and the answers you seek from models, also called the “target,” are not tidy such that they can be used to train an effective, predictive model. So you have some heavy lifting to do to get the data into new variables, “transforming” it to a more prediction-friendly input. For example, you may have a set of transactions that a customer had with your brand, but you need to summarize that into a count of transactions for that customer, and an average time between purchases. These two new fields will be more predictive and useful. A command of logic and statistics helps make these calls, as does experience.

4. Implement a Learning Algorithm: Your input variables have to be fed into an algorithm that can sort and find patterns in your data — also called a “learning algorithm.” These algorithms are specialized to help establish models (statistical relationships) and evaluate the quality of the models on data that was held out from model building.

5. Run The Model: We generate predictions against a new or holdout sample of the same format of the same source of data. You can’t run this predictive model on the same sample you used to build the model. This begins the iterative process

6. Iterate … Then Do It Again: As is any process where you’re engineering new outcomes for the first time, this process is generally iterative. It’s usually not realistic to expect a killer result on the first pass. You’ll likely massage inputs and training methods a number of times before the output starts looking good. More on what a good output looks like in a future column, though. For now, you need to know that the first product won’t likely be the final product.

The Bottom Line — Easier Still Isn’t Quite Easy for the Average Marketing Organization
While Amazon and Google may be among the easiest websites to use, and have made tremendous contributions to the proliferation of data science by providing structure and programming tools with which organizations can develop new capabilities, using Amazon AWS for Machine Language and Prediction is not for the creative marketer or even the “traditional” Web marketer.

There is also a rising category of upstarts in data-driven and database marketing apps that add intelligence to the process and can provide marketers with a significant head-start in advancing their marketing intelligence.

Data Science requires a combination of technical, mathematics/statistics and marketing/business skills. This combination is in great demand the world over, and so it’s not easy to hire top contributors to implement all of this. But for organizations with the programming bench, or external experienced business partners, tools like AWS and Google Cloud Platform can provide a substantial leap forward in using data to make superior decisions.

Remember, the outputs of the predictive process don’t have to be “right” 100 percent of the time — and they won’t be. They only need to make the numbers break in your favor enough to have a material impact on your revenue and profit now — and over time.

After all, that’s really what the data science discipline is really all about.

It’s All About Ranking

The decision-making process is really all about ranking. As a marketer, to whom should you be talking first? What product should you offer through what channel? As a businessperson, whom should you hire among all the candidates? As an investor, what stocks or bonds should you purchase? As a vacationer, where should you visit first?

The decision-making process is really all about ranking. As a marketer, to whom should you be talking first? What product should you offer through what channel? As a businessperson, whom should you hire among all the candidates? As an investor, what stocks or bonds should you purchase? As a vacationer, where should you visit first?

Yes, “choice” is the keyword in all of these questions. And if you picked Paris over other places as an answer to the last question, you just made a choice based on some ranking order in your mind. The world is big, and there could have been many factors that contributed to that decision, such as culture, art, cuisine, attractions, weather, hotels, airlines, prices, deals, distance, convenience, language, etc., and I am pretty sure that not all factors carried the same weight for you. For example, if you put more weight on “cuisine,” I can see why London would lose a few points to Paris in that ranking order.

As a citizen, for whom should I vote? That’s the choice based on your ranking among candidates, too. Call me overly analytical (and I am), but I see the difference in political stances as differences in “weights” for many political (and sometimes not-so-political) factors, such as economy, foreign policy, defense, education, tax policy, entitlement programs, environmental issues, social issues, religious views, local policies, etc. Every voter puts different weights on these factors, and the sum of them becomes the score for each candidate in their minds. No one thinks that education is not important, but among all these factors, how much weight should it receive? Well, that is different for everybody; hence, the political differences.

I didn’t bring this up to start a political debate, but rather to point out that the decision-making process is based on ranking, and the ranking scores are made of many factors with different weights. And that is how the statistical models are designed in a nutshell (so, that means the models are “nuts”?). Analysts call those factors “independent variables,” which describe the target.

In my past columns, I talked about the importance of statistical models in the age of Big Data (refer to “Why Model?”), and why marketing databases must be “model-ready” (refer to “Chicken or the Egg? Data or Analytics?”). Now let’s dig a little deeper into the design of the “model-ready” marketing databases. And surprise! That is also all about “ranking.”

Let’s step back into the marketing world, where folks are not easily offended by the subject matter. If I give a spreadsheet that contains thousands of leads for your business, you wouldn’t be able to tell easily which ones are the “Glengarry Glen Ross” leads that came from Downtown, along with those infamous steak knives. What choice would you have then? Call everyone on the list? I guess you can start picking names out of a hat. If you think a little more about it, you may filter the list by the first name, as they may reflect the decade in which they were born. Or start calling folks who live in towns that sound affluent. Heck, you can start calling them in alphabetical order, but the point is that you would “sort” the list somehow.

Now, if the list came with some other valuable information, such as income, age, gender, education level, socio-economic status, housing type, number of children, etc., you may be able to pick and choose by which variables you would use to sort the list. You may start calling the high income folks first. Not all product sales are positively related to income, but it is an easy way to start the process. Then, you would throw in other variables to break the ties in rich areas. I don’t know what you’re selling, but maybe, you would want folks who live in a single-family house with kids. And sometimes, your “gut” feeling may lead you to the right place. But only sometimes. And only when the size of the list is not in millions.

If the list was not for prospecting calls, but for a CRM application where you also need to analyze past transaction and interaction history, the list of the factors (or variables) that you need to consider would be literally nauseating. Imagine the list contains all kinds of dollars, dates, products, channels and other related numbers and figures in a seemingly endless series of columns. You’d have to scroll to the right for quite some time just to see what’s included in the chart.

In situations like that, how nice would it be if some analyst threw in just two model scores for responsiveness to your product and the potential value of each customer, for example? The analysts may have considered hundreds (or thousands) of variables to derive such scores for you, and all you need to know is that the higher the score, the more likely the lead will be responsive or have higher potential values. For your convenience, the analyst may have converted all those numbers with many decimal places into easy to understand 1-10 or 1-20 scales. That would be nice, wouldn’t it be? Now you can just start calling the folks in the model group No. 1.

But let me throw in a curveball here. Let’s go back to the list with all those transaction data attached, but without the model scores. You may say, “Hey, that’s OK, because I’ve been doing alright without any help from a statistician so far, and I’ll just use the past dollar amount as their primary value and sort the list by it.” And that is a fine plan, in many cases. Then, when you look deeper into the list, you find out there are multiple entries for the same name all over the place. How can you sort the list of leads if the list is not even on an individual level? Welcome to the world of relational databases, where every transaction deserves an entry in a table.

Relational databases are optimized to store every transaction and retrieve them efficiently. In a relational database, tables are connected by match keys, and many times, tables are connected in what we call “1-to-many” relationships. Imagine a shopping basket. There is a buyer, and we need to record the buyer’s ID number, name, address, account number, status, etc. Each buyer may have multiple transactions, and for each transaction, we now have to record the date, dollar amount, payment method, etc. Further, if the buyer put multiple items in a shopping basket, that transaction, in turn, is in yet another 1-to-many relationship to the item table. You see, in order to record everything that just happened, this relational structure is very useful. If you are the person who has to create the shipping package, yes, you need to know all the item details, transaction value and the buyer’s information, including the shipping and billing address. Database designers love this completeness so much, they even call this structure the “normal” state.

But the trouble with the relational structure is that each line is describing transactions or items, not the buyers. Sure, one can “filter” people out by interrogating every line in the transaction table, say “Select buyers who had any transaction over $100 in past 12 months.” That is what I call rudimentary filtering, but once we start asking complex questions such as, “What is the buyer’s average transaction amount for past 12 months in the outdoor sports category, and what is the overall future value of the customers through online channels?” then you will need what we call “Buyer-centric” portraits, not transaction or item-centric records. Better yet, if I ask you to rank every customer in the order of such future value, well, good luck doing that when all the tables are describing transactions, not people. That would be exactly like the case where you have multiple lines for one individual when you need to sort the leads from high value to low.

So, how do we remedy this? We need to summarize the database on an individual level, if you would like to sort the leads on an individual level. If the goal is to rank households, email addresses, companies, business sites or products, then the summarization should be done on those levels, too. Now, database designers call it the “de-normalization” process, and the tables tend to get “wide” along that process, but that is the necessary step in order to rank the entities properly.

Now, the starting point in all the summarizations is proper identification numbers for those levels. It won’t be possible to summarize any table on a household level without a reliable household ID. One may think that such things are given, but I would have to disagree. I’ve seen so many so-called “state of the art” (another cliché that makes me nauseous) databases that do not have consistent IDs of any kind. If your database managers say they are using “plain name” or “email address” fields for matching or summarization, be afraid. Be very afraid. As a starter, you know how many email addresses one person may have. To add to that, consider how many people move around each year.

Things get worse in regard to ranking by model scores when it comes to “unstructured” databases. We see more and more of those, as the data sources are getting into uncharted territories, and the size of the databases is growing exponentially. There, all these bits and pieces of data are sitting on mysterious “clouds” as entries on their own. Here again, it is one thing to select or filter based on collected data, but ranking based on some statistical modeling is simply not possible in such a structure (or lack thereof). Just ask the database managers how many 24-month active customers they really have, considering a great many people move in that time period and change their addresses, creating multiple entries. If you get an answer like “2 million-ish,” well, that’s another scary moment. (Refer to “Cheat Sheet: Is Your Database Marketing Ready?”)

In order to develop models using variables that are descriptors of customers, not transactions, we must convert those relational or unstructured data into the structure that match the level by which you would like to rank the records. Even temporarily. As the size of databases are getting bigger and bigger and the storage is getting cheaper and cheaper, I’d say that the temporary time period could be, well, indefinite. And because the word “data-mart” is overused and confusing to many, let me just call that place the “Analytical Sandbox.” Sandboxes are fun, and yes, all kinds of fun stuff for marketers and analysts happen there.

The Analytical Sandbox is where samples are created for model development, actual models are built, models are scored for every record—no matter how many there are—without hiccups; targets are easily sorted and selected by model scores; reports are created in meaningful and consistent ways (consistency is even more important than sheer accuracy in what we do), and analytical language such as SAS, SPSS or R are spoken without being frowned up by other computing folks. Here, analysts will spend their time pondering upon target definitions and methodologies, not about database structures and incomplete data fields. Have you heard about a fancy term called “in-database scoring”? This is where that happens, too.

And what comes out of the Analytical Sandbox and back into the world of relational database or unstructured databases—IT folks often ask this question—is going to be very simple. Instead of having to move mountains of data back and forth, all the variables will be in forms of model scores, providing answers to marketing questions, without any missing values (by definition, every record can be scored by models). While the scores are packing tons of information in them, the sizes could be as small as a couple bytes or even less. Even if you carry over a few hundred affinity scores for 100 million people (or any other types of entities), I wouldn’t call the resultant file large, as it would be as small as a few video files, really.

In my future columns, I will explain how to create model-ready (and human-ready) variables using all kinds of numeric, character or free-form data. In Exhibit A, you will see what we call traditional analytical activities colored in dark blue on the right-hand side. In order to make those processes really hum, we must follow all the steps that are on the left-hand side of that big cylinder in the middle. Preventing garbage-in-garbage-out situations from happening, this is where all the data get collected in uniform fashion, properly converted, edited and standardized by uniform rules, categorized based on preset meta-tables, consolidated with consistent IDs, summarized to desired levels, and meaningful variables are created for more advanced analytics.

Even more than statistical methodologies, consistent and creative variables in form of “descriptors” of the target audience make or break the marketing plan. Many people think that purchasing expensive analytical software will provide all the answers. But lest we forget, fancy software only answers the right-hand side of Exhibit A, not all of it. Creating a consistent template for all useful information in a uniform fashion is the key to maximizing the power of analytics. If you look into any modeling bakeoff in the industry, you will see that the differences in methodologies are measured in fractions. Conversely, inconsistent and incomplete data create disasters in real world. And in many cases, companies can’t even attempt advanced analytics while sitting on mountains of data, due to structural inadequacies.

I firmly believe the Big Data movement should be about

  1. getting rid of the noise, and
  2. providing simple answers to decision-makers.

Bragging about the size and the speed element alone will not bring us to the next level, which is to “humanize” the data. At the end of the day (another cliché that I hate), it is all about supporting the decision-making processes, and the decision-making process is all about ranking different options. So, in the interest of keeping it simple, let’s start by creating an analytical haven where all those rankings become easy, in case you think that the sandbox is too juvenile.

B-to-B Prospecting Data Just Keeps Getting Better

The most reliable and scalable approach to finding new B-to-B customers is outbound communications, whether by mail, phone or email, to potential prospects, using rented or purchased lists. B-to-B marketers typically select targets from prospecting lists based on such traditional variables as industry, company size and job role, or title. But new research indicates that B-to-B prospecting data is much more detailed these days, and includes a plethora of variables to choose from

The most reliable and scalable approach to finding new B-to-B customers is outbound communications, whether by mail, phone or email, to potential prospects, using rented or purchased lists. B-to-B marketers typically select targets from prospecting lists based on such traditional variables as industry, company size, and job role or title. But new research (opens as a pdf) indicates that B-to-B prospecting data is much more detailed these days, and includes a plethora of variables to choose from—for refining your targeting, or for building predictive models—to pick your targets even more effectively.

My colleague Bernice Grossman and I recently conducted a new study (opens as a pdf) indicating that B-to-B marketers now have the opportunity to target prospects more efficiently than ever before. In fact, you might say that business marketers now have access to prospecting data as rich and varied as that available in consumer markets.

To get an understanding of the depth of data available to B-to-B marketers for prospecting, we invited a set of reputable vendors to open their vaults and share details about the nature and quantity of the fields they offer. Seven vendors participated, giving us a nice range of data sources, including both compiled lists and response lists.

We provided each vendor with a set of 30 variables that B-to-B marketers often use, including not only company size and industry, but also elements like the year the company was established, fiscal year end, Fortune Magazine ranking, SOHO (small office/home office) business indicator, growing/shrinking indicator, and other useful variables that can give marketers insight into the relative likelihood of a prospect’s conversion to a customer. We learned that some vendors provide all these data elements on most of the accounts on their files, while others offer only a few.

We also asked the participating vendors to tell us what other fields they make available, and this is where things got interesting. In response to our request for sample records on five well-known firms, the reported results included as many as 100 lines per firm. Furthermore, two of the vendors, Harte-Hanks and HG Data, supply details about installed technology, and their fields thus run into the thousands. The quantity was so vast that we published it in a supplementary spreadsheet, so that our research report itself would be kept to a readable size.

Some of the more intriguing fields now available to marketers include:

  • Spending levels on legal services, insurance, advertising, accounting services, utilities and office equipment (Infogroup)
  • Self-identifying keywords used on the company website (ALC)
  • Technology usage “intensity” score, by product (HG Data)
  • Out-of-business indicator, plus credit rating and parent/subsidiary linkages (Salesforce.com)
  • Company SWOT analysis (OneSource)
  • Whether the company conducts e-commerce (ALC)
  • List of company competitors (OneSource)
  • Biographies of company contacts (OneSource)
  • Employees who travel internationally (Harte-Hanks)
  • Employees who use mobile technology (Harte-Hanks)
  • Links to LinkedIn profiles of company managers (Stirista)
  • Executive race, religion, country of origin and second language (Stirista)

Imagine what marketers could do with a treasure trove of data elements like these to help identify high-potential prospects.

Matter of fact, we asked the vendors to tell us the fields that their clients find most valuable for predictive purposes. Several fresh and interesting ideas surfaced:

  • A venture capital trigger, from OneSource, indicating that a firm has received fresh funding and thus has budget to spend.
  • Tech purchase likelihood scores from Harte-Hanks, built from internal models and appended to enhance the profile of each account.
  • A “prospectability” score custom-modeled by OneSource to match target accounts with specific sales efforts.
  • PRISM-like business clusters offered by Salesforce.com (appended from D&B), which provide a simple profile for gaining customer insights and finding look-alikes.
  • “Call status code,” Infogroup’s assessment of the authenticity of the company record, based on Infogroup’s ongoing phone-based data verification program.

We conclude from this study that B-to-B prospecting data is richer and more varied than most marketers would have thought. We recommend that marketers test several vendors, to see which best suit their needs, and conduct a comparative test before you buy.

Readers who would like to see our past studies on the quality and quantity of prospecting data available in business markets can access them here. Bernice and I are always open to ideas for future studies. We welcome your feedback and suggestions.

A version of this article appeared in Biznology, the digital marketing blog.