Creating a Persona Menu (for You)

Personas are like menu items, each representing key characteristics of target customers that marketers need to know to push their products.

“93H,” Public Domain license. | Credit: Flickr by saul saulete

I have been writing about the importance of using modeling techniques for personalization for some time now (refer to “Personalization Is About the Person” and “Segments vs. Personas”). If I may summarize the whole idea down to a 15-second pitch:

  • We need modeling because we will never know everything about everybody, and;
  • Selfishly for marketers, it is much simpler to assign personas to product groups and related contents than to have to deal with an obscene amount of customer data and a long list of content details at the same time.

Simply, personas are like menu items, each representing key characteristics of target customers that marketers need to know to push their products.

One may say, “Hey, I just put in SKU-level data into some personalization engine!” To which, I must ask, “Do you also put in unrefined oil into your beloved automobile?” I didn’t think so. Not that ruining some personalization engine will break anyone’s heart. But it may annoy the heck out of your customers by treating them as extensions of their immediate purchases, not as living, breathing human beings.

I’ve actually met someone from a software company at a conference who claimed to be able to create hundreds of thousands of combinations of SKU-level transaction data and content data. If you have a few hundred thousand SKUs and tens of thousands of pictures and creative items, well, the number of combinations will be quite large. Not exactly the number of stars in the universe, but quite unmanageable, enough for marketers to just “let go” and leave it all to the machine on a default setting. So, even if someone automated the process of combining such data (with some built-in rules, I’m sure), how would any marketer – and recipients of messages – make sense out of it all?

That type of shotgun approach is the mother of all of those annoying “personalizations,” like offers of the very same items that you just purchased. For such rudimentary methods, it might actually be a great achievement to offer a yoga mat to someone who just bought a yoga mat. Hey, they are in the same category after all, categorically speaking, right?

The key to humanization of marketing messages is to make them about the customers, not about marketers, products or channels. And that kind of high-level personalization requires, well, a real human touch. That means, each block of information must be bite-sized so that human beings – i.e., marketers – can process and consume it easily.

When I first came to America (a long time ago), it wasn’t so easy to go through menu items in a typical diner. Too many items! How can I pick just “one” of those items that matches my appetite and mood of the day? Now imagine a menu that goes on for hundreds of thousands of lines. And you have to act fast on it, too.

Personas, or architypes as some may call them, are the bridges between obscene amounts of data points and yet another large set of pictures and content. The idea is to have a manageable number of personas to make it easier for us to match the right content to the right target.

I bet most content libraries are not crazy big, but large enough. But on that side, it is what it is. You will not cut out some valuable digital assets just because the inventory got big. So, we have to make the personal data – especially behavioral and transactional data – more compact to facilitate easy assignment, as in “Show this picture of a glass of red wine next to a juicy steak” to a persona called “Wine Enthusiast” or “Fine Dining.” The assignment itself would be as simple as saving a room for persona designation in the content library (if you don’t even have a content library, we need to talk).

Then, how would you come up with the right list of personas for “you”? Having done this a few times for many companies in various industries on a national level, I have some tips to share.

  1. Be Product-Centric: Anyone who has been reading my articles about personalization will be surprised by this one, as I have been screaming “customer-centric marketing” all along. But, in the end, we are doing all of this to sell more of our products to customers. Think about the products you want to push, then think about the types of characteristics that you would love to know about customers to push those products in a relevant way.

Trying to sell cutting-edge products? Then you may need personas such as “Early adopter.” Selling value-based items? You may want “Bargain-seekers.” Pushing travel items? Try “Frequent business traveler” or “Family vacation” personas. Dealing with high net-worth people? Well, go beyond simple income-select and try “Globetrotter,” “Luxury car,” “Heavy stock investor,” etc., depending on what you are selling. By the way, these luxury personas may or may not be related to one another, as human beings are much more complex than their income levels.

  1. Be Creative: Models can be built if you have data for “some” people who have actually behaved in a certain way to be used as targets. That limitation aside, you can be as creative you want to be.

For example, if you are in the telecommunications industry, expand the typical triple-play offering, and dig deeper into “why” people would need broadband service. Is it because someone is an “Avid gamer,” “Heavy VOIP user,” “Frequent international caller,” part of a “Big family,” “Home office worker” and/or “On-demand movie watcher”? If you can differentiate these traits, you don’t have to push broadband Internet services with brute force. You can now show reasons why they need over 100 megabits per second service.

If you are dealing with mostly female customers (who are, by the way, responsible for the bulk of economic activities on a national level), one can imagine categories that start with various health and beauty items, going all of the way to yoga and fitness personas. In between those, add any persona that is an ideal target for the products you are trying to sell, be it “Fashion enthusiast,” “Children’s interests,” “Gardening enthusiast,” “Organic food,” “Weight watchers,” Gourmet Cooking,” “Family entertainment,” etc., etc. The keys is to describe the buyer, not the product.

  1. Start Small, but be bolder as the list grows: In the beginning, you may have to prove that personalization using model-based personas really works. Yes, building a persona is as simple as building a propensity model (in essence, they are exactly those), but that doesn’t mean that you start the effort with 50 persons. Pick the product that you really want to push, or characteristics that you need to know in order to resonate with your core customers, and build a few personas as a starter (say five to 10). You may find some data limitations along the way, but as you go through the list, your team (or analytics partners) will definitely gain momentum.

Then you can be bold. I’ve seen retailers who routinely maintain over 100 personas for just one major product category. And I’ll bet that list didn’t grow that big overnight, either.

Also, when you are in an expansion mode, just add items when in doubt. Think about the users of those personas, not mathematical differences among models. Do you know the difference between Kung Pao Chicken and Diced Chicken with Hot Peppers? Just peanuts on top. But restaurants have them both because customers expect to see them.

Similarly, there may be only slight differences between “Conservative Investor” and “Annuity Investor” personas. But the users of those personas may grab one or the other because of their targeting need at the moment. Or whatever inspired their marketing spirit. Think in terms of user-friendliness, not mathematical purity.

  1. Do Not Go Out of Control: When I was leading a product development team in a prominent data compiling company in the U.S., our team developed about 140 personas covering the entire country for various behavioral categories, including investment, travel, sports (both active participation and being a fan of), telecomm, donation, politics, etc. One of our competitors tried to copy that idea, and failed miserably. Why? It had built too many models.

For instance, if you are building personas for the cruise industry in general, you may need just “Luxury cruise” and “Family cruise” for starters. Those are good enough for initial prospecting. Then, if you must get deeper into cross-selling for coveted “onboard spending,” then you may get into “Adventure-seeker,” “Family entertainment,” “Gourmet,” “Wine enthusiast,” “Shopping expedition,” “Luxury entertainment,” “Silver years,” “Young parents,” etc., for customization of offers.

My old copycats with too many models had developed separate models for “each” cruise fleet and brand. How were they going to use all of that? One brand at a time, with one company as a user group? Why not build a custom model as needed, then? Surely that would be more effective if the model is to target a specific brand or fleet. Anyway, my competitors ended up building a few thousand models, for any known brand out there in every industry, seriously limiting the chance those personas would be used by marketers.

As I mentioned in the beginning, this is about matching offers (or content) to the right people at the right time. If you go out of control, it will be very difficult to do that kind of match-making. If your persona list is just big for the sake of being big, well, how is that any different from using the raw data? You’ve got to know when to stop, too. The key is “not too small, and not too big,” for humans and machines alike.

  1. Update Periodically: Like any menu, persona lists go out of date. Some items may not have been used actively. Some may become obsolete as business models and core product lines go through changes. And models do go stale, as well. You may not have to review this all of the time, and there will be staple menu items, like spaghetti with meatballs in a restaurant. But it will be prudent to go through the menu once in awhile. If not because of the product, then because of people’s attitudes about it changing.
  2. Evangelize: It would be a shame if the data and analytics people did all of this work and marketers didn’t use it fully. These personas are in essence mathematical summaries of “lots of” data in compact forms. They can be used in targeting (for selecting the right target for specific product offers), and for personalization of offers and messages based on dominant characteristic of the target (e.g., show different pictures to “Adventure-seeker” and “Family entertainment” personas, even if they are about to board the same ship). Continuously educate your fellow marketers that using personas is as easy as using any other type of data, except that they are compressed model scores with no missing values.

The personalization game is complex. It may look easy if you just buy an off-the-shelf personalization engine, set up some rules with unrefined data and let it run. While it’s better than sending uniform message to everyone, that kind of rudimentary approach is far less than ideal, not to mention the annoyance factor.

To maximize the power of all available data and the personalization engine itself, we must compress the data in forms of personas. Resultant messaging will be far more relevant to your target audience as, for one, a persona is a built-in mechanism for the personal touch. If you set the menu up as a bridge between data and people, that is.

Segments vs. Personas

Personalization may mean different things to marketers, but we may break it down to, one, reacting to what you specifically know about the target and, two, proactively personalizing messages and offers based on both explicit and implicit data.

Tina ThrillseekerPersonalization may mean different things to marketers, but we may break it down to, one, reacting to what you specifically know about the target and, two, proactively personalizing messages and offers based on both explicit and implicit data.

The first one is more like “OK, the target prospect is clicking a whole a lot in the hiking gear section, so show him more related products right now.” This type of activity requires technical know-how regarding Web and mobile display techniques, and there are lots of big and small companies that specialize in that arena. Simply put, what good is all this talk about data and analytics, if one doesn’t know how to display personalized messages to the target customer? If you “know” that the customer is looking for hiking gear, by any means, usher him to the proper section. There are plenty of commercial versions of “product-to-product” matching algorithms available, too. We can dissect the data trail that the consumer left behind later.

All those transaction data trails become integral parts of the “Customer-360” (yet another buzzword of the day). Once that type of customer-centric view (a must for proper personalization) becomes a reality, however, marketers often realize “Oh jeez, we really do not know everything about everyone.” That is when the analytics must get into a higher gear, as we need to project what is known to us to the unknown territory, effectively filling in the gaps in the data. I’d say that is the single most important function of statistical modeling in the age of abundant, but never complete data — a state of omnipotence that we will never reach.

Then the next natural question is how we are going to fill in such gaps? In such situations, many marketers jump into an autopilot mode to use what we have been calling “segmentation” since the ’70s and ’80s (depending on how advanced one was back then). But is it still a desirable behavior in this day and age?

As “data-driven” personalization goes, no, using a segmentation technique is not a bad thing at all. It is heck of a lot more effective than using raw data for customized messaging. As a consumer, we all laugh at some ridiculous product suggestions, even by so-called reputable merchants, and that happens because they often enter raw SKU-level data into some commercial personalization engines.

If we get to have access to segments called “rich and comfortable retirees” or “young and upcoming professionals,” why not make the most of them? We can certainly use such information to personalize our offers and messages. It is just that we can do a lot better than that now.

The traditional segmentation technique has its limitations, as it tends to pin the target into one segment at a time. Surely, we all somewhat look like our neighbors, but are we so predictably uniform? Why should anyone be pigeonholed into one segment, and be labeled along with millions of others in that group? Even for rich and prestigious-sounding segments, it may be insulting to treat every member equally, as if they all enjoy the same type of luxury travel and put their money into the same investment vehicles. Simply put, in the real world, they do not.

Every individual possesses multiple dominant characteristics. For that reason alone, it is much more prudent to develop multiple personas and line them around the target consumer. The idea is the opposite of “group them first, and label them later”-type segmentation. It is more like “Build separate personas for all relevant behaviors, then find dominant characteristics for one person at a time.” With modeling techniques and modern computing power, we can certainly do that. There already are retailers who routinely use more than 100 personas for personalized campaigns and treatments.

The following chart compares traditional clustering/segmentation techniques to model-based personas:

Screen Shot 2016-06-08 at 11.16.08 AM

This segment vs. persona question comes up every time I talk about analytics-based personalization. It is understandable, as segmentation is an age-old technique with long mileage. Marketers feel comfortable around the concept, as segments have been the common language among creative types, IT folks and geeky analytical kinds. But I must point out that the segments are primarily designed for “general” message groups, not for individual-level personalization with wider varieties.

Plus, as I described in the chart, personas are more updatable, as they are much more agile than a clunky segmentation tool. I’ve seen segmentation tools that boast of more than 70 to 90 segments. But the more specific they become, the harder it is to update all of those with any consistency.

Conversely, personas are built for one behavior/propensity at a time, so it is much easier to update and maintain them. If the model scores seem to be drifting away from the original validation, just update the problematic ones, not the whole menu.

In the end, the personalization game is about which message and product offer resonates with the customers better. Without even talking about technical details, we know that more agile and flexible tools would have advantages in that game. And as I mentioned many times in this series, matching the right product and offer to the right person is a job anyone can do without a degree in mathematics. Just bring your common sense and let your imagination fly. After all, that is how copywriters imagine their target; by looking at the segment descriptions. That part isn’t any different from looking at the descriptions of personas instead; you will just have more flexibility in that matchmaking business.

Data Atrophy

Not all data are created equal. There are one-dimensional demographic and firmographic data, then there are more colorful behavioral data. The former is about how the targets look, and the latter is more about what they do, like what they click, browse, purchase and say.

Not all data are created equal. There are one-dimensional demographic and firmographic data, then there are more colorful behavioral data. The former is about how the targets look, and the latter is more about what they do, like what they click, browse, purchase and say. On top of these, if we are lucky, we may have access to attitudinal data, which are about what the target is thinking about. If we get to have all three types of data about the customers and prospects, prediction business will definitely get to the next level (refer to “Big Data Must Get Smaller”). But the reality is that it is very difficult to know everything about anyone, and that is why analytics is really about making the best of what we know. Predictive modeling is useful not only because it predicts the future, but also fills gaps in data. And even in the age of abundant data, there are many holes, as we will never have a complete set of information (refer to “Why Model?”).

Among these data types, some are more useful for prediction than others. Behavioral data definitely possess more predictive power than simple demographic data for sure. But alas, they are harder to come by. It could be that the target is new to the environment, so she may not have left much data behind at all. May be she just looked around and didn’t buy anything yet. Or she is very privacy-conscious and diligent about erasing her behavioral trails on the net or otherwise. Maybe she explicitly opted out of being traced at all, giving up much of the convenience factors of being known by the merchants. Then the data coverage comes into the equation, and that is why analysts rely on demographic and geo-demographic data for their readily available nature. Much of such data can easily be purchased and appended on a household or individual level, at least in the U.S. If we get to have some hint of identity of the target, there are ways to merge disparate data sets together.

What if we don’t get to know who are leaving data trails? Again, it could be about the privacy concerns of the target, or the manner by which the data are collected. Some data collectors avoid personally identifiable information, such as name, address or email, as they do not want to be seen as the Big Brother. Even if collectors get to have access to such PII, they do not share it with outsiders, to maintain dominance and to avoid the data privacy issue altogether. And there are many instances where that “who” part is completely out of reach. Movement data would be an example of that.

Weaving multiple types of data together is often the main source of trouble when it comes to predictive analytics. I have been talking about the importance of a 360-degree view of a customer for proper personalization and attribution, but the main show-stopper there is often the inability to merge data sources with confidence, not the lack of technology or statistical skills. That would be the horizontal challenge when dealing with multiple types of data.

Then there is the time factor. Like living organisms, data get old and wither away, too. Let’s call it the “data atrophy” challenge. Data players must be mindful about it, as outdated information is often worse than not having any at all for the decision-making or prediction business.

Now, not all data types deteriorate at the same rate. The shelf-life of demographic data are far longer than that of behavioral data. For example, people’s income levels or housing size do not change overnight, while usefulness of what we call “hotline” data evaporates much faster. If you get to know that someone is searching for a new car, how long will he be in the market? What if it is about a ticket or pay-per-view purchase for tonight’s ball game? Data that is extremely valuable this minute could be totally irrelevant within the next hour.

No One Is One-Dimensional

If anyone says to your face “You’re one-dimensional,” you would be rightfully offended by such statement. It would almost sound like “You are so simple that I just figured you out.”

If anyone says to your face “You’re one-dimensional,” you would be rightfully offended by such statement. It would almost sound like “You are so simple that I just figured you out.” Along with that line of thinking, you should be mad at most marketers, as they treat consumers as one-dimensional subjects. Even advanced marketers who claim that they pursue personalized marketing routinely treat customers as if they belong to “1” segment along with millions of other people. Sort of like drones with similar characteristics. Some may title such segments with other names, like “clusters” or “cohorts.” But no matter. That is how personalization works most times, and that is why most consumers are not impressed with so-called personalized messages.

Here is how segments are built through cluster analysis. Unlike regression models, clusters are built without clear “target” (or dependent) variables (refer to “Data Deep Dive: The Art of Targeting”). Considering all available variables, statisticians group the universe with commonly shared characteristics. A common analogy is that they throw spaghetti noodles on the wall, and see which ones stick together. Analysts can control the number of segments and closeness (or “stickiness”) of resultant groups. I have seen major banks grouping their customers into six to seven major segments. Most commercial clustering products by data compilers maintain 50 to 60 segments or cohorts (I am not going to name names here, but I am sure you have heard of most of them). I was personally involved in a project where we divided every town in the U.S. into 108 distinctive clusters using consumer, business and geo-demographic variables. The number of segments may vary greatly, depending on the purpose.

Once distinctive segments are created through a mathematical process, then the real fun begins. The creators get to describe characteristics of each segment in plain English, and group smaller segments into higher-level “super” clusters. Some creative companies name each cluster with whimsical titles or dominant first names of each cluster (for copyright reasons, I wouldn’t use actual names, but again, I’m sure marketers have heard about them). To identify dominating characteristics of people within each cluster, analysts use various measurements to compare them against the whole universe. For instance, if a cluster shows an above-average index of post-college graduates, then they may call it “highly educated.” If analysts see a high index-value of luxury car owners, then they may label the whole cluster with some luxurious-sounding name.

Segmentation is an age-old technique and, of course, it still has its place in marketing. Let me make it clear that using segments for target marketing is much better than not using anything at all. It also provides a common language among various players in marketing, binding clients and vendors together. Marketing agencies, who cannot realistically create an unlimited number of copies, may prepare a set number of creatives for major segments that their clients are targeting. With descriptions of segments in front of them, copywriters may write as if they are talking to the target directly. Surely, writing copy for a “Family-oriented young couple with dual income” would be easier than doing so for some anonymous target.

However, the trouble begins when marketers start using such a “descriptive” tool for targeting purposes. Just because there is a higher-than-average index value of a certain characteristic in a segment, is it justified to treat thousands, or sometimes millions, of people in the target group the same way? Surely, not everyone in the “luxury” segment is about luxury automobiles or vacations. It is just that the cluster that someone happened to have belonged through some statistical process has a higher-than-average concentration of such folks.

Then how do we overcome such shortcomings of a popular method? I suggest we reverse the way we look at the behavioral indices completely. The traditional method defines the clusters first, and then the analysts put descriptions looking at various behavioral and demographic indices. For promotions for specific products or services, they may examine more than 50, sometimes more than a few hundred index values. Only to label everyone in a segment the same way.

Instead, for targeting and personalization, marketers should commission independent models for every type of behavioral or demographic characteristic that may matter for their campaigns. So, instead of using one “luxury segment,” we should build multiple models. For example, for a travel industry like airlines or cruise lines, we may consider the following series of model-based “personas”:

  • Foreign vacationers
  • Luxury vacationers
  • Frequent business travelers
  • Frequent flyers
  • Budget-conscious travelers
  • Family vacationers
  • Travelers with young children
  • Frequent theme park visitors
  • Bargain-seekers
  • Adventure-seekers
  • Wine enthusiasts
  • Gourmets
  • Brand-loyal travelers
  • Point collectors
  • etc.

This way, we can describe “everyone” in the target universe in a multi-dimensional way. Surely, not everyone is about everything. That is why we need a system under which one person may score high in multiple categories at the same time. We all have tendencies to be bargain seekers, but everyone has a different threshold for it (i.e., what length of trouble would you go through for a 10 percent discount?). If you have multiple descriptors for everybody, you can find the most dominant characteristics for one person at a time. Yes, one may have high scores in “luxury vacationers,” “frequent flyers” and “frequent business travelers” models, but which characteristic has the highest score for “him”?

Imagine having assigned scores for these “personas” for everyone. I may score nine out of nine in “frequent flyer” (and that is for certain, as I am writing this on a plane again), score six out of nine in “luxury vacation,” and score two out of nine in “family vacationers” (as my kids are not young anymore). If you have one chance to show me something that resonates with me this second, what would be the offer? Even a machine can decide the outcome with a scoring system like this. Now imagine doing it for millions of people, all customized.

Last month, I wrote that personalization is not an option anymore, and further, marketers should aspire to personalize their messages for most people, most times, through all channels, instead of personalizing only for some people sometimes through some channels (refer to “Road to Personalization”). Because “personas” based on statistical models will not have any missing values, we can achieve that ambitious goal with this technique.

With new modeling techniques and software, this is just a matter of commitment now. We are not operating in the 80s anymore, and it is time to move ahead from simple segmentation methods. Yes, using segments would be much better than no targeting at all. But with a few more tweaks, we can build more than 20 personas in the same time that we would spend for developing segments using a clustering technique, which isn’t exactly cheap even nowadays.

Another downside of a clustering technique is that, once the statistical work is done, it is very difficult to update the formula without changing existing marketing schemas. By nature, segments are very static. It is no secret that even some data compilers chose to stay with old models, as they are afraid of creating inconsistencies with newly updated ones. Some are more than a decade old.

Conversely, it is very easy to update personas, as it is not much different from refitting the models one at a time. And we don’t have to update the whole series every time, either. Just watch out for the ones that do not validate very well over time. With real machine learning techniques around the corner, we can even consider automating the whole process, from model update to deployment of messages through every channel.

The hard part would be imagining the categories of personas, but I suggest starting small with essential categories, and then keep building upon them. Surely, teenage apparel companies would have a very different list than business service companies that sell their services to other businesses. Start with obvious ones, like bargain seekers, high-value customers and specific key product targets.

Connecting personas to actual creatives will require some work in the beginning, too. However, if you plan the categories with set creatives in mind from the get-go, it won’t be so difficult. Again, start small and see how it goes, along with some A/B testing. Ten categories will be plenty for many businesses. But having more than 100 personas won’t take up much space in supporting databases, either. Once the system gets stable, marketers can automate much of the process, as most commercial software can take these personas like any other raw variable.

So, if your marketing team is committed enough to have purchased personalization engines for various channels, get out of the old segmentation method and consider building model-based personas. After all, no one is one-dimensional, and everyone deserves personalized offers and messages in this day of abundant data and machine power. This is not 1984 anymore.

Exciting New Tools for B-to-B Prospecting

Finding new customers is a lot easier these days, what with innovative, digitally based ways to capture and collect data. Early examples of this exciting new trend in prospecting were Jigsaw, a business card swapping tool that allowed salespeople to trade contacts, and ZoomInfo, which scrapes corporate websites for information about businesspeople and merges the information into a vast pool of data for analysis and lead generation campaigns. New ways to find prospects continue to come on the scene—it seems like on the daily.

Finding new customers is a lot easier these days, what with innovative, digitally based ways to capture and collect data. Early examples of this exciting new trend in prospecting were Jigsaw, a business card swapping tool that allowed salespeople to trade contacts, and ZoomInfo, which scrapes corporate websites for information about businesspeople and merges the information into a vast pool of data for analysis and lead generation campaigns. New ways to find prospects continue to come on the scene—it seems like on the daily.

One big new development is the trend away from static name/address lists, and towards dynamic sourcing of prospect names complete with valuable indicators of buying readiness culled from their actual behavior online. Companies such as InsideView and Leadspace are developing solutions in this area. Leadspace’s process begins with constructing an ideal buyer persona by analyzing the marketer’s best customers, which can be executed by uploading a few hundred records of name, company name and email address. Then, Leadspace scours the Internet, social networks and scores of contact databases for look-alikes and immediately delivers prospect names, fresh contact information and additional data about their professional activities.

Another dynamic data sourcing supplier with a new approach is Lattice, which also analyzes current customer data to build predictive models for prospecting, cross-sell and churn prevention. The difference from Leadspace is that Lattice builds the client models using their own massive “data cloud” of B-to-B buyer behavior, fed by 35 data sources like LexisNexis, Infogroup, D&B, and the US Government Patent Office. CMO Brian Kardon says Lattice has identified some interesting variables that are useful in prospecting, for example:

  • Juniper Networks found that a company that has recently “signed a lease for a new building” is likely to need new networks and routers.
  • American Express’s foreign exchange software division identified “opened an office in a foreign country” suggests a need for foreign exchange help.
  • Autodesk searches for companies who post job descriptions online that seek “design engineers with CAD/CAM experience.”

Lattice faces competition from Mintigo and Infer, which are also offering prospect scoring models—more evidence of the growing opportunity for marketers to take advantage of new data sources and applications.

Another new approach is using so-called business signals to identify opportunity. As described by Avention’s Hank Weghorst, business signals can be any variable that characterizes a business. Are they growing? Near an airport? Unionized? Minority owned? Susceptible to hurricane damage? The data points are available today, and can be harnessed for what Weghorst calls “hyper segmentation.” Avention’s database of information flowing from 70 suppliers, overlaid by data analytics services, intends to identify targets for sales, marketing and research.

Social networks, especially LinkedIn, are rapidly becoming a source of marketing data. For years, marketers have mined LinkedIn data by hand, often using low-cost offshore resources to gather targets in niche categories. Recently, a gaggle of new companies—like eGrabber and Social123—are experimenting with ways to bring social media data into CRM systems and marketing databases, to populate and enhance customer and prospect records.

Then there’s 6Sense, which identifies prospective accounts that are likely to be in the market for particular products, based on the online behavior of their employees, anonymous or identifiable. 6Sense analyzes billions of rows of 3rd party data, from trade publishers, blogs and forums, looking for indications of purchase intent. If Cisco is looking to promote networking hardware, for example, 6Sense will come back with a set of accounts that are demonstrating an interest in that category, and identify where they were in their buying process, from awareness to purchase. The account data will be populated with contacts, indicating their likely role in the purchase decision, and an estimate of the likely deal size. The data is delivered in real-time to whatever CRM or marketing automation system the client wants, according to CEO and founder Amanda Kahlow.

Just to whet your appetite further, have a look at CrowdFlower, a start-up company in San Francisco, which sends your customer and prospect records to a network of over five million individual contributors in 90 countries, to analyze, clean or collect the information at scale. Crowd sourcing can be very useful for adding information to, and checking on the validity and accuracy of, your data. CrowdFlower has developed an application that lets you manage the data enrichment or validity exercises yourself. This means that you can develop programs to acquire new fields whenever your business changes and still take advantage of their worldwide network of individuals who actually look at each record.

The world of B-to-B data is changing quickly, with exciting new technologies and data sources coming available at record pace. Marketers can expect plenty of new opportunity for reaching customers and prospects efficiently.

A version of this article appeared in Biznology, the digital marketing blog.

Beyond RFM Data

In the world of predictive analytics, the transaction data is the king of the hill. The master of the domain. The protector of the realm. Why? Because they are hands-down the most powerful predictors. If I may borrow the term that my mentor coined for our cooperative venture more than a decade ago (before anyone even uttered the word “Big Data”), “The past behavior is the best predictor of the future behavior.” Indeed. Back then, we had built a platform that nowadays could easily have qualified as Big Data. The platform predicted people’s future behaviors on a massive scale, and it worked really well, so I still stand by that statement.

In the world of predictive analytics, the transaction data is the king of the hill. The master of the domain. The protector of the realm. Why? Because they are hands-down the most powerful predictors. If I may borrow the term that my mentor coined for our cooperative venture more than a decade ago (before anyone even uttered the word “Big Data”), “The past behavior is the best predictor of the future behavior.” Indeed. Back then, we had built a platform that nowadays could easily have qualified as Big Data. The platform predicted people’s future behaviors on a massive scale, and it worked really well, so I still stand by that statement.

How so? At the risk of sounding like a pompous mathematical smartypants (I’m really not), it is because people do not change that much, or if so, not so rapidly. Every move you make is on some predictive curve. What you been buying, clicking, browsing, smelling or coveting somehow leads to the next move. Well, not all the time. (Maybe you just like to “look” at pretty shoes?) But with enough data, we can calculate the probability with some confidence that you would be an outdoors type, or a golfer, or a relaxing type on a cruise ship, or a risk-averse investor, or a wine enthusiast, or into fashion, or a passionate gardener, or a sci-fi geek, or a professional wrestling fan. Beyond affinity scores listed here, we can predict future value of each customer or prospect and possible attrition points, as well. And behind all those predictive models (and I have seen countless algorithms), the leading predictors are mostly transaction data, if you are lucky enough to get your hands on them. In the age of ubiquitous data and at the dawn of the “Internet of Things,” more marketers will be in that lucky group if they are diligent about data collection and refinement. Yes, in the near future, even a refrigerator will be able to order groceries, but don’t forget that only the collection mechanism will be different there. We still have to collect, refine and analyze the transaction data.

Last month, I talked about three major types of data (refer to “Big Data Must Get Smaller“), which are:
1. Descriptive Data
2. Behavioral Data (mostly Transaction Data)
3. Attitudinal Data.

If you gain access to all three elements with decent coverage, you will have tremendous predictive power when it comes to human behaviors. Unfortunately, it is really difficult to accumulate attitudinal data on a large scale with individual-level details (i.e., knowing who’s behind all those sentiments). Behavioral data, mostly in forms of transaction data, are also not easy to collect and maintain (non-transaction behavioral data are even bigger and harder to handle), but I’d say it is definitely worth the effort, as most of what we call Big Data fall under this category. Conversely, one can just purchase descriptive data, which are what we generally call demographic or firmographic data, from data compilers or brokers. The sellers (there are many) will even do the data-append processing for you and they may also throw in a few free profile reports with it.

Now, when we start talking about the transaction data, many marketers will respond “Oh, you mean RFM data?” Well, that is not completely off-base, because “Recency, Frequency and Monetary” data certainly occupy important positions in the family of transaction data. But they hardly are the whole thing, and the term is misused as frequently as “Big Data.” Transaction data are so much more than simple RFM variables.

RFM Data Is Just a Good Start
The term RFM should be used more as a checklist for marketers, not as design guidelines—or limitations in many cases—for data professionals. How recently did this particular customer purchase our product, and how frequently did she do that and how much money did she spend with us? Answering these questions is a good start, but stopping there would seriously limit the potential of transaction data. Further, this line of questioning would lead the interrogation efforts to simple “filtering,” as in: “Select all customers who purchased anything with a price tag over $100 more than once in past 12 months.” Many data users may think that this query is somewhat complex, but it really is just a one-dimensional view of the universe. And unfortunately, no customer is one-dimensional. And this query is just one slice of truth from the marketer’s point of view, not the customer’s. If you want to get really deep, the view must be “buyer-centric,” not product-, channel-, division-, seller- or company-centric. And the database structure should reflect that view (refer to “It’s All About Ranking,” where the concept of “Analytical Sandbox” is introduced).

Transaction data by definition describe the transactions, not the buyers. If you would like to describe a buyer or if you are trying to predict the buyer’s future behavior, you need to convert the transaction data into “descriptors of the buyers” first. What is the difference? It is the same data looked at through a different window—front vs. side window—but the effect is huge.

Even if we think about just one simple transaction with one item, instead of describing the shopping basket as “transaction happened on July 3, 2014, containing the Coldplay’s latest CD ‘Ghost Stories’ priced at $11.88,” a buyer-centric description would read: “A recent CD buyer in Rock genre with an average spending level in the music category under $20.” The trick is to describe the buyer, not the product or the transaction. If that customer has many orders and items in his purchase history (let’s say he downloaded a few songs to his portable devices, as well), the description of the buyer would become much richer. If you collect all of his past purchase history, it gets even more colorful, as in: “A recent music CD or MP3 buyer in rock, classical and jazz genres with 24-month purchase totaling to 13 orders containing 16 items with total spending valued in $100-$150 range and $11 average order size.” Of course you would store all this using many different variables (such as genre indicators, number of orders, number of items, total dollars spent during the past 24 months, average order amount and number of weeks since last purchase in the music category, etc.). But the point is that the story would come out this way when you change the perspective.

Creating a Buyer-Centric Portrait
The whole process of creating a buyer-centric portrait starts with data summarization (or de-normalization). A typical structure of the table (or database) that needs to capture every transaction detail, such as transaction date and amount, would require an entry for every transaction, and the database designers call it the “normal” state. As I explained in my previous article (“Ranking is the key”), if you would like to rank in terms of customer value, the data record must be on a customer level, as well. If you are ranking households or companies, you would then need to summarize the data on those levels, too.

Now, this summarization (or de-normalization) is not a process of eliminating duplicate entries of names, as you wouldn’t want to throw away any transaction details. If there are multiple orders per person, what is the total number of orders? What is the total amount of spending on an individual level? What would be average spending level per transaction, or per year? If you are allowed to have only one line of entry per person, how would you summarize the purchase dates, as you cannot just add them up? In that case, you can start with the first and last transaction date of each customer. Now, when you have the first and last transaction date for every customer, what would be the tenure of each customer and what would be the number of days since the last purchase? How many days, on average, are there in between orders then? Yes, all these figures are related to basic RFM metrics, but they are far more colorful this way.

The attached exhibit displays a very simple example of a before and after picture of such summarization process. On the left-hand side, there resides a typical order table containing customer ID, order number, order date and transaction amount. If a customer has multiple orders in a given period, an equal number of lines are required to record the transaction details. In real life, other order level information, such as payment method (very predictive, by the way), tax amount, discount or coupon amount and, if applicable, shipping amount would be on this table, as well.

On the right-hand side of the chart, you will find there is only one line per customer. As I mentioned in my previous columns, establishing consistent and accurate customer ID cannot be neglected—for this reason alone. How would you rely on the summary data if one person may have multiple IDs? The customer may have moved to a new address, or shopped from multiple stores or sites, or there could have been errors in data collections. Relying on email address is a big no-no, as we all carry many email addresses. That is why the first step of building a functional marketing database is to go through the data hygiene and consolidation process. (There are many data processing vendors and software packages for it.) Once a persistent customer (or individual) ID system is in place, you can add up the numbers to create customer-level statistics, such as total orders, total dollars, and first and last order dates, as you see in the chart.

Remember R, F, M, P and C
The real fun begins when you combine these numeric summary figures with product, channel and other important categorical variables. Because product (or service) and channel are the most distinctive dividers of customer behaviors, let’s just add P and C to the famous RFM (remember, we are using RFM just as a checklist here), and call it R, F, M, P and C.

Product (rather, product category) is an important separator, as people often show completely different spending behavior for different types of products. For example, you can send me fancy-shmancy fashion catalogs all you want, but I won’t look at it with an intention of purchase, as most men will look at the models and not what they are wearing. So my active purchase history in the sports, home electronics or music categories won’t mean anything in the fashion category. In other words, those so-called “hotline” names should be treated differently for different categories.

Channel information is also important, as there are active online buyers who would never buy certain items, such as apparel or home furnishing products, without physically touching them first. For example, even in the same categories, I would buy guitar strings or golf balls online. But I would not purchase a guitar or a driver without trying them out first. Now, when I say channel, I mean the channel that the customer used to make the purchase, not the channel through which the marketer chose to communicate with him. Channel information should be treated as a two-way street, as no marketer “owns” a customer through a particular channel (refer to “The Future of Online is Offline“).

As an exercise, let’s go back to the basic RFM data and create some actual variables. For “each” customer, we can start with basic RFM measures, as exhibited in the chart:

· Number of Transactions
· Total Dollar Amount
· Number of Days (or Weeks) since the Last Transaction
· Number of Days (or Weeks) since the First Transaction

Notice that the days are counted from today’s point of view (practically the day the database is updated), as the actual date’s significance changes as time goes by (e.g., a day in February would feel different when looked back on from April vs. November). “Recency” is a relative concept; therefore, we should relativize the time measurements to express it.

From these basic figures, we can derive other related variables, such as:

· Average Dollar Amount per Customer
· Average Dollar Amount per Transaction
· Average Dollar Amount per Year
· Lifetime Highest Amount per Item
· Lifetime Lowest Amount per Transaction
· Average Number of Days Between Transactions
· Etc., etc…

Now, imagine you have all these measurements by channels, such as retail, Web, catalog, phone or mail-in, and separately by product categories. If you imagine a gigantic spreadsheet, the summarized table would have fewer numbers of rows, but a seemingly endless number of columns. I will discuss categorical and non-numeric variables in future articles. But for this exercise, let’s just imagine having these sets of variables for all major product categories. The result is that the recency factor now becomes more like “Weeks since Last Online Order”—not just any order. Frequency measurements would be more like “Number of Transactions in Dietary Supplement Category”—not just for any product. Monetary values can be expressed in “Average Spending Level in Outdoor Sports Category through Online Channel”—not just the customer’s average dollar amount, in general.

Why stop there? We may slice and dice the data by offer type, customer status, payment method or time intervals (e.g., lifetime, 24-month, 48-months, etc.) as well. I am not saying that all the RFM variables should be cut out this way, but having “Number of Transaction by Payment Method,” for example, could be very revealing about the customer, as everybody uses multiple payment methods, while some may never use a debit card for a large purchase, for example. All these little measurements become building blocks in predictive modeling. Now, too many variables can also be troublesome. And knowing the balance (i.e., knowing where to stop) comes from the experience and preliminary analysis. That is when experts and analysts should be consulted for this type of uniform variable creation. Nevertheless, the point is that RFM variables are not just three simple measures that happen be a part of the larger transaction data menu. And we didn’t even touch non-transaction based behavioral elements, such as clicks, views, miles or minutes.

The Time Factor
So, if such data summarization is so useful for analytics and modeling, should we always include everything that has been collected since the inception of the database? The answer is yes and no. Sorry for being cryptic here, but it really depends on what your product is all about; how the buyers would relate to it; and what you, as a marketer, are trying to achieve. As for going back forever, there is a danger in that kind of data hoarding, as “Life-to-Date” data always favors tenured customers over new customers who have a relatively short history. In reality, many new customers may have more potential in terms of value than a tenured customer with lots of transaction records from a long time ago, but with no recent activity. That is why we need to create a level playing field in terms of time limit.

If a “Life-to-Date” summary is not ideal for predictive analytics, then where should you place the cutoff line? If you are selling cars or home furnishing products, we may need to look at a 4- to 5-year history. If your products are consumables with relatively short purchase cycles, then a 1-year examination would be enough. If your product is seasonal in nature—like gardening, vacation or heavily holiday-related items, then you may have to look at a minimum of two consecutive years of history to capture seasonal patterns. If you have mixed seasonality or longevity of products (e.g., selling golf balls and golf clubs sets through the same store or site), then you may have to summarize the data with multiple timelines, where the above metrics would be separated by 12 months, 24 months, 48 months, etc. If you have lifetime value models or any time-series models in the plan, then you may have to break the timeline down even more finely. Again, this is where you may need professional guidance, but marketers’ input is equally important.

Analytical Sandbox
Lastly, who should be doing all of this data summary work? I talked about the concept of the “Analytical Sandbox,” where all types of data conversion, hygiene, transformation, categorization and summarization are done in a consistent manner, and analytical activities, such as sampling, profiling, modeling and scoring are done with proper toolsets like SAS, R or SPSS (refer to “It’s All About Ranking“). The short and final answer is this: Do not leave that to analysts or statisticians. They are the main players in that playground, not the architects or developers of it. If you are serious about employing analytics for your business, plan to build the Analytical Sandbox along with the team of analysts.

My goal as a database designer has always been serving the analysts and statisticians with “model-ready” datasets on silver platters. My promise to them has been that the modelers would spend no time fixing the data. Instead, they would be spending their valuable time thinking about the targets and statistical methodologies to fulfill the marketing goals. After all, answers that we seek come out of those mighty—but often elusive—algorithms, and the algorithms are made of data variables. So, in the interest of getting the proper answers fast, we must build lots of building blocks first. And no, simple RFM variables won’t cut it.