Customer Value: Narrowcasting vs. Broadcasting

The traditional model for customer acquisition has essentially been a broadcast approach, reaching a large audience generally descriptive of the customer base. Contrast this with what is sometimes described as “narrowcasting.”

Virtually every brand we’ve met with in the last few months is hungry for new customers: The war for the customer is on. For more on growing your customer base, consider reading “Bigger is Better: How to Scale Up Customer Acquisition Smarter,” which is an article we published recently about how to grow your customer base.

Many organizations are hooked on customer acquisition. That is, in order to hit sales plans for the organization, new customers will be required in large numbers. It’s about as easy to kick the “acquisition addiction” as it is to kick any other for most brands. Try going without coffee suddenly, and see how your head feels. It’s not very different from reducing a business’s dependence on customer acquisition as a means to achieving revenue and profit targets.

Organizations that need ever larger numbers of new customers to achieve growth goals eventually will find the cost of acquiring incremental net new customers can become prohibitive.

Broadcast vs. Narrowcast
The traditional model for advertising and customer acquisition has essentially been a broadcast approach, reaching a large audience that is generally descriptive of the customer who a brand believes to be a fit. Contrast this with what is sometimes described as a “narrowcasting” strategy. Narrowcasting uses customer intelligence to understand a great number of discrete dimensions that a consumer possesses and can leverage statistical methods to validate the accuracy and predictiveness of targeting customers through these methods.

The chart below, depicting the value of customers acquired through traditional broadcast capabilities upfront and over time helps illustrate why “broadcast” strategies for customer acquisition alone aren’t enough.

Research for Mike Ferranti blog

Broadcast Acquisition Strategies Lack Focus on Customer Value
Large numbers of customers have been acquired in a trailing 13-month window – lots of them. The challenge is this cohort of customers has been acquired without adequate consideration of the right target.

Consider the fact that the target customer value of average or better customers is around $500. In the example above, the marketer has acquired a large number of customers who are lagging in their economic contribution to the business. While the customer acquisition metrics may look good, this was a large campaign and produced several hundreds of thousands of customers over its duration – the average value of those customers is quite low indeed.

Low Customer Value Manifests Itself, Even if Acquisition Volume Is High
When sales targets are rising, it becomes harder to justify the high cost of customer acquisition if the customers previously acquired are underperforming. This leads to a very common bind marketers are placed in. The only way to “make the number” is to acquire more and more.

The most competitive and high quality businesses steadily acquire and have a robust customer base whose economic contribution is materially higher. Consequently, profits are higher, and we have a fundamentally better business.

Oftentimes, “broadcast” advertising approaches define the target with a single criteria like age, income or geography. This can be effective, especially when the media is bought at a good value. However, “effective” is almost always defined as “number of customers acquired.” This of course is a reasonable way to judge the performance of the marketing – at least by traditional standards.

There is another way to measure the success of the campaign that is only just beginning to be understood by many traditional “broadcast” marketers: customer value. The chart above shows that this cohort of acquired customers had relatively low economic value.

Root Causes of Low Customer Value
What are the causes of low value? It would be fair to start with the ongoing marketing and relationship with the customer. Bad service could keep customers from returning. Poor quality could lead to excessive returns. Over-promotion could drive down value. Getting the message and frequency wrong could lead to underperformance of the cohort. These are all viable reasons for lower value that need to be rationally and methodically ruled out prior to looking elsewhere.

Therefore, if operational issues are not clear – either through organizational KPI tracking, or simply by monitoring Twitter — then a marketing professional needs to start looking at three things.

  1. The Target (and Media)
  2. The Offer (and Message)
  3. The Creative

Given the target is historically responsible for up to 70 percent of the success of advertising, this is the first place a professional data-driven marketer would look.

Target Definition Defines the Customer You Acquire, and It Drives Customer Value.
A fact that is often overlooked is that target definition means not just focusing efforts and advertising spent on consumers who are most likely to convert and become customers, but it also defines what kind of customers they have the potential to become.

In conversations with CMOs, we often discuss “the target customer” or the “ideal customer” they wish to introduce their brand to. The descriptions of course vary by the brand and the product. Those target definitions are often more qualitative in nature. In fact, only about 30 percent of CMO’s we engage with regularly are focused on using hard data to define their customer base. While these are helpful and create a vocabulary for discussing and defining who the customer is, those primarily qualitative descriptors are often sculpted to align with media descriptors that make targeting “big and simple.”

“While simplifying is good business, when simplicity masks underlying business model challenges, a deeper look will ultimately be required, if not forced on the organization.”

While we would not refute a place for those descriptors of a valued consumer, they do fall short of true target definition. Ideally, the process of defining the customer who a brand wishes to pursue must begin with a thorough inventory of the customers it already has, and a substantial enhancement of those customer records which provides vibrant metrics on affluence, age, ethnographic, urbanicity, purchasing behaviors, credit history, geo- and demo-graphics, net worth, income, online purchasing, offline purchasing and potentially a great deal more.

Sex and the Schoolboy: Predictive Modeling – Who’s Doing It? Who’s Doing it Right?

Forgive the borrowed interest, but predictive modeling is to marketers as sex is to schoolboys. They’re all talking about it, but few are doing it. And among those who are, fewer are doing it right. In customer relationship marketing (CRM), predictive modeling uses data to predict the likelihood of a customer taking a specific action. It’s a three-step process.

Forgive the borrowed interest, but predictive modeling is to marketers as sex is to schoolboys.

They’re all talking about it, but few are doing it. And among those who are, fewer are doing it right.

In customer relationship marketing (CRM), predictive modeling uses data to predict the likelihood of a customer taking a specific action. It’s a three-step process:

1. Examine the characteristics of the customers who took a desired action

2. Compare them against the characteristics of customers who didn’t take that action

3. Determine which characteristics are most predictive of the customer taking the action and the value or degree to which each variable is predictive

Predictive modeling is useful in allocating CRM resources efficiently. If a model predicts that certain customers are less likely respond to a specific offer, then fewer resources can be allocated to those customers, allowing more resources to be allocated to those who are more likely to respond.

Data Inputs
A predictive model will only be as good as the input data that’s used in the modeling process. You need the data that define the dependent variable; that is, the outcome the model is trying to predict (such as response to a particular offer). You’ll also need the data that define the independent variables, or the characteristics that will be predictive of the desired outcome (such as age, income, purchase history, etc.). Attitudinal and behavioral data may also be predictive, such as an expressed interest in weight loss, fitness, healthy eating, etc.

The more variables that are fed into the model at the beginning, the more likely the modeling process will identify relevant predictors. Modeling is an iterative process, and those variables that are not at all predictive will fall out in the early iterations, leaving those that are most predictive for more precise analysis in later iterations. The danger in not having enough independent variables to model is that the resultant model will only explain a portion of the desired outcome.

For example, a predictive model created to determine the factors affecting physician prescribing of a particular brand was inconclusive, because there weren’t enough dependent variables to explain the outcome fully. In a standard regression analysis, the number of RXs written in a specific timeframe was set as the dependent variable. There were only three independent variables available: sales calls, physician samples and direct mail promotions to physicians. And while each of the three variables turned out to have a positive effect on prescriptions written, the “Multiple R” value of the regression equation was high at 0.44, meaning that these variables only explained 44 percent of the variance in RXs. The other 56 percent of the variance is from factors that were not included in the model input.

Sample Size
Larger samples will produce more robust models than smaller ones. Some modelers recommend a minimum data set of 10,000 records, 500 of those with the desired outcome. Others report acceptable results with as few as 100 records with the desired outcome. But in general, size matters.

Regardless, it is important to hold out a validation sample from the modeling process. That allows the model to be applied to the hold-out sample to validate its ability to predict the desired outcome.

Important First Steps

1. Define Your Outcome. What do you want the model to do for your business? Predict likelihood to opt-in? Predict likelihood to respond to a particular offer? Your objective will drive the data set that you need to define the dependent variable. For example, if you’re looking to predict likelihood to respond to a particular offer, you’ll need to have prospects who responded and prospects who didn’t in order to discriminate between them.

2. Gather the Data to Model. This requires tapping into several data sources, including your CRM database, as well as external sources where you can get data appended (see below).

3. Set the Timeframe. Determine the time period for the data you will analyze. For example, if you’re looking to model likelihood to respond, the start and end points for the data should be far enough in the past that you have a sufficient sample of responders and non-responders.

4. Examine Variables Individually. Some variables will not be correlated with the outcome, and these can be eliminated prior to building the model.

Data Sources
Independent variable data
may include

  • In-house database fields
  • Data overlays (demographics, HH income, lifestyle interests, presence of children,
    marital status, etc.) from a data provider such as Experian, Epsilon or Acxiom.

Don’t Try This at Home
While you can do regression analysis in Microsoft Excel, if you’re going to invest a lot of promotion budget in the outcome, you should definitely leave the number crunching to the professionals. Expert modelers know how to analyze modeling results and make adjustments where necessary.

Missing Data Can Be Meaningful

No matter how big the Big Data gets, we will never know everything about everything. Well, according to the super-duper computer called “Deep Thought” in the movie “The Hitchhiker’s Guide to the Galaxy” (don’t bother to watch it if you don’t care for the British sense of humour), the answer to “The Ultimate Question of Life, the Universe, and Everything” is “42.” Coincidentally, that is also my favorite number to bet on (I have my reasons), but I highly doubt that even that huge fictitious computer with unlimited access to “everything” provided that numeric answer with conviction after 7½ million years of computing and checking. At best, that “42” is an estimated figure of a sort, based on some fancy algorithm. And in the movie, even Deep Thought pointed out that “the answer is meaningless, because the beings who instructed it never actually knew what the Question was.” Ha! Isn’t that what I have been saying all along? For any type of analytics to be meaningful, one must properly define the question first. And what to do with the answer that comes out of an algorithm is entirely up to us humans, or in the business world, the decision-makers. (Who are probably human.)

No matter how big the Big Data gets, we will never know everything about everything. Well, according to the super-duper computer called “Deep Thought” in the movie “The Hitchhiker’s Guide to the Galaxy” (don’t bother to watch it if you don’t care for the British sense of humour), the answer to “The Ultimate Question of Life, the Universe, and Everything” is “42.” Coincidentally, that is also my favorite number to bet on (I have my reasons), but I highly doubt that even that huge fictitious computer with unlimited access to “everything” provided that numeric answer with conviction after 7½ million years of computing and checking. At best, that “42” is an estimated figure of a sort, based on some fancy algorithm. And in the movie, even Deep Thought pointed out that “the answer is meaningless, because the beings who instructed it never actually knew what the Question was.” Ha! Isn’t that what I have been saying all along? For any type of analytics to be meaningful, one must properly define the question first. And what to do with the answer that comes out of an algorithm is entirely up to us humans, or in the business world, the decision-makers. (Who are probably human.)

Analytics is about making the best of what we know. Good analysts do not wait for a perfect dataset (it will never come by, anyway). And businesspeople have no patience to wait for anything. Big Data is big because we digitize everything, and everything that is digitized is stored somewhere in forms of data. For example, even if we collect mobile device usage data from just pockets of the population with certain brands of mobile services in a particular area, the sheer size of the resultant dataset becomes really big, really fast. And most unstructured databases are designed to collect and store what is known. If you flip that around to see if you know every little behavior through mobile devices for “everyone,” you will be shocked to see how small the size of the population associated with meaningful data really is. Let’s imagine that we can describe human beings with 1,000 variables coming from all sorts of sources, out of 200 million people. How many would have even 10 percent of the 1,000 variables filled with some useful information? Not many, and definitely not 100 percent. Well, we have more data than ever in the history of mankind, but still not for every case for everyone.

In my previous columns, I pointed out that decision-making is about ranking different options, and to rank anything properly. We must employee predictive analytics (refer to “It’s All About Ranking“). And for ranking based on the scores resulting from predictive models to be effective, the datasets must be summarized to the level that is to be ranked (e.g., individuals, households, companies, emails, etc.). That is why transaction or event-level datasets must be transformed to “buyer-centric” portraits before any modeling activity begins. Again, it is not about the transaction or the products, but it is about the buyers, if you are doing all this to do business with people.

Trouble with buyer- or individual-centric databases is that such transformation of data structure creates lots of holes. Even if you have meticulously collected every transaction record that matters (and that will be the day), if someone did not buy a certain item, any variable that is created based on the purchase record of that particular item will have nothing to report for that person. Likewise, if you have a whole series of variables to differentiate online and offline channel behaviors, what would the online portion contain if the consumer in question never bought anything through the Web? Absolutely nothing. But in the business of predictive analytics, what did not happen is as important as what happened. Even a simple concept of “response” is only meaningful when compared to “non-response,” and the difference between the two groups becomes the basis for the “response” model algorithm.

Capturing the Meanings Behind Missing Data
Missing data are all around us. And there are many reasons why they are missing, too. It could be that there is nothing to report, as in aforementioned examples. Or, there could be errors in data collection—and there are lots of those, too. Maybe you don’t have access to certain pockets of data due to corporate, legal, confidentiality or privacy reasons. Or, maybe records did not match properly when you tried to merge disparate datasets or append external data. These things happen all the time. And, in fact, I have never seen any dataset without a missing value since I left school (and that was a long time ago). In school, the professors just made up fictitious datasets to emphasize certain phenomena as examples. In real life, databases have more holes than Swiss cheese. In marketing databases? Forget about it. We all make do with what we know, even in this day and age.

Then, let’s ask a philosophical question here:

  • If missing data are inevitable, what do we do about it?
  • How would we record them in databases?
  • Should we just leave them alone?
  • Or should we try to fill in the gaps?
  • If so, how?

The answer to all this is definitely not 42, but I’ll tell you this: Even missing data have meanings, and not all missing data are created equal, either.

Furthermore, missing data often contain interesting stories behind them. For example, certain demographic variables may be missing only for extremely wealthy people and very poor people, as their residency data are generally not exposed (for different reasons, of course). And that, in itself, is a story. Likewise, some data may be missing in certain geographic areas or for certain age groups. Collection of certain types of data may be illegal in some states. “Not” having any data on online shopping behavior or mobile activity may mean something interesting for your business, if we dig deeper into it without falling into the trap of predicting legal or corporate boundaries, instead of predicting consumer behaviors.

In terms of how to deal with missing data, let’s start with numeric data, such as dollars, days, counters, etc. Some numeric data simply may not be there, if there is no associated transaction to report. Now, if they are about “total dollar spending” and “number of transactions” in a certain category, for example, they can be initiated as zero and remain as zero in cases like this. The counter simply did not start clicking, and it can be reported as zero if nothing happened.

Some numbers are incalculable, though. If you are calculating “Average Amount per Online Transaction,” and if there is no online transaction for a particular customer, that is a situation for mathematical singularity—as we can’t divide anything by zero. In such cases, the average amount should be recorded as: “.”, blank, or any value that represents a pure missing value. But it should never be recorded as zero. And that is the key in dealing with missing numeric information; that zero should be reserved for real zeros, and nothing else.

I have seen too many cases where missing numeric values are filled with zeros, and I must say that such a practice is definitely frowned-upon. If you have to pick just one takeaway from this article, that’s it. Like I emphasized, not all missing values are the same, and zero is not the way you record them. Zeros should never represent lack of information.

Take the example of a popular demographic variable, “Number of Children in the Household.” This is a very predictable variable—not just for purchase behavior of children’s products, but for many other things. Now, it is a simple number, but it should never be treated as a simple variable—as, in this case, lack of information is not the evidence of non-existence. Let’s say that you are purchasing this data from a third-party data compiler (or a data broker). If you don’t see a positive number in that field, it could be because:

  1. The household in question really does not have a child;
  2. Even the data-collector doesn’t have the information; or
  3. The data collector has the information, but the household record did not match to the vendor’s record, for some reason.

If that field contains a number like 1, 2 or 3, that’s easy, as they will represent the number of children in that household. But the zero should be reserved for cases where the data collector has a positive confirmation that the household in question indeed does not have any children. If it is unknown, it should be marked as blank, “.” (Many statistical softwares, such as SAS, record missing values this way.) Or use “U” (though an alpha character should not be in a numeric field).

If it is a case of non-match to the external data source, then there should be a separate indicator for it. The fact that the record did not match to a professional data compiler’s list may mean something. And I’ve seen cases where such non-matching indicators are made to model algorithms along with other valid data, as in the case where missing indicators of income display the same directional tendency as high-income households.

Now, if the data compiler in question boldly inputs zeros for the cases of unknowns? Take a deep breath, fire the vendor, and don’t deal with the company again, as it is a sign that its representatives do not know what they are doing in the data business. I have done so in the past, and you can do it, too. (More on how to shop for external data in future articles.)

For non-numeric categorical data, similar rules apply. Some values could be truly “blank,” and those should be treated separately from “Unknown,” or “Not Available.” As a practice, let’s list all kinds of possible missing values in codes, texts or other character fields:

  • ” “—blank or “null”
  • “N/A,” “Not Available,” or “Not Applicable”
  • “Unknown”
  • “Other”—If it is originating from some type of multiple choice survey or pull-down menu
  • “Not Answered” or “Not Provided”—This indicates that the subjects were asked, but they refused to answer. Very different from “Unknown.”
  • “0”—In this case, the answer can be expressed in numbers. Again, only for known zeros.
  • “Non-match”—Not matched to other internal or external data sources
  • Etc.

It is entirely possible that all these values may be highly correlated to each other and move along the same predictive direction. However, there are many cases where they do not. And if they are combined into just one value, such as zero or blank, we will never be able to detect such nuances. In fact, I’ve seen many cases where one or more of these missing indicators move together with other “known” values in models. Again, missing data have meanings, too.

Filling in the Gaps
Nonetheless, missing data do not have to left as missing, blank or unknown all the time. With statistical modeling techniques, we can fill in the gaps with projected values. You didn’t think that all those data compilers really knew the income level of every household in the country, did you? It is not a big secret that much of those figures are modeled with other available data.

Such inferred statistics are everywhere. Popular variables, such as householder age, home owner/renter indicator, housing value, household income or—in the case of business data—the number of employees and sales volume contain modeled values. And there is nothing wrong with that, in the world where no one really knows everything about everything. If you understand the limitations of modeling techniques, it is quite alright to employ modeled values—which are much better alternatives to highly educated guesses—in decision-making processes. We just need to be a little careful, as models often fail to predict extreme values, such as household incomes over $500,000/year, or specific figures, such as incomes of $87,500. But “ranges” of household income, for example, can be predicted at a high confidence level, though it technically requires many separate algorithms and carefully constructed input variables in various phases. But such technicality is an issue that professional number crunchers should deal with, like in any other predictive businesses. Decision-makers should just be aware of the reality of real and inferred data.

Such imputation practices can be applied to any data source, not just compiled databases by professional data brokers. Statisticians often impute values when they encounter missing values, and there are many different methods of imputation. I haven’t met two statisticians who completely agree with each other when it comes to imputation methodologies, though. That is why it is important for an organization to have a unified rule for each variable regarding its imputation method (or lack thereof). When multiple analysts employ different methods, it often becomes the very source of inconsistent or erroneous results at the application stage. It is always more prudent to have the calculation done upfront, and store the inferred values in a consistent manner in the main database.

In terms of how that is done, there could be a long debate among the mathematical geeks. Will it be a simple average of non-missing values? If such a method is to be employed, what is the minimum required fill-rate of the variable in question? Surely, you do not want to project 95 percent of the population with 5 percent known values? Or will the missing values be replaced with modeled values, as in previous examples? If so, what would be the source of target data? What about potential biases that may exist because of data collection practices and their limitations? What should be the target definition? In what kind of ranges? Or should the target definition remain as a continuous figure? How would you differentiate modeled and real values in the database? Would you embed indicators for inferred values? Or would you forego such flags in the name of speed and convenience for users?

The important matter is not the rules or methodologies, but the consistency of them throughout the organization and the databases. That way, all users and analysts will have the same starting point, no matter what the analytical purposes are. There could be a long debate in terms of what methodology should be employed and deployed. But once the dust settles, all data fields should be treated by pre-determined rules during the database update processes, avoiding costly errors in the downstream. All too often, inconsistent imputation methods lead to inconsistent results.

If, by some chance, individual statisticians end up with freedom to come up with their own ways to fill in the blanks, then the model-scoring code in question must include missing value imputation algorithms without an exception, granted that such practice will elongate the model application processes and significantly increase chances for errors. It is also important that non-statistical users should be educated about the basics of missing data and associated imputation methods, so that everyone who has access to the database shares a common understanding of what they are dealing with. That list includes external data providers and partners, and it is strongly recommended that data dictionaries must include employed imputation rules wherever applicable.

Keep an Eye on the Missing Rate
Often, we get to find out that the missing rate of certain variables is going out of control because models become ineffective and campaigns start to yield disappointing results. Conversely, it can be stated that fluctuations in missing data ratios greatly affect the predictive power of models or any related statistical works. It goes without saying that a consistent influx of fresh data matters more than the construction and the quality of models and algorithms. It is a classic case of a garbage-in-garbage-out scenario, and that is why good data governance practices must include a time-series comparison of the missing rate of every critical variable in the database. If, all of a sudden, an important predictor’s fill-rate drops below a certain point, no analyst in this world can sustain the predictive power of the model algorithm, unless it is rebuilt with a whole new set of variables. The shelf life of models is definitely finite, but nothing deteriorates effectiveness of models faster than inconsistent data. And a fluctuating missing rate is a good indicator of such an inconsistency.

Likewise, if the model score distribution starts to deviate from the original model curve from the development and validation samples, it is prudent to check the missing rate of every variable used in the model. Any sudden changes in model score distribution are a good indicator that something undesirable is going on in the database (more on model quality control in future columns).

These few guidelines regarding the treatment of missing data will add more flavors to statistical models and analytics in general. In turn, proper handling of missing data will prolong the predictive power of models, as well. Missing data have hidden meanings, but they are revealed only when they are treated properly. And we need to do that until the day we get to know everything about everything. Unless you are just happy with that answer of “42.”

Beyond RFM Data

In the world of predictive analytics, the transaction data is the king of the hill. The master of the domain. The protector of the realm. Why? Because they are hands-down the most powerful predictors. If I may borrow the term that my mentor coined for our cooperative venture more than a decade ago (before anyone even uttered the word “Big Data”), “The past behavior is the best predictor of the future behavior.” Indeed. Back then, we had built a platform that nowadays could easily have qualified as Big Data. The platform predicted people’s future behaviors on a massive scale, and it worked really well, so I still stand by that statement.

In the world of predictive analytics, the transaction data is the king of the hill. The master of the domain. The protector of the realm. Why? Because they are hands-down the most powerful predictors. If I may borrow the term that my mentor coined for our cooperative venture more than a decade ago (before anyone even uttered the word “Big Data”), “The past behavior is the best predictor of the future behavior.” Indeed. Back then, we had built a platform that nowadays could easily have qualified as Big Data. The platform predicted people’s future behaviors on a massive scale, and it worked really well, so I still stand by that statement.

How so? At the risk of sounding like a pompous mathematical smartypants (I’m really not), it is because people do not change that much, or if so, not so rapidly. Every move you make is on some predictive curve. What you been buying, clicking, browsing, smelling or coveting somehow leads to the next move. Well, not all the time. (Maybe you just like to “look” at pretty shoes?) But with enough data, we can calculate the probability with some confidence that you would be an outdoors type, or a golfer, or a relaxing type on a cruise ship, or a risk-averse investor, or a wine enthusiast, or into fashion, or a passionate gardener, or a sci-fi geek, or a professional wrestling fan. Beyond affinity scores listed here, we can predict future value of each customer or prospect and possible attrition points, as well. And behind all those predictive models (and I have seen countless algorithms), the leading predictors are mostly transaction data, if you are lucky enough to get your hands on them. In the age of ubiquitous data and at the dawn of the “Internet of Things,” more marketers will be in that lucky group if they are diligent about data collection and refinement. Yes, in the near future, even a refrigerator will be able to order groceries, but don’t forget that only the collection mechanism will be different there. We still have to collect, refine and analyze the transaction data.

Last month, I talked about three major types of data (refer to “Big Data Must Get Smaller“), which are:
1. Descriptive Data
2. Behavioral Data (mostly Transaction Data)
3. Attitudinal Data.

If you gain access to all three elements with decent coverage, you will have tremendous predictive power when it comes to human behaviors. Unfortunately, it is really difficult to accumulate attitudinal data on a large scale with individual-level details (i.e., knowing who’s behind all those sentiments). Behavioral data, mostly in forms of transaction data, are also not easy to collect and maintain (non-transaction behavioral data are even bigger and harder to handle), but I’d say it is definitely worth the effort, as most of what we call Big Data fall under this category. Conversely, one can just purchase descriptive data, which are what we generally call demographic or firmographic data, from data compilers or brokers. The sellers (there are many) will even do the data-append processing for you and they may also throw in a few free profile reports with it.

Now, when we start talking about the transaction data, many marketers will respond “Oh, you mean RFM data?” Well, that is not completely off-base, because “Recency, Frequency and Monetary” data certainly occupy important positions in the family of transaction data. But they hardly are the whole thing, and the term is misused as frequently as “Big Data.” Transaction data are so much more than simple RFM variables.

RFM Data Is Just a Good Start
The term RFM should be used more as a checklist for marketers, not as design guidelines—or limitations in many cases—for data professionals. How recently did this particular customer purchase our product, and how frequently did she do that and how much money did she spend with us? Answering these questions is a good start, but stopping there would seriously limit the potential of transaction data. Further, this line of questioning would lead the interrogation efforts to simple “filtering,” as in: “Select all customers who purchased anything with a price tag over $100 more than once in past 12 months.” Many data users may think that this query is somewhat complex, but it really is just a one-dimensional view of the universe. And unfortunately, no customer is one-dimensional. And this query is just one slice of truth from the marketer’s point of view, not the customer’s. If you want to get really deep, the view must be “buyer-centric,” not product-, channel-, division-, seller- or company-centric. And the database structure should reflect that view (refer to “It’s All About Ranking,” where the concept of “Analytical Sandbox” is introduced).

Transaction data by definition describe the transactions, not the buyers. If you would like to describe a buyer or if you are trying to predict the buyer’s future behavior, you need to convert the transaction data into “descriptors of the buyers” first. What is the difference? It is the same data looked at through a different window—front vs. side window—but the effect is huge.

Even if we think about just one simple transaction with one item, instead of describing the shopping basket as “transaction happened on July 3, 2014, containing the Coldplay’s latest CD ‘Ghost Stories’ priced at $11.88,” a buyer-centric description would read: “A recent CD buyer in Rock genre with an average spending level in the music category under $20.” The trick is to describe the buyer, not the product or the transaction. If that customer has many orders and items in his purchase history (let’s say he downloaded a few songs to his portable devices, as well), the description of the buyer would become much richer. If you collect all of his past purchase history, it gets even more colorful, as in: “A recent music CD or MP3 buyer in rock, classical and jazz genres with 24-month purchase totaling to 13 orders containing 16 items with total spending valued in $100-$150 range and $11 average order size.” Of course you would store all this using many different variables (such as genre indicators, number of orders, number of items, total dollars spent during the past 24 months, average order amount and number of weeks since last purchase in the music category, etc.). But the point is that the story would come out this way when you change the perspective.

Creating a Buyer-Centric Portrait
The whole process of creating a buyer-centric portrait starts with data summarization (or de-normalization). A typical structure of the table (or database) that needs to capture every transaction detail, such as transaction date and amount, would require an entry for every transaction, and the database designers call it the “normal” state. As I explained in my previous article (“Ranking is the key”), if you would like to rank in terms of customer value, the data record must be on a customer level, as well. If you are ranking households or companies, you would then need to summarize the data on those levels, too.

Now, this summarization (or de-normalization) is not a process of eliminating duplicate entries of names, as you wouldn’t want to throw away any transaction details. If there are multiple orders per person, what is the total number of orders? What is the total amount of spending on an individual level? What would be average spending level per transaction, or per year? If you are allowed to have only one line of entry per person, how would you summarize the purchase dates, as you cannot just add them up? In that case, you can start with the first and last transaction date of each customer. Now, when you have the first and last transaction date for every customer, what would be the tenure of each customer and what would be the number of days since the last purchase? How many days, on average, are there in between orders then? Yes, all these figures are related to basic RFM metrics, but they are far more colorful this way.

The attached exhibit displays a very simple example of a before and after picture of such summarization process. On the left-hand side, there resides a typical order table containing customer ID, order number, order date and transaction amount. If a customer has multiple orders in a given period, an equal number of lines are required to record the transaction details. In real life, other order level information, such as payment method (very predictive, by the way), tax amount, discount or coupon amount and, if applicable, shipping amount would be on this table, as well.

On the right-hand side of the chart, you will find there is only one line per customer. As I mentioned in my previous columns, establishing consistent and accurate customer ID cannot be neglected—for this reason alone. How would you rely on the summary data if one person may have multiple IDs? The customer may have moved to a new address, or shopped from multiple stores or sites, or there could have been errors in data collections. Relying on email address is a big no-no, as we all carry many email addresses. That is why the first step of building a functional marketing database is to go through the data hygiene and consolidation process. (There are many data processing vendors and software packages for it.) Once a persistent customer (or individual) ID system is in place, you can add up the numbers to create customer-level statistics, such as total orders, total dollars, and first and last order dates, as you see in the chart.

Remember R, F, M, P and C
The real fun begins when you combine these numeric summary figures with product, channel and other important categorical variables. Because product (or service) and channel are the most distinctive dividers of customer behaviors, let’s just add P and C to the famous RFM (remember, we are using RFM just as a checklist here), and call it R, F, M, P and C.

Product (rather, product category) is an important separator, as people often show completely different spending behavior for different types of products. For example, you can send me fancy-shmancy fashion catalogs all you want, but I won’t look at it with an intention of purchase, as most men will look at the models and not what they are wearing. So my active purchase history in the sports, home electronics or music categories won’t mean anything in the fashion category. In other words, those so-called “hotline” names should be treated differently for different categories.

Channel information is also important, as there are active online buyers who would never buy certain items, such as apparel or home furnishing products, without physically touching them first. For example, even in the same categories, I would buy guitar strings or golf balls online. But I would not purchase a guitar or a driver without trying them out first. Now, when I say channel, I mean the channel that the customer used to make the purchase, not the channel through which the marketer chose to communicate with him. Channel information should be treated as a two-way street, as no marketer “owns” a customer through a particular channel (refer to “The Future of Online is Offline“).

As an exercise, let’s go back to the basic RFM data and create some actual variables. For “each” customer, we can start with basic RFM measures, as exhibited in the chart:

· Number of Transactions
· Total Dollar Amount
· Number of Days (or Weeks) since the Last Transaction
· Number of Days (or Weeks) since the First Transaction

Notice that the days are counted from today’s point of view (practically the day the database is updated), as the actual date’s significance changes as time goes by (e.g., a day in February would feel different when looked back on from April vs. November). “Recency” is a relative concept; therefore, we should relativize the time measurements to express it.

From these basic figures, we can derive other related variables, such as:

· Average Dollar Amount per Customer
· Average Dollar Amount per Transaction
· Average Dollar Amount per Year
· Lifetime Highest Amount per Item
· Lifetime Lowest Amount per Transaction
· Average Number of Days Between Transactions
· Etc., etc…

Now, imagine you have all these measurements by channels, such as retail, Web, catalog, phone or mail-in, and separately by product categories. If you imagine a gigantic spreadsheet, the summarized table would have fewer numbers of rows, but a seemingly endless number of columns. I will discuss categorical and non-numeric variables in future articles. But for this exercise, let’s just imagine having these sets of variables for all major product categories. The result is that the recency factor now becomes more like “Weeks since Last Online Order”—not just any order. Frequency measurements would be more like “Number of Transactions in Dietary Supplement Category”—not just for any product. Monetary values can be expressed in “Average Spending Level in Outdoor Sports Category through Online Channel”—not just the customer’s average dollar amount, in general.

Why stop there? We may slice and dice the data by offer type, customer status, payment method or time intervals (e.g., lifetime, 24-month, 48-months, etc.) as well. I am not saying that all the RFM variables should be cut out this way, but having “Number of Transaction by Payment Method,” for example, could be very revealing about the customer, as everybody uses multiple payment methods, while some may never use a debit card for a large purchase, for example. All these little measurements become building blocks in predictive modeling. Now, too many variables can also be troublesome. And knowing the balance (i.e., knowing where to stop) comes from the experience and preliminary analysis. That is when experts and analysts should be consulted for this type of uniform variable creation. Nevertheless, the point is that RFM variables are not just three simple measures that happen be a part of the larger transaction data menu. And we didn’t even touch non-transaction based behavioral elements, such as clicks, views, miles or minutes.

The Time Factor
So, if such data summarization is so useful for analytics and modeling, should we always include everything that has been collected since the inception of the database? The answer is yes and no. Sorry for being cryptic here, but it really depends on what your product is all about; how the buyers would relate to it; and what you, as a marketer, are trying to achieve. As for going back forever, there is a danger in that kind of data hoarding, as “Life-to-Date” data always favors tenured customers over new customers who have a relatively short history. In reality, many new customers may have more potential in terms of value than a tenured customer with lots of transaction records from a long time ago, but with no recent activity. That is why we need to create a level playing field in terms of time limit.

If a “Life-to-Date” summary is not ideal for predictive analytics, then where should you place the cutoff line? If you are selling cars or home furnishing products, we may need to look at a 4- to 5-year history. If your products are consumables with relatively short purchase cycles, then a 1-year examination would be enough. If your product is seasonal in nature—like gardening, vacation or heavily holiday-related items, then you may have to look at a minimum of two consecutive years of history to capture seasonal patterns. If you have mixed seasonality or longevity of products (e.g., selling golf balls and golf clubs sets through the same store or site), then you may have to summarize the data with multiple timelines, where the above metrics would be separated by 12 months, 24 months, 48 months, etc. If you have lifetime value models or any time-series models in the plan, then you may have to break the timeline down even more finely. Again, this is where you may need professional guidance, but marketers’ input is equally important.

Analytical Sandbox
Lastly, who should be doing all of this data summary work? I talked about the concept of the “Analytical Sandbox,” where all types of data conversion, hygiene, transformation, categorization and summarization are done in a consistent manner, and analytical activities, such as sampling, profiling, modeling and scoring are done with proper toolsets like SAS, R or SPSS (refer to “It’s All About Ranking“). The short and final answer is this: Do not leave that to analysts or statisticians. They are the main players in that playground, not the architects or developers of it. If you are serious about employing analytics for your business, plan to build the Analytical Sandbox along with the team of analysts.

My goal as a database designer has always been serving the analysts and statisticians with “model-ready” datasets on silver platters. My promise to them has been that the modelers would spend no time fixing the data. Instead, they would be spending their valuable time thinking about the targets and statistical methodologies to fulfill the marketing goals. After all, answers that we seek come out of those mighty—but often elusive—algorithms, and the algorithms are made of data variables. So, in the interest of getting the proper answers fast, we must build lots of building blocks first. And no, simple RFM variables won’t cut it.

Big Data Must Get Smaller

Like many folks who worked in the data business for a long time, I don’t even like the words “Big Data.” Yeah, data is big now, I get it. But so what? Faster and bigger have been the theme in the computing business since the first calculator was invented. In fact, I don’t appreciate the common definition of Big Data that is often expressed in the three Vs: volume, velocity and variety. So, if any kind of data are big and fast, it’s all good? I don’t think so. If you have lots of “dumb” data all over the place, how does that help you? Well, as much as all the clutter that’s been piled on in your basement since 1971. It may yield some profit on an online auction site one day. Who knows? Maybe some collector will pay good money for some obscure Coltrane or Moody Blues albums that you never even touched since your last turntable (Ooh, what is that?) died on you. Those oversized album jackets were really cool though, weren’t they?

Like many folks who worked in the data business for a long time, I don’t even like the words “Big Data.” Yeah, data is big now, I get it. But so what? Faster and bigger have been the theme in the computing business since the first calculator was invented. In fact, I don’t appreciate the common definition of Big Data that is often expressed in the three Vs: volume, velocity and variety. So, if any kind of data are big and fast, it’s all good? I don’t think so. If you have lots of “dumb” data all over the place, how does that help you? Well, as much as all the clutter that’s been piled on in your basement since 1971. It may yield some profit on an online auction site one day. Who knows? Maybe some collector will pay good money for some obscure Coltrane or Moody Blues albums that you never even touched since your last turntable (Ooh, what is that?) died on you. Those oversized album jackets were really cool though, weren’t they?

Seriously, the word “Big” only emphasizes the size element, and that is a sure way to miss the essence of the data business. And many folks are missing even that little point by calling all decision-making activities that involve even small-sized data “Big Data.” It is entirely possible that this data stuff seems all new to someone, but the data-based decision-making process has been with us for a very long time. If you use that “B” word to differentiate old-fashioned data analytics of yesteryear and ridiculously large datasets of the present day, yes, that is a proper usage of it. But we all know most people do not mean it that way. One side benefit of this bloated and hyped up buzzword is data professionals like myself do not have to explain what we do for living for 20 minutes anymore by simply uttering the word “Big Data,” though that is a lot like a grandmother declaring all her grandchildren work on computers for living. Better yet, that magic “B” word sometimes opens doors to new business opportunities (or at least a chance to grab a microphone in non-data-related meetings and conferences) that data geeks of the past never dreamed of.

So, I guess it is not all that bad. But lest we forget, all hypes lead to overinvestments, and all overinvestments leads to disappointments, and all disappointments lead to purging of related personnel and vendors that bear that hyped-up dirty word in their titles or division names. If this Big Data stuff does not yield significant profit (or reduction in cost), I am certain that those investment bubbles will burst soon enough. Yes, some data folks may be lucky enough to milk it for another two or three years, but brace for impact if all those collected data do not lead to some serious dollar signs. I know how the storage and processing cost decreased significantly in recent years, but they ain’t totally free, and related man-hours aren’t exactly cheap, either. Also, if this whole data business is a new concept to an organization, any money spent on the promise of Big Data easily becomes a liability for the reluctant bunch.

This is why I open up my speeches and lectures with this question: “Have you made any money with this Big Data stuff yet?” Surely, you didn’t spend all that money to provide faster toys and nicer playgrounds to IT folks? Maybe the head of IT had some fun with it, but let’s ask that question to CFOs, not CTOs, CIOs or CDOs. I know some colleagues (i.e., fellow data geeks) who are already thinking about a new name for this—”decision-making activities, based on data and analytics”—because many of us will be still doing that “data stuff” even after Big Data cease to be cool after the judgment day. Yeah, that Gangnam Style dance was fun for a while, but who still jumps around like a horse?

Now, if you ask me (though nobody did yet), I’d say the Big Data should have been “Smart Data,” “Intelligent Data” or something to that extent. Because data must provide insights. Answers to questions. Guidance to decision-makers. To data professionals, piles of data—especially the ones that are fragmented, unstructured and unformatted, no matter what kind of fancy names the operating system and underlying database technology may bear—it is just a good start. For non-data-professionals, unrefined data—whether they are big or small—would remain distant and obscure. Offering mounds of raw data to end-users is like providing a painting kit when someone wants a picture on the wall. Bragging about the size of the data with impressive sounding new measurements that end with “bytes” is like counting grains of rice in California in front of a hungry man.

Big Data must get smaller. People want yes/no answers to their specific questions. If such clarity is not possible, probability figures to such questions should be provided; as in, “There’s an 80 percent chance of thunderstorms on the day of the company golf outing,” “An above-average chance to close a deal with a certain prospect” or “Potential value of a customer who is repeatedly complaining about something on the phone.” It is about easy-to-understand answers to business questions, not a quintillion bytes of data stored in some obscure cloud somewhere. As I stated at the end of my last column, the Big Data movement should be about (1) Getting rid of the noise, and (2) Providing simple answers to decision-makers. And getting to such answers is indeed the process of making data smaller and smaller.

In my past columns, I talked about the benefits of statistical models in the age of Big Data, as they are the best way to compact big and complex information in forms of simple answers (refer to “Why Model?”). Models built to predict (or point out) who is more likely to be into outdoor sports, to be a risk-averse investor, to go on a cruise vacation, to be a member of discount club, to buy children’s products, to be a bigtime donor or to be a NASCAR fan, are all providing specific answers to specific questions, while each model score is a result of serious reduction of information, often compressing thousands of variables into one answer. That simplification process in itself provides incredible value to decision-makers, as most wouldn’t know where to cut out unnecessary information to answer specific questions. Using mathematical techniques, we can cut down the noise with conviction.

In model development, “Variable Reduction” is the first major step after the target variable is determined (refer to “The Art of Targeting“). It is often the most rigorous and laborious exercise in the whole model development process, where the characteristics of models are often determined as each statistician has his or her unique approach to it. Now, I am not about to initiate a debate about the best statistical method for variable reduction (I haven’t met two statisticians who completely agree with each other in terms of methodologies), but I happened to know that many effective statistical analysts separate variables in terms of data types and treat them differently. In other words, not all data variables are created equal. So, what are the major types of data that database designers and decision-makers (i.e., non-mathematical types) should be aware of?

In the business of predictive analytics for marketing, the following three types of data make up three dimensions of a target individual’s portrait:

  1. Descriptive Data
  2. Transaction Data / Behavioral Data
  3. Attitudinal Data

In other words, if we get to know all three aspects of a person, it will be much easier to predict what the person is about and/or what the person will do. Why do we need these three dimensions? If an individual has a high income and is living in a highly valued home (demographic element, which is descriptive); and if he is an avid golfer (behavioral element often derived from his purchase history), can we just assume that he is politically conservative (attitudinal element)? Well, not really, and not all the time. Sometimes we have to stop and ask what the person’s attitude and outlook on life is all about. Now, because it is not practical to ask everyone in the country about every subject, we often build models to predict the attitudinal aspect with available data. If you got a phone call from a political party that “assumes” your political stance, that incident was probably not random or accidental. Like I emphasized many times, analytics is about making the best of what is available, as there is no such thing as a complete dataset, even in this age of ubiquitous data. Nonetheless, these three dimensions of the data spectrum occupy a unique and distinct place in the business of predictive analytics.

So, in the interest of obtaining, maintaining and utilizing all possible types of data—or, conversely, reducing the size of data with conviction by knowing what to ignore, let us dig a little deeper:

Descriptive Data
Generally, demographic data—such as people’s income, age, number of children, housing size, dwelling type, occupation, etc.—fall under this category. For B-to-B applications, “Firmographic” data—such as number of employees, sales volume, year started, industry type, etc.—would be considered as descriptive data. It is about what the targets “look like” and, generally, they are frozen in the present time. Many prominent data compilers (or data brokers, as the U.S. government calls them) collect, compile and refine the data and make hundreds of variables available to users in various industry sectors. They also fill in the blanks using predictive modeling techniques. In other words, the compilers may not know the income range of every household, but using statistical techniques and other available data—such as age, home ownership, housing value, and many other variables—they provide their best estimates in case of missing values. People often have some allergic reaction to such data compilation practices siting privacy concerns, but these types of data are not about looking up one person at a time, but about analyzing and targeting groups (or segments) of individuals and households. In terms of predictive power, they are quite effective and results are very consistent. The best part is that most of the variables are available for every household in the country, whether they are actual or inferred.

Other types of descriptive data include geo-demographic data, and the Census Data by the U.S. Census Bureau falls under this category. These datasets are organized by geographic denominations such as Census Block Group, Census Tract, Country or ZIP Code Tabulation Area (ZCTA, much like postal ZIP codes, but not exactly the same). Although they are not available on an individual or a household level, the Census data are very useful in predictive modeling, as every target record can be enhanced with it, even when name and address are not available, and data themselves are very stable. The downside is that while the datasets are free through Census Bureau, the raw datasets contain more than 40,000 variables. Plus, due to the budget cut and changes in survey methods during the past decade, the sample size (yes, they sample) decreased significantly, rendering some variables useless at lower geographic denominations, such as Census Block Group. There are professional data companies that narrowed down the list of variables to manageable sizes (300 to 400 variables) and filled in the missing values. Because they are geo-level data, variables are in the forms of percentages, averages or median values of elements, such as gender, race, age, language, occupation, education level, real estate value, etc. (as in, percent male, percent Asian, percent white-collar professionals, average income, median school years, median rent, etc.).

There are many instances where marketers cannot pinpoint the identity of a person due to privacy issues or challenges in data collection, and the Census Data play a role of effective substitute for individual- or household-level demographic data. In predictive analytics, duller variables that are available nearly all the time are often more valuable than precise information with limited availability.

Transaction Data/Behavioral Data
While descriptive data are about what the targets look like, behavioral data are about what they actually did. Often, behavioral data are in forms of transactions. So many just call it transaction data. What marketers commonly refer to as RFM (Recency, Frequency and Monetary) data fall under this category. In terms of predicting power, they are truly at the top of the food chain. Yes, we can build models to guess who potential golfers are with demographic data, such as age, gender, income, occupation, housing value and other neighborhood-level information, but if you get to “know” that someone is a buyer of a box of golf balls every six weeks or so, why guess? Further, models built with transaction data can even predict the nature of future purchases, in terms of monetary value and frequency intervals. Unfortunately, many who have access to RFM data are using them only in rudimentary filtering, as in “select everyone who spends more than $200 in a gift category during the past 12 months,” or something like that. But we can do so much more with rich transaction data in every stage of the marketing life cycle for prospecting, cultivating, retaining and winning back.

Other types of behavioral data include non-transaction data, such as click data, page views, abandoned shopping baskets or movement data. This type of behavioral data is getting a lot of attention as it is truly “big.” The data have been out of reach for many decision-makers before the emergence of new technology to capture and store them. In terms of predictability, nevertheless, they are not as powerful as real transaction data. These non-transaction data may provide directional guidance, as they are what some data geeks call “a-camera-on-everyone’s-shoulder” type of data. But we all know that there is a clear dividing line between people’s intentions and their commitments. And it can be very costly to follow every breath you take, every move you make, and every step you take. Due to their distinct characteristics, transaction data and non-transaction data must be managed separately. And if used together in models, they should be clearly labeled, so the analysts will never treat them the same way by accident. You really don’t want to mix intentions and commitments.

The trouble with the behavioral data are, (1) they are difficult to compile and manage, (2) they get big; sometimes really big, (3) they are generally confined within divisions or companies, and (4) they are not easy to analyze. In fact, most of the examples that I used in this series are about the transaction data. Now, No. 3 here could be really troublesome, as it equates to availability (or lack thereof). Yes, you may know everything that happened with your customers, but do you know where else they are shopping? Fortunately, there are co-op companies that can answer that question, as they are compilers of transaction data across multiple merchants and sources. And combined data can be exponentially more powerful than data in silos. Now, because transaction data are not always available for every person in databases, analysts often combine behavioral data and descriptive data in their models. Transaction data usually become the dominant predictors in such cases, while descriptive data play the supporting roles filling in the gaps and smoothing out the predictive curves.

As I stated repeatedly, predictive analytics in marketing is all about finding out (1) whom to engage, and (2) if you decided to engage someone, what to offer to that person. Using carefully collected transaction data for most of their customers, there are supermarket chains that achieved 100 percent customization rates for their coupon books. That means no two coupon books are exactly the same, which is a quite impressive accomplishment. And that is all transaction data in action, and it is a great example of “Big Data” (or rather, “Smart Data”).

Attitudinal Data
In the past, attitudinal data came from surveys, primary researches and focus groups. Now, basically all social media channels function as gigantic focus groups. Through virtual places, such as Facebook, Twitter or other social media networks, people are freely volunteering what they think and feel about certain products and services, and many marketers are learning how to “listen” to them. Sentiment analysis falls under that category of analytics, and many automatically think of this type of analytics when they hear “Big Data.”

The trouble with social data is:

  1. We often do not know who’s behind the statements in question, and
  2. They are in silos, and it is not easy to combine such data with transaction or demographic data, due to lack of identity of their sources.

Yes, we can see that a certain political candidate is trending high after an impressive speech, but how would we connect that piece of information to whom will actually donate money for the candidate’s causes? If we can find out “where” the target is via an IP address and related ZIP codes, we may be able to connect the voter to geo-demographic data, such as the Census. But, generally, personally identifiable information (PII) is only accessible by the data compilers, if they even bothered to collect them.

Therefore, most such studies are on a macro level, citing trends and directions, and types of analysts in that field are quite different from the micro-level analysts who deal with behavioral data and descriptive data. Now, the former provide important insights regarding the “why” part of the equation, which is often the hardest thing to predict; while the latter provide answers to “who, what, where and when.” (“Who” is the easiest to answer, and “when” is the hardest.) That “why” part may dictate a product development part of the decision-making process at the conceptual stage (as in, “Why would customers care for a new type of dishwasher?”), while “who, what, where and when” are more about selling the developed products (as in “Let’s sell those dishwashers in the most effective ways.”). So, it can be argued that these different types of data call for different types of analytics for different cycles in the decision-making processes.

Obviously, there are more types of data out there. But for marketing applications dealing with humans, these three types of data complete the buyers’ portraits. Now, depending on what marketers are trying to do with the data, they can prioritize where to invest first and what to ignore (for now). If they are early in the marketing cycle trying to develop a new product for the future, they need to understand why people want something and behave in certain ways. If signing up as many new customers as possible is the immediate goal, finding out who and where the ideal prospects are becomes the most imminent task. If maximizing the customer value is the ongoing objective, then you’d better start analyzing transaction data more seriously. If preventing attrition is the goal, then you will have to line up the transaction data in time series format for further analysis.

The business goals must dictate the analytics, and the analytics call for specific types of data to meet the goals, and the supporting datasets should be in “analytics-ready” formats. Not the other way around, where businesses are dictated by the limitations of analytics, and analytics are hampered by inadequate data clutters. That type of business-oriented hierarchy should be the main theme of effective data management, and with clear goals and proper data strategy, you will know where to invest first and what data to ignore as a decision-maker, not necessarily as a mathematical analyst. And that is the first step toward making the Big Data smaller. Don’t be impressed by the size of the data, as they often blur the big picture and not all data are created equal.