They say data is an asset. I say it, too. If collected data are wielded properly, they can definitely lead to financial gains, either through a revenue increase or cost reduction. But that doesn’t mean that possessing large amounts of data guarantees large dollar figures for the collector. Data governance matters, because the operative words in my statement are “wielded properly,” as I have been emphasizing for years through this column.
Plus, collecting data also comes with risks. When sensitive data go into the wrong hands, it often leads to a direct financial burden for the data collector. In some countries, an assumed guardian of sensitive data may face legal charges for mishandling sensitive data. Even in the United States, which is known as the “freest” country for businesses when it comes to data usage, data breach or clear abuse of data can lead to a publicity nightmare for the organization; or worse, large legal settlements after long and costly litigations. Even in the most innocuous cases, mistreatment of sensitive data may lead to serious damage to the brand image.
The phrase is not even cool in the business community anymore, but “Big Data” worked like a magic word only a few years ago. In my opinion, that word “big” in Big Data misled many organizations and decision-makers. It basically gave a wrong notion that “big” is indeed “good” in the data business.
What is “good,” in a pure business sense? Simply, more money. What was the popular definition of Big Data back then? Three Vs, as in volume, velocity and variety. So, if varieties of data in large volumes move around really fast, it will automatically be good for businesses? We know the answer by now, that a large amount of unstructured, unorganized and unrefined data could just be a burden to the holder, not to mention the security concerns listed earlier.
Unfortunately, with the popularity of Big Data and emergence of cloud computing, many organizations started to hoard data with a hope that collected data would turn into gold one day. Here, I am saying “hoarding” with all of the negative connotations that come with the word.
Hoarders are the people who are not able to throw away anything, even garbage. Data hoarders are the same way. Most datasets are huge because the collector does not know what to throw out. If you ask any hoarder why he keeps so many items in the house, the most common answer would be “because you never know when you need them.” Data hoarders keep every piece of data indefinitely for the same reason.
Only Keep Useful Data
But if you are playing with data for business purposes, you should know what pieces of data are useful for decision-making. The sponsor of any data activity must have clear objectives to begin with. Analysts would then find out what kind of data are necessary to meet those goals, through various statistical analyses and cumulative knowledge.
Actually, good analysts do know that not all data are created equal, and some are more useful than others. Why do you think that the notion of a Data Lake became popular following the Big Data hype? Further, I have been emphasizing the importance of an even more concise data environment. (I call it an “Analytics Sandbox.”) Because the lake water in the Data Lake is still not drinkable. Data must get smaller through data refinement and analytics to be beneficial for decision-makers (refer to “Big Data Must Get Smaller”).
Nonetheless, organizations continue to hoard data, because no one wants to be responsible for purging data that may be useful someday. Government agencies may have some good reasons to maintain large amounts of data, because the cost of losing or misplacing data about some terrorist activities is too high. Even in that case, however, we should collectively be concerned if the most sensitive data about us — such as our biometrics data — reside in some government agency’s server somewhere, without clear and immediate purposes. In cities like London or Paris, cameras are on every street corner, linked to facial recognition algorithms. But we tolerate that because the benefit outweighs the risk (so we think). But that doesn’t mean that we don’t need to be concerned with data breach or abuse.
Hoarding Data Gives Brands the Temptation to Be Creepy
If the data are collected by businesses for their financial gains, then the subjects of such data collection (i.e., consumers) should question who gave them the right to collect data about every breath we take, every move we make and every claim we stake. It is one thing to retain data about mutual transactions, but it is quite another to collect data on our movement or whereabouts, unilaterally. In other words, it is one thing to be remembered (for better service and recommendation in the future), but it is another to be stalked (remember “Every Breath You Take” is a song about a stalker).
Have you heard a story about a stalker who successfully courted the subject as result of stalking? Why do marketers think that they will sell more of their products by stalking their customers and prospects? Since when did being totally creepy – as in “I know where you are and what you’re doing right now” – become an acceptable marketing tactic? (Refer to “Don’t Do It Just Because You Can.”)
In fact, even if you do possess such data, in the interest of “not” being creepy, you must make your message more innocuous. For example, don’t act like you are offering an item because you “know” that the target looked around similar items recently. That kind of creepy approach may work once in a while, but let’s not call that a good sales tactic.
Instead, sellers should make gentle nudges. Don’t say “I know you are looking for this particular skin care item.” The response to that would be “Who the hell are you, and how do you know that?” Instead, do say “Would you be interested in our new product for people with sensitive skin?” The desirable response would be “Hey, I was just looking for something like that!”
The difference between a creepy stalking and a gentle nudging is huge, from the receiving end.
Through many articles about personalization, I have been emphasizing the use of model-based personas, as they pack so much information in the form of answer to questions and cover the gap of missing data (as we’d never know everything about everyone). If I may add one more benefit of modeling, it coverts data into probabilities. Raw data is about “I know she is looking for a particular high-end skin care item,” where coverage of such data is seriously limited, anyway. Conversely, model scores are about “Her score for high-end beauty products is 8 out of 10 scale score,” even if we may not even have concrete data about that specific interest.
Now, users who only have access to the model score — which is “dull” information, in comparison to “sharp” data about some verified behavior — would be less temped to say “Oh, I know you did this.” Even for non-geeky types, the difference between “Is” and “Likely to be” is vast.
If converting sharp data into innocuous probability scores through modeling is too much for you to start with, then at least categorize the data, and expose data points to users that way. Yes, we are living in the world of SKU-level product suggestion (like Amazon does), but as a consumer, have you ever “liked” such blunt suggestions, anyway? Marketers do it because such personalization does better than not doing anything at all, but such a practice is hardly ideal for many reasons (Being creepy being one. Refer to “Personalization Is About the Person”).
The saddest part in all this is that most marketers don’t even know how to fully utilize what they collected. I’ve seen too many organizations that are still stuck with using a few popular data variables repeatedly, while hoarding data indiscriminately. Why risk all of those privacy and security concerns, not to mention the data maintenance cost, if that is the case?
Have a Goal for All of That Data
If analytics is part of the process, then the analysts will tell you with conviction, that you don’t need all those data points for certain types of prediction. For instance, why risk losing a bunch of credit card numbers, when the credit card type or payment method is all you need to predict responses and propensities on a customer level?
Of course, the organization must first decide what types of models and predictions are necessary to meet their goals. But that is the beginning part of the whole analytics game, anyway. Analytics is not about answering to some wishful thinking of data hoarders; it should be a goal-oriented activity, with carefully selected and refined data for clear purposes.
A goal-oriented mindset is even more important in the age of machine learning and automation. Because we should never automate bad behaviors. Imagine a powerful marketing automation engine in the hands of data hoarders. Forget about organizational inefficiency. As a consumer, don’t you get a chill down your spine just imagining how creepy the outcome would be? Well, maybe we don’t really have to imagine it, as we all get bombarded with ineffective and not-so-personal offers every day.
So, marketers, have clear purposes in data activities, and do not become mindless data hoarders. If you do possess data, wield them properly with analytics. And while at it, purge pieces of data that do not fit your goals. That “you never know” attitude really doesn’t help anyone. And you are supposed to know your own goals and what data and methodologies will get you there.