Smart Data – Not Big Data

As a concerned data professional, I am already plotting an exit strategy from this Big Data hype. Because like any bubble, it will surely burst. That inevitable doomsday could be a couple of years away, but I can feel it coming. At the risk of sounding too much like Yoda the Jedi Grand Master, all hypes lead to over-investments, all over-investments lead to disappointments, and all disappointments lead to blames. Yes, in a few years, lots of blames will go around, and lots of heads will roll.

As a concerned data professional, I am already plotting an exit strategy from this Big Data hype. Because like any bubble, it will surely burst. That inevitable doomsday could be a couple of years away, but I can feel it coming. At the risk of sounding too much like Yoda the Jedi Grand Master, all hypes lead to over-investments, all over-investments lead to disappointments, and all disappointments lead to blames. Yes, in a few years, lots of blames will go around, and lots of heads will roll.

So, why would I stay on the troubled side? Well, because, for now, this Big Data thing is creating lots of opportunities, too. I am writing this on my way back from Seoul, Korea, where I presented this Big Data idea nine times in just two short weeks, trotting from large venues to small gatherings. Just a few years back, I used to have a hard time explaining what I do for living. Now, I just have to say “Hey, I do this Big Data thing,” and the doors start to open. In my experience, this is the best “Open Sesame” moment for all data specialists. But it will last only if we play it right.

Nonetheless, I also know that I will somehow continue to make living setting data strategies, fixing bad data, designing databases and leading analytical activities, even after the hype cools down. Just with a different title, under a different banner. I’ve seen buzzwords come and go, and this data business has been carried on by the people who cut through each hype (and gargantuan amount of BS along with it) and create real revenue-generating opportunities. At the end of the day (I apologize for using this cliché), it is all about the bottom line, whether it comes from a revenue increase or cost reduction. It is never about the buzzwords that may have created the business opportunities in the first place; it has always been more about the substance that turned those opportunities into money-making machines. And substance needs no fancy title or buzzwords attached to it.

Have you heard Google or Amazon calling themselves a “Big Data” companies? They are the ones with sick amounts of data, but they also know that it is not about the sheer amount of data, but it is all about the user experience. “Wannabes” who are not able to understand the core values often hang onto buzzwords and hypes. As if Big Data, Cloud Computing or coding language du jour will come and save the day. But they are just words.

Even the name “Big Data” is all wrong, as it implies that bigger is always better. The 3 Vs of Big Data—volume, velocity and variety—are also misleading. That could be a meaningful distinction for existing data players, but for decision-makers, it gives a notion that size and speed are the ultimate quest. But for the users, small is better. They don’t have time to analyze big sets of data. They need small answers in fun size packages. Plus, why is big and fast new? Since the invention of modern computers, has there been any year when the processing speed did not get faster and storage capacity did not get bigger?

Lest we forget, it is the software industry that came up with this Big Data thing. It was created as a marketing tagline. We should have read it as, “Yes, we can now process really large amounts of data, too,” not as, “Big Data will make all your dreams come true.” If you are in the business of selling toolsets, of course, that is how you present your product. If guitar companies keep emphasizing how hard it is to be a decent guitar player, would that help their businesses? It is a lot more effective to say, “Hey, this is the same guitar that your guitar hero plays!” But you don’t become Jeff Beck just because you bought a white Fender Stratocaster with a rosewood neck. The real hard work begins “after” you purchase a decent guitar. However, this obvious connection is often lost in the data business. Toolsets never provide solutions on their own. They may make your life easier, but you’d still have to formulate the question in a logical fashion, and still have to make decisions based on provided data. And harnessing meanings out of mounds of data requires training of your mind, much like the way musicians practice incessantly.

So, before business people even consider venturing into this Big Data hype, they should ask themselves “Why data?” What are burning questions that you are trying to solve with the data? If you can’t answer this simple question, then don’t jump into it. Forget about it. Don’t get into it just because everyone else seems to be getting into it. Yeah, it’s a big party, but why are you going there? Besides, if you formulate the question properly, often you will find that you don’t need Big Data all the time. If fact, Big Data can be a terrible detour if your question can be answered by “small” data. But that happens all the time, because people approach their business questions through the processes set by the toolsets. Big Data should be about the business, not about the IT or data.

Smart Data, Not Big Data
So, how do we get over this hype? All too often, perception rules, and a replacement word becomes necessary to summarize the essence of the concept for the general public. In my opinion, “Big Data” should have been “Smart Data.” Piles of unorganized dumb data aren’t worth a damn thing. Imagine a warehouse full of boxes with no labels, collecting dust since 1943. Would you be impressed with the sheer size of the warehouse? Great, the ark that Indiana Jones procured (or did he?) may be stored in there somewhere. But if no one knows where it is—or even if it can be located, if no one knows what to do with it—who cares?

Then, how do data get smarter? Smart data are bite-sized answers to questions. A thousand variables could have been considered to provide the weather forecast that calls for a “70 percent chance of scattered showers in the afternoon,” but that one line that we hear is the smart piece of data. Not the list of all the variables that went into the formula that created that answer. Emphasizing the raw data would be like giving paints and brushes to a person who wants a picture on the wall. As in, “Hey, here are all the ingredients, so why don’t you paint the picture and hang it on the wall?” Unfortunately, that is how the Big Data movement looks now. And too often, even the ingredients aren’t all that great.

I visit many companies only to find that the databases in question are just messy piles of unorganized and unstructured data. And please do not assume that such disarrays are good for my business. I’d rather spend my time harnessing meanings out of data and creating values, not taking care of someone else’s mess all the time. Really smart data are small, concise, clean and organized. Big Data should only be seen in “Behind the Scenes” types of documentaries for manias, not for everyday decision-makers.

I have been already saying that Big Data must get smaller for some time (refer to “Big Data Must Get Smaller“) and I would repeat it until it becomes a movement on its own. The Big Data movement must be about:

  1. Cutting down the noise
  2. Providing the answers

There is too much noise in the data, and cutting it out is the first step toward making the data smaller and smarter. The trouble is that the definition of “noise” is not static. Rock music that I grew up with was certainly a noise to my parents’ generation. In turn, some music that my kids listen to is pure noise to me. Likewise, “product color,” which is essential for a database designed for an inventory management system, may or may not be noise if the goal is to sell more apparel items. In such cases, more important variables could be style, brand, price range, target gender, etc., but color could be just peripheral information at best, or even noise (as in, “Uh, she isn’t going to buy just red shoes all the time?”). How do we then determine the differences? First, set the clear goals (as in, “Why are we playing with the data to begin with?”), define the goals using logical expressions, and let mathematics take care of it. Now you can drop the noise with conviction (even if it may look important to human minds).

If we continue with that mathematical path, we would reach the second part, which is “providing answers to the question.” And the smart answers are in the forms of yes/no, probability figures or some type of scores. Like in the weather forecast example, the question would be “chance of rain on a certain day” and the answer would be “70 percent.” Statistical modeling is not easy or simple, but it is the essential part of making the data smarter, as models are the most effective way to summarize complex and abundant data into compact forms (refer to “Why Model?”).

Most people do not have degrees in mathematics or statistics, but they all know what to do with a piece of information such as “70 percent chance of rain” on the day of a company outing. Some may complain that it is not a definite yes/no answer, but all would agree that providing information in this form is more humane than dumping all the raw data onto users. Sales folks are not necessarily mathematicians, but they would certainly appreciate scores attached to each lead, as in “more or less likely to close.” No, that is not a definite answer, but now sales people can start calling the leads in the order of relative importance to them.

So, all the Big Data players and data scientists must try to “humanize” the data, instead of bragging about the size of the data, making things more complex, and providing irrelevant pieces of raw data to users. Make things simpler, not more complex. Some may think that complexity is their job security, but I strongly disagree. That is a sure way to bring down this Big Data movement to the ground. We are already living in a complex world, and we certainly do not need more complications around us (more on “How to be a good data scientist” in a future article).

It’s About the Users, Too
On the flip side, the decision-makers must change their attitude about the data, as well.

1. Define the goals first: The main theme of this series has been that the Big Data movement is about the business, not IT or data. But I’ve seen too many business folks who would so willingly take a hands-off approach to data. They just fund the database; do not define clear business goals to developers; and hope to God that someday, somehow, some genius will show up and clear up the mess for them. Guess what? That cavalry is never coming if you are not even praying properly. If you do not know what problems you want to solve with data, don’t even get started; you will get to nowhere really slowly, bleeding lots of money and time along the way.

2. Take the data seriously: You don’t have to be a scientist to have a scientific mind. It is not ideal if someone blindly subscribes anything computers spew out (there are lots of inaccurate information in databases; refer to “Not All Databases Are Created Equal.”). But too many people do not take data seriously and continue to follow their gut feelings. Even if your customer profile coming out of a serious analysis does not match with your preconceived notions, do not blindly reject it; instead, treat it as a newly found gold mine. Gut feelings are even more overrated than Big Data.

3. Be logical: Illogical questions do not lead anywhere. There is no toolset that reads minds—at least not yet. Even if we get to have such amazing computers—as seen on “Star Trek” or in other science fiction movies—you would still have to ask questions in a logical fashion for them to be effective. I am not asking decision-makers to learn how to code (or be like Mr. Spock or his loyal follower, Dr. Sheldon Cooper), but to have some basic understanding of logical expressions and try to learn how analysts communicate with computers. This is not data geek vs. non-geek world anymore; we all have to be a little geekier. Knowing Boolean expressions may not be as cool as being able to throw a curve ball, but it is necessary to survive in the age of information overload.

4. Shoot for small successes: Start with a small proof of concept before fully investing in large data initiatives. Even with a small project, one gets to touch all necessary steps to finish the job. Understanding the flow of information is as important as each specific step, as most breakdowns occur in between steps, due to lack of proper connections. There was Gemini program before Apollo missions. Learn how to dock spaceships in space before plotting the chart to the moon. Often, over-investments are committed when the discussion is led by IT. Outsource even major components in the beginning, as the initial goal should be mastering the flow of things.

5. Be buyer-centric: No customer is bound by the channel of the marketer’s choice, and yet, may businesses act exactly that way. No one is an online person just because she did not refuse your email promotions yet (refer to “The Future of Online is Offline“). No buyer is just one dimensional. So get out of brand-, division-, product- or channel-centric mindsets. Even well-designed, buyer-centric marketing databases become ineffective if users are trapped in their channel- or division-centric attitudes, as in “These email promotions must flow!” or “I own this product line!” The more data we collect, the more chances marketers will gain to impress their customers and prospects. Do not waste those opportunities by imposing your own myopic views on them. Big Data movement is not there to fortify marketers’ bad habits. Thanks to the size of the data and speed of machines, we are now capable of disappointing a lot of people really fast.

What Did This Hype Change?
So, what did this Big Data hype change? First off, it changed people’s attitudes about the data. Some are no longer afraid of large amounts of information being thrown at them, and some actually started using them in their decision-making processes. Many realized that we are surrounded by numbers everywhere, not just in marketing, but also in politics, media, national security, health care and the criminal justice system.

Conversely, some people became more afraid—often with good reasons. But even more often, people react based on pure fear that their personal information is being actively exploited without their consent. While data geeks are rejoicing in the age of open source and cloud computing, many more are looking at this hype with deep suspicions, and they boldly reject storing any personal data in those obscure “clouds.” There are some people who don’t even sign up for EZ Pass and voluntarily stay on the long lane to pay tolls in the old, but untraceable way.

Nevertheless, not all is lost in this hype. The data got really big, and types of data that were previously unavailable, such as mobile and social data, became available to many marketers. Focus groups are now the size of Twitter followers of the company or a subject matter. The collection rate of POS (point of service) data has been increasingly steady, and some data players became virtuosi in using such fresh and abundant data to impress their customers (though some crossed that “creepy” line inadvertently). Different types of data are being used together now, and such merging activities will compound the predictive power even further. Analysts are dealing with less missing data, though no dataset would ever be totally complete. Developers in open source environments are now able to move really fast with new toolsets that would just run on any device. Simply, things that our forefathers of direct marketing used to take six months to complete can be done in few hours, and in the near future, maybe within a few seconds.

And that may be a good thing and a bad thing. If we do this right, without creating too many angry consumers and without burning holes in our budgets, we are currently in a position to achieve great many things in terms of predicting the future and making everyone’s lives a little more convenient. If we screw it up badly, we will end up creating lots of angry customers by abusing sensitive data and, at the same time, wasting a whole lot of investors’ money. Then this Big Data thing will go down in history as a great money-eating hype.

We should never do things just because we can; data is a powerful tool that can hurt real people. Do not even get into it if you don’t have a clear goal in terms of what to do with the data; it is not some piece of furniture that you buy just because your neighbor bought it. Living with data is a lifestyle change, and it requires a long-term commitment; it is not some fad that you try once and give up. It is a continuous loop where people’s responses to marketer’s data-based activities create even more data to be analyzed. And that is the only way it keeps getting better.

There Is No Big Data
And all that has nothing to do with “Big.” If done right, small data can do plenty. And in fact, most companies’ transaction data for the past few years would easily fit in an iPhone. It is about what to do with the data, and that goal must be set from a business point of view. This is not just a new playground for data geeks, who may care more for new hip technologies that sound cool in their little circle.

I recently went to Brazil to speak at a data conference called QIBRAS, and I was pleasantly surprised that the main theme of it was the quality of the data, not the size of the data. Well, at least somewhere in the world, people are approaching this whole thing without the “Big” hype. And if you look around, you will not find any successful data players calling this thing “Big Data.” They just deal with small and large data as part of their businesses. There is no buzzword, fanfare or a big banner there. Because when something is just part of your everyday business, you don’t even care what you call it. You just do. And to those masters of data, there is no Big Data. If Google all of a sudden starts calling itself a Big Data company, it would be so uncool, as that word would seriously limit it. Think about that.

‘Forgotten’ Unsubscribes – Is This a New Trend?

With Black Friday now behind me, I ran a quick count and found 131 emails sent by retailers with whom I had unsubscribed. I was more than a little surprised to have received this many emails and wondered: Are these retailers counting on me having forgotten I had unsubscribed? Is this a new trend?

With Black Friday now behind me, I ran a quick count and found 131 emails sent by retailers with whom I had unsubscribed. I was more than a little surprised to have received this many emails and wondered: Are these retailers counting on me having forgotten I had unsubscribed? Is this a new trend?

The CAN-SPAM Act is very clear on the issue of how businesses should present and handle unsubscribes. It reads in part, you cannot charge a fee, require the recipient to give any personally identifying information beyond an email address, or make the recipient take any step other than sending a reply email or visiting a single page on a website as a condition for honoring an opt-out request.” In other words, it should be easy and it should be permanent. The retailers who have sent me an email in the last few days have done far more damage than good – though I admit, my diligence in tracking unsubscribes goes well beyond that of the typical subscriber—most people probably do forget having unsubscribed.

I’ve divided my 131 Black Friday marketing emails into three categories (remember, these are not business correspondence messages or transactional messages, for which opt-out rules differ in the US, as well as Canada and the EU):

  1. Retailers with whom I had done business, but not subscribed (permitted to send transactional messages only).
  2. Retailers with whom I had done business, subscribed, and later unsubscribed (permitted to send transactional and marketing messages until revoked).
  3. New retailers with whom I had concluded business and explicitly opted out of marketing messages at the time of transaction (permitted to send transactional message only).

Of these emails:

  • 6 provided no unsubscribe link or information (which is allowed by the CAN-SPAM Act, if they are using the reply-to process for unsubscribing)
  • 26 provided an unsubscribe link requiring me to visit a web page to set my preferences
  • 19 provided both an unsubscribe link and a preferences link

Past Relationships
So let’s take a look at these vendors’ approaches and assess the value of each:

Several years ago I bought hosting services from Glob@t. On the 28th and again today, I received emails from this vendor. I unsubscribed from their messages just once when our relationship ended, and yet Black Friday seemed to have provided the perfect opportunity—as deemed by their marketing department – to reactivate an unsubscribed name and send a message.

In this case their message actually did exactly as they hoped: I became re-engaged. Of course, they had no idea, but yesterday I spent three hours on a tech-support call with my current vendor, and had decided to start shopping hosting vendors. Glob@t’s email came at an opportune time, but that’s not to say I wasn’t annoyed by it—I certainly was. Nonetheless, I clicked the link to check out their hosting packages, and after checking pricing, I returned to the email to unsubscribe. I will monitor their messages to ensure I remain unsubscribed this time around.

New Relationships
Three weeks ago I made a purchase from eBay of a hard-to-find item, which launched an onslaught of emails. I have received one or more every day since the date of purchase and in each I have clicked the unsubscribe link. Their unsubscribe text at the bottom of those emails reads in part,

Learn more to protect yourself from spoof (fake) e-mails.

eBay Inc. sent this e-mail to you at [myemailaddress] because your Notification Preferences indicate that you want to receive general email promotions.

If you do not wish to receive further communications like this, please click here to unsubscribe. Alternatively, you can change your Notification Preferences in My eBay by clicking here. Please note that it may take up to 10 days to process your request.

What I find interesting about their unsubscribe text is the presentation. By starting out with a “learn more about spoofing” link, they have attempted to befriend me by offering tips on protecting myself. They are my concerned about me—or so it would seem.

Next they offer to unsubscribe me by clicking the link and when I do click it, I receive an unsubscribe confirmation and information on how to re-subscribe should I wish to.

Their unsubscribe text does let me know it may take up to ten days to process my request, but I have to wonder: Why is this? Every company using an email-automation system knows unsubscribes are immediate. What’s up with the ten-day delay? My guess is they hope within the next ten days they will be able to send me an email that will re-engage me. (Terrible idea.)

After more than ten days of continuing to receive one or more emails every day, I clicked the set my preferences link, which requires—you guessed it—a log in. The purchase I made was completed as a guest. I did not wish then, nor do I wish now, to create an account with them. I’ve had one cause (ever) to make a purchase from them, and didn’t see it happening again. If it did, I could make a decision at that time about whether or not an account would be necessary. This too is an annoying approach: require the user to create an account to unsubscribe. (Terrible idea number two.)

After two weeks of emails, I’m now so irritated by their entire process it will adversely affect my decision to ever buy from them again, even if the item I am seeking is less expensive, more available, or even exclusively available. I will remember their lack of respect for my wishes and it will deter me. I guess they’re not as friendly as they first seemed.

As marketers, staying engaged with your constituents is more than betting on their short-term memory loss. It’s about honoring the relationship and their wishes. I remembered Glob@t and would have come back to their site when my vendor shopping began, but knowing they do not honor my unsubscribe status has tainted my view of their business practices. My purchase from eBay was exactly the right product, delivered on time, and in great condition. My positive experience would have led me back to them at some point in the future, but their emailing practices have put them on my own do-not-call list. If the new trend is to make a brand more memorable by being annoying, I opt out.