Putting Data to Use

The value of data does not depend on size or shape of them. It really depends on how useful data are for decision-making. Some data geeks may not agree with me, but they are generally not the ones who fund the maintenance of spit-spot clean data in a warehouse or in a cloud.

Lead a horse to waterThe value of data does not depend on size or shape of them. It really depends on how useful data are for decision-making. Some data geeks may not agree with me, but they are generally not the ones who fund the maintenance of spit-spot clean data in a warehouse or in a cloud.

From the business perspective, no one would invest an obscene amount of money for someone’s hobby. Sorry for being obvious, but data must be used by everyday decision-makers for them to have any value.

I have shared ways to evaluate various types of data in this series (refer to “Not All Databases Are Created Equal,” where I explained nine evaluation criteria), and even that article, written for businesspeople, can be considered too technical. If I may really simplify it, data is worthless if no one is using it.

Data and information are modern-day currency; piling them up in a safe does not increase their value. Even bigshots like CIOs, CDOs or CTOs should eventually answer to CEOs and CFOs regarding the return on investment. Without exception, such value is measured in dollars, pounds, Euros or Yuans, never in terabytes, megabits per second, instructions per second or any other techy measurements. What incremental revenue or extra savings did all those data and analytics activities create? Or, an even shorter question in a typical boardroom would be “what have the data done for the business lately?”

Like any field that requires some levels of expertise to get things done, there are all kinds of organizations when it comes to data usage. Some are absolutely clueless — even nowadays — and some are equipped with cutting-edge techniques and support systems. But even the ones that brag about terabytes of data flowing through their so-called “state of the art” (another cliché that I hate) systems often admit that data utilization is not on-par with the state of data themselves.

Unfortunately, no amount of investment on data platforms and toolsets can force users to change the way they make decisions. They have to “feel” that using data is easy and beneficial to them. That is why most job descriptions for CDOs include “evangelization” of data and analytics throughout the organization. And often, that is the most difficult part of their job. Another good old cliché would be “You can lead a horse to water, but you can’t make it drink.” Really?

I completely disagree with that statement. First, decisions-makers are not horses, and secondly, we can help them use the data by putting them into bite-size packages. And let’s not even call those packages names that reflect employed processes. When we consume any other product, how often do we care about the process? It’s not just that we don’t want to know what is in the hot dog, but the same is true of even high-tech products, such as smartphones. We just want them to work, don’t we? Sure, some enthusiasts may want to understand everything about their beloved gadgets, but most people could care less about all of the hardships that the designers and manufacturers have gone through.

In fact, I tell fellow analysts to spare all of the details, assumptions and chagrins when they talk to their clients and colleagues about any analysis. Get to the point fast. Tell them major implications and next steps, in the form of multiple choices, if necessary. Have the detailed answers in your back pocket, but share them only when requested. Explain the benefits of model scores without uttering words like “regression” or “decision tree.”

Patients Aren’t Ready for Treatment?

The key is to an effective prescription is to listen to the client first. Why do they lose sleep at night? What are their key success metrics? What are the immediate pain points? What are their long-term goals? And how would we reach there within the limits of provided resources

In my job of being “a guy who finds money-making opportunities using data,” I get to meet all kinds of businesspeople in various industries. Thanks to the business trend around analytics (and to that infamous “Big Data” fad), I don’t have to spend a long time explaining what I do any more; I just say I am in the field of analytics, or to sound a bit fancier, I say data science. Then most marketers seem to understand where the conversation will go from there. Things are never that simple in real life, though, as there are many types of analytics — business intelligence, descriptive analytics, predictive analytics, optimization, forecasting, etc., even at a high level — but figuring what type of solutions should be prescribed is THE job for a consultant, anyway (refer to “Prescriptive Analytics at All Stages”).

The key is to an effective prescription is to listen to the client first. Why do they lose sleep at night? What are their key success metrics? What are the immediate pain points? What are their long-term goals? And how would we reach there within the limits of provided resources and put out the fire at the same time? Building a sound data and analytics roadmap is critical, as no one wants to have an “Oh dang, we should have done that a year ago!” moment after a complex data project is well on its way. Reconstruction in any line of business is costly, and unfortunately, it happens all of the time, as many marketers and decision-makers often jump into the data pool out of desperation under organizational pressure (or under false promises by toolset providers, as in “all your dreams will come true with this piece of technology”). It is a sad sight when users realize that they don’t know how to swim in it “after” they jumped into it.

Why does that happen all of the time? At the risk of sounding like a pompous doctor, I must say that it is quite often the patient’s fault, too; there are lots of bad patients. When it comes to the data and analytics business, not all marketers are experts in it, though some are. Most do have a mid-level understanding, and they actually know when to call in for help. And there are complete novices, too. Now, regardless of their understanding level, bad patients are the ones who show up with self-prescribed solutions, and wouldn’t hear about any other options or precautions. Once, I’ve even met a client who demanded a neural-net model right after we exchanged pleasantries. My response? “Whoa, hold your horses for a minute here, why do you think that you need one?” (Though I didn’t quite say it like that.) Maybe you just came back from some expensive analytics conference, but can we talk about your business case first? After that conversation, I could understand why doctors wouldn’t appreciate patients who would trust WebMD over living, breathing doctors who are in front of them.

Then there are opposite types of cases, too. Some marketers are so insecure about the state of their data assets (or their level of understanding) that they wouldn’t even want to hear about any solutions that sound even remotely complex or difficult, although they may be in desperate need of them. A typical response is something like “Our datasets are so messy that we can’t possibly entertain anything statistical.” You know what that sounds like? It sounds like a patient refusing any surgical treatment in an ER because “he” is not ready for it. No, doctors should be ready to perform the surgery, not the patient.

Messy datasets are surely no excuse for not taking the right path. If we had to wait for a perfect set of data all of the time, there wouldn’t be any need for statisticians or data scientists. In fact, we need such specialists precisely because most data sets are messy and incomplete, and they need to be enhanced by statistical techniques.

Analytics is about making the best of what we have. Cleaning dirty and messy data is part of the job, and should never be an excuse for not doing the right thing. If anyone assumes that simple reports don’t require data cleansing steps because the results look simple, nothing could be further from the truth. Most reporting errors stem from dirty data, and most datasets — big or small, new or old — are not ready to be just plugged into analytical engines.

Besides, different types of analytics are needed because there are so many variations of business challenges, and no analytics is supposed to happen in some preset order. In other words, we get into predictive modeling because the business calls for it, not because a marketer finished some basic Reporting 101 class and now wants to move onto an Analytics 202 course. I often argue that deriving insights out of a series of simple reports could be a lot more difficult than building models or complex data management. Conversely, regardless of the sophistication level, marketers are not supposed to get into advanced analytics just for intellectual curiosity. Every data and analytics activity must be justified with business purposes, carefully following the strategic data roadmap, not difficulty level of the task.