Stop Blaming Marketing Problems on Software

When faced with a large amount of unrefined, unstructured and uncategorized data, we must indeed fix the data first. Let’s not even think about blaming the data storage platforms like Hadoop, MongoDB or Teradata here. That would be like blaming rice storage facilities for not being able to refine rice for human consumption.

DataI often hear statements like “Our client has a Tableau problem.” Or, it is something about Hadoop or data platforms, as in “We have an issue with Hadoop.” What did it do, use offensive language? I wonder what the real issue is.

In any case, such general statements don’t help much. I guess a medical doctor feels the same way when she hears that her patient has a headache. What does that even mean, headache? What kind of headache? Prolonging or sporadic? Throbbing or sharp pain? Overall, or one-sided? Or, do you just want to avoid conversations with your spouse?

Symptoms are not always related to root causes. Why would marketers think they have a problem with Tableau? Isn’t that a reporting and display tool? Unless one doesn’t like the way a bubble chart comes out, nothing really is a Tableau problem.

More often than not, reporting issues stem back to the data. What could be the major issues with the report? Inaccuracy, inconsistency or just plain suckiness? If the data on the report don’t make any sense, we must dig deeper. And let’s not forget that reporting tools are not even designed to handle heavy-duty data manipulations. But if the report doesn’t make any sense or is hard to understand — well, then — let’s blame the designer of such a report, not the toolset.

For the record, I do not represent analytical toolset companies like SAS, SPSS or Tableau. Maybe they should share some blame, because they must have sold the toolsets as an almighty data mining tool that just does it all. But I am addressing the issue this way; as, at least for now, forming proper questions, defining problem statements, data modeling (for analytics), report design and, most importantly, deriving insights out of the report solidly remain as human functions.

Let’s break it down further. When faced with a large amount of unrefined, unstructured and uncategorized data, we must indeed fix the data first. Let’s not even think about blaming the data storage platforms like Hadoop, MongoDB or Teradata here. That would be like blaming rice storage facilities for not being able to refine rice for human consumption. In other words, we should not put too much of a burden on the data collection and storage systems when it comes to data refinement.

Data refinement should be dealt with as a separate entry altogether; between data collection (such as Hadoop) and data delivery (such as Tableau), each requiring different skillsets and expertise. Such data refinement work includes:

  • Data Hygiene and Edit: As no data source is immaculate. In fact, many analysts waste their valuable time on fixing dirty data (and following the steps listed below).
  • Data Categorization and Tagging: As uncategorized freeform data must be put into buckets and properly tagged for advanced analytics (refer to “Free Form Data Are Not Exactly Free”).
  • Data Consolidation: As disparate data sources must be “merged” (to create a “360-degree view of the customer” around a person, for example), or “concatenated” (to increase coverage by adding similar types of data).
  • Data Summarization and Variable Creation: To transform data to describe different levels (transaction, emails, customers, companies, etc.), as in converting transaction or event-level data into “descriptors of individual customers” (refer to “Beyond RFM Data”).
  • Treat Missing Values: As no data will ever be fully complete, we need to fill in the gaps either with statistical models or business rules (refer to “Missing Data Can Be Meaningful”).

If the salesperson who sold you the reporting toolset promised that the product would do all of these things, well, just ignore him. Even in the age of AI, these steps must be performed by separate machines (or teams) trained for specific tasks. Simply, machines are not that smart yet; AI trained for “recognition” won’t be able to “predict” and fill in the blanks for you. That also means that these are not to be done by human analysts all by themselves.

Nonetheless, the steps listed here must be completed before the reporting or any other analytical work even begins. We can even say that the reporting step is the simplest one of all. But only if the reports are designed properly first. And that is the catch.

No amount of pretty charts can be meaningful if there is no story behind it. That would be like watching a movie filled with so-called state-of-the-art special effects with no character development or viable storyline. That may work as a trailer, but that’s about it. Now, if you are an analyst having to present findings to a client or your boss, you don’t want to be the one who loses steam five minutes after the meeting begins. A 40-page PowerPoint deck? So what? What does all of that mean? What are we supposed to do about it?

Don’t Hire Data Posers

There are data geeks and there are data scientists. Then there are data plumbers, and there are total posers. In this modern world where the line between “real” and “fake” is ever-blurrier, some may not even care for such differences.

data poserThere are data geeks and there are data scientists. Then there are data plumbers, and there are total posers. In this modern world where the line between “real” and “fake” is ever-blurrier, some may not even care for such differences.

Call me old-school, but at least in some fields, I believe that “the ability to do things” still matters. Analytics is one of those fields. When it comes to data and analytics, you either know how to do it, or you don’t know how to do it. The difference is as clear as a person who can play a musical instrument and one who is tone-deaf.

Unfortunately, there is no clear way to tell the difference in this data and analytics field. It’s not like we can line up contestants and ask them to sing and be judged here. Furthermore, “posers” often have louder voices — armed with fancy visuals and so-called automated toolsets.

I’ve been to many conferences and sat through countless presentations in my lifetime. It may sound harsh for me to criticize fellow data players and presenters, but let me just come out and say it: A great many presenters and panelists at conferences are posers.

How do I know that? Easy. I asked them. For example, when I stalked some panelists who preached about the best practices of personalization after the session, the answers were often “Well, it is not like we do all those things for real …” Sometimes I didn’t even have to ask the question, as I could tell something is seriously broken in their data and promotion chain by observing their marketing messages as a customer.

The bad news for the users of information — and for consumers, for that matter — is that it takes a long time to figure out things are not going fine. Conversely, we can all tell who is tone-deaf as soon as a singer opens her mouth. It is so hard to tell the difference between a data scientist (i.e., an analyst who provides insights and next steps out of mounds of data) and a data plumber (i.e., supposedly an analyst who moves big and small data around all day long and thinks that is his job), that I admit it sometimes takes a few months — generally after some near meltdowns — for me to figure it out.