Don’t Do It Just Because You Can

Don’t do it just because you can. No kidding. … Any geek with moderate coding skills or any overzealous marketer with access to some data can do real damage to real human beings without any superpowers to speak of. Largely, we wouldn’t go so far as calling them permanent damages, but I must say that some marketing messages and practices are really annoying and invasive. Enough to classify them as “junk mail” or “spam.” Yeah, I said that, knowing full-well that those words are forbidden in the industry in which I built my career.

Don’t do it just because you can. No kidding. By the way, I could have gone with Ben Parker’s “With great power comes great responsibility” line, but I didn’t, as it has become an over-quoted cliché. Plus, I’m not much of a fan of “Spiderman.” Actually, I’m kidding this time. (Not the “Spiderman” part, as I’m more of a fan of “Thor.”) But the real reason is any geek with moderate coding skills or any overzealous marketer with access to some data can do real damage to real human beings without any superpowers to speak of. Largely, we wouldn’t go so far as calling them permanent damages, but I must say that some marketing messages and practices are really annoying and invasive. Enough to classify them as “junk mail” or “spam.” Yeah, I said that, knowing full-well that those words are forbidden in the industry in which I built my career.

All jokes aside, I received a call from my mother a few years ago asking me if this “urgent” letter that says her car warranty will expire if she does not act “right now” (along with a few exclamation marks) is something to which she must respond immediately. Many of us by now are impervious to such fake urgencies or outrageous claims (like “You’ve just won $10,000,000!!!”). But I then realized that there still are plenty of folks who would spend their hard-earned dollars based on such misleading messages. What really made me mad, other than the fact that my own mother was involved in that case, was that someone must have actually targeted her based on her age, ethnicity, housing value and, of course, the make and model of her automobile. I’ve been doing this job for too long to be unaware of potential data variables and techniques that must have played a part so that my mother to receive a series of such letters. Basically, some jerk must have created a segment that could be named as “old and gullible.” Without a doubt, this is a classic example of what should not be done just because one can.

One might dismiss it as an isolated case of a questionable practice done by questionable individuals with questionable moral integrity, but can we honestly say that? I, who knows the ins and outs of direct marketing practices quite well, fell into traps more than a few times, where supposedly a one-time order mysteriously turns into a continuity program without my consent, followed by an extremely cumbersome canceling process. Further, when I receive calls or emails from shady merchants with dubious offers, I can very well assume my information changed hands in very suspicious ways, if not through outright illegal routes.

Even without the criminal elements, as data become more ubiquitous and targeting techniques become more precise, an accumulation of seemingly inoffensive actions by innocuous data geeks can cause a big ripple in the offline (i.e., “real”) world. I am sure many of my fellow marketers remember the news about this reputable retail chain a few years ago; that they accurately predicted pregnancy in households based on their product purchase patterns and sent customized marketing messages featuring pregnancy-related products accordingly. Subsequently it became a big controversy, as such a targeted message was the way one particular head of household found out his teenage daughter was indeed pregnant. An unintended consequence? You bet.

I actually saw the presentation of the instigating statisticians in a predictive analytics conference before the whole incident hit the wire. At the time, the presenters were unaware of the consequences of their actions, so they proudly shared employed methodologies with the audience. But when I heard about what they were actually trying to predict, I immediately turned my head to look at the lead statistician in my then-analytical team sitting next to me, and saw that she had a concerned look that I must have had on my face, as well. And our concern was definitely not about the techniques, as we knew how to do the same when provided with similar sets of data. It was about the human consequences that such a prediction could bring, not just to the eventual targets, but also to the predictors and their fellow analysts in the industry who would all be lumped together as evil scientists by the outsiders. In predictive analytics, there is a price for being wrong; and at times, there is a price to pay for being right, too. Like I said, we shouldn’t do things just because we can.

Analysts do not have superpowers individually, but when technology and ample amounts of data are conjoined, the results can be quite influential and powerful, much like the way bombs can be built with common materials available at any hardware store. Ironically, I have been evangelizing that the data and technology should be wielded together to make big and dumb data smaller and smarter all this time. But providing answers to decision-makers in ready-to-be used formats, hence “humanizing” the data, may have its downside, too. Simply, “easy to use” can easily be “easy to abuse.” After all, humans are fallible creatures with ample amounts of greed and ambition. Even without any obvious bad intentions, it is sometimes very difficult to contemplate all angles, especially about those sensitive and squeamish humans.

I talked about the social consequences of the data business last month (refer to “How to Be a Good Data Scientist“), and that is why I emphasized that anyone who is about to get into this data field must possess deep understandings of both technology and human nature. That little sensor in your stomach that tells you “Oh, I have a bad feeling about this” may not come to everyone naturally, but we all need to be equipped with those safeguards like angels on our shoulders.

Hindsight is always 20/20, but apparently, those smart analysts who did that pregnancy prediction only thought about the techniques and the bottom line, but did not consider all the human factors. And they should have. Or, if not them, their manager should have. Or their partners in the marketing department should have. Or their public relations people should have. Heck, “someone” in their organization should have, alright? Just like we do not casually approach a woman on the street who “seems” pregnant and say “You must be pregnant.” Only socially inept people would do that.

People consider certain matters extremely private, in case some data geeks didn’t realize that. If I might add, the same goes for ailments such as erectile dysfunction or constipation, or any other personal business related to body parts that are considered private. Unless you are a doctor in an examining room, don’t say things like “You look old, so you must have hard time having sex, right?” It is already bad enough that we can’t even watch golf tournaments on TV without those commercials that assume that golf fans need help in that department. (By the way, having “two” bathtubs “outside” the house at dusk don’t make any sense either, when the effect of the drug can last for hours for heaven’s sake. Maybe the man lost interest because the tubs were too damn heavy?)

While it may vary from culture to culture, we all have some understanding of social boundaries in casual settings. When you are talking to a complete stranger on a plane ride, for example, you know exactly how much information that you would feel comfortable sharing with that person. And when someone crosses the line, we call that person inappropriate, or “creepy.” Unfortunately, that creepy line is set differently for each person who we encounter (I am sure people like George Clooney or Scarlett Johansson have a really high threshold for what might be considered creepy), but I think we can all agree that such a shady area can be loosely defined at the least. Therefore, when we deal with large amounts of data affecting a great many people, imagine a rather large common area of such creepiness/shadiness, and do not ever cross it. In other words, when in doubt, don’t go for it.

Now, as a lifelong database marketer, I am not advocating some over-the-top privacy zealots either, as most of them do not understand the nature of data work and can’t tell the difference between informed (and mutually beneficial) messages and Big Brother-like nosiness. This targeting business is never about looking up an individual’s record one at a time, but more about finding correlations between users and products and doing some good match-making in mass numbers. In other words, we don’t care what questionable sites anyone visits, and honest data players would not steal or abuse information with bad intent. I heard about waiters who steal credit card numbers from their customers with some swiping devices, but would you condemn the entire restaurant industry for that? Yes, there are thieves in any part of the society, but not all data players are hackers, just like not all waiters are thieves. Statistically speaking, much like flying being the safest from of travel, I can even argue that handing over your physical credit card to a stranger is even more dangerous than entering the credit card number on a website. It looks much worse when things go wrong, as incidents like that affect a great many all at once, just like when a plane crashes.

Years back, I used to frequent a Japanese Restaurant near my office. The owner, who doubled as the head sushi chef, was not a nosy type. So he waited for more than a year to ask me what I did for living. He had never heard anything about database marketing, direct marketing or CRM (no “Big Data” on the horizon at that time). So I had to find a simple way to explain what I do. As a sushi chef with some local reputation, I presumed that he would know personal preferences of many frequently visiting customers (or “high-value customers,” as marketers call them). He may know exactly who likes what kind of fish and types of cuts, who doesn’t like raw shellfish, who is allergic to what, who has less of a tolerance for wasabi or who would indulge in exotic fish roes. When I asked this question, his answer was a simple “yes.” Any diligent sushi chef would care for his or her customers that much. And I said, “Now imagine that you can provide such customized services to millions of people, with the help of computers and collected data.” He immediately understood the benefits of using data and analytics, and murmured “Ah so …”

Now let’s turn the table for a second here. From the customer’s point of view, yes, it is very convenient for me that my favorite sushi chef knows exactly how I like my sushi. Same goes for the local coffee barista who knows how you take your coffee every morning. Such knowledge is clearly mutually beneficial. But what if those business owners or service providers start asking about my personal finances or about my grown daughter in a “creepy” way? I wouldn’t care if they carried the best yellowtail in town or served the best cup of coffee in the world. I would cease all my interaction with them immediately. Sorry, they’ve just crossed that creepy line.

Years ago, I had more than a few chances to sit closely with Lester Wunderman, widely known as “The Father of Direct Marketing,” as the venture called I-Behavior in which I participated as one of the founders actually originated from an idea on a napkin from Lester and his friends. Having previously worked in an agency that still bears his name, and having only seen him behind a podium until I was introduced to him on one cool autumn afternoon in 1999, meeting him at a small round table and exchanging ideas with the master was like an unknown guitar enthusiast having a jam session with Eric Clapton. What was most amazing was that, at the beginning of the dot.com boom, he was completely unfazed about all those new ideas that were flying around at that time, and he was precisely pointing out why most of them would not succeed at all. I do not need to quote the early 21st century history to point out that his prediction was indeed accurate. When everyone was chasing the latest bit of technology for quick bucks, he was at least a decade ahead of all of those young bucks, already thinking about the human side of the equation. Now, I would not reveal his age out of respect, but let’s just say that almost all of the people in his age group would describe occupations of their offspring as “Oh, she just works on a computer all the time …” I can only wish that I will remain that sharp when I am his age.

One day, Wunderman very casually shared a draft of the “Consumer Bill of Rights for Online Engagement” with a small group of people who happened to be in his office. I was one of the lucky souls who heard about his idea firsthand, and I remember feeling that he was spot-on with every point, as usual. I read it again recently just as this Big Data hype is reaching its peak, just like the dot.com boom was moving with a force that could change the world back then. In many ways, such tidal waves do end up changing the world. But lest we forget, such shifts inevitably affect living, breathing human beings along the way. And for any movement guided by technology to sustain its velocity, people who are at the helm of the enabling technology must stay sensitive toward the needs of the rest of the human collective. In short, there is not much to gain by annoying and frustrating the masses.

Allow me to share Lester Wunderman’s “Consumer Bill of Rights for Online Engagement” verbatim, as it appeared in the second edition of his book “Being Direct”:

  1. Tell me clearly who you are and why you are contacting me.
  2. Tell me clearly what you are—or are not—going to do with the information I give.
  3. Don’t pretend that you know me personally. You don’t know me; you know some things about me.
  4. Don’t assume that we have a relationship.
  5. Don’t assume that I want to have a relationship with you.
  6. Make it easy for me to say “yes” and “no.”
  7. When I say “no,” accept that I mean not this, not now.
  8. Help me budget not only my money, but also my TIME.
  9. My time is valuable, don’t waste it.
  10. Make my shopping experience easier.
  11. Don’t communicate with me just because you can.
  12. If you do all of that, maybe we will then have the basis for a relationship!

So, after more than 15 years of the so-called digital revolution, how many of these are we violating almost routinely? Based on the look of my inboxes and sites that I visit, quite a lot and all the time. As I mentioned in my earlier article “The Future of Online is Offline,” I really get offended when even seasoned marketers use terms like “online person.” I do not become an online person simply because I happen to stumble onto some stupid website and forget to uncheck some pre-checked boxes. I am not some casual object at which some email division of a company can shoot to meet their top-down sales projections.

Oh, and good luck with that kind of mindless mass emailing; your base will soon be saturated and you will learn that irrelevant messages are bad for the senders, too. Proof? How is it that the conversion rate of a typical campaign did not increase dramatically during the past 40 years or so? Forget about open or click-through rate, but pay attention to the good-old conversion rate. You know, the one that measures actual sales. Don’t we have superior databases and technologies now? Why is anyone still bragging about mailing “more” in this century? Have you heard about “targeted” or “personalized” messages? Aren’t there lots and lots of toolsets for that?

As the technology advances, it becomes that much easier and faster to offend people. If the majority of data handlers continue to abuse their power, stemming from the data in their custody, the communication channels will soon run dry. Or worse, if abusive practices continue, the whole channel could be shut down by some legislation, as we have witnessed in the downfall of the outbound telemarketing channel. Unfortunately, a few bad apples will make things a lot worse a lot faster, but I see that even reputable companies do things just because they can. All the time, repeatedly.

Furthermore, in this day and age of abundant data, not offending someone or not violating rules aren’t good enough. In fact, to paraphrase comedian Chris Rock, only losers brag about doing things that they are supposed to do in the first place. The direct marketing industry has long been bragging about the self-governing nature of its tightly knit (and often incestuous) network, but as tools get cheaper and sharper by the day, we all need to be even more careful wielding this data weaponry. Because someday soon, we as consumers will be seeing messages everywhere around us, maybe through our retina directly, not just in our inboxes. Personal touch? Yes, in the creepiest way, if done wrong.

Visionaries like Lester Wunderman were concerned about the abusive nature of online communication from the very beginning. We should all read his words again, and think twice about social and human consequences of our actions. Google from its inception encapsulated a similar idea by simply stating its organizational objective as “Don’t be evil.” That does not mean that it will stop pursuing profit or cease to collect data. I think it means that Google will always try to be mindful about the influences of its actions on real people, who may not be in positions to control the data, but instead are on the side of being the subject of data collection.

I am not saying all of this out of some romantic altruism; rather, I am emphasizing the human side of the data business to preserve the forward-momentum of the Big Data movement, while I do not even care for its name. Because I still believe, even from a consumer’s point of view, that a great amount of efficiency could be achieved by using data and technology properly. No one can deny that modern life in general is much more convenient thanks to them. We do not get lost on streets often, we can translate foreign languages on the fly, we can talk to people on the other side of the globe while looking at their faces. We are much better informed about products and services that we care about, we can look up and order anything we want while walking on the street. And heck, we get suggestions before we even think about what we need.

But we can think of many negative effects of data, as well. It goes without saying that the data handlers must protect the data from falling into the wrong hands, which may have criminal intentions. Absolutely. That is like banks having to protect their vaults. Going a few steps further, if marketers want to retain the privilege of having ample amounts of consumer information and use such knowledge for their benefit, do not ever cross that creepy line. If the Consumer’s Bill of Rights is too much for you to retain, just remember this one line: “Don’t be creepy.”

Missing Data Can Be Meaningful

No matter how big the Big Data gets, we will never know everything about everything. Well, according to the super-duper computer called “Deep Thought” in the movie “The Hitchhiker’s Guide to the Galaxy” (don’t bother to watch it if you don’t care for the British sense of humour), the answer to “The Ultimate Question of Life, the Universe, and Everything” is “42.” Coincidentally, that is also my favorite number to bet on (I have my reasons), but I highly doubt that even that huge fictitious computer with unlimited access to “everything” provided that numeric answer with conviction after 7½ million years of computing and checking. At best, that “42” is an estimated figure of a sort, based on some fancy algorithm. And in the movie, even Deep Thought pointed out that “the answer is meaningless, because the beings who instructed it never actually knew what the Question was.” Ha! Isn’t that what I have been saying all along? For any type of analytics to be meaningful, one must properly define the question first. And what to do with the answer that comes out of an algorithm is entirely up to us humans, or in the business world, the decision-makers. (Who are probably human.)

No matter how big the Big Data gets, we will never know everything about everything. Well, according to the super-duper computer called “Deep Thought” in the movie “The Hitchhiker’s Guide to the Galaxy” (don’t bother to watch it if you don’t care for the British sense of humour), the answer to “The Ultimate Question of Life, the Universe, and Everything” is “42.” Coincidentally, that is also my favorite number to bet on (I have my reasons), but I highly doubt that even that huge fictitious computer with unlimited access to “everything” provided that numeric answer with conviction after 7½ million years of computing and checking. At best, that “42” is an estimated figure of a sort, based on some fancy algorithm. And in the movie, even Deep Thought pointed out that “the answer is meaningless, because the beings who instructed it never actually knew what the Question was.” Ha! Isn’t that what I have been saying all along? For any type of analytics to be meaningful, one must properly define the question first. And what to do with the answer that comes out of an algorithm is entirely up to us humans, or in the business world, the decision-makers. (Who are probably human.)

Analytics is about making the best of what we know. Good analysts do not wait for a perfect dataset (it will never come by, anyway). And businesspeople have no patience to wait for anything. Big Data is big because we digitize everything, and everything that is digitized is stored somewhere in forms of data. For example, even if we collect mobile device usage data from just pockets of the population with certain brands of mobile services in a particular area, the sheer size of the resultant dataset becomes really big, really fast. And most unstructured databases are designed to collect and store what is known. If you flip that around to see if you know every little behavior through mobile devices for “everyone,” you will be shocked to see how small the size of the population associated with meaningful data really is. Let’s imagine that we can describe human beings with 1,000 variables coming from all sorts of sources, out of 200 million people. How many would have even 10 percent of the 1,000 variables filled with some useful information? Not many, and definitely not 100 percent. Well, we have more data than ever in the history of mankind, but still not for every case for everyone.

In my previous columns, I pointed out that decision-making is about ranking different options, and to rank anything properly. We must employee predictive analytics (refer to “It’s All About Ranking“). And for ranking based on the scores resulting from predictive models to be effective, the datasets must be summarized to the level that is to be ranked (e.g., individuals, households, companies, emails, etc.). That is why transaction or event-level datasets must be transformed to “buyer-centric” portraits before any modeling activity begins. Again, it is not about the transaction or the products, but it is about the buyers, if you are doing all this to do business with people.

Trouble with buyer- or individual-centric databases is that such transformation of data structure creates lots of holes. Even if you have meticulously collected every transaction record that matters (and that will be the day), if someone did not buy a certain item, any variable that is created based on the purchase record of that particular item will have nothing to report for that person. Likewise, if you have a whole series of variables to differentiate online and offline channel behaviors, what would the online portion contain if the consumer in question never bought anything through the Web? Absolutely nothing. But in the business of predictive analytics, what did not happen is as important as what happened. Even a simple concept of “response” is only meaningful when compared to “non-response,” and the difference between the two groups becomes the basis for the “response” model algorithm.

Capturing the Meanings Behind Missing Data
Missing data are all around us. And there are many reasons why they are missing, too. It could be that there is nothing to report, as in aforementioned examples. Or, there could be errors in data collection—and there are lots of those, too. Maybe you don’t have access to certain pockets of data due to corporate, legal, confidentiality or privacy reasons. Or, maybe records did not match properly when you tried to merge disparate datasets or append external data. These things happen all the time. And, in fact, I have never seen any dataset without a missing value since I left school (and that was a long time ago). In school, the professors just made up fictitious datasets to emphasize certain phenomena as examples. In real life, databases have more holes than Swiss cheese. In marketing databases? Forget about it. We all make do with what we know, even in this day and age.

Then, let’s ask a philosophical question here:

  • If missing data are inevitable, what do we do about it?
  • How would we record them in databases?
  • Should we just leave them alone?
  • Or should we try to fill in the gaps?
  • If so, how?

The answer to all this is definitely not 42, but I’ll tell you this: Even missing data have meanings, and not all missing data are created equal, either.

Furthermore, missing data often contain interesting stories behind them. For example, certain demographic variables may be missing only for extremely wealthy people and very poor people, as their residency data are generally not exposed (for different reasons, of course). And that, in itself, is a story. Likewise, some data may be missing in certain geographic areas or for certain age groups. Collection of certain types of data may be illegal in some states. “Not” having any data on online shopping behavior or mobile activity may mean something interesting for your business, if we dig deeper into it without falling into the trap of predicting legal or corporate boundaries, instead of predicting consumer behaviors.

In terms of how to deal with missing data, let’s start with numeric data, such as dollars, days, counters, etc. Some numeric data simply may not be there, if there is no associated transaction to report. Now, if they are about “total dollar spending” and “number of transactions” in a certain category, for example, they can be initiated as zero and remain as zero in cases like this. The counter simply did not start clicking, and it can be reported as zero if nothing happened.

Some numbers are incalculable, though. If you are calculating “Average Amount per Online Transaction,” and if there is no online transaction for a particular customer, that is a situation for mathematical singularity—as we can’t divide anything by zero. In such cases, the average amount should be recorded as: “.”, blank, or any value that represents a pure missing value. But it should never be recorded as zero. And that is the key in dealing with missing numeric information; that zero should be reserved for real zeros, and nothing else.

I have seen too many cases where missing numeric values are filled with zeros, and I must say that such a practice is definitely frowned-upon. If you have to pick just one takeaway from this article, that’s it. Like I emphasized, not all missing values are the same, and zero is not the way you record them. Zeros should never represent lack of information.

Take the example of a popular demographic variable, “Number of Children in the Household.” This is a very predictable variable—not just for purchase behavior of children’s products, but for many other things. Now, it is a simple number, but it should never be treated as a simple variable—as, in this case, lack of information is not the evidence of non-existence. Let’s say that you are purchasing this data from a third-party data compiler (or a data broker). If you don’t see a positive number in that field, it could be because:

  1. The household in question really does not have a child;
  2. Even the data-collector doesn’t have the information; or
  3. The data collector has the information, but the household record did not match to the vendor’s record, for some reason.

If that field contains a number like 1, 2 or 3, that’s easy, as they will represent the number of children in that household. But the zero should be reserved for cases where the data collector has a positive confirmation that the household in question indeed does not have any children. If it is unknown, it should be marked as blank, “.” (Many statistical softwares, such as SAS, record missing values this way.) Or use “U” (though an alpha character should not be in a numeric field).

If it is a case of non-match to the external data source, then there should be a separate indicator for it. The fact that the record did not match to a professional data compiler’s list may mean something. And I’ve seen cases where such non-matching indicators are made to model algorithms along with other valid data, as in the case where missing indicators of income display the same directional tendency as high-income households.

Now, if the data compiler in question boldly inputs zeros for the cases of unknowns? Take a deep breath, fire the vendor, and don’t deal with the company again, as it is a sign that its representatives do not know what they are doing in the data business. I have done so in the past, and you can do it, too. (More on how to shop for external data in future articles.)

For non-numeric categorical data, similar rules apply. Some values could be truly “blank,” and those should be treated separately from “Unknown,” or “Not Available.” As a practice, let’s list all kinds of possible missing values in codes, texts or other character fields:

  • ” “—blank or “null”
  • “N/A,” “Not Available,” or “Not Applicable”
  • “Unknown”
  • “Other”—If it is originating from some type of multiple choice survey or pull-down menu
  • “Not Answered” or “Not Provided”—This indicates that the subjects were asked, but they refused to answer. Very different from “Unknown.”
  • “0”—In this case, the answer can be expressed in numbers. Again, only for known zeros.
  • “Non-match”—Not matched to other internal or external data sources
  • Etc.

It is entirely possible that all these values may be highly correlated to each other and move along the same predictive direction. However, there are many cases where they do not. And if they are combined into just one value, such as zero or blank, we will never be able to detect such nuances. In fact, I’ve seen many cases where one or more of these missing indicators move together with other “known” values in models. Again, missing data have meanings, too.

Filling in the Gaps
Nonetheless, missing data do not have to left as missing, blank or unknown all the time. With statistical modeling techniques, we can fill in the gaps with projected values. You didn’t think that all those data compilers really knew the income level of every household in the country, did you? It is not a big secret that much of those figures are modeled with other available data.

Such inferred statistics are everywhere. Popular variables, such as householder age, home owner/renter indicator, housing value, household income or—in the case of business data—the number of employees and sales volume contain modeled values. And there is nothing wrong with that, in the world where no one really knows everything about everything. If you understand the limitations of modeling techniques, it is quite alright to employ modeled values—which are much better alternatives to highly educated guesses—in decision-making processes. We just need to be a little careful, as models often fail to predict extreme values, such as household incomes over $500,000/year, or specific figures, such as incomes of $87,500. But “ranges” of household income, for example, can be predicted at a high confidence level, though it technically requires many separate algorithms and carefully constructed input variables in various phases. But such technicality is an issue that professional number crunchers should deal with, like in any other predictive businesses. Decision-makers should just be aware of the reality of real and inferred data.

Such imputation practices can be applied to any data source, not just compiled databases by professional data brokers. Statisticians often impute values when they encounter missing values, and there are many different methods of imputation. I haven’t met two statisticians who completely agree with each other when it comes to imputation methodologies, though. That is why it is important for an organization to have a unified rule for each variable regarding its imputation method (or lack thereof). When multiple analysts employ different methods, it often becomes the very source of inconsistent or erroneous results at the application stage. It is always more prudent to have the calculation done upfront, and store the inferred values in a consistent manner in the main database.

In terms of how that is done, there could be a long debate among the mathematical geeks. Will it be a simple average of non-missing values? If such a method is to be employed, what is the minimum required fill-rate of the variable in question? Surely, you do not want to project 95 percent of the population with 5 percent known values? Or will the missing values be replaced with modeled values, as in previous examples? If so, what would be the source of target data? What about potential biases that may exist because of data collection practices and their limitations? What should be the target definition? In what kind of ranges? Or should the target definition remain as a continuous figure? How would you differentiate modeled and real values in the database? Would you embed indicators for inferred values? Or would you forego such flags in the name of speed and convenience for users?

The important matter is not the rules or methodologies, but the consistency of them throughout the organization and the databases. That way, all users and analysts will have the same starting point, no matter what the analytical purposes are. There could be a long debate in terms of what methodology should be employed and deployed. But once the dust settles, all data fields should be treated by pre-determined rules during the database update processes, avoiding costly errors in the downstream. All too often, inconsistent imputation methods lead to inconsistent results.

If, by some chance, individual statisticians end up with freedom to come up with their own ways to fill in the blanks, then the model-scoring code in question must include missing value imputation algorithms without an exception, granted that such practice will elongate the model application processes and significantly increase chances for errors. It is also important that non-statistical users should be educated about the basics of missing data and associated imputation methods, so that everyone who has access to the database shares a common understanding of what they are dealing with. That list includes external data providers and partners, and it is strongly recommended that data dictionaries must include employed imputation rules wherever applicable.

Keep an Eye on the Missing Rate
Often, we get to find out that the missing rate of certain variables is going out of control because models become ineffective and campaigns start to yield disappointing results. Conversely, it can be stated that fluctuations in missing data ratios greatly affect the predictive power of models or any related statistical works. It goes without saying that a consistent influx of fresh data matters more than the construction and the quality of models and algorithms. It is a classic case of a garbage-in-garbage-out scenario, and that is why good data governance practices must include a time-series comparison of the missing rate of every critical variable in the database. If, all of a sudden, an important predictor’s fill-rate drops below a certain point, no analyst in this world can sustain the predictive power of the model algorithm, unless it is rebuilt with a whole new set of variables. The shelf life of models is definitely finite, but nothing deteriorates effectiveness of models faster than inconsistent data. And a fluctuating missing rate is a good indicator of such an inconsistency.

Likewise, if the model score distribution starts to deviate from the original model curve from the development and validation samples, it is prudent to check the missing rate of every variable used in the model. Any sudden changes in model score distribution are a good indicator that something undesirable is going on in the database (more on model quality control in future columns).

These few guidelines regarding the treatment of missing data will add more flavors to statistical models and analytics in general. In turn, proper handling of missing data will prolong the predictive power of models, as well. Missing data have hidden meanings, but they are revealed only when they are treated properly. And we need to do that until the day we get to know everything about everything. Unless you are just happy with that answer of “42.”

Chicken or the Egg? Data or Analytics?

I just saw an online discussion about the role of a chief data officer, whether it should be more about data or analytics. My initial response to that question is “neither.” A chief data officer must represent the business first.

I just saw an online discussion about the role of a chief data officer, whether it should be more about data or analytics. My initial response to that question is “neither.” A chief data officer must represent the business first. And I had the same answer when such a title didn’t even exist and CTOs or other types of executives covered that role in data-rich environments. As soon as an executive with a seemingly technical title starts representing the technology, that business is doomed. (Unless, of course, the business itself is about having fun with the technology. How nice!)

Nonetheless, if I really have to pick just one out of the two choices, I would definitely pick the analytics over data, as that is the key to providing answers to business questions. Data and databases must be supporting that critical role of analytics, not the other way around. Unfortunately, many organizations are completely backward about it, where analysts are confined within the limitations of database structures and affiliated technologies, and the business owners and decision-makers are dictated to by the analysts and analytical tool sets. It should be the business first, then the analytics. And all databases—especially marketing databases—should be optimized for analytical activities.

In my previous columns, I talked about the importance of marketing databases and statistical modeling in the age of Big Data; not all depositories of information are necessarily marketing databases, and statistical modeling is the best way to harness marketing answers out of mounds of accumulated data. That begs for the next question: Is your marketing database model-ready?

When I talk about the benefits of statistical modeling in data-rich environments (refer to my previous column titled “Why Model?”), I often encounter folks who list reasons why they do not employ modeling as part of their normal marketing activities. If I may share a few examples here:

  • Target universe is too small: Depending on the industry, the prospect universe and customer base are sometimes very small in size, so one may decide to engage everyone in the target group. But do you know what to offer to each of your prospects? Customized offers should be based on some serious analytics.
  • Predictive data not available: This may have been true years back, but not in this day and age. Either there is a major failure in data collection, or collected data are too unstructured to yield any meaningful answers. Aren’t we living in the age of Big Data? Surely we should all dig deeper.
  • 1-to-1 marketing channels not in plan: As I repeatedly said in my previous columns, “every” channel is, or soon will be, a 1-to-1 channel. Every audience is secretly screaming, “Entertain us!” And customized customer engagement efforts should be based on modeling, segmentation and profiling.
  • Budget doesn’t allow modeling: If the budget is too tight, a marketer may opt in for some software solution instead of hiring a team of statisticians. Remember that cookie-cutter models out of software packages are still better than someone’s intuitive selection rules (i.e., someone’s “gut” feeling).
  • The whole modeling process is just too painful: Hmm, I hear you. The whole process could be long and difficult. Now, why do you think it is so painful?

Like a good doctor, a consultant should be able to identify root causes based on pain points. So let’s hear some complaints:

  • It is not easy to find “best” customers for targeting
  • Modelers are fixing data all the time
  • Models end up relying on a few popular variables, anyway
  • Analysts are asking for more data all the time
  • It takes too long to develop and implement models
  • There are serious inconsistencies when models are applied to the database
  • Results are disappointing
  • Etc., etc…

I often get called in when model-based marketing efforts yield disappointing results. More often than not, the opening statement in such meetings is that “The model did not work.” Really? What is interesting is that in more than nine times out of 10 cases like that, the models are the only elements that seem to have been done properly. Everything else—from pre-modeling steps, such as data hygiene, conversion, categorization, and summarization; to post-modeling steps, such as score application and validation—often turns out to be the root cause of all the troubles, resulting in pain points listed here.

When I speak at marketing conferences, talking about this subject of this “model-ready” environment, I always ask if there are statisticians and analysts in the audience. Then I ask what percentage of their time goes into non-statistical activities, such as data preparation and remedying data errors. The absolute majority of them say they spend of 80 percent to 90 percent of their time fixing the data, devoting the rest to the model development work. You don’t need me to tell you that something is terribly wrong with this picture. And I am pretty sure that none of those analysts got their PhDs and master’s degrees in statistics to spend most of their waking hours fixing the data. Yeah, I know from experience that, in this data business, the last guy who happens to touch the dataset always ends up being responsible for all errors made to the file thus far, but still. No wonder it is often quoted that one of the key elements of being a successful data scientist is the programming skill.

When you provide datasets filled with unstructured, incomplete and/or missing data, diligent analysts will devote their time to remedying the situation and making the best out of what they have received. I myself often tell newcomers that analytics is really about making the best of what you’ve got. The trouble is that such data preparation work calls for a different set of skills that have nothing to do with statistics or analytics, and most analysts are not that great at programming, nor are they trained for it.

Even if they were able to create a set of sensible variables to play with, here comes the bigger trouble; what they have just fixed is just a “sample” of the database, when the models must be applied to the whole thing later. Modern databases often contain hundreds of millions of records, and no analyst in his or her right mind uses the whole base to develop any models. Even if the sample is as large as a few million records (an overkill, for sure) that would hardly be the entire picture. The real trouble is that no model is useful unless the resultant model scores are available on every record in the database. It is one thing to fix a sample of a few hundred thousand records. Now try to apply that model algorithm to 200 million entries. You see all those interesting variables that analysts created and fixed in the sample universe? All that should be redone in the real database with hundreds of millions of lines.

Sure, it is not impossible to include all the instructions of variable conversion, reformat, edit and summarization in the model-scoring program. But such a practice is the No. 1 cause of errors, inconsistencies and serious delays. Yes, it is not impossible to steer a car with your knees while texting with your hands, but I wouldn’t call that the best practice.

That is why marketing databases must be model-ready, where sampling and scoring become a routine with minimal data transformation. When I design a marketing database, I always put the analysts on top of the user list. Sure, non-statistical types will still be able to run queries and reports out of it, but those activities should be secondary as they are lower-level functions (i.e., simpler and easier) compared to being “model-ready.”

Here is list of prerequisites of being model-ready (which will be explained in detail in my future columns):

  • All tables linked or merged properly and consistently
  • Data summarized to consistent levels such as individuals, households, email entries or products (depending on the ranking priority by the users)
  • All numeric fields standardized, where missing data and zero values are separated
  • All categorical data edited and categorized according to preset business rules
  • Missing data imputed by standardized set of rules
  • All external data variables appended properly

Basically, the whole database should be as pristine as the sample datasets that analysts play with. That way, sampling should take only a few seconds, and applying the resultant model algorithms to the whole base would simply be the computer’s job, not some nerve-wrecking, nail-biting, all-night baby-sitting suspense for every update cycle.

In my co-op database days, we designed and implemented the core database with this model-ready philosophy, where all samples were presented to the analysts on silver platters, with absolutely no need for fixing the data any further. Analysts devoted their time to pondering target definitions and statistical methodologies. This way, each analyst was able to build about eight to 10 “custom” models—not cookie-cutter models—per “day,” and all models were applied to the entire database with more than 200 million individuals at the end of each day (I hear that they are even more efficient these days). Now, for the folks who are accustomed to 30-day model implementation cycle (I’ve seen as long as 6-month cycles), this may sound like a total science fiction. And I am not even saying that all companies need to build and implement that many models every day, as that would hardly be a core business for them, anyway.

In any case, this type of practice has been in use way before the words “Big Data” were even uttered by anyone, and I would say that such discipline is required even more desperately now. Everyone is screaming for immediate answers for their questions, and the questions should be answered in forms of model scores, which are the most effective and concise summations of all available data. This so-called “in-database” modeling and scoring practice starts with “model-ready” database structure. In the upcoming issues, I will share the detailed ways to get there.

So, here is the answer for the chicken-or-the-egg question. It is the business posing the questions first and foremost, then the analytics providing answers to those questions, where databases are optimized to support such analytical activities including predictive modeling. For the chicken example, with the ultimate goal of all living creatures being procreation of their species, I’d say eggs are just a means to that end. Therefore, for a business-minded chicken, yeah, definitely the chicken before the egg. Not that I’ve seen too many logical chickens.

A New Year THINKABOUT!

Happy January—the month of all sorts of resolution making! It’s hard to resist the desire to start anew with a clean slate each year. Something in us likes that blank blackboard/screen feel and the  “do-overness” ability that comes with a turn or click of the calendar. But whether or not the act of resolution making resonates with you, I do advocate the practice of taking a pause for a New Year ThinkAbout with your brand leaders to reflect together on two powerful verbs. Ask yourselves these questions:

Happy January—the month of all sorts of resolution making! It’s hard to resist the desire to start anew with a clean slate each year. Something in us likes that blank blackboard/screen feel and the “do-overness” ability that comes with a turn or click of the calendar. But whether or not the act of resolution making resonates with you, I do advocate the practice of taking a pause for a New Year ThinkAbout with your brand leaders to reflect together on two powerful verbs. Ask yourselves these questions:

  1. How well did you WOO and WOW your customers last year?
  2. What are your plans to live out these verbs in a fresh and meaningful way this year?

WOO and WOW: Six letters with all sorts of magnificent brand potential. Short and simple little verbs that can easily get lost in the day-to-day shuffle of omnichannel strategy creation, personnel issues, financial plan execution and competitive activities springing up all around you. But these two verbs should be at the forefront of your best brand thinking. Here’s why:

• Wooing is a full-time, year-round, relationship-building branding activity. When brands forget to woo, that is, continually win over, both potentially new and, of course, existing customers throughout all their touchpoint interactions, these customers can turn elsewhere. When customers feel their business (and time and attention and wallets!) are taken for granted, unappreciated and or even assumed, they can start to slip away. You may or may not even notice at first … it may be subtle: one less purchase from you, one extra month between transactions.

• Wowing is a full-time, year-round, relationship-building branding activity. When brands fail to keep pace with their customers’ needs, when they keep doing more of the same, when they don’t stay a step ahead of their competitors or disrupt their own successes, they stop wowing customers. Customers get bored, fatigued and even worse, distracted by those competitive brands that are indeed wowing.

So, who is your Chief Wooing Officer? Who is your Chief Wowing Officer? What’s their action plan for 2014? Better yet, why not have a thinkabout incorporating wooing and wowing as a full time, company-wide, all-brand ambassadors’ initiative this year?

7 Steps to a Better B-to-B Landing Page

Despite years of practice with digital campaigns, B-to-B marketers still have trouble getting their landing pages to work as hard as they could. I am not sure why, since there’s nothing more important to capturing the responses from outbound messages and kicking off a relationship with prospects. You could say the landing page is where your campaign pays off. But I am still seeing obvious errors

Despite years of practice with digital campaigns, B-to-B marketers still have trouble getting their landing pages to work as hard as they could. I am not sure why, since there’s nothing more important to capturing the responses from outbound messages and kicking off a relationship with prospects. You could say the landing page is where your campaign pays off. But I am still seeing obvious errors. So herewith I offer a seven-point checklist of landing page best practices. And I invite readers to add some of their own recommendations.

1. Connect the landing page directly to the outbound message. When respondents click through to the landing page, they should experience a seamless flow from one to the other. The outbound message—whether a SEM ad, an email, a direct mail piece or even a print ad—should act like the teaser, to motivate the recipient to click or type in the landing page URL. The role of the landing page is to close on the deal, the same way a salesperson asks for the order. So the two formats should act as one, working together to move the prospect along. If they are disjointed—whether through design or copy inconsistency—the momentum is lost.

2. Create a fresh landing page for each variable in your campaign. OK, I know this means work. But the effort that goes into the outbound message should be equaled or exceeded when crafting the response vehicle. If you are doing an A/B test on your creative or your offer, you need two landing pages. Plan for it.

3. Mobile-enable your landing page. No excuses. The dramatic rise in tablet and smartphone use cannot be ignored. As any direct marketer will tell you: Don’t get in the way. If you put up any obstacles, your response rate will inevitably be lower. A landing page that is engineered for ease of use on mobile devices is no longer a nice to have; it’s a must.

4. Prepopulate the form where possible. If your outbound message includes digital information about the respondents, don’t make them retype their data.

5. Ask for the minimal amount of information you need to take the next step in the relationship. The more elements you require, the lower your response rate. So ask yourself, “How will asking for this piece of information change the way I deal with the inquiry?” If the answer is, “It won’t,” then hold that query for a later stage in the relationship.

6. Develop a culture of constant testing. Any responsive vehicle benefits from continuous improvement. Your landing page is the perfect place to test copy, offer, layout and other variables like the number of data elements you ask for. Do it, don’t duck it.

7. Follow landing page design best practices. Hubspot offers some excellent tips in this area. Remember that the purpose of a landing page is to drive an action. So everything you do-the copy, the offer, the layout, the graphics-must focus on that end.

I welcome your ideas on how to improve landing page results.

A version of this post appeared in Biznology, the digital marketing blog.

My 9 Insider Tips to Build Your Email List For Low or No Cost!

Whether you’re an entrepreneur, corporation or online publisher, the power of the lead is critical in growing your business … and your email list. Leads, also known as prospects, are typically the entry level point of the sales funnel. 

Whether you’re an entrepreneur, corporation or online publisher, the power of the lead is critical in growing your business … and your email list. Leads, also known as prospects, are typically the entry level point of the sales funnel.

A popular business model by many online publishers is to bring in leads at the “free” level (i.e. report, e-newsletter, webinar, white paper, etc.), add those names to their house list and typically over the course of 30 to 90 days (the bonding time) that lead will convert into a paying customer. This practice is known as lead generation, name collection or list-building efforts.

Today, I’m going to share with you some proven online marketing methods I’ve used and had great success with at some of the top publishers in America. And bonus … many of these tactics are low- or no-cost. Here’s my list, in no particular order:

Power eAcquisition Polls. In my last blog post, I wrote about using polls for lead generation. Incorporating a poll on your website or having a poll on another site is a great way to build your list. It’s important to spend time thinking about your poll question—something that is a hot topic, controversial and relevant to the locations where you’re placing your poll. You want to pull people in with your headline and make the poll entertaining. Your answers should be multiple choice and have an “other” field, which encourages participants to engage with your question. I’ve found this “other” field as a fantastic way to make the poll interactive. Many people are passionate about certain subject matters and won’t mind giving you their two cents. Then, to show appreciation for talking the poll, tell participants they are getting a bonus report and a free e-newsletter subscription (which they can opt out of at any time). And of course, make sure to mention—and link to—your privacy/anti spam policy. After you kick off your list-building efforts, make sure you start tracking them so you can quantify the time and resources spent. This involves working with your webmaster on setting up tracking URLs specific to each website you’re advertising on. It also means looking at Google Analytics for your website and corresponding landing pages to see traffic and referring page sources.

Teleseminars or Webinars. This is a great way to collect qualified names. Promote a free, relevant and value-oriented teleseminar or webinar to targeted prospects. You can promote it through several organic (free) tactics, such as LinkedIn Groups/Events, Facebook Events, Twitter, online press releases, affiliate marketing/joint ventures. Remember, this is for lead generation, not bonding. So your goal is to cast a wide net outside of your existing list, create visibility and get new names. Your value proposition should be actionable, relevant information that your target audience would find useful and worth giving their email address for. The trick is to promote the event in as many places as possible without incurring advertising costs; then your only costs may be the set up of the conference call (multiple lines, 800#) or webinar platform. And, in case you were wondering, I have been involved with teleseminars with non-toll-free numbers and response rates were not greatly impacted.

Co-registration. Co-Reg is another way to collect names, but involves a nominal fee. Co-Reg is when you place a small ad on another publisher’s site after some sort of transaction (albeit a sales or lead-gen offer). So, for instance, after someone signs up to the AOL Travel eNewsletter, a Thank You page comes up with a list of sponsors the reader may find interesting, as well—other free e-newsletter offers. The text ad is usually accompanied by a small graphic image representing the sponsor. The key here is to pick publishers and Co-Reg placements that are synergistic to your own publication and offer. Another important note is to make sure you follow up quickly to these names so they don’t forget who you are and go cold quite fast. I suggest a dedicated auto responder series for bonding and monetization. Co-Reg efforts can cost you around $1 to $3 per valid email address.

Frienemy Marketing. This includes JVs (joint ventures), affiliate marketing, guest editorials, editorial contributions and reciprocal ad swaps (for leads generation or revenue sharing). This tactic is extremely effective and cost-efficient. The key here is having some kind of leverage, then approaching publishers who may want your content or a cross-marketing opportunity to your current list (note: This only works if you have a list of decent size that another publisher will find attractive). In exchange for content or revenue share efforts, you and the other publisher agree to reciprocate either e-news ads or solo emails to each other’s lists, thereby sending a message to a targeted, relevant list for free. Well, if you agree on a rev share, it’s free as far as ad costs, but you are giving that publisher a split of your net revenues.

SONAR Marketing. I’ve written about this many times, but can’t stress it enough. Content is king and you can leverage it via what I call “SONAR.” It’s an organic (free) online strategy that works with the search engines. It’s a comprehensive method of repurposing, reusing, distributing and synchronizing the release of relevant, original content (albeit text, audio, video) to targeted online channels based on your audience. SONAR represents the following online distribution platforms:

S Syndicate partners, content syndication networks and user-generated content sites
O Online press releases
N Network (social) communities
A Article directories
R Relevant posts to blogs, forums and bulletin boards.

SONAR works hand-in-hand with your existing search engine marketing (SEM), social media marketing (SMM) and search engine optimization (SEO) tactics.

Search Engine Marketing. It’s a shame more marketers don’t see the value of SEO or SEM. In order to drive as much organic traffic as possible to your website, you need to make sure your site is optimized for the correct keywords and your target audience. Once you optimize your site with title tags, meta descriptions, meta keywords and relevant, keyword-dense content, you need to make sure you have revised your site to harness the traffic that will be coming. That means adding eye-catching email collection boxes to your home page (and it’s static on all your subpages), relevant banners and obvious links to e-comm webpages. You don’t want to miss a single opportunity to turn traffic into leads or sales.

Smart Media Buying. To complement your free online efforts, you may want to consider targeted, low-cost media buys (paid online advertising) in the form of text ads, banner ads, blog ads or list rentals (i.e. e-news sponsorships or solo emails). You’re paying for the placement in these locations, so you must make sure you have strong promotional copy and offers for the best results possible. High-traffic blogs are a high-performing, low-cost way to test new creatives. I like BlogAds.com network and you can buy placements a la carte and search by genre.

Pay Per Click (PPC). Many people try pay per click only to spend thousands of dollars with little results. Creating a successful PPC campaign is an art—one that I’ve had success with. You must make sure you have a strong text ad and landing page and that the ad is keyword dense. You must also have a compelling offer and make sure you do your keyword research. Picking the correct keywords that coincide with your actual ad and landing page is crucial. You don’t want to pick keywords that are too vague, too competitive or unpopular. You also need to be active with your campaign management, which includes bid amounts and daily budget. All these things—bid, budget, keywords, popularity and placement—will determine the success of the campaign. And most campaigns are trial and error and take anywhere from three to six weeks to optimize.

Viral Marketing. Make sure you have a “forward to friend” feature in your e-newsletter to encourage viral marketing. It’s also important to have a content syndication blurb in your newsletter; this also encourages other websites, publishers, editors and bloggers to republish and share your content, as long as they give you author attribution and a back-link to your site (which helps in SEM).

The following, in my personal experience, doesn’t work for quality list building …

Sweepstakes and Giveaways. You’ve seen the offers: Win a free TV, iPhone or similar in exchange for your email address. This gets the volume, but the leads are usually poor quality or unqualified (irrelevant). The numbers may look good on the front end, but when you dig deeper, your list is likely compromised with deliverability issues (high bounce rates), inactives and bad emails. This is because the leads are not targeted. The offer wasn’t targeted or synergistic with the company. With lead generation efforts, it should be quality over quantity.

Email appends. According to Wikipedia, email appending, also known as e-appending, is a marketing practice that involves taking known customer data (first name, last name and postal address) and matching it against a vendor’s database to obtain email addresses. The purpose is to grow one’s email subscriber list with the intent of sending customers information via email instead of through traditional direct “snail” mail. The problem with this, in my direct experience, is that on the front end your list initially grows, but these names are not typically qualified or interested. At one company where I worked, we tracked a group of email append cohorts over the course of a year to see what percent would “convert” to a paying customer. Nearly 75 percent of the names dropped off the file during that year and never even converted. Email appending is a controversial tactic, with critics claiming that sending email to people who never explicitly opted-in is against best practices. In my opinion, it’s a waste of time and money.

Mythbusters: Digital, Mail and Green Marketing Payback

The “Mythbusters” of Discovery Channel’s hit show get to blow things up while putting myths to the tests of science. At the Direct Marketing Association’s annual marketing conference, I paid tribute to personal heroes Jamie and Adam (the real TV Mythbusters) by blowing up some green marketing myths that have infiltrated both consumer and agency attitudes toward sustainable marketing practice. If left unchecked, today’s common green myths can sacrifice campaign integrity, leave profitable sustainability solutions untapped, alienate consumers and contribute to environmental harm

In this week’s “Marketing Sustainability,” I’ve invited the newly named chair of the Direct Marketing Association Committee on the Environment and Social Responsibility—Adam Freedgood of New York-based Quadriga Art—to share with readers a “myths v. facts” discussion on sustainability and marketing, presented recently at the DMA2012 conference in Las Vegas, NV. —Chet Dalzell

The “Mythbusters” of Discovery Channel’s hit show get to blow things up while putting myths to the tests of science. At the Direct Marketing Association’s annual marketing conference, I paid tribute to personal heroes Jamie and Adam (the real TV Mythbusters) by blowing up some green marketing myths that have infiltrated both consumer and agency attitudes toward sustainable marketing practice. If left unchecked, today’s common green myths can sacrifice campaign integrity, leave profitable sustainability solutions untapped, alienate consumers and contribute to environmental harm. A 30-minute town square session called “Mythbusters: Green Marketing Edition” debunked and discussed a dozen print, digital and multichannel myths, resulting in new opportunities to drive profitability from sustainability of campaign execution.

The troubling truth about green marketing myths is that they appeal to our aspirations and can quickly become ingrained in business practice. For example, “going green costs more,” “digital is greener than print,” “you can save a tree by not printing this article,” and “storing your data in the cloud means fluffy white beams of clean energy will power your campaign data storage, forever.”

Marketing missteps can grant mythological status to simple misconceptions virtually overnight. Consider the classic “go green, go paperless.” This little beauty appeared out of nowhere and now graces billing statements everywhere. There is no quantifiable environmental benefit attached to the claim, which creates risk to brand integrity. Unsupported green claims violate the Federal Trade Commission’s “Green Guides” enacted earlier this year. The “go paperless” phrase subjugates marketing best practice, opting instead for a greedy grab at the small subset of consumers who attach singificant value to a brand’s environmental attributes. A direct response mechanism that acknowledges basic consumer preferences would do just fine.

The evolution of product stewardship regulation, rising resource costs and consumer preferences support the business case for infusing sustainability in all aspects of marketing best practice. The full myth busting presentation is a Jeopardy-style game board rendered interactively in PowerPoint, available to download here.

Here are a few green marketing myths we debunked that offer urgent, profitable insights for print, digital and multichannel marketers:

Myth 1: “Delivering products and services online, or in the cloud, represents a shift toward environmentally friendly communications, compared with print-based media.”

Reality: This myth is busted. Digital communications shift the tangible environmental impact of marketing campaigns away from the apparent resource requirements associated with paper, transport and end-of-life impacts of print campaigns. By way of fossil fuel-powered data centers that are largely out of sight and out of mind, digital carries a surprising set of environmental hazards. A September 2012 New York Times article highlights the growing connection between data centers and air pollution due to massive energy requirements and dirty fossil-based power inputs. The digital devices used to create and deliver online content to consumers contain toxic heavy metals and petroleum-based plastics. Electronic devices are too toxic for our landfills but are recycled at an abysmal rate. According to the Electronics Takeback Coalition, the U.S. generates more than 3 million tons of “e-waste” annually but recycles only 15 percent.

Myth 2: The United States Postal Service (USPS) has struggled to implement comprehensive sustainability strategies due to declining mail volume and the related shortage of revenue available to invest in green activities.

Reality: Myth busted. The USPS is a prime example of an organization that has embraced the business case for sustainability by making extensive investments in greening most aspects of the organization’s operations. USPS has applied a “triple bottom line” approach to sustainability—the perspective that investments in green business must perform on dimensions of profitability, environmental sustainability and stakeholder impacts. Through postal facility energy efficiency retrofits and attention to sustainability at all levels of operations, USPS has saved $400 million since 2007, according to its sustainability report. Through some 400 employee green teams, USPS employs a bottom-up approach to sustainability that produces substantial cost and energy savings.

Myth 3: Green initiatives have a long, three to five year payback period, placing them at odds with other organizational priorities, such as investments in fast-paced digital marketing infrastructure.

Reality: Myth busted. While some sustainability measures, such as building energy efficiency retrofits, carry a payback period of several years depending on finance and incentives, there are innovative approaches to sustainability for direct marketers that yield much faster financial gains. For example, performing a packaging design audit that identifies downsized product packages and renewable materials can produce immediate savings while dramatically reducing environmental impact. Consolidating IT infrastructure and applying best practices in data center efficiency and server virtualization produces fast financial returns for firms operating in-house data centers. Lastly, Innovative programs that engage customers and suppliers in sustainability also produce quick gains with minimal investment. Starbucks’s “beta cup” competition mobilized a global audience of packaging designers, students and inventors in search of more sustainable coffee cups. The design submissions confronted a key sustainability issue head-on, allowing the chain to engage stakeholders in the solution.

Adam Freedgood is a sustainable business strategy specialist and director of business development at global nonprofit direct marketing firm Quadriga Art in New York City. Reach him on Twitter @thegreenophobe or email adam@freedgood.com.

Backlink Pruning: A Staple ‘Best’ Practice, Especially in Penguin’s Aftermath

Many direct marketers are familiar with the practice of list hygiene. In a nutshell, it’s going through your email file, looking at inactive, duplicate or bad emails, and removing them or “purging them” from your list. Having a “clean” list means it’s more relevant and responsive. The same holds true for backlinks … especially in lieu of recent Google algorithm updates like last year’s Farmer/Panda and this year’s Penguin, which penalize websites for low quality irrelevant content and backlinks.

Many direct marketers are familiar with the practice of list hygiene. In a nutshell, it’s going through your email file, looking at inactive, duplicate or bad emails, and removing them or “purging them” from your list.

Having a “clean” list means it’s more relevant and responsive.

The same holds true for backlinks … especially in lieu of recent Google algorithm updates like last year’s Farmer/Panda and this year’s Penguin, which penalize websites for low quality irrelevant content and backlinks.

It’s always a best practice, from a search engine hygiene standpoint, to monitor and “prune” your backlinks to make sure you don’t have spammy or irrelevant websites linking back to you.

And now more than ever, with Google’s latest update, it’s prudent to check your own website’s backlinks to ensure those who are linking to you are relevant and synergistic to your own site’s content.

Here’s what you need to know (and do!):

First, check out some free online tools that do this, known as “backlink checkers.” Some that I use are:

But there are many out there. You can simply type a search for “free backlink checker tool” and see which one appeals best to you.

Second, after you plug in your website’s URL in the backlink checker tool, go down the results list and see who’s linking back to you. Note: This is a laborious process, but well worth it; especially if you noticed your traffic and SERP placement dropped recently and you may have speculated that Penguin is to blame.

Next, identify the sites that appear to be irrelevant and non-related to your website—a site in a totally different industry or one that is blatantly spam. Then it’s simply the manual process of visiting the bad backlinks website and contacting them to remove the link going to your site.

If you happen to find dozens of irrelevant and potentially harmful websites, for the sake of time management, it’s best to create one form letter and send to each asking each site to remove its backlink to your site in an effort to avoid/recover from a Google penalty.

List the specifics about the irrelevant URL, such as where it can be found (its entire URL), where it links to (which page on your site), and any anchor text. Your goal is to give the other website as much useful information as possible so they can easily find the link and remove it from your site.

Sometimes, it’s easy to find contact information for the irrelevant backlink’s website owner. You simply visit the corresponding website link and search their site for contact information or a “Contact Us” page.

Other times it’s a bit harder, and you may need to do a bit of sleuthing and use some additional free tools to help determine the website’s owner. Such tools are:

  • Domaintools.com: If you want to find out who owns the site your link is on,
    visit domain tools or type “whois.sc” in front of a URL.
  • C-Class Checker: If you have a list of all the links you want to get rid of,
    you can run them through a bulk C-class checker to see how many of them
    are on the same C-class.
  • SpyonWeb: If you only have 1 URL to work with, this tool lets you find out

what other domains they are associated with. Just put in a website URL,
IP address or even the Google analytics or AdSense code and you can find
all of the websites that are connected to it. Keep a record of all efforts to
contact “bad links,” as it will show Google you’ve been making a good effort
to get rid of these irrelevant links.

If you received notification from Google or found that the Panda or Penguin updates have affected your website’s rank and SERP visibility and believe there may have been an error of some sort, there is some recourse …

Google has a quick and easy form you can fill out to pinpoint search terms that you believe you shouldn’t be penalized for.

Good luck!

Myths and Misconceptions: The Real Truth About Content Marketing and the Search Engines: Part II

Lately, I’ve been hearing a lot of people saying things such as: “Google doesn’t like content or article marketing since they changed their algorithms” and “article directories are not useful for search engine marketing and link-building efforts anymore.” I like to remind people of a few fundamental rules of online marketing, specifically involving content, that virtually never changes and is extremely helpful to know (and do!) … Previously, I provided the first three rules, here are the last three:

[Editor’s note: This is Part Two of a two-part series.]

Lately, I’ve been hearing a lot of people saying things such as: “Google doesn’t like content or article marketing since they changed their algorithms” and “article directories are not useful for search engine marketing and link-building efforts anymore.”

I like to remind people of a few fundamental rules of online marketing, specifically involving content, that virtually never changes and is extremely helpful to know (and do!) … Previously, I provided the first three rules, here are the last three:

4. Targeted Link-Building. Links, whether it’s a one way back link or a reciprocal back link, are still links. Quality links help SEO, and that is indisputable. But, again, there’s some ground rules to do it right within best practices … and do it wrong. Links should be quality links, and by that I mean on sites that have relevant content and a synergistic audience to your own. It should also be a site with a good traffic rank. I prefer to do linkbuilding manually and do it strategically. I research sites that are synergistic in all ways to the site I’m working with (albeit one-way or reciprocal links). Doing it manually allows more targeted selection and control over where you want your links to go. Manual selection and distribution can also lead to other opportunities down the road with those sites you’re building relationships with, including cross-marketing or editorial efforts such as editorial contributions, revenue shares and more. In my view, this approach is both linkbuilding and relationship building.

5. Location, Location, Location. Where you link to is important. When doing SONAR or content marketing, I always tell clients to deep link—that is, not just link to their home page—which, to me, doesn’t make any sense anyway, as there’s too many distractions on a home page. Readers need a simple, direct call to action. Keep them focused. It’s always smarter to link to your source article, which should be on one of your subpages, such as the newsletter archive page or press release page. Now you have a connection. The article/content excerpt you pushed out is appearing in the SERPs (search engine result pages) and its redirect links to the full version on your archive or press page. You’ve satisfied the searcher’s expectations by not doing a “bait and switch.” There’s relevance and continuity. And to help monetize that traffic, that newsletter archive or press Web page (which you’re driving the traffic to), the background should contain fixed elements to “harness” the traffic it will be getting for list growth and cross-selling, such as fixed lead gen boxes, text ads, banner ads, editorial notes and more. These elements should blend with your overall format, not being to obnoxious, but being easily seen.

6. Catalyst Content. It’s always important to make sure you publish the content on your website first … I call this your “catalyst content.” This is the driving source which all other inbound marketing will occur and be focused around. Your website articles should be dated and be formatted similar to a news feed or blog. Also, posting timely press releases will work favorably, as they will be viewed by Google and human readers as the latest news (again favorable to Google’s latest “freshness” update). At the same time, send your content out via email (i.e. ezine) to your in-house list before external marketing channels see it. This helps from an SEO standpoint, but also helps with credibility and bonding with your subscribers and regular website visitors, as they should get your information before the masses.

There you go. My best practices for marketing with content. I don’t practice nor condone “black hat” marketing tactics. I’ve always been lucky enough to work for top publishers and clients who put out great, original content.

It really does all boil down to the quality of the content when you talk about any form of article and search engine marketing. Content is king, and when you have strong editorial, along with being a “creatively strategic” thinker, you don’t need to engage in “black hat” or questionable SEO/SEM.

Algorithms are always changing. It’s good to be aware of the latest news, trends and techniques, but also not to put your your eggs in one basket and build your entire online marketing strategy based on the “current” algorithms. Using solid content, analyzing your website’s visitor and usage patterns and keeping general best practices in mind are staple components that will always play an important role in content marketing.