The Question Is the Answer

This question and answer format has come to SEO as the featured snippet. These snippets, generated automatically by Google from the organic results, provide users quick answers to their questions. Sample questions that trigger a snippet are: “best chicken and dumplings recipe,” “what to wear to a funeral,” “how to remove a tick” and “when to use a semicolon.”

Unknown peopleFans of the long-running TV show “Jeopardy!” know that contestants must state their answers in the form of a question. Having watched this show many times over the years, it is startling over how many domains of knowledge answers can be stated as questions.

This question and answer format has come to SEO as the featured snippet. These snippets, generated automatically by Google from the organic results, provide users quick answers to their questions. Sample questions that trigger a snippet are: “best chicken and dumplings recipe,” “what to wear to a funeral,” “how to remove a tick” and “when to use a semicolon.” The featured answer snippet includes a direct link to the source and shows up above any of the other organic results. For the SEO, this is new ground to capture.

To be the featured snippet is to achieve a rank 0, so to speak. Is there an advantage to attaining this? How is it accomplished?

Why Have These Featured Snippets Proliferated?

As users migrate to mobile devices with smaller screens, search is changing to meet their needs. Gone is the user sitting at a desktop plowing through link after link for information on “how to remove a tick?” Chances are, the searcher is out on a hike or walking in the lawn and realizes that one of these disease-bearing insects has grabbed onto their body. A quick search on an ever-present phone will yield accurate instructions for the removal.

The rapid growth of voice activated search through Siri, Alexa and Cortana has brought a more conversational tone to search. “Siri, find me the best chicken and dumplings recipe?” These devices will continue to improve and so, too, must search. User behavior will demand it.

When Google first brought out the featured snippet, SEOs thought that it might be little more than a test or would only apply to certain types of information. It is not a test, and as “Jeopardy!” has shown us, a question and answer format can apply to many domains of information. Google has continued to expand the featured snippet with related snippets (headlined as — People also ask) that delve deeper into the topic at hand. Explore these, and you will find that layers and layers of instant information unspool before your eyes.

Is There an Advantage?

When the featured snippet first showed up on search pages, there were concerns that Google was seizing a site’s content, displaying it and removing the impetus for the user to come to the site. Experience has shown that the featured snippet provides an added impetus for the user to click through and get more information. It is as if the user has hit a rich vein of ore and wants dig out more quality information. Sites that are featured enjoy strong traffic generated by the snippets.

How to Be Featured?

How to be featured is the challenge. This is one of the many places where content and SEO must come together. It is dreaming to expect a page with little chance of ranking, mired in Page Four or Five of the search results, to magically pop up in the featured snippets for a competitive keyword question. However, a quick review of top-ranking pages — Page One or so — will give you some idea as to where potential lies. The next step is to generate questions that might fit with the pages. If your pages were built for users to find information, this task should, in fact, come quite easily.

  • Why did you build it?
  • Who did you build it for?
  • When do you expect users to find it?
  • How will they use the page?
  • What benefit will they glean from it?

As you may have noted, each of the phrases above is in the form of a question. It is not hard to generate questions. Then, make sure that the question and its attendant answer are infused into your content and watch the results.

Benchmarking: There’s No Such Thing as an Average 2% Response Rate

It seems easy enough to answer the question: How to know if a marketing campaign measures up? But managing client expectations (whether they’re internal or external) is sometimes more fuzzy

It seems easy enough to answer the question: How to know if a marketing campaign measures up?

Often enough, there are predefined business objectives, acceptable margins for profit and cost, and a marketing return on investment that is straightforward enough to calculate. If one is able to know any and all of these markers, then one can know if a marketing campaign, or even a single tactic, is making the grade.

But managing client expectations (whether they’re internal or external) is sometimes more fuzzy. And a marketing execution doesn’t always go according to plan, prompting investigations on what might have gone wrong. (I’m still surprised how testing is underutilized, for example.)

On the happier end of the spectrum, stellar results might prompt a whole other set of questions: “Did we really beat the long-standing control? This campaign performed gang-busters, how does it measure up to efforts of our industry peers? Is this campaign award-worthy?”

As a public relations professional in the world of direct response, I’ve often been asked to help an agency or marketing client understand how good or bad a particular marketing result might be. When the question is about results that are less than expected, there is often internal wrangling about the creative, the list and/or the strategy — any of which might be the culprit. When the results are fantastic, clients often want to know, are we beating whatever the competition may be up to.

In both scenarios, among go-to options are various industry research sources. Anyone who has a subscription to Who’s Mailing What! archive (direct mail, email), or taps eMarketer or Econsultancy (digital and mobile information), or steps up to Gartner, Forrester and the like for subscriptions to qualitative reporting, certainly has access to great data and idea stores.

I personally keep a copy of “DMA Statistical Fact Book” (annually published) and “DMA Response Rate Report” close at hand. The “DMA Response Rate Report’s” 2015 version is recently published, and is available at the DMA Bookstore. Both are understandably Direct Marketing Association top-sellers.

The “DMA Response Rate Report” aggregates data from respondents — providing a true benchmarking resource. And it breaks response data out by media, and by industry (selling cars is not selling clothes) which gives marketers a helpful guide of what to shoot for and expect. It’s worth a whole other post to delve into its insights, but IWCO Direct and SeQuel Response recently offered some. A quick inspection of the report can let marketers know what they might expect from an otherwise well-executed campaign.

And I’m happy to say to some clients, too, as another benchmark, that they should enter the International ECHO Awards. It’s perhaps the best way to be recognized for achievement (beyond the paycheck). With judges inspecting the world’s best in data-driven advertising, an ECHO trophy says that a marketing team, agency or organization knows its stuff. This year’s competition deadline for entering is July 10, and DMA is offering a Webinar on May 19 to give tips and insights from the judges themselves (speaking will be yours truly, joined by fellow Target Marketing blogger Carolyn Goodman of Goodman Marketing Partners and Smithsonian’s Karen Rice Gardiner). Have only five minutes to spare? You can always hear directly from Carolyn here about the entry process.

Enter early and often! I’d love to point to your campaign as a “benchmark” later this year.

How to Outsource Analytics

In this series, I have been emphasizing the importance of statistical modeling in almost every article. While there are plenty of benefits of using statistical models in a more traditional sense (refer to “Why Model?”), in the days when “too much” data is the main challenge, I would dare to say that the most important function of statistical models is that they summarize complex data into simple-to-use “scores.”

In this series, I have been emphasizing the importance of statistical modeling in almost every article. While there are plenty of benefits of using statistical models in a more traditional sense (refer to “Why Model?”), in the days when “too much” data is the main challenge, I would dare to say that the most important function of statistical models is that they summarize complex data into simple-to-use “scores.”

The next important feature would be that models fill in the gaps, transforming “unknowns” to “potentials.” You see, even in the age of ubiquitous data, no one will ever know everything about everybody. For instance, out of 100,000 people you have permission to contact, only a fraction will be “known” wine enthusiasts. With modeling, we can assign scores for “likelihood of being a wine enthusiast” to everyone in the base. Sure, models are not 100 percent accurate, but I’ll take “70 percent chance of afternoon shower” over not knowing the weather forecast for the day of the company picnic.

I’ve already explained other benefits of modeling in detail earlier in this series, but if I may cut it really short, models will help marketers:

1. In deciding whom to engage, as they cannot afford to spam the world and annoy everyone who can read, and

2. In determining what to offer once they decide to engage someone, as consumers are savvier than ever and they will ignore and discard any irrelevant message, no matter how good it may look.

OK, then. I hope you are sold on this idea by now. The next question is, who is going to do all that mathematical work? In a country where jocks rule over geeks, it is clear to me that many folks are more afraid of mathematics than public speaking; which, in its own right, ranks higher than death in terms of the fear factor for many people. If I may paraphrase “Seinfeld,” many folks are figuratively more afraid of giving a eulogy than being in the coffin at a funeral. And thanks to a sub-par math education in the U.S. (and I am not joking about this, having graduated high school on foreign soil), yes, the fear of math tops them all. Scary, heh?

But that’s OK. This is a big world, and there are plenty of people who are really good at mathematics and statistics. That is why I purposefully never got into the mechanics of modeling techniques and related programming issues in this series. Instead, I have been emphasizing how to formulate questions, how to express business goals in a more logical fashion and where to invest to create analytics-ready environments. Then the next question is, “How will you find the right math geeks who can make all your dreams come true?”

If you have a plan to create an internal analytics team, there are a few things to consider before committing to that idea. Too many organizations just hire one or two statisticians, dump all the raw data onto them, and hope to God that they will figure some ways to make money with data, somehow. Good luck with that idea, as:

1. I’ve seen so many failed attempts like that (actually, I’d be shocked if it actually worked), and

2. I am sure God doesn’t micromanage statistical units.

(Similarly, I am almost certain that she doesn’t care much for football or baseball scores of certain teams, either. You don’t think God cares more for the Red Sox than the Yankees, do ya?)

The first challenge is locating good candidates. If you post any online ad for “Statistical Analysts,” you will receive a few hundred resumes per day. But the hiring process is not that simple, as you should ask the right questions to figure out who is a real deal, and who is a poser (and there are many posers out there). Even among qualified candidates with ample statistical knowledge, there are differences between the “Doers” and “Vendor Managers.” Depending on your organizational goal, you must differentiate the two.

Then the next challenge is keeping the team intact. In general, mathematicians and statisticians are not solely motivated by money; they also want constant challenges. Like any smart and creative folks, they will simply pack up and leave, if “they” determine that the job is boring. Just a couple of modeling projects a year with some rudimentary sets of data? Meh. Boring! Promises of upward mobility only work for a fraction of them, as the majority would rather deal with numbers and figures, showing no interest in managing other human beings. So, coming up with interesting and challenging projects, which will also benefit the whole organization, becomes a job in itself. If there are not enough challenges, smart ones will quit on you first. Then they need constant mentoring, as even the smartest statisticians will not know everything about challenges associated with marketing, target audiences and the business world, in general. (If you stumble into a statistician who is even remotely curious about how her salary is paid for, start with her.)

Further, you would need to invest to set up an analytical environment, as well. That includes software, hardware and other supporting staff. Toolsets are becoming much cheaper, but they are not exactly free yet. In fact, some famous statistical software, such as SAS, could be quite expensive year after year, although there are plenty of alternatives now. And they need an “analytics-ready” data environment, as I emphasized countless times in this series (refer to “Chicken or the Egg? Data or Analytics?” and “Marketing and IT; Cats and Dogs”). Such data preparation work is not for statisticians, and most of them are not even good at cleaning up dirty data, anyway. That means you will need different types of developers/programmers on the analytics team. I pointed out that analytical projects call for a cohesive team, not some super-duper analyst who can do it all (refer to “How to Be a Good Data Scientist”).

By now you would say “Jeez Louise, enough already,” as all this is just too much to manage to build just a few models. Suddenly, outsourcing may sound like a great idea. Then you would realize there are many things to consider when outsourcing analytical work.

First, where would you go? Everyone in the data industry and their cousins claim that they can take care of analytics. But in reality, it is a scary place where many who have “analytics” in their taglines do not even touch “predictive analytics.”

Analytics is a word that is abused as much as “Big Data,” so we really need to differentiate them. “Analytics” may mean:

  • Business Intelligence (BI) Reporting: This is mostly about the present, such as the display of key success metrics and dashboard reporting. While it is very important to know about the current state of business, much of so-called “analytics” unfortunately stops right here. Yes, it is good to have a dashboard in your car now, but do you know where you should be going?
  • Descriptive Analytics: This is about how the targets “look.” Common techniques such as profiling, segmentation and clustering fall under this category. These techniques are mainly for describing the target audience to enhance and optimize messages to them. But using these segments as a selection mechanism is not recommended, while many dare to do exactly that (more on this subject in future articles).
  • Predictive Modeling: This is about answering the questions about the future. Who would be more likely to behave certain ways? What communication channels will be most effective for whom? How much is the potential spending level of a prospect? Who is more likely to be a loyal and profitable customer? What are their preferences? Response models, various of types of cloning models, value models, and revenue models, attrition models, etc. all fall under this category, and they require hardcore statistical skills. Plus, as I emphasized earlier, these model scores compact large amounts of complex data into nice bite-size packages.
  • Optimization: This is mostly about budget allocation and attribution. Marketing agencies (or media buyers) generally deal with channel optimization and spending analysis, at times using econometrics models. This type of statistical work calls for different types of expertise, but many still insist on calling it simply “analytics.”

Let’s say that for the purpose of customer-level targeting and personalization, we decided to outsource the “predictive” modeling projects. What are our options?

We may consider:

  • Individual Consultants: In-house consultants are dedicated to your business for the duration of the contract, guaranteeing full access like an employee. But they are there for you only temporarily, with one foot out the door all the time. And when they do leave, all the knowledge walks away with them. Depending on the rate, the costs can add up.
  • Standalone Analytical Service Providers: Analytical work is all they do, so you get focused professionals with broad technical and institutional knowledge. Many of them are entrepreneurs, but that may work against you, as they could often be understaffed and stretched thin. They also tend to charge for every little step, with not many freebies. They are generally open to use any type of data, but the majority of them do not have secure sources of third-party data, which could be essential for certain types of analytics involving prospecting.
  • Database Service Providers: Almost all data compilers and brokers have statistical units, as they need to fill in the gap within their data assets with statistical techniques. (You didn’t think that they knew everyone’s income or age, did you?) For that reason, they have deep knowledge in all types of data, as well as in many industry verticals. They provide a one-stop shop environment with deep resource pools and a variety of data processing capabilities. However, they may not be as agile as smaller analytical shops, and analytics units may be tucked away somewhere within large and complex organizations. They also tend to emphasize the use of their own data, as after all, their main cash cows are their data assets.
  • Direct Marketing Agencies: Agencies are very strategic, as they touch all aspects of marketing and control creative processes through segmentation. Many large agencies boast full-scale analytical units, capable of all types of analytics that I explained earlier. But some agencies have very small teams, stretched really thin—just barely handling the reporting aspect, not any advanced analytics. Some just admit that predictive analytics is not part of their core competencies, and they may outsource such projects (not that it is a bad thing).

As you can see here, there is no clear-cut answer to “with whom you should you work.” Basically, you will need to check out all types of analysts and service providers to determine the partner best suitable for your long- and short-term business purposes, not just analytical goals. Often, many marketers just go with the lowest bidder. But pricing is just one of many elements to be considered. Here, allow me to introduce “10 Essential Items to Consider When Outsourcing Analytics.”

1. Consulting Capabilities: I put this on the top of the list, as being a translator between the marketing and the technology world is the most important differentiator (refer to “How to Be a Good Data Scientist”). They must understand the business goals and marketing needs, prescribe suitable solutions, convert such goals into mathematical expressions and define targets, making the best of available data. If they lack strategic vision to set up the data roadmap, statistical knowledge alone will not be enough to achieve the goals. And such business goals vary greatly depending on the industry, channel usage and related success metrics. Good consultants always ask questions first, while sub-par ones will try to force-fit marketers’ goals into their toolsets and methodologies.

Translating marketing goals into specific courses of action is a skill in itself. A good analytical partner should be capable of building a data roadmap (not just statistical steps) with a deep understanding of the business impact of resultant models. They should be able to break down larger goals into smaller steps, creating proper phased approaches. The plan may call for multiple models, all kinds of pre- and post-selection rules, or even external data acquisition, while remaining sensitive to overall costs.

The target definition is the core of all these considerations, which requires years of experience and industry knowledge. Simply, the wrong or inadequate targeting decision leads to disastrous results, no matter how sound the mathematical work is (refer to “Art of Targeting”).

Another important quality of a good analytical partner is the ability to create usefulness out of seemingly chaotic and unstructured data environments. Modeling is not about waiting for the perfect set of data, but about making the best of available data. In many modeling bake-offs, the winners are often decided by the creative usage of provided data, not just statistical techniques.

Finally, the consultative approach is important, as models do not exist in a vacuum, but they have to fit into the marketing engine. Be aware of the ones who want to change the world around their precious algorithms, as they are geeks not strategists. And the ones who understand the entire marketing cycle will give advice on what the next phase should be, as marketing efforts must be perpetual, not transient.

So, how will you find consultants? Ask the following questions:

  • Are they “listening” to you?
  • Can they repeat “your” goals in their own words?
  • Do their roadmaps cover both short- and long-term goals?
  • Are they confident enough to correct you?
  • Do they understand “non-statistical” elements in marketing?
  • Have they “been there, done that” for real, or just in theories?

2. Data Processing Capabilities: I know that some people look down upon the word “processing.” But data manipulation is the most important key step “before” any type of advanced analytics even begins. Simply, “garbage-in, garbage out.” And unfortunately, most datasets are completely unsuitable for analytics and modeling. In general, easily more than 80 percent of model development time goes into “fixing” the data, as most are unstructured and unrefined. I have been repeatedly emphasizing the importance of a “model-ready” (or “analytics-ready”) environment for that reason.

However, the reality dictates that the majority of databases are indeed NOT model-ready, and most of them are not even close to it. Well, someone has to clean up the mess. And in this data business, the last one who touches the dataset becomes responsible for all the errors and mistakes made to it thus far. I know it is not fair, but that is why we need to look at the potential partner’s ability to handle large and really messy data, not just the statistical savviness displayed in glossy presentations.

Yes, that dirty work includes data conversion, edit/hygiene, categorization/tagging, data summarization and variable creation, encompassing all kinds of numeric, character and freeform data (refer to “Beyond RFM Data” and “Freeform Data Aren’t Exactly Free”). It is not the most glorious part of this business, but data consistency is the key to successful implementation of any advanced analytics. So, if a model-ready environment is not available, someone had better know how to make the best of whatever is given. I have seen too many meltdowns in “before” and “after” modeling steps due to inconsistencies in databases.

So, grill the candidates with the following questions:

  • If they support file conversions, edit, categorization and summarization
  • How big of a dataset is too big, and how many files/tables are too many for them
  • How much free-form data are too much for them
  • Ask for sample model variables that they have created in the past

3. Track Records in the Industry: It can be argued that industry knowledge is even more crucial for the success than statistical know-how, as nuances are often “Lost in Translation” without relevant industry experience. In fact, some may not even be able to carry on a proper conversation with a client without it, leading to all kinds of wrong assumptions. I have seen a case where “real” rocket scientists messed up models for credit card campaigns.

The No. 1 reason why industry experience is important is everyone’s success metrics are unique. Just to name a few, financial services (banking, credit card, insurance, investment, etc.), travel and hospitality, entertainment, packaged goods, online and offline retail, catalogs, publication, telecommunications/utilities, non-profit and political organizations all call for different types of analytics and models, as their business models and the way they interact with target audiences are vastly different. For example, building a model (or a database, for that matter) for businesses where they hand over merchandise “before” they collect money is fundamentally different than the ones where exchange happens simultaneously. Even a simple concept of payment date or transaction date cannot be treated the same way. For retailers, recent dates could be better for business, but for subscription business, older dates may carry more weight. And these are just some examples with “dates,” before touching any dollar figures or other fun stuff.

Then the job gets even more complicated, if we further divide all of these industries by B-to-B vs. B-to-C, where available data do not even look similar. On top of that, divisional ROI metrics may be completely different, and even terminology and culture may play a role in all of this. When you are a consultant, you really don’t want to stop the flow of a meeting to clarify some unfamiliar acronyms, as you are supposed to know them all.

So, always demand specific industry references and examine client roasters, if allowed. (Many clients specifically ask vendors not to use their names as references.) Basically, watch out for the ones who push one-size-fits-all cookie-cutter solutions. You deserve way more than that.

4. Types of Models Supported: Speaking of cookie-cutter stuff, we need to be concerned with types of models that the outsourcing partner would support. Sure, nobody employs every technique, and no one can be good at everything. But we need to watch out for the “One-trick Ponies.”

This could be a tricky issue, as we are going into a more technical domain. Plus, marketers should not self-prescribe with specific techniques, instead of clearly stating their business goals (refer to “Marketing and IT; Cats and Dogs”). Some of the modeling goals are:

  • Rank and select prospect names
  • Lead scoring
  • Cross-sell/upsell
  • Segment the universe for messaging strategy
  • Pinpoint the attrition point
  • Assign lifetime values for prospects and customers
  • Optimize media/channel spending
  • Create new product packages
  • Detect fraud
  • Etc.

Unless you have successfully dealt with the outsourcing partner in the past (or you have a degree in statistics), do not blurt out words like Neural-net, CHAID, Cluster Analysis, Multiple Regression, Discriminant Function Analysis, etc. That would be like demanding specific medication before your new doctor even asks about your symptoms. The key is meeting your business goals, not fulfilling buzzwords. Let them present their methodology “after” the goal discussion. Nevertheless, see if the potential partner is pushing one or two specific techniques or solutions all the time.

5. Speed of Execution: In modern marketing, speed to action is the king. Speed wins, and speed gains respect. However, when it comes to modeling or other advanced analytics, you may be shocked by the wide range of time estimates provided by each outsourcing vendor. To be fair they are covering themselves, mainly because they have no idea what kind of messy data they will receive. As I mentioned earlier, pre-model data preparation and manipulation are critical components, and they are the most time-consuming part of all; especially when available data are in bad shape. Post-model scoring, audit and usage support may elongate the timeline. The key is to differentiate such pre- and post-modeling processes in the time estimate.

Even for pure modeling elements, time estimates vary greatly, depending on the complexity of assignments. Surely, a simple cloning model with basic demographic data would be much easier to execute than the ones that involve ample amounts of transaction- and event-level data, coming from all types of channels. If time-series elements are added, it will definitely be more complex. Typical clustering work is known to take longer than regression models with clear target definitions. If multiple models are required for the project, it will obviously take more time to finish the whole job.

Now, the interesting thing about building a model is that analysts don’t really finish it, but they just run out of time—much like the way marketers work on PowerPoint presentations. The commonality is that we can basically tweak models or decks forever, but we have to stop at some point.

However, with all kinds of automated tools and macros, model development time has decreased dramatically in past decades. We really came a long way since the first application of statistical techniques to marketing, and no one should be quoting a 1980s timeline in this century. But some still do. I know vendors are trained to follow the guideline “always under-promise and over-deliver,” but still.

An interesting aspect of this dilemma is that we can negotiate the timeline by asking for simpler and less sophisticated versions with diminished accuracy. If, hypothetically, it takes a week to be 98 percent accurate, but it only takes a day to be 90 percent accurate, what would you pick? That should be the business decision.

So, what is a general guideline? Again, it really depends on many factors, but allow me to share a version of it:

  • Pre-modeling Processing

– Data Conversions: from half a day to weeks

– Data Append/Enhancement: between overnight and two days

– Data Edit and Summarization: Data-dependent

  • Modeling: Ranges from half a day to weeks

– Depends on type, number of models and complexity

  • Scoring: from half a day to one week

– Mainly depends on number of records and state of the database to be scored

I know these are wide ranges, but watch out for the ones that routinely quote 30 days or more for simple clone models. They may not know what they are doing, or worse, they may be some mathematical perfectionists who don’t understand the marketing needs.

6. Pricing Structure: Some marketers would put this on top of the checklist, or worse, use the pricing factor as the only criterion. Obviously, I disagree. (Full disclosure: I have been on the service side of the fence during my entire career.) Yes, every project must make an economic sense in the end, but the budget should not and cannot be the sole deciding factor in choosing an outsourcing partner. There are many specialists under famous brand names who command top dollars, and then there are many data vendors who throw in “free” models, disrupting the ecosystem. Either way, one should not jump to conclusions too fast, as there is no free lunch, after all. In any case, I strongly recommend that no one should start the meeting with pricing questions (hence, this article). When you get to the pricing part, ask what the price includes, as the analytical journey could be a series of long and winding roads. Some of the biggest factors that need to be considered are:

  • Multiple Model Discounts—Less for second or third models within a project?
  • Pre-developed (off-the-shelf) Models—These can be “much” cheaper than custom models, while not custom-fitted.
  • Acquisition vs. CRM—Employing client-specific variables certainly increases the cost.
  • Regression Models vs. Other Types—At times, types of techniques may affect the price.
  • Clustering and Segmentations—They are generally priced much higher than target-specific models.

Again, it really depends on the complexity factor more than anything else, and the pre- and post-modeling process must be estimated and priced separately. Non-modeling charges often add up fast, and you should ask for unit prices and minimum charges for each step.

Scoring charges in time can be expensive, too, so negotiate for discounts for routine scoring of the same models. Some may offer all-inclusive package pricing for everything. The important thing is that you must be consistent with the checklist when shopping around with multiple candidates.

7. Documentation: When you pay for a custom model (not pre-developed, off-the-shelf ones), you get to own the algorithm. Because algorithms are not tangible items, the knowledge is to be transformed in model documents. Beware of the ones who offer “black-box” solutions with comments like, “Oh, it will work, so trust us.”

Good model documents must include the following, at the minimum:

  • Target and Comparison Universe Definitions: What was the target variable (or “dependent” variable) and how was it defined? How was the comparison universe defined? Was there any “pre-selection” for either of the universes? These are the most important factors in any model—even more than the mechanics of the model itself.
  • List of Variables: What are the “independent” variables? How were they transformed or binned? From where did they originate? Often, these model variables describe the nature of the model, and they should make intuitive sense.
  • Model Algorithms: What is the actual algorithm? What are the assigned weight for each independent variable?
  • Gains Chart: We need to examine potential effectiveness of the model. What are the “gains” for each model group, from top to bottom (e.g., 320 percent gain at the top model group in comparison to the whole universe)? How fast do such gains decrease as we move down the scale? How do the gains factors compare against the validation sample? A graphic representation would be nice, too.

For custom models, it is customary to have a formal model presentation, full documentation and scoring script in designated programming languages. In addition, if client files are provided, ask for a waterfall report that details input and output counts of each step. After the model scoring, it is also customary for the vendor to provide a scored universe count by model group. You will be shocked to find out that many so-called analytical vendors do not provide thorough documentation. Therefore, it is recommended to ask for sample documents upfront.

8. Scoring Validation: Models are built and presented properly, but the job is not done until the models are applied to the universe from which the names are ranked and selected for campaigns. I have seen too many major meltdowns at this stage. Simply, it is one thing to develop models with a few hundred thousand record samples, but it is quite another to apply the algorithm to millions of records. I am not saying that the scoring job always falls onto the developers, as you may have an internal team or a separate vendor for such ongoing processes. But do not let the model developer completely leave the building until everything checks out.

The model should have been validated against the validation sample by then, but live scoring may reveal all kinds of inconsistencies. You may also want to back-test the algorithms with past campaign results, as well. In short, many things go wrong “after” the modeling steps. When I hear customers complaining about models, I often find that the modeling is the only part that was done properly, and “before” and “after” steps were all messed up. Further, even machines misunderstand each other, as any differences in platform or scripting language may cause discrepancies. Or, maybe there was no technical error, but missing values may have caused inconsistencies (refer to “Missing Data Can Be Meaningful”). Nonetheless, the model developers would have the best insight as to what could have gone wrong, so make sure that they are available for questions after models are presented and delivered.

9. Back-end Analysis: Good analytics is all about applying learnings from past campaigns—good or bad—to new iterations of efforts. We often call it “closed-loop marketing—while many marketers often neglect to follow up. Any respectful analytics shop must be aware of it, while they may classify such work separately from modeling or other analytical projects. At the minimum, you need to check out if they even offer such services. In fact, so-called “match-back analysis” is not as simple as just matching campaign files against responders in this omnichannel environment. When many channels are employed at the same time, allocation of credit (i.e., “what worked?”) may call for all kinds of business rules or even dedicated models.

While you are at it, ask for a cheaper version of “canned” reports, as well, as custom back-end analysis can be even more costly than the modeling job itself, over time. Pre-developed reports may not include all the ROI metrics that you’re looking for (e.g., open, clickthrough, conversion rates, plus revenue and orders-per-mailed, per order, per display, per email, per conversion. etc.). So ask for sample reports upfront.

If you start breaking down all these figures by data source, campaign, time series, model group, offer, creative, targeting criteria, channel, ad server, publisher, keywords, etc., it can be unwieldy really fast. So contain yourself, as no one can understand 100-page reports, anyway. See if the analysts can guide you with such planning, as well. Lastly, if you are so into ROI analysis, get ready to share the “cost” side of the equation with the selected partner. Some jobs are on the marketers.

10. Ongoing Support: Models have a finite shelf life, as all kinds of changes happen in the real world. Seasonality may be a factor, or the business model or strategy may have changed. Fluctuations in data availability and quality further complicate the matter. Basically assumptions like “all things being equal” only happen in textbooks, so marketers must plan for periodic review of models and business rules.

A sure sign of trouble is decreasing effectiveness of models. When in doubt, consult the developers and they may recommend a re-fit or complete re-development of models. Quarterly reviews would be ideal, but if the cost becomes an issue, start with 6-month or yearly reviews, but never go past more than a year without any review. Some vendors may offer discounts for redevelopment, so ask for the price quote upfront.

I know this is a long list of things to check, but picking the right partner is very important, as it often becomes a long-term relationship. And you may find it strange that I didn’t even list “technical capabilities” at all. That is because:

1. Many marketers are not equipped to dig deep into the technical realm anyway, and

2. The difference between the most mathematically sound models and the ones from the opposite end of the spectrum is not nearly as critical as other factors I listed in this article.

In other words, even the worst model in the bake-off would be much better than no model, if these other business criterion are well-considered. So, happy shopping with this list, and I hope you find the right partner. Employing analytics is not an option when living in the sea of data.

Must Love Dogs, and Other Content Marketing Advice

Content Marketing is a lot like dating. If you create your dating profile based on what you think potential life mates might be interested in, but don’t accurately reflect who you really are, then your first date will probably be a short one.

Content Marketing is a lot like dating.

If you create your dating profile based on what you think potential life mates might be interested in, but don’t accurately reflect who you really are, then your first date will probably be a short one.

After all, if you’re an active sports enthusiast who loves dogs and isn’t afraid to speak your mind, then why would you pretend to be otherwise? Do you think that nobody will wink at you online if you’re honest about yourself? Do you think “tricking” someone into asking you out has the possibility of turning into a long term relationship?

Many businesses continue to get poor results for their content marketing efforts because they’re attempting to be something that they’re not. When Google’s algorithm discovers that your content has a lot of bounces because it does NOT really answer a Google inquiry on a topic, your search result gets moved to the back of the pack. There’s no “gaming” the system by stuffing keywords in your meta tag—Google is simply trying to figure out what a page is all about so they can serve up an authentic answer to the search inquiry.

I keep going back to the story about Marcus Sheridan, the pool company owner who started writing a blog based on answering their customers’ questions. As a result, his pool company is thriving and his website gets more traffic than any other pool company site in the world—and Marcus started an online consulting business to help other companies achieve similar results.

The secret to his success? Answering every single question a consumer could possibly have about buying a fiberglass pool in a frank and personable way. Now when a consumer asks Google a question about fiberglass pools, Marcus’ site is at the top of the organic search results because web traffic clicks and time spent on his site tell Google that his answers are the most “helpful” and relevant to the question being asked. Marcus gets an “A” for Content Marketing.

But why do most businesses still get an “F” for their attempts??

Primarily because they’re afraid: Afraid to answer questions honestly out of fear that it might make their product or service look bad; Afraid that they won’t look like they know what they’re talking about; Afraid that the competition will read their content and “steal” their answers or ideas; Afraid that someone will read their content then shop elsewhere to find the same solution… Only cheaper.

But Marcus wasn’t afraid. He had deep experience in the pool business, and was happy to share it with anyone who asked. He knew that by demonstrating his knowledge he would attract more inquiries, interest and referrals, because at the end of the day, we all love to do business with people who know what they’re talking about—people who give us confidence because we know we’ve made the right decision by purchasing from an expert.

I recently read a great quote from Phil Darby—a pioneer in new branding—who said, “You won’t build relationships by talking about yourself all the time.”

You couldn’t be more right, Phil, and just like in dating, no one wants to sit with someone who drones on and on about themselves.

Great content adds value to a topic; brings a fresh perspective to an issue, or provides advice and counsel on how to solve a problem—all without the chest pounding.

And, if you continuously post content to your site and distribute through other social media channels, that will help with SEO efforts because according to Searchmetrics, 7 out of 10 of the most important factors in SEO ranking now come from social media. Whether you post it on LinkedIn, Google+, tweet about it, or link to a Facebook post, all these efforts help optimize your search results.

Take a fresh look at your content—is it authentic? Does it truly help the reader gain new knowledge or insight on a topic? Or is it just the lipstick on your pig?

Smart Data – Not Big Data

As a concerned data professional, I am already plotting an exit strategy from this Big Data hype. Because like any bubble, it will surely burst. That inevitable doomsday could be a couple of years away, but I can feel it coming. At the risk of sounding too much like Yoda the Jedi Grand Master, all hypes lead to over-investments, all over-investments lead to disappointments, and all disappointments lead to blames. Yes, in a few years, lots of blames will go around, and lots of heads will roll.

As a concerned data professional, I am already plotting an exit strategy from this Big Data hype. Because like any bubble, it will surely burst. That inevitable doomsday could be a couple of years away, but I can feel it coming. At the risk of sounding too much like Yoda the Jedi Grand Master, all hypes lead to over-investments, all over-investments lead to disappointments, and all disappointments lead to blames. Yes, in a few years, lots of blames will go around, and lots of heads will roll.

So, why would I stay on the troubled side? Well, because, for now, this Big Data thing is creating lots of opportunities, too. I am writing this on my way back from Seoul, Korea, where I presented this Big Data idea nine times in just two short weeks, trotting from large venues to small gatherings. Just a few years back, I used to have a hard time explaining what I do for living. Now, I just have to say “Hey, I do this Big Data thing,” and the doors start to open. In my experience, this is the best “Open Sesame” moment for all data specialists. But it will last only if we play it right.

Nonetheless, I also know that I will somehow continue to make living setting data strategies, fixing bad data, designing databases and leading analytical activities, even after the hype cools down. Just with a different title, under a different banner. I’ve seen buzzwords come and go, and this data business has been carried on by the people who cut through each hype (and gargantuan amount of BS along with it) and create real revenue-generating opportunities. At the end of the day (I apologize for using this cliché), it is all about the bottom line, whether it comes from a revenue increase or cost reduction. It is never about the buzzwords that may have created the business opportunities in the first place; it has always been more about the substance that turned those opportunities into money-making machines. And substance needs no fancy title or buzzwords attached to it.

Have you heard Google or Amazon calling themselves a “Big Data” companies? They are the ones with sick amounts of data, but they also know that it is not about the sheer amount of data, but it is all about the user experience. “Wannabes” who are not able to understand the core values often hang onto buzzwords and hypes. As if Big Data, Cloud Computing or coding language du jour will come and save the day. But they are just words.

Even the name “Big Data” is all wrong, as it implies that bigger is always better. The 3 Vs of Big Data—volume, velocity and variety—are also misleading. That could be a meaningful distinction for existing data players, but for decision-makers, it gives a notion that size and speed are the ultimate quest. But for the users, small is better. They don’t have time to analyze big sets of data. They need small answers in fun size packages. Plus, why is big and fast new? Since the invention of modern computers, has there been any year when the processing speed did not get faster and storage capacity did not get bigger?

Lest we forget, it is the software industry that came up with this Big Data thing. It was created as a marketing tagline. We should have read it as, “Yes, we can now process really large amounts of data, too,” not as, “Big Data will make all your dreams come true.” If you are in the business of selling toolsets, of course, that is how you present your product. If guitar companies keep emphasizing how hard it is to be a decent guitar player, would that help their businesses? It is a lot more effective to say, “Hey, this is the same guitar that your guitar hero plays!” But you don’t become Jeff Beck just because you bought a white Fender Stratocaster with a rosewood neck. The real hard work begins “after” you purchase a decent guitar. However, this obvious connection is often lost in the data business. Toolsets never provide solutions on their own. They may make your life easier, but you’d still have to formulate the question in a logical fashion, and still have to make decisions based on provided data. And harnessing meanings out of mounds of data requires training of your mind, much like the way musicians practice incessantly.

So, before business people even consider venturing into this Big Data hype, they should ask themselves “Why data?” What are burning questions that you are trying to solve with the data? If you can’t answer this simple question, then don’t jump into it. Forget about it. Don’t get into it just because everyone else seems to be getting into it. Yeah, it’s a big party, but why are you going there? Besides, if you formulate the question properly, often you will find that you don’t need Big Data all the time. If fact, Big Data can be a terrible detour if your question can be answered by “small” data. But that happens all the time, because people approach their business questions through the processes set by the toolsets. Big Data should be about the business, not about the IT or data.

Smart Data, Not Big Data
So, how do we get over this hype? All too often, perception rules, and a replacement word becomes necessary to summarize the essence of the concept for the general public. In my opinion, “Big Data” should have been “Smart Data.” Piles of unorganized dumb data aren’t worth a damn thing. Imagine a warehouse full of boxes with no labels, collecting dust since 1943. Would you be impressed with the sheer size of the warehouse? Great, the ark that Indiana Jones procured (or did he?) may be stored in there somewhere. But if no one knows where it is—or even if it can be located, if no one knows what to do with it—who cares?

Then, how do data get smarter? Smart data are bite-sized answers to questions. A thousand variables could have been considered to provide the weather forecast that calls for a “70 percent chance of scattered showers in the afternoon,” but that one line that we hear is the smart piece of data. Not the list of all the variables that went into the formula that created that answer. Emphasizing the raw data would be like giving paints and brushes to a person who wants a picture on the wall. As in, “Hey, here are all the ingredients, so why don’t you paint the picture and hang it on the wall?” Unfortunately, that is how the Big Data movement looks now. And too often, even the ingredients aren’t all that great.

I visit many companies only to find that the databases in question are just messy piles of unorganized and unstructured data. And please do not assume that such disarrays are good for my business. I’d rather spend my time harnessing meanings out of data and creating values, not taking care of someone else’s mess all the time. Really smart data are small, concise, clean and organized. Big Data should only be seen in “Behind the Scenes” types of documentaries for manias, not for everyday decision-makers.

I have been already saying that Big Data must get smaller for some time (refer to “Big Data Must Get Smaller“) and I would repeat it until it becomes a movement on its own. The Big Data movement must be about:

  1. Cutting down the noise
  2. Providing the answers

There is too much noise in the data, and cutting it out is the first step toward making the data smaller and smarter. The trouble is that the definition of “noise” is not static. Rock music that I grew up with was certainly a noise to my parents’ generation. In turn, some music that my kids listen to is pure noise to me. Likewise, “product color,” which is essential for a database designed for an inventory management system, may or may not be noise if the goal is to sell more apparel items. In such cases, more important variables could be style, brand, price range, target gender, etc., but color could be just peripheral information at best, or even noise (as in, “Uh, she isn’t going to buy just red shoes all the time?”). How do we then determine the differences? First, set the clear goals (as in, “Why are we playing with the data to begin with?”), define the goals using logical expressions, and let mathematics take care of it. Now you can drop the noise with conviction (even if it may look important to human minds).

If we continue with that mathematical path, we would reach the second part, which is “providing answers to the question.” And the smart answers are in the forms of yes/no, probability figures or some type of scores. Like in the weather forecast example, the question would be “chance of rain on a certain day” and the answer would be “70 percent.” Statistical modeling is not easy or simple, but it is the essential part of making the data smarter, as models are the most effective way to summarize complex and abundant data into compact forms (refer to “Why Model?”).

Most people do not have degrees in mathematics or statistics, but they all know what to do with a piece of information such as “70 percent chance of rain” on the day of a company outing. Some may complain that it is not a definite yes/no answer, but all would agree that providing information in this form is more humane than dumping all the raw data onto users. Sales folks are not necessarily mathematicians, but they would certainly appreciate scores attached to each lead, as in “more or less likely to close.” No, that is not a definite answer, but now sales people can start calling the leads in the order of relative importance to them.

So, all the Big Data players and data scientists must try to “humanize” the data, instead of bragging about the size of the data, making things more complex, and providing irrelevant pieces of raw data to users. Make things simpler, not more complex. Some may think that complexity is their job security, but I strongly disagree. That is a sure way to bring down this Big Data movement to the ground. We are already living in a complex world, and we certainly do not need more complications around us (more on “How to be a good data scientist” in a future article).

It’s About the Users, Too
On the flip side, the decision-makers must change their attitude about the data, as well.

1. Define the goals first: The main theme of this series has been that the Big Data movement is about the business, not IT or data. But I’ve seen too many business folks who would so willingly take a hands-off approach to data. They just fund the database; do not define clear business goals to developers; and hope to God that someday, somehow, some genius will show up and clear up the mess for them. Guess what? That cavalry is never coming if you are not even praying properly. If you do not know what problems you want to solve with data, don’t even get started; you will get to nowhere really slowly, bleeding lots of money and time along the way.

2. Take the data seriously: You don’t have to be a scientist to have a scientific mind. It is not ideal if someone blindly subscribes anything computers spew out (there are lots of inaccurate information in databases; refer to “Not All Databases Are Created Equal.”). But too many people do not take data seriously and continue to follow their gut feelings. Even if your customer profile coming out of a serious analysis does not match with your preconceived notions, do not blindly reject it; instead, treat it as a newly found gold mine. Gut feelings are even more overrated than Big Data.

3. Be logical: Illogical questions do not lead anywhere. There is no toolset that reads minds—at least not yet. Even if we get to have such amazing computers—as seen on “Star Trek” or in other science fiction movies—you would still have to ask questions in a logical fashion for them to be effective. I am not asking decision-makers to learn how to code (or be like Mr. Spock or his loyal follower, Dr. Sheldon Cooper), but to have some basic understanding of logical expressions and try to learn how analysts communicate with computers. This is not data geek vs. non-geek world anymore; we all have to be a little geekier. Knowing Boolean expressions may not be as cool as being able to throw a curve ball, but it is necessary to survive in the age of information overload.

4. Shoot for small successes: Start with a small proof of concept before fully investing in large data initiatives. Even with a small project, one gets to touch all necessary steps to finish the job. Understanding the flow of information is as important as each specific step, as most breakdowns occur in between steps, due to lack of proper connections. There was Gemini program before Apollo missions. Learn how to dock spaceships in space before plotting the chart to the moon. Often, over-investments are committed when the discussion is led by IT. Outsource even major components in the beginning, as the initial goal should be mastering the flow of things.

5. Be buyer-centric: No customer is bound by the channel of the marketer’s choice, and yet, may businesses act exactly that way. No one is an online person just because she did not refuse your email promotions yet (refer to “The Future of Online is Offline“). No buyer is just one dimensional. So get out of brand-, division-, product- or channel-centric mindsets. Even well-designed, buyer-centric marketing databases become ineffective if users are trapped in their channel- or division-centric attitudes, as in “These email promotions must flow!” or “I own this product line!” The more data we collect, the more chances marketers will gain to impress their customers and prospects. Do not waste those opportunities by imposing your own myopic views on them. Big Data movement is not there to fortify marketers’ bad habits. Thanks to the size of the data and speed of machines, we are now capable of disappointing a lot of people really fast.

What Did This Hype Change?
So, what did this Big Data hype change? First off, it changed people’s attitudes about the data. Some are no longer afraid of large amounts of information being thrown at them, and some actually started using them in their decision-making processes. Many realized that we are surrounded by numbers everywhere, not just in marketing, but also in politics, media, national security, health care and the criminal justice system.

Conversely, some people became more afraid—often with good reasons. But even more often, people react based on pure fear that their personal information is being actively exploited without their consent. While data geeks are rejoicing in the age of open source and cloud computing, many more are looking at this hype with deep suspicions, and they boldly reject storing any personal data in those obscure “clouds.” There are some people who don’t even sign up for EZ Pass and voluntarily stay on the long lane to pay tolls in the old, but untraceable way.

Nevertheless, not all is lost in this hype. The data got really big, and types of data that were previously unavailable, such as mobile and social data, became available to many marketers. Focus groups are now the size of Twitter followers of the company or a subject matter. The collection rate of POS (point of service) data has been increasingly steady, and some data players became virtuosi in using such fresh and abundant data to impress their customers (though some crossed that “creepy” line inadvertently). Different types of data are being used together now, and such merging activities will compound the predictive power even further. Analysts are dealing with less missing data, though no dataset would ever be totally complete. Developers in open source environments are now able to move really fast with new toolsets that would just run on any device. Simply, things that our forefathers of direct marketing used to take six months to complete can be done in few hours, and in the near future, maybe within a few seconds.

And that may be a good thing and a bad thing. If we do this right, without creating too many angry consumers and without burning holes in our budgets, we are currently in a position to achieve great many things in terms of predicting the future and making everyone’s lives a little more convenient. If we screw it up badly, we will end up creating lots of angry customers by abusing sensitive data and, at the same time, wasting a whole lot of investors’ money. Then this Big Data thing will go down in history as a great money-eating hype.

We should never do things just because we can; data is a powerful tool that can hurt real people. Do not even get into it if you don’t have a clear goal in terms of what to do with the data; it is not some piece of furniture that you buy just because your neighbor bought it. Living with data is a lifestyle change, and it requires a long-term commitment; it is not some fad that you try once and give up. It is a continuous loop where people’s responses to marketer’s data-based activities create even more data to be analyzed. And that is the only way it keeps getting better.

There Is No Big Data
And all that has nothing to do with “Big.” If done right, small data can do plenty. And in fact, most companies’ transaction data for the past few years would easily fit in an iPhone. It is about what to do with the data, and that goal must be set from a business point of view. This is not just a new playground for data geeks, who may care more for new hip technologies that sound cool in their little circle.

I recently went to Brazil to speak at a data conference called QIBRAS, and I was pleasantly surprised that the main theme of it was the quality of the data, not the size of the data. Well, at least somewhere in the world, people are approaching this whole thing without the “Big” hype. And if you look around, you will not find any successful data players calling this thing “Big Data.” They just deal with small and large data as part of their businesses. There is no buzzword, fanfare or a big banner there. Because when something is just part of your everyday business, you don’t even care what you call it. You just do. And to those masters of data, there is no Big Data. If Google all of a sudden starts calling itself a Big Data company, it would be so uncool, as that word would seriously limit it. Think about that.

Missing Data Can Be Meaningful

No matter how big the Big Data gets, we will never know everything about everything. Well, according to the super-duper computer called “Deep Thought” in the movie “The Hitchhiker’s Guide to the Galaxy” (don’t bother to watch it if you don’t care for the British sense of humour), the answer to “The Ultimate Question of Life, the Universe, and Everything” is “42.” Coincidentally, that is also my favorite number to bet on (I have my reasons), but I highly doubt that even that huge fictitious computer with unlimited access to “everything” provided that numeric answer with conviction after 7½ million years of computing and checking. At best, that “42” is an estimated figure of a sort, based on some fancy algorithm. And in the movie, even Deep Thought pointed out that “the answer is meaningless, because the beings who instructed it never actually knew what the Question was.” Ha! Isn’t that what I have been saying all along? For any type of analytics to be meaningful, one must properly define the question first. And what to do with the answer that comes out of an algorithm is entirely up to us humans, or in the business world, the decision-makers. (Who are probably human.)

No matter how big the Big Data gets, we will never know everything about everything. Well, according to the super-duper computer called “Deep Thought” in the movie “The Hitchhiker’s Guide to the Galaxy” (don’t bother to watch it if you don’t care for the British sense of humour), the answer to “The Ultimate Question of Life, the Universe, and Everything” is “42.” Coincidentally, that is also my favorite number to bet on (I have my reasons), but I highly doubt that even that huge fictitious computer with unlimited access to “everything” provided that numeric answer with conviction after 7½ million years of computing and checking. At best, that “42” is an estimated figure of a sort, based on some fancy algorithm. And in the movie, even Deep Thought pointed out that “the answer is meaningless, because the beings who instructed it never actually knew what the Question was.” Ha! Isn’t that what I have been saying all along? For any type of analytics to be meaningful, one must properly define the question first. And what to do with the answer that comes out of an algorithm is entirely up to us humans, or in the business world, the decision-makers. (Who are probably human.)

Analytics is about making the best of what we know. Good analysts do not wait for a perfect dataset (it will never come by, anyway). And businesspeople have no patience to wait for anything. Big Data is big because we digitize everything, and everything that is digitized is stored somewhere in forms of data. For example, even if we collect mobile device usage data from just pockets of the population with certain brands of mobile services in a particular area, the sheer size of the resultant dataset becomes really big, really fast. And most unstructured databases are designed to collect and store what is known. If you flip that around to see if you know every little behavior through mobile devices for “everyone,” you will be shocked to see how small the size of the population associated with meaningful data really is. Let’s imagine that we can describe human beings with 1,000 variables coming from all sorts of sources, out of 200 million people. How many would have even 10 percent of the 1,000 variables filled with some useful information? Not many, and definitely not 100 percent. Well, we have more data than ever in the history of mankind, but still not for every case for everyone.

In my previous columns, I pointed out that decision-making is about ranking different options, and to rank anything properly. We must employee predictive analytics (refer to “It’s All About Ranking“). And for ranking based on the scores resulting from predictive models to be effective, the datasets must be summarized to the level that is to be ranked (e.g., individuals, households, companies, emails, etc.). That is why transaction or event-level datasets must be transformed to “buyer-centric” portraits before any modeling activity begins. Again, it is not about the transaction or the products, but it is about the buyers, if you are doing all this to do business with people.

Trouble with buyer- or individual-centric databases is that such transformation of data structure creates lots of holes. Even if you have meticulously collected every transaction record that matters (and that will be the day), if someone did not buy a certain item, any variable that is created based on the purchase record of that particular item will have nothing to report for that person. Likewise, if you have a whole series of variables to differentiate online and offline channel behaviors, what would the online portion contain if the consumer in question never bought anything through the Web? Absolutely nothing. But in the business of predictive analytics, what did not happen is as important as what happened. Even a simple concept of “response” is only meaningful when compared to “non-response,” and the difference between the two groups becomes the basis for the “response” model algorithm.

Capturing the Meanings Behind Missing Data
Missing data are all around us. And there are many reasons why they are missing, too. It could be that there is nothing to report, as in aforementioned examples. Or, there could be errors in data collection—and there are lots of those, too. Maybe you don’t have access to certain pockets of data due to corporate, legal, confidentiality or privacy reasons. Or, maybe records did not match properly when you tried to merge disparate datasets or append external data. These things happen all the time. And, in fact, I have never seen any dataset without a missing value since I left school (and that was a long time ago). In school, the professors just made up fictitious datasets to emphasize certain phenomena as examples. In real life, databases have more holes than Swiss cheese. In marketing databases? Forget about it. We all make do with what we know, even in this day and age.

Then, let’s ask a philosophical question here:

  • If missing data are inevitable, what do we do about it?
  • How would we record them in databases?
  • Should we just leave them alone?
  • Or should we try to fill in the gaps?
  • If so, how?

The answer to all this is definitely not 42, but I’ll tell you this: Even missing data have meanings, and not all missing data are created equal, either.

Furthermore, missing data often contain interesting stories behind them. For example, certain demographic variables may be missing only for extremely wealthy people and very poor people, as their residency data are generally not exposed (for different reasons, of course). And that, in itself, is a story. Likewise, some data may be missing in certain geographic areas or for certain age groups. Collection of certain types of data may be illegal in some states. “Not” having any data on online shopping behavior or mobile activity may mean something interesting for your business, if we dig deeper into it without falling into the trap of predicting legal or corporate boundaries, instead of predicting consumer behaviors.

In terms of how to deal with missing data, let’s start with numeric data, such as dollars, days, counters, etc. Some numeric data simply may not be there, if there is no associated transaction to report. Now, if they are about “total dollar spending” and “number of transactions” in a certain category, for example, they can be initiated as zero and remain as zero in cases like this. The counter simply did not start clicking, and it can be reported as zero if nothing happened.

Some numbers are incalculable, though. If you are calculating “Average Amount per Online Transaction,” and if there is no online transaction for a particular customer, that is a situation for mathematical singularity—as we can’t divide anything by zero. In such cases, the average amount should be recorded as: “.”, blank, or any value that represents a pure missing value. But it should never be recorded as zero. And that is the key in dealing with missing numeric information; that zero should be reserved for real zeros, and nothing else.

I have seen too many cases where missing numeric values are filled with zeros, and I must say that such a practice is definitely frowned-upon. If you have to pick just one takeaway from this article, that’s it. Like I emphasized, not all missing values are the same, and zero is not the way you record them. Zeros should never represent lack of information.

Take the example of a popular demographic variable, “Number of Children in the Household.” This is a very predictable variable—not just for purchase behavior of children’s products, but for many other things. Now, it is a simple number, but it should never be treated as a simple variable—as, in this case, lack of information is not the evidence of non-existence. Let’s say that you are purchasing this data from a third-party data compiler (or a data broker). If you don’t see a positive number in that field, it could be because:

  1. The household in question really does not have a child;
  2. Even the data-collector doesn’t have the information; or
  3. The data collector has the information, but the household record did not match to the vendor’s record, for some reason.

If that field contains a number like 1, 2 or 3, that’s easy, as they will represent the number of children in that household. But the zero should be reserved for cases where the data collector has a positive confirmation that the household in question indeed does not have any children. If it is unknown, it should be marked as blank, “.” (Many statistical softwares, such as SAS, record missing values this way.) Or use “U” (though an alpha character should not be in a numeric field).

If it is a case of non-match to the external data source, then there should be a separate indicator for it. The fact that the record did not match to a professional data compiler’s list may mean something. And I’ve seen cases where such non-matching indicators are made to model algorithms along with other valid data, as in the case where missing indicators of income display the same directional tendency as high-income households.

Now, if the data compiler in question boldly inputs zeros for the cases of unknowns? Take a deep breath, fire the vendor, and don’t deal with the company again, as it is a sign that its representatives do not know what they are doing in the data business. I have done so in the past, and you can do it, too. (More on how to shop for external data in future articles.)

For non-numeric categorical data, similar rules apply. Some values could be truly “blank,” and those should be treated separately from “Unknown,” or “Not Available.” As a practice, let’s list all kinds of possible missing values in codes, texts or other character fields:

  • ” “—blank or “null”
  • “N/A,” “Not Available,” or “Not Applicable”
  • “Unknown”
  • “Other”—If it is originating from some type of multiple choice survey or pull-down menu
  • “Not Answered” or “Not Provided”—This indicates that the subjects were asked, but they refused to answer. Very different from “Unknown.”
  • “0”—In this case, the answer can be expressed in numbers. Again, only for known zeros.
  • “Non-match”—Not matched to other internal or external data sources
  • Etc.

It is entirely possible that all these values may be highly correlated to each other and move along the same predictive direction. However, there are many cases where they do not. And if they are combined into just one value, such as zero or blank, we will never be able to detect such nuances. In fact, I’ve seen many cases where one or more of these missing indicators move together with other “known” values in models. Again, missing data have meanings, too.

Filling in the Gaps
Nonetheless, missing data do not have to left as missing, blank or unknown all the time. With statistical modeling techniques, we can fill in the gaps with projected values. You didn’t think that all those data compilers really knew the income level of every household in the country, did you? It is not a big secret that much of those figures are modeled with other available data.

Such inferred statistics are everywhere. Popular variables, such as householder age, home owner/renter indicator, housing value, household income or—in the case of business data—the number of employees and sales volume contain modeled values. And there is nothing wrong with that, in the world where no one really knows everything about everything. If you understand the limitations of modeling techniques, it is quite alright to employ modeled values—which are much better alternatives to highly educated guesses—in decision-making processes. We just need to be a little careful, as models often fail to predict extreme values, such as household incomes over $500,000/year, or specific figures, such as incomes of $87,500. But “ranges” of household income, for example, can be predicted at a high confidence level, though it technically requires many separate algorithms and carefully constructed input variables in various phases. But such technicality is an issue that professional number crunchers should deal with, like in any other predictive businesses. Decision-makers should just be aware of the reality of real and inferred data.

Such imputation practices can be applied to any data source, not just compiled databases by professional data brokers. Statisticians often impute values when they encounter missing values, and there are many different methods of imputation. I haven’t met two statisticians who completely agree with each other when it comes to imputation methodologies, though. That is why it is important for an organization to have a unified rule for each variable regarding its imputation method (or lack thereof). When multiple analysts employ different methods, it often becomes the very source of inconsistent or erroneous results at the application stage. It is always more prudent to have the calculation done upfront, and store the inferred values in a consistent manner in the main database.

In terms of how that is done, there could be a long debate among the mathematical geeks. Will it be a simple average of non-missing values? If such a method is to be employed, what is the minimum required fill-rate of the variable in question? Surely, you do not want to project 95 percent of the population with 5 percent known values? Or will the missing values be replaced with modeled values, as in previous examples? If so, what would be the source of target data? What about potential biases that may exist because of data collection practices and their limitations? What should be the target definition? In what kind of ranges? Or should the target definition remain as a continuous figure? How would you differentiate modeled and real values in the database? Would you embed indicators for inferred values? Or would you forego such flags in the name of speed and convenience for users?

The important matter is not the rules or methodologies, but the consistency of them throughout the organization and the databases. That way, all users and analysts will have the same starting point, no matter what the analytical purposes are. There could be a long debate in terms of what methodology should be employed and deployed. But once the dust settles, all data fields should be treated by pre-determined rules during the database update processes, avoiding costly errors in the downstream. All too often, inconsistent imputation methods lead to inconsistent results.

If, by some chance, individual statisticians end up with freedom to come up with their own ways to fill in the blanks, then the model-scoring code in question must include missing value imputation algorithms without an exception, granted that such practice will elongate the model application processes and significantly increase chances for errors. It is also important that non-statistical users should be educated about the basics of missing data and associated imputation methods, so that everyone who has access to the database shares a common understanding of what they are dealing with. That list includes external data providers and partners, and it is strongly recommended that data dictionaries must include employed imputation rules wherever applicable.

Keep an Eye on the Missing Rate
Often, we get to find out that the missing rate of certain variables is going out of control because models become ineffective and campaigns start to yield disappointing results. Conversely, it can be stated that fluctuations in missing data ratios greatly affect the predictive power of models or any related statistical works. It goes without saying that a consistent influx of fresh data matters more than the construction and the quality of models and algorithms. It is a classic case of a garbage-in-garbage-out scenario, and that is why good data governance practices must include a time-series comparison of the missing rate of every critical variable in the database. If, all of a sudden, an important predictor’s fill-rate drops below a certain point, no analyst in this world can sustain the predictive power of the model algorithm, unless it is rebuilt with a whole new set of variables. The shelf life of models is definitely finite, but nothing deteriorates effectiveness of models faster than inconsistent data. And a fluctuating missing rate is a good indicator of such an inconsistency.

Likewise, if the model score distribution starts to deviate from the original model curve from the development and validation samples, it is prudent to check the missing rate of every variable used in the model. Any sudden changes in model score distribution are a good indicator that something undesirable is going on in the database (more on model quality control in future columns).

These few guidelines regarding the treatment of missing data will add more flavors to statistical models and analytics in general. In turn, proper handling of missing data will prolong the predictive power of models, as well. Missing data have hidden meanings, but they are revealed only when they are treated properly. And we need to do that until the day we get to know everything about everything. Unless you are just happy with that answer of “42.”

Big Data Must Get Smaller

Like many folks who worked in the data business for a long time, I don’t even like the words “Big Data.” Yeah, data is big now, I get it. But so what? Faster and bigger have been the theme in the computing business since the first calculator was invented. In fact, I don’t appreciate the common definition of Big Data that is often expressed in the three Vs: volume, velocity and variety. So, if any kind of data are big and fast, it’s all good? I don’t think so. If you have lots of “dumb” data all over the place, how does that help you? Well, as much as all the clutter that’s been piled on in your basement since 1971. It may yield some profit on an online auction site one day. Who knows? Maybe some collector will pay good money for some obscure Coltrane or Moody Blues albums that you never even touched since your last turntable (Ooh, what is that?) died on you. Those oversized album jackets were really cool though, weren’t they?

Like many folks who worked in the data business for a long time, I don’t even like the words “Big Data.” Yeah, data is big now, I get it. But so what? Faster and bigger have been the theme in the computing business since the first calculator was invented. In fact, I don’t appreciate the common definition of Big Data that is often expressed in the three Vs: volume, velocity and variety. So, if any kind of data are big and fast, it’s all good? I don’t think so. If you have lots of “dumb” data all over the place, how does that help you? Well, as much as all the clutter that’s been piled on in your basement since 1971. It may yield some profit on an online auction site one day. Who knows? Maybe some collector will pay good money for some obscure Coltrane or Moody Blues albums that you never even touched since your last turntable (Ooh, what is that?) died on you. Those oversized album jackets were really cool though, weren’t they?

Seriously, the word “Big” only emphasizes the size element, and that is a sure way to miss the essence of the data business. And many folks are missing even that little point by calling all decision-making activities that involve even small-sized data “Big Data.” It is entirely possible that this data stuff seems all new to someone, but the data-based decision-making process has been with us for a very long time. If you use that “B” word to differentiate old-fashioned data analytics of yesteryear and ridiculously large datasets of the present day, yes, that is a proper usage of it. But we all know most people do not mean it that way. One side benefit of this bloated and hyped up buzzword is data professionals like myself do not have to explain what we do for living for 20 minutes anymore by simply uttering the word “Big Data,” though that is a lot like a grandmother declaring all her grandchildren work on computers for living. Better yet, that magic “B” word sometimes opens doors to new business opportunities (or at least a chance to grab a microphone in non-data-related meetings and conferences) that data geeks of the past never dreamed of.

So, I guess it is not all that bad. But lest we forget, all hypes lead to overinvestments, and all overinvestments leads to disappointments, and all disappointments lead to purging of related personnel and vendors that bear that hyped-up dirty word in their titles or division names. If this Big Data stuff does not yield significant profit (or reduction in cost), I am certain that those investment bubbles will burst soon enough. Yes, some data folks may be lucky enough to milk it for another two or three years, but brace for impact if all those collected data do not lead to some serious dollar signs. I know how the storage and processing cost decreased significantly in recent years, but they ain’t totally free, and related man-hours aren’t exactly cheap, either. Also, if this whole data business is a new concept to an organization, any money spent on the promise of Big Data easily becomes a liability for the reluctant bunch.

This is why I open up my speeches and lectures with this question: “Have you made any money with this Big Data stuff yet?” Surely, you didn’t spend all that money to provide faster toys and nicer playgrounds to IT folks? Maybe the head of IT had some fun with it, but let’s ask that question to CFOs, not CTOs, CIOs or CDOs. I know some colleagues (i.e., fellow data geeks) who are already thinking about a new name for this—”decision-making activities, based on data and analytics”—because many of us will be still doing that “data stuff” even after Big Data cease to be cool after the judgment day. Yeah, that Gangnam Style dance was fun for a while, but who still jumps around like a horse?

Now, if you ask me (though nobody did yet), I’d say the Big Data should have been “Smart Data,” “Intelligent Data” or something to that extent. Because data must provide insights. Answers to questions. Guidance to decision-makers. To data professionals, piles of data—especially the ones that are fragmented, unstructured and unformatted, no matter what kind of fancy names the operating system and underlying database technology may bear—it is just a good start. For non-data-professionals, unrefined data—whether they are big or small—would remain distant and obscure. Offering mounds of raw data to end-users is like providing a painting kit when someone wants a picture on the wall. Bragging about the size of the data with impressive sounding new measurements that end with “bytes” is like counting grains of rice in California in front of a hungry man.

Big Data must get smaller. People want yes/no answers to their specific questions. If such clarity is not possible, probability figures to such questions should be provided; as in, “There’s an 80 percent chance of thunderstorms on the day of the company golf outing,” “An above-average chance to close a deal with a certain prospect” or “Potential value of a customer who is repeatedly complaining about something on the phone.” It is about easy-to-understand answers to business questions, not a quintillion bytes of data stored in some obscure cloud somewhere. As I stated at the end of my last column, the Big Data movement should be about (1) Getting rid of the noise, and (2) Providing simple answers to decision-makers. And getting to such answers is indeed the process of making data smaller and smaller.

In my past columns, I talked about the benefits of statistical models in the age of Big Data, as they are the best way to compact big and complex information in forms of simple answers (refer to “Why Model?”). Models built to predict (or point out) who is more likely to be into outdoor sports, to be a risk-averse investor, to go on a cruise vacation, to be a member of discount club, to buy children’s products, to be a bigtime donor or to be a NASCAR fan, are all providing specific answers to specific questions, while each model score is a result of serious reduction of information, often compressing thousands of variables into one answer. That simplification process in itself provides incredible value to decision-makers, as most wouldn’t know where to cut out unnecessary information to answer specific questions. Using mathematical techniques, we can cut down the noise with conviction.

In model development, “Variable Reduction” is the first major step after the target variable is determined (refer to “The Art of Targeting“). It is often the most rigorous and laborious exercise in the whole model development process, where the characteristics of models are often determined as each statistician has his or her unique approach to it. Now, I am not about to initiate a debate about the best statistical method for variable reduction (I haven’t met two statisticians who completely agree with each other in terms of methodologies), but I happened to know that many effective statistical analysts separate variables in terms of data types and treat them differently. In other words, not all data variables are created equal. So, what are the major types of data that database designers and decision-makers (i.e., non-mathematical types) should be aware of?

In the business of predictive analytics for marketing, the following three types of data make up three dimensions of a target individual’s portrait:

  1. Descriptive Data
  2. Transaction Data / Behavioral Data
  3. Attitudinal Data

In other words, if we get to know all three aspects of a person, it will be much easier to predict what the person is about and/or what the person will do. Why do we need these three dimensions? If an individual has a high income and is living in a highly valued home (demographic element, which is descriptive); and if he is an avid golfer (behavioral element often derived from his purchase history), can we just assume that he is politically conservative (attitudinal element)? Well, not really, and not all the time. Sometimes we have to stop and ask what the person’s attitude and outlook on life is all about. Now, because it is not practical to ask everyone in the country about every subject, we often build models to predict the attitudinal aspect with available data. If you got a phone call from a political party that “assumes” your political stance, that incident was probably not random or accidental. Like I emphasized many times, analytics is about making the best of what is available, as there is no such thing as a complete dataset, even in this age of ubiquitous data. Nonetheless, these three dimensions of the data spectrum occupy a unique and distinct place in the business of predictive analytics.

So, in the interest of obtaining, maintaining and utilizing all possible types of data—or, conversely, reducing the size of data with conviction by knowing what to ignore, let us dig a little deeper:

Descriptive Data
Generally, demographic data—such as people’s income, age, number of children, housing size, dwelling type, occupation, etc.—fall under this category. For B-to-B applications, “Firmographic” data—such as number of employees, sales volume, year started, industry type, etc.—would be considered as descriptive data. It is about what the targets “look like” and, generally, they are frozen in the present time. Many prominent data compilers (or data brokers, as the U.S. government calls them) collect, compile and refine the data and make hundreds of variables available to users in various industry sectors. They also fill in the blanks using predictive modeling techniques. In other words, the compilers may not know the income range of every household, but using statistical techniques and other available data—such as age, home ownership, housing value, and many other variables—they provide their best estimates in case of missing values. People often have some allergic reaction to such data compilation practices siting privacy concerns, but these types of data are not about looking up one person at a time, but about analyzing and targeting groups (or segments) of individuals and households. In terms of predictive power, they are quite effective and results are very consistent. The best part is that most of the variables are available for every household in the country, whether they are actual or inferred.

Other types of descriptive data include geo-demographic data, and the Census Data by the U.S. Census Bureau falls under this category. These datasets are organized by geographic denominations such as Census Block Group, Census Tract, Country or ZIP Code Tabulation Area (ZCTA, much like postal ZIP codes, but not exactly the same). Although they are not available on an individual or a household level, the Census data are very useful in predictive modeling, as every target record can be enhanced with it, even when name and address are not available, and data themselves are very stable. The downside is that while the datasets are free through Census Bureau, the raw datasets contain more than 40,000 variables. Plus, due to the budget cut and changes in survey methods during the past decade, the sample size (yes, they sample) decreased significantly, rendering some variables useless at lower geographic denominations, such as Census Block Group. There are professional data companies that narrowed down the list of variables to manageable sizes (300 to 400 variables) and filled in the missing values. Because they are geo-level data, variables are in the forms of percentages, averages or median values of elements, such as gender, race, age, language, occupation, education level, real estate value, etc. (as in, percent male, percent Asian, percent white-collar professionals, average income, median school years, median rent, etc.).

There are many instances where marketers cannot pinpoint the identity of a person due to privacy issues or challenges in data collection, and the Census Data play a role of effective substitute for individual- or household-level demographic data. In predictive analytics, duller variables that are available nearly all the time are often more valuable than precise information with limited availability.

Transaction Data/Behavioral Data
While descriptive data are about what the targets look like, behavioral data are about what they actually did. Often, behavioral data are in forms of transactions. So many just call it transaction data. What marketers commonly refer to as RFM (Recency, Frequency and Monetary) data fall under this category. In terms of predicting power, they are truly at the top of the food chain. Yes, we can build models to guess who potential golfers are with demographic data, such as age, gender, income, occupation, housing value and other neighborhood-level information, but if you get to “know” that someone is a buyer of a box of golf balls every six weeks or so, why guess? Further, models built with transaction data can even predict the nature of future purchases, in terms of monetary value and frequency intervals. Unfortunately, many who have access to RFM data are using them only in rudimentary filtering, as in “select everyone who spends more than $200 in a gift category during the past 12 months,” or something like that. But we can do so much more with rich transaction data in every stage of the marketing life cycle for prospecting, cultivating, retaining and winning back.

Other types of behavioral data include non-transaction data, such as click data, page views, abandoned shopping baskets or movement data. This type of behavioral data is getting a lot of attention as it is truly “big.” The data have been out of reach for many decision-makers before the emergence of new technology to capture and store them. In terms of predictability, nevertheless, they are not as powerful as real transaction data. These non-transaction data may provide directional guidance, as they are what some data geeks call “a-camera-on-everyone’s-shoulder” type of data. But we all know that there is a clear dividing line between people’s intentions and their commitments. And it can be very costly to follow every breath you take, every move you make, and every step you take. Due to their distinct characteristics, transaction data and non-transaction data must be managed separately. And if used together in models, they should be clearly labeled, so the analysts will never treat them the same way by accident. You really don’t want to mix intentions and commitments.

The trouble with the behavioral data are, (1) they are difficult to compile and manage, (2) they get big; sometimes really big, (3) they are generally confined within divisions or companies, and (4) they are not easy to analyze. In fact, most of the examples that I used in this series are about the transaction data. Now, No. 3 here could be really troublesome, as it equates to availability (or lack thereof). Yes, you may know everything that happened with your customers, but do you know where else they are shopping? Fortunately, there are co-op companies that can answer that question, as they are compilers of transaction data across multiple merchants and sources. And combined data can be exponentially more powerful than data in silos. Now, because transaction data are not always available for every person in databases, analysts often combine behavioral data and descriptive data in their models. Transaction data usually become the dominant predictors in such cases, while descriptive data play the supporting roles filling in the gaps and smoothing out the predictive curves.

As I stated repeatedly, predictive analytics in marketing is all about finding out (1) whom to engage, and (2) if you decided to engage someone, what to offer to that person. Using carefully collected transaction data for most of their customers, there are supermarket chains that achieved 100 percent customization rates for their coupon books. That means no two coupon books are exactly the same, which is a quite impressive accomplishment. And that is all transaction data in action, and it is a great example of “Big Data” (or rather, “Smart Data”).

Attitudinal Data
In the past, attitudinal data came from surveys, primary researches and focus groups. Now, basically all social media channels function as gigantic focus groups. Through virtual places, such as Facebook, Twitter or other social media networks, people are freely volunteering what they think and feel about certain products and services, and many marketers are learning how to “listen” to them. Sentiment analysis falls under that category of analytics, and many automatically think of this type of analytics when they hear “Big Data.”

The trouble with social data is:

  1. We often do not know who’s behind the statements in question, and
  2. They are in silos, and it is not easy to combine such data with transaction or demographic data, due to lack of identity of their sources.

Yes, we can see that a certain political candidate is trending high after an impressive speech, but how would we connect that piece of information to whom will actually donate money for the candidate’s causes? If we can find out “where” the target is via an IP address and related ZIP codes, we may be able to connect the voter to geo-demographic data, such as the Census. But, generally, personally identifiable information (PII) is only accessible by the data compilers, if they even bothered to collect them.

Therefore, most such studies are on a macro level, citing trends and directions, and types of analysts in that field are quite different from the micro-level analysts who deal with behavioral data and descriptive data. Now, the former provide important insights regarding the “why” part of the equation, which is often the hardest thing to predict; while the latter provide answers to “who, what, where and when.” (“Who” is the easiest to answer, and “when” is the hardest.) That “why” part may dictate a product development part of the decision-making process at the conceptual stage (as in, “Why would customers care for a new type of dishwasher?”), while “who, what, where and when” are more about selling the developed products (as in “Let’s sell those dishwashers in the most effective ways.”). So, it can be argued that these different types of data call for different types of analytics for different cycles in the decision-making processes.

Obviously, there are more types of data out there. But for marketing applications dealing with humans, these three types of data complete the buyers’ portraits. Now, depending on what marketers are trying to do with the data, they can prioritize where to invest first and what to ignore (for now). If they are early in the marketing cycle trying to develop a new product for the future, they need to understand why people want something and behave in certain ways. If signing up as many new customers as possible is the immediate goal, finding out who and where the ideal prospects are becomes the most imminent task. If maximizing the customer value is the ongoing objective, then you’d better start analyzing transaction data more seriously. If preventing attrition is the goal, then you will have to line up the transaction data in time series format for further analysis.

The business goals must dictate the analytics, and the analytics call for specific types of data to meet the goals, and the supporting datasets should be in “analytics-ready” formats. Not the other way around, where businesses are dictated by the limitations of analytics, and analytics are hampered by inadequate data clutters. That type of business-oriented hierarchy should be the main theme of effective data management, and with clear goals and proper data strategy, you will know where to invest first and what data to ignore as a decision-maker, not necessarily as a mathematical analyst. And that is the first step toward making the Big Data smaller. Don’t be impressed by the size of the data, as they often blur the big picture and not all data are created equal.

Collaborating With Sales for Sales

I presented the Bottoms-Up Marketing webinar a couple weeks ago, and following the event found the same question had been submitted by a number of attendees. The question? How does a marketer get sales to follow up with leads? I came away feeling I had done a poor job of helping the audience to understand, it’s not

I presented the Bottoms-Up Marketing webinar a couple weeks ago, and following the event found the same question had been submitted by a number of attendees. The question? How does a marketer get sales to follow up with leads? I came away feeling I had done a poor job of helping the audience to understand, it’s not, “how do you get sales to do what you want?” it’s “how do you give sales something they want to work with?”

The premise of bottom-up marketing is that we marketers are only half the equation. Yes, our skills and expertise are critical to the campaign design and architecting process. But for the sales funnel requiring a closer, we must turn to the experience of our sales and CSR teams to understand the traditional process our business has used to convert leads to customers.

When a marketer asks the question, “How do I make sales do their job?” I immediately know this is an organization where marketing and closers are firmly pitted against one another and conversations and collaboration are a thing of the past—if they ever were. It’s a terrible question and says much about how you see yourself and your department in the sales funnel. If this is you, prepare yourself for a chewing out.

Resolution of discourse comes only where there is conversation and compromise.

Identifying prospects and warming leads without the input of the very people who close those leads is like writing a script without considering the audience. Oh sure, you can do it, but how many people from your audience will buy a ticket to your next event if you write only for yourself?

We marketers know better than to act as an audience (focus group) of one. Our job is to develop content for our mass audience. The people within our business with the best understanding of our audience is the closing team. Our closers, be that sales, CSRs, or another department, has a front-row seat to what our customers need, want, and require, and you would do well to pay attention.

Stop wondering how you can manipulate your sales team and start involving them.

At the very beginning—when you are brainstorming your next campaign—start at the bottom of the sales funnel by meeting with your closers to get their insight on crafting a digitized version of their warming process. You will not be able to duplicate all of their functions—and as they are people who bring unique personalities to the closing process, you shouldn’t try—but ask your sales team about resources and processes and contribute where you can. Move the easy rocks—use nurture emails to provide instantaneous responses for form completions while setting the stage for a sales call, provide links to videos, enroll them in a demo—do the rote work that capitalizes on your automated-campaign processes.

Our closers excel in so many areas we marketers guess, struggle, test and analyze—all in a never-ending effort to learn more about:

  • Finding prospects
  • Distilling prospects to leads
  • Determining which leads are qualified leads
  • Nurturing leads through the sales funnel
  • Converting leads to customers

Take the short cut. Your closers already have a great deal of this insight and are usually willing to impart at least some of it to you.

Look at it from their point of view: If you were in sales and the marketing department was delivering you qualified/hot leads, wouldn’t you rather process those than start anew with a cold call? Of course you would. So do they.

So how do you make the closers do their job and close the sales you give them? Invite them to participate—from the bottom up.

LinkedIn Prospecting: How Much Time Should You Spend?

“How much time should you spend on LinkedIn each week?” It’s a noble question. I understand why you ask it. But worrying about time is a dangerous place to start. True, we live in a world where we have limited time for new ideas. But saying, “You should spend X hours per week on LinkedIn” would be disingenuous.

“How much time should you spend on LinkedIn each week?”

It’s a noble question. I understand why you ask it. But worrying about time is a dangerous place to start.

True, we live in a world where we have limited time for new ideas. But saying, “You should spend X hours per week on LinkedIn” would be disingenuous.

Because there is no credible answer to the question. Instead, the best starting point is simple: Get more leads, faster, by creating a LinkedIn prospecting system.

You will be effective—regardless of how much time invested on LinkedIn!

Where to Start With LinkedIn
Here’s the skinny: The more success you have with LinkedIn prospecting the more time you’ll want to invest in it.

So invest time, first, in making sure you experience a little bit of success. Start by making the most out of every minute you commit.

Learn a systematic approach to:

  • Attract potential leads to connect with you
  • Spark questions about what you sell in buyers’ minds
  • Help prospects self-qualify faster

Let’s start today. Pick one of the above as a goal. Let’s commit to taking the first step toward a better LinkedIn prospecting system.

The Problem: Lack of a Good System
Most sales reps struggle with LinkedIn prospecting because they don’t use a system. Or the process they’re committed to doesn’t work.

For example, we’ve been told (by “experts”) to invest time on LinkedIn by:

  • publishing (blogging on the LinkedIn platform)
  • polishing your profile with new features
  • sharing knowledge with Connections and in Groups

Publishing on LinkedIn’s platform, sharing knowledge and polishing your profile might be effective—if they’re part of a system. These tactics, alone, are not enough. If you’ve already tried them you know what I mean!

A Direct Response Copywriting System
“How can I get customers to view content on my profile and be so excited they contact me?”

That’s a better question! One that leads us toward a proven system. A system to get customers curious about you. A better way to provoke response from buyers.

Content that makes customers respond does one thing really well: It uses direct response copywriting to make potential buyers think, “Yes, yes, YES … I can take action on that. In fact, I’ll probably get results from taking this advice.”

Most importantly, buyers must conclude their thoughts with an urge.

“How can I get my hands on more of those kinds of insights/tips?”

This simple idea (using a direct or subtle call to action) is the difference between wasting time on LinkedIn and effectively prospecting with it.

Response is what drives success. It’s what gets you paid. Invest time on LinkedIn with a system that grabs customers’ attention and gets them to respond.

Remember, publishing content on LinkedIn’s blogging platform or posting interesting updates will not work. Not without the direct response element.

More Success = More Time
The more success you have with LinkedIn prospecting, the more time you’ll want to invest in it. It only makes sense to invest time, first, in making sure you experience a little bit of success. Success that you can increase, systematically.

Start your LinkedIn prospecting journey by making the most out of every minute. Commit to LinkedIn, but resist worrying about how many hours per week to invest.

Instead, invest time for a few months in optimizing a prospecting approach. Use this proven system to get started. I guarantee you won’t worry so much about how much time you’re investing. In fact, you will probably want to invest more time in LinkedIn prospecting!

Good luck. Let me know how it goes for you.

The Email Hierarchy of Needs: Deliverability is the Foundation

If you’re not getting the most of your email messaging, you might not be asking the right questions. How many times have I been asked “What’s the best day of the week to send email”, “What’s the best time of day to send email”, “What’s the best Email Provider”? These questions are much less important than the big questions. “Is my email getting to my subscribers?” “Can my subscribers read my email on their device”? “Do my subscribers want my email or are they hitting ‘spam’?

If you’re not getting the most out of your email messaging, you might not be asking the right questions.

I can’t count how many times I’ve been asked, “What’s the best day of the week to send email?” “What’s the best time of day to send email?” “Which is the best email provider?” These questions are much less important than the big ones: “Is my email getting to my subscribers?” “Can subscribers read my emails on their mobile device?” “Do subscribers want to receive my email or are they hitting ‘spam’?”

Many times companies want to run before they walk. There are times when first to market or a beta version of a product is more important than getting it perfect the first time. However, if you take that approach with email messaging, you better make sure you have your fundamentals squared away first. What does it matter what time the email is sent if it gets sent to the “spam” folder anyway? It doesn’t matter what email provider you use if you keep mailing outdated lists.

The foundation: Deliverability and inbox placement
In the end, none of your email messaging efforts are going to make any impact if the subscriber doesn’t receive the email. The first barrier to overcome in email marketing is deliverability. Email services, ISPs that provide email services and the software on which subscribers view emails have an arsenal of anti-spam tactics they use to keep your email from getting to subscribers. In a world of spammers, phishers and corporate network admins trying to increase productivity by filtering distracting emails, the odds are stacked against you that your email message will be delivered to your subscribers. There are a number of factors that contribute to your deliverability and inbox placement, including the following:

Sending platform
This is the reason marketers use email service providers (ESPs) instead of sending emails via Outlook or Gmail. Brands also use ESPs instead of letting their developers with no email experience say, “we’ll build it.” Email delivery is complex.

The configuration of the mail transfer agent, the proper processing of bounces and unsubscribes, the feedback loops necessary to track and opt out spam complaints, and the proper throttle rates per domain takes a team. This is where the question “what is the best ESP” becomes interesting. All successful ESPs must have this piece down to a science. The first question I ask an emerging ESP is how many people are on its deliverability team. If the answer is “we all just pitch in” (that’s a real answer I received once), then I stay away.

Your data
The single most important thing you have control of to optimize deliverability is good data practices. This means list hygiene and validation to eliminate malformed and undeliverable email addresses. It means opting out subscribers who ask to be unsubscribed. It means regularly mailing your entire list, having clean and transparent opt-in practices, and keeping your database clean and centralized to allow you to target subscribers based on their actions and preferences.

Your creative
A terrible email message alone won’t land your message in the spam folder, but it certainly won’t help. Email can be marked as spam for a combination of things: content, IP reputation, from name/domain, etc. If you’re spamming people, your email won’t get delivered, even if your content doesn’t have “FREE” or “Viagra” in it. If you send emails that people open and click on like crazy and nobody ever hits “this is spam,” you can say free (almost) as much as you want. Most companies are somewhere in between. Test prior to sending. Usually one “free” won’t kill your deliverability.

Of course, this overly simplifies the complex issue of email deliverability to some basics tenants. Spam filters are updated regularly in an attempt to thwart the efforts of spammers. Companies will have the most success getting their emails delivered by respecting the permission and preferences of their subscribers, as well as working with a reputable ESP that has a deliverability team to tackle the technical aspect of bounce handling and email send settings.