# Beware the Small Sample

Everyone knows that the outcome of flipping a fair coin will be 50 percent heads/50 percent tails. But that’s only when the number of tries is large enough to generate the 50-50 outcome. In fact, in a small sample of four coin flips, the probability of getting tails when the last toss was a head is almost 60 percent. More about that later.

Everyone knows that the outcome of flipping a fair coin will be 50 percent heads/50 percent tails. But that’s only when the number of tries is large enough to generate the 50-50 outcome. In fact, in a small sample of four coin flips, the probability of getting tails when the last toss was a head is almost 60 percent. More about that later.

I’ve seen all types of people fall prey to the lure of making decisions based on small samples: They include undergraduates in my Advertising Research classes, brand managers in Fortune 500 companies, even seasoned direct marketers. Some examples:

• Undergrads planning a focus group with 10 college students predict that they will be able to report what percentage of college students are aware of or use a particular product. Of course, we can forgive the undergrads. They’re in class to learn that when you talk to 10 students at Temple, you’re going to find out what those 10 people at Temple think, not what 20 million undergraduates across the U.S. think.
• Brand managers exposing three focus groups of 10 people (30 subjects, in total) to different creative concepts and concluding that because 70 percent of the people interviewed like a particular concept that it must be a winner. This is a bit more egregious, because we would expect marketers with MBAs to understand that when 21 out of 30 people think something there’s a 90 percent chance that it could have been as few as 17 – slightly more than half, rather than almost three quarters.
• Direct marketers testing two creative executions against each other in cells of 25,000 and declaring one execution the winner when it generated 125 responses (0.5 percent) vs. 100 responses (0.4 percent). Here, there’s a 90 percent chance that the “winning” cell could have had as few as 108 responses and the loser could have had as many as 145.
(See the “Statistical Variation Tables” in “Direct Marketing – Strategy, Planning and Execution” by Ed Nash for confirmation of these estimates).

These errors in judgement are the result of a form of sample bias where the sample being studied is not large enough to be representative of the entire population. The temptation to make false conclusions is particularly strong in qualitative research, which is useful for gaining insights into people’s attitudes, beliefs and motivations, but not intended to determine what percentage of the population has those particular traits. Qualitative researchers always caveat their focus group findings with an appropriate disclaimer, but that doesn’t always stop their clients from hearing what they want to hear.

So what about the potential outcomes of four coin tosses?

A set of four tosses produces 16 possible outcomes from HHHH to TTTT. Steven Landsburg of the University of Rochester created a table that shows the probability of a “head” being followed by another head for every possible combination of the four flips. In the small sample of four coin flips, there are 14 possible outcomes where a head can be followed by another head (i.e., excluding TTTH and TTTT).

If we add up the probabilities that a head will be followed by a head in the 14 flips where that is possible (i.e., 100 percent for HHHH, 67 percent for HHHT, 50 percent for HHTH, 0 percent for HTHT, etc.), the result is 567 percent. Dividing that number by 14 (the number of possible outcomes where a head cannot be followed by head) we get 40.5 percent — which is the percentage of time a head will be followed by another head in four coin flips. (See the New York Times piece “Gamblers, Scientists and the Mysterious Hot Hand” for a table of the potential outcomes and a deeper explanation of the phenomenon).

What’s at play here is selection bias. Examining all the potential outcomes from HHHH to TTTT, we get 12 heads and 12 tails. But concentrating only on those 14 where a head can follow another head in the small sample of four tosses, we find that we can only get another head 40 percent of the time.

So be careful about what conclusions you draw from samples that are too small to represent the entire population. And, if you’re betting on a game of four coin flips and the last flip was a head, your odds of getting a tail next time are better than 50-50.

## Author: Chuck McLeester

Chuck McLeester's blog explores issues about marketing and marketing measurement. He is a marketing strategist and analyst with experience in healthcare, pharmaceuticals, financial services, pet products, travel/hospitality, publishing and other categories. He spent several years as a client-side direct marketer and 25 years on the agency side developing expertise in direct, digital, and relationship marketing. Now he consults with marketers and advertising agencies to create measurable marketing programs.

## One thought on “Beware the Small Sample”

1. Douglas Kelly says:

It’s about time someone like you talks about this. I’ve been in marketing research for more than 30 years, and I’ve seen people who simply don’t understand statistics presume that they do. The results are usually systemically problematic, and the conclusions drawn are always wrong.

Any survey or polling without N = 1,200 people — the very minimum, is not worth looking at. I never accept the results of a study without also having he cross tabulations and the questionnaire.

The propensity of people (amateurs) to read one line finding and conclude anything is . . . well, amateurish. But yet the news media and other talking heads do it all the time.