Properly measuring customer loyalty is often a difficult task in multichannel B2B marketing environment. The first question is often, “Where should we start digging when there are many data silos?” Before embarking on a massive data consolidation project throughout the organization, we suggest defining the problem statements by breaking down what customer loyalty means to you first, as that exercise will narrow down the list of data assets to be dealt with.
Who’s likely to be your valuable customer? What will their value be in next few years? How long will they continue to do business with you? Which ones are in vulnerable positions, and who’s likely to churn in next three months? Wouldn’t it be great if you could identify who’s vulnerable among your valuable customers “before” they actually stop doing business with you?
Marketers often rely on surveys to measure loyalty. Net Promoter Score, for example, is a good way to measure customer loyalty for the brand. But if you want to be proactive about each customer, you will need to know the loyalty score for everyone in your base. And asking “everyone” is too cost-prohibitive and impractical. On top of that, the respondents may not be completely honest about their intentions; especially when it comes to monetary transactions.
That’s where modeling techniques come in. Without asking direct questions, what are the leading indicators of loyalty or churn? What specific behaviors lead to longevity of the relationship or complete attrition? In answering those questions, past behavior is often proven to be a better predictor of future behavior than survey data, as what people say they would do and what they actually do are indeed different.
Modeling is also beneficial, as it fills inevitable data gaps, as well. No matter how much data you may have collected, you will never know everything about everyone in your base. Models are tools that make the most of available data assets, summarizing complex datasets into forms of answers to questions. How loyal is the Company XYZ? The loyalty model score will express that in a numeric form, such as a score between one and 10 for every entity in question. That would be a lot simpler than setting up rules by digging through a long data dictionary.
Our team recently developed a loyalty model for a leading computing service company in the U.S. The purposes of the modeling exercise were two-fold:
- Find a group of customers who are likely to be loyal customers, and
- Find the “vulnerable” segment in the base.
This way, the client can treat “potentially” loyal customers even before they show all of the signs of loyalty. At the opposite end of the spectrum, the client can proactively contact vulnerable customers, if their present or future value (need a customer value model for that) is high. We would call that the “valuable-vulnerable” segment.
We could have built a separate churn model more properly, but that would have required long historical data in forms of time-series variables (processes for those can be time-consuming and costly). To get to the answer fast with minimal data that we had access to, we chose to build one loyalty model, making sure that the bottom scores could be used to measure vulnerability, while the top scores indicate loyalty.
What did we need to build this model? Again, to provide a “usable” answer in the shortest time, we only used the past three years of transaction history, along with some third-party firmographic data. We considered promotion and response-history data, technical support data, non-transactional engagement data and client-initiated activity data, but we pushed them out for future enhancement due to difficulties in data procurement.
To define what “loyal” means in a mathematical term for modeling, we considered multiple options, as that word can mean lots of different things. Depending on the purpose, it could mean high value, frequent buyer, tenured customers, or other measurements of loyalty and levels of engagement. Because we are starting with the basic transaction data, we examined many possible combinations of RFM data.
In doing so, we observed that many indicators of loyalty behave radically differently among different segments, defined by spending level in this instance, which is a clear sign that separate models are required. For other cases, such overarching segments, they can be defined based on region, product line or target groups, too.
So we divided the base into small, medium and large segments, based on annual spending level, then started examining other types of indicators of loyalty for target definition. If we had some survey data, we could have used them to define what “loyal” means. In this case, we mixed the combinations of recency and frequency factors, where each segment ended up with different target definitions. For the first round, we defined the loyal customers with the last transaction date within the past 12 months and total transaction counts within the top 10 to 15 percent range, where the governing idea was to have the target universes that are “not too big” or “not too small.” During this exercise, we concluded that the small segment of big spenders was deemed to be loyal, and we didn’t need a model to further discriminate.
As expected, models built for small- and medium-level spenders were quite different, in terms of usage of data and weight assigned to each variable. For example, even for the same product category purchases, a recency variable (weeks since the last transaction within the category) showed up as a leading indicator for one model, while various bands of categorical spending levels were important factors for the other. Common variables, such as industry classification code (SIC code) also behaved very differently, validating our decision to build separate models for each spending level segment.