This piece is for aspiring data scientists, analysts or consultants (or any other cool title du jour in this data and analytics business). Then again, people who spend even a single dime on a data project must remember this, as well: “The main goal of any analytical endeavor is to make differences in business.”
To this, some may say “Duh, keep stating the obvious.” But I am stating the obvious, as too many data initiatives are either for the sake of playing with data at hand, or for the “cool factor” among fellow data geeks. One may sustain such a position for a couple of years if he is lucky, but sooner or later, someone who is paying for all of the data stuff will ask where the money is going. In short, no one will pay for all of those servers, analytical tools and analysts’ salaries so that a bunch of geeks have some fun with data. If you just want the fun part, then maybe you should just stay in academia “paying” tuition for such an experience.
Not too long ago, I encountered a promising resume in a deep pile. Seemingly, this candidate had very impressive credentials. A PhD in statistics from a reputable school, hands-on analytics experience in multiple industries (so he claimed), knowledge in multiple types of statistical techniques, and proficiency in various computing languages and toolsets. But the interview couldn’t have gone worse.
When the candidate was going on and on about minute details of his mathematical journey for a rather ordinary modeling project, I interrupted and asked a very simple question: “Why did you build that model?” Unbelievably, he couldn’t answer that question, and kept resorting back to the methodology part. Unfortunately for him, I was not looking for a statistician, but an analytics consultant. There was just no way that I would put such a mechanical person in front of a client without risking losing the deal entirely.
When I interview to fill a client-facing position, I am not just looking for technical skills. What I am really looking for is an ability to break down business challenges into tangible analytics projects to meet tangible business goals.
In fact, in the near future, this will be all that is left for us humans to do: “To define the problem statement in the business context.” Machines will do all of the tedious data prep work and mathematical crunching after that. (Well, with some guidance from humans, but not requiring line-by-line instructions by many.) Now, if number-crunching is the only skill one is selling, well then, he is asking to be replaced by machines sooner than others.
From my experience, I see that the overlap between a business analyst and a statistical analyst is surprisingly small. Further, let me go on and say that most graduates with degrees in statistics are utterly ill-prepared for the real world challenges. Why?
Once I read an article somewhere (I do not recall the name of the publication or the author) that colleges are not really helping future data scientists in a practical manner, as (
- all of the datasets for school projects are completely clean and free of missing data, and
- professors set the goals and targets of modeling exercises.
I completely agree with this statement, as I have never seen a totally clean dataset since my school days (which was a long time ago in a galaxy far far away), and defining the target of any model is the most difficult challenge in any modeling project. In fact, for most hands-on analysts, data preparation and target definition are the work. If the target is hung on a wrong place, no amount of cool algorithms will save the day.
Yet, kids graduate schools thinking that they are ready to take on such challenges in the real world on Day One. Sorry to break it to them this way, but no, mathematical skills do not directly translate into ability to solve problems in the business world. Such training will definitely give them an upper hand in the job market, though, as no math-illiterate should be called an analyst.
Last summer, my team hired two promising interns, mainly to build a talent pool for the following year. Both were very bright kids, indeed, and we gave them two seemingly straightforward modeling projects. The first assignment was to build a model to proximate customer loyalty in a B2B setting. I don’t remember the second assignment, as they spent the entire summer searching for the definition of a “loyal customer” to go after. They couldn’t even begin the modeling part. So more senior members in the team had to do that fun part after they went back to school. (For more details about this project, refer to “The Secret Sauce for B2B Loyalty Marketing.”)
Of course, we as a team knew what we were doing all along, but I wanted to teach these youngsters how to approach a project from the very beginning, as no client will define the target for consultants and vendors. Technical specs? You’re supposed to write that spec from scratch.
In fact, determining if we even need a model to reach the business goal was a test in itself. Why build a model at all? Because it’s a cool thing on your resume? With what data? For what specific success metrics? If “selling more things by treating valuable customers properly” is the goal, then why not build a customer value model first? Why the loyalty model? Because clients just said so? Why not product propensity models, if there are specific products to push? Why not build multiple models and cover all bases while we’re at it? If so, will we build a one-size-fits-all model in one shot, or should we consider separating the universe for distinct segments in the footprint? If so, how would you determine such segments then? (Ah, that “segmentation of the universe” part was where the interns were stuck.)
Boy, did I wish schools spent more time doing these types of problem-solving exercises with their students. Yes, kids will be uncomfortable as these questions do NOT have clear yes or no answers to them. But in business, there rarely are clear answers to our questions. Converting such ambiguity into measurable and quantifiable answers (such as probability that a certain customer will respond to a certain offer, or sales projection of a particular product line for the next two quarters with limited data) is the required skill. Prescribing the right approach and methodology to solve long- and short-term challenges is the job, not just manipulating data and building algorithms.
In other words, mathematical elegance may be a differentiating factor between a mediocre and excellent analyst, but such is not the end goal. Then what should aspiring analysts keep in mind?
In the business world, the goals of data or analytical work are really clear-cut and simple. We work with the data to (1) increase revenue, (2) decrease cost (hence, maximizing profit), or minimize risks. That’s it.
From that point, a good analyst should:
- Define clear problem statements (even when ambiguity is all around)
- Set tangible and attainable goals employing a phased approach (i.e., a series of small successes leading to achievement of long-term goals)
- Examine quality of available data, and get them ready for advanced analytics (as most datasets are NOT model-ready)
- Consider specific methodologies best fit to solve goals in each phase (as assumptions and conditions may change drastically for each progression, and one brute-force methodology may not work well in the end)
- Set the order of operation (as sequence of events does matter in any complex project)
- Determine success metrics, and think about how to “sell” the results to sponsors of the project (even before any data or math work begins)
- Go about modeling or any other statistical work (only if the project calls for it)
- Share knowledge with others and make sure resultant model scores and other findings are available to users through their favorite toolsets (even if the users are non-believers of analytics)
- Continuously monitor the results and re-fit the models for improvement
As you can see here, even in this simplified list, modeling is just an “optional” step in the whole process. No one should build models because they know how to do it. You’re not in school anymore, where the goal is to get an A at the end of the semester. In the real world (although using this term makes me sound like a geezer), data players are supposed to make money with data, with or without advanced techniques. Methodologies? They are just colors on a palette, and you don’t have to use all of them.
For the folks who are in position to hire math geeks to maximize the value of data, simply ask them “why they would do anything.” If the candidate actually pauses and tries to think from the business perspective, then she is actually displaying some potential to be a business partner in the future. If the candidate keeps dropping technical jargon to this simple question, cut the interview short — unless you have natural curiosity in the mechanics of models and analytics, and your department’s success is just measured in complexity and elegance of solutions. But I highly doubt that such a goal would be above increasing profit for the organization in the end.