The “streetlight effect:” A man is looking for his keys under a streetlight.
Policeman: “Where did you lose them?”
Man: “In the alley, near the door to the bar.”
Policeman: “Why are you looking here?”
Man: “The light’s better.”
This is related to the more general “Statistical Type 4 Error” – asking the wrong question, and to the “Shiny Toy Syndrome” – an attraction to captivating new technologies, without regard to the problem that needs solving. Such phenomena help explain why analytics projects in organizations often fail.
Note: The following blog was written by Miriam Friedel and comes from Elder Research, Inc., a data science and analytics consulting and training company, with whom we have just joined forces. We’re excited – read more here! (and you can read the original blog here).
If your organization is new to data science and predictive analytics, it can be difficult to know where to start. In our 25 years of experience at Elder Research, we have found that there is often a mismatch between what companies think they should do with analytics versus what will provide the most value. While the specific problem to tackle varies by industry and business, we have found that choosing the right problem and focusing on a few key guidelines at the outset helps us deliver business value and gain support for analytics from key stakeholders.
Focus On the Business Question
For companies new to analytics, the vast array of tools and techniques available may seem overwhelming. Should I choose R or Python? What is the advantage of a neural network over a random forest? Am I using the right database? While these are reasonable questions, they are not the most important ones. The most relevant questions an organization can ask when starting an analytics effort center around business understanding and value. To wit: what business question do I hope to answer and what data do I have with which to answer it? Beginning here will do more to ensure success than deciding which tool, technology, or algorithm to use.
In the effort to identify the right business question, always consider the follow-on question: “How will the results be used?” Once you have built a predictive model or injected new descriptive statistics into an existing business process, how will that change the actions you take? For example, consider the goal of growing a credit business. You could start by building a predictive model to identify individuals most likely to respond to a credit card offer. Once you have this information what do you do with it? You wouldn’t automatically grant the application of all high scorers, as you would want customers to be profitable, of course; so you would want to check their credit worthiness. This line of reasoning immediately suggests that your predictive model should have incorporated credit scoring in the first place, making it more actionable from the outset. Or, you could build a second model focused on credit worthiness and pair it, in a decision rule, with the model focused on response. This type of framing, before considerations of data, tools, and algorithms, can prove extremely helpful in identifying the correct problem to solve. Ask “then what?” and you will avoid a lot of false starts. [1]
Don’t Wait for Perfect Data
Most organizations today have vast amounts of data. Often it is stored in relational databases, but can also be found in survey results, physician’s notes, csv files, and software usage logs, to name a few. Available data is almost always messy; collected and stored for a purpose other than analytics, it must be parsed, cleaned, and transformed into a format suitable for modeling and visualization. Further, your data will never contain every relevant predictor for the business problem you are trying to solve, especially since things that drive human behavior are so varied and complex. Nevertheless, insights are almost always available with the data you have, even if it isn’t the data you wish you had. Connecting data understanding with a clearly defined business problem can often be a challenge, but it is well worth the effort. Documenting assumptions and risks along the way will shed new light on the data available and inform data layer expansions and enhancements, and perhaps more importantly, provide new insights surrounding existing business processes. Refining the connection between the business question to be answered and the data available to answer it provides clarity on the analytics strategy and roadmap that an organization can leverage to get to a state of pervasive analytics.
Beware of Shiny Objects
Just as it is necessary to begin with imperfect data, it is also important to start the process with the technologies that are already in place within your organization. While there is an ever-increasing set of tools and technologies for the practice of data science, success with analytics does not hinge on making a large investment in commercial software or immediately migrating everything to the cloud. The best place to begin is with tools and technologies that are already in place and to augment them as appropriate.
Start Small
Often, organizations have grand visions of what they would like to do with advanced analytics. These are often costly, in terms of team effort and in budgets. Overreaching early on can cause frustration, take too long to show return on investment, and ultimately stymie development of a data-driven decision making culture within an organization. Instead, focus on low-hanging fruit. Determine some quick wins that can be achieved with the available data, tools, and personnel in a three-to-six month time period. Showing the value of analytics creates internal buy-in from key stakeholders and generates momentum for more advanced and integrated analytics. Success will earn the support to go higher with confidence.
The world of data science is expanding rapidly, and it can be difficult to identify the best ways to leverage an ever-evolving set of tools and techniques. By focusing on relevant business questions, available resources, and starting small and progressing gradually, your organization can achieve success with analytics.
[1] Editor’s note: Seemingly fine distinctions in one’s goal can make a huge difference in the bottom line. Not long ago a certain bank financially incentivized its new president to “grow the credit business”. The leader did so with gusto (using response models) and was well-compensated. But, he had ignored customer quality (that is, attention to credit worthiness). When stock analysts realized that the new customers were largely unprofitable, and reported so, the bank’s stock price took a huge hit, dropping the bank’s market capitalization by over a billion dollars. – Dr. John Elder, IV