Where Data-Driven Decision-Making Can Go Wrong
Too often business leaders go in one of two directions in these moments: either taking the evidence presented as gospel or dismissing it altogether. Both approaches are misguided. Leaders instead should organize discussions that thoughtfully evaluate seemingly relevant evidence and its applicability to a given situation.

Let’s say you’re leading a meeting about the hourly pay of your company’s warehouse employees. For several years it has automatically been increased by small amounts to keep up with inflation. Citing a study of a large company that found that higher pay improved productivity so much that it boosted profits, someone on your team advocates for a different approach: a substantial raise of $2 an hour for all workers in the warehouse. What would you do?
In the scenario just described you should pose a series of questions aimed at assessing the potential impact of wage increases on your company specifically. You might ask:

  • Can you tell us more about the setting of the research to help us evaluate whether it applies to our warehouse employees?
  • How do our wages stack up against those of other employers competing for our workers, and how does that compare with the study?
  • Was an experiment conducted? If not, what approach was used to understand whether higher wages were driving the productivity change or simply reflecting it?
  • What measures of productivity were used, and how long were the effects measured?
  • What other analyses or data might be relevant?

Of course, tone matters. These questions must be asked in a genuine spirit of curiosity, with a desire to learn and get sound recommendations.
Whether evidence comes from an outside study or internal data, walking through it thoroughly before making major decisions is crucial. In our interactions with companies—including data-heavy tech firms—we’ve noticed that this practice isn’t consistently followed.
Too often predetermined beliefs, problematic comparisons, and groupthink dominate discussions. Research from psychology and economics suggests that biases—such as base rate neglect, the tendency to overlook general statistical information in favor of specific case information or anecdotes, and confirmation bias, the propensity to seek out and overweight results that support your existing beliefs—also hinder the systematic weighing of evidence.
But companies don’t have to fall into this pattern. Drawing on our research, work with companies, and teaching experience (including executive education classes in leadership and business analytics and a recent MBA course called Data-Driven Leadership), we have developed an approach general managers can apply to discussions of data so that they can make better decisions.
Pressure-Test the Link Between Cause and Effect
Will search engine advertisements increase sales? Will allowing employees to work remotely reduce turnover?
These questions are about cause and effect—and are the kind of questions that data analytics can help answer. In fact, research papers have looked at them in detail.

However, managers frequently misinterpret how the findings of those and other studies apply to their own business situation. When making decisions, managers should consider internal validity—whether an analysis accurately answers a question in the context in which it was studied. They should also consider external validity—the extent to which they can generalize results from one context to another.
That will help them avoid making five common mistakes:
  1. Conflating causation with correlation.
Even though most people know that correlation doesn’t equal causation, this error is surprisingly prevalent.
Take eBay’s advertising strategy. For years the company had advertised on search engines such as Google, looking to grow demand by attracting more customers.
A consulting report concluded that the ads were effective, noting that when more ads were shown in a market, the total value of purchases on eBay was higher.

Alas, it had reached the wrong conclusion about those ads. With the help of an experiment conducted by a team of economists led by Steven Tadelis of the University of California Berkeley, eBay realized that the correlation was explained by advertisements targeting people already likely to visit eBay and markets where demand for eBay would be expected to spike even without ads.
These photos by Docubyte showcase his passion for vintage machines and his skill at digital restoration.
To understand causality, delve into how the study in question was conducted. For instance, was it a randomized controlled trial, in which the researchers randomly assigned people to two groups: one that was subjected to a test condition and a control group that was not?

That’s often considered the gold standard for assessing cause and effect, though such experiments aren’t always feasible or practical. Perhaps the researchers relied on a natural experiment, observing the effects of an event or a policy change on specific groups.
For example, a study might examine the impact of a benefit whose recipients were chosen by lottery, which allows researchers to compare how the benefit changed the circumstances or behavior of those who won the lottery with that of those who didn’t win.
Researchers who don’t have access to planned or natural experiments may instead control for potential confounding factors—variables that affect the variable of interest—in their data analysis, though this can be challenging in practice. For instance, if you were assessing the impact of a training program on productivity, you’d want to make sure you controlled for prior experience and other things that might affect productivity.
2. Underestimating the importance of sample size.
Imagine two hospitals: a large one that handles thousands of births each year, and a small one with a few hundred births annually. Which hospital do you think would have more days where more than 60% of the babies born were boys?
The answer is the small hospital because it has more variability in daily birth numbers. Small sample sizes are more likely to show greater fluctuations.
Psychologists Daniel Kahneman and Amos Tversky, in their canonical work on biases and heuristics, found that most people got the answer wrong, with more than half saying, “About the same.” People tend to underappreciate the effect that sample size has on the precision of an estimate.

This common error can lead to bad decisions. Whether you’re trying to figure out how much to trust online reviews, how to interpret productivity trends, or how much weight to put on the results of an advertising experiment, the size of the sample being analyzed is important to consider.


When evaluating effects, it can be helpful to ask not only about the sample size but about the confidence interval. A confidence interval provides a range of values that the true effect is likely to fall into, and the degree to which one is certain it falls into that range. The answers should shape the conversation about which course of action you’ll take.
3.Focusing on the wrong outcomes.
In their classic 1992 HBR article “The Balanced Scorecard: Measures That Drive Performance,” Robert S. Kaplan and David P. Norton opened with a simple observation: “What you measure is what you get.” Although their article predates the era of modern analytics, that idea is more apt than ever.
Experiments and predictive analytics often focus on outcomes that are easy to measure rather than on those that business leaders truly care about but are difficult or impractical to ascertain. As a result, outcome metrics often don’t fully capture broader performance in company operations.
Let’s return to the example of wage increases. Costs are easily measured, while boosts in productivity can be difficult to quantify.
That can lead managers to focus narrowly on the cost of better pay and fail to appreciate the potential gains. A broader analysis would take an approach like the one seen in a study by economists Natalia Emanuel and Emma Harrington. They set out to understand the implications of warehouse pay levels set by a large online retailer.

The researchers examined changes in productivity after a 2019 pay increase for warehouse workers and found that improvements in productivity and turnover were so large that the wage increases more than paid for themselves. They found similar results when they looked at the effects of higher pay on the productivity and turnover of customer service employees.

It’s also important to make sure that the outcome being studied is a good proxy for the actual organizational goal in question. Some company experiments track results for just a few days and assume that they’re robust evidence of what the longer-term effect would be. With certain questions and contexts, a short time frame may not be sufficient.

One company that works to avoid this problem is Amazon: It invests heavily in exploring the longer-term costs and benefits of possible product changes. There are many ways to assess the relevance and interpretation of outcomes, ranging from clear discussions about limitations to formal analyses of the link between short-term effects and longer-term ones.
To really learn from any data set, you need to ask basic questions like, What outcomes were measured, and did we include all that are relevant to the decision we have to make? Were they broad enough to capture key intended and unintended consequences? Were they tracked for an appropriate period of time?
4.Misjudging generalizability.
With the example of the warehouse wage increase, a vital question is what the results from one set of warehouses imply for a different set.
Moreover, a company may wish to know how the results apply to, say, restaurant or retail employees.
We have seen business leaders make missteps in both directions, either over- or underestimating the generalizability of findings.

For instance, when the senior vice president of engineering at a major tech company told us about his company’s rule against looking at university grades in engineer hiring decisions, we asked about the rationale. He said that Google had “proved that grades don’t matter”—referring to a Google executive’s comment he had read somewhere, claiming there wasn’t a relationship between school grades and career outcomes. By taking that piece of information as gospel, he ignored potential limitations to both its internal and its external validity.

Docubyte
When you’re assessing generalizability, it can be helpful to discuss the mechanisms that might explain the results and whether they apply in other contexts. You might ask things like, How similar is the setting of this study to that of our business? Does the context or period of the analysis make it more or less relevant to our decision? What is the composition of the sample being studied, and how does it influence the applicability of the results? Does the effect vary across subgroups?
5.Overweighting a specific result.
Relying on a single empirical finding without a systematic discussion of it can be just as unwise as dismissing the evidence as irrelevant to your situation.

It’s worth checking for additional research on the subject. Conducting an experiment or further analysis with your own organization can be another good option. Questions to ask include, Are there other analyses that validate the results and the approach? What additional data might we collect, and would the benefit of gathering more evidence outweigh the cost of that effort?
Start by Speaking Up
In 1906, Sir Francis Galton famously analyzed data on a contest at a livestock fair in which people guessed the weight of an ox. Though individual guesses were all over the map, the average of the guesses was nearly spot-on—demonstrating the wisdom of the crowd. Harnessing that wisdom can be challenging, however. Collective intelligence is best when mechanisms are in place to promote active and diverse participation. Otherwise, crowds can also amplify bias—especially when they’re homogeneous in viewpoint.
To overcome bias, business leaders can invite contributors with diverse perspectives to a conversation, ask them to challenge and build on ideas, and ensure that discussions are probing and draw on high-quality data. (See “What You Don’t Know About Making Decisions,” by David A. Garvin and Michael Roberto, HBR, September 2001.) Encouraging dissent and constructive criticism can help combat groupthink, make it easier to anticipate unintended consequences, and help teams avoid giving too much weight to leaders’ opinions. Leaders also must push people to consider the impact of decisions on various stakeholders and deliberately break out of siloed perspectives.
How to Avoid Predictable Errors
Pressure-testing assumptions, especially before difficult-to-reverse decisions are made, is increasingly vital. Here are five common pitfalls leaders make in interpreting analyses, along with the questions that will help you steer clear of them.
These kinds of discussions can help ensure the thoughtful weighing of evidence. But all too often they get derailed even when they would be productive. Countless studies have shown that hierarchies can lead people to withhold dissenting views and that discussion participants tend to shy away from sharing potentially relevant data or asking probing questions when they don’t experience psychological safety—the belief that candor is expected and won’t be punished. Without psychological safety, the approach we’ve described is less likely to work.
Teams benefit when their members feel that offering up data, ideas, concerns, and alternative views will be valued by their peers and managers alike. Most important, in many discussions, participants should view asking probing questions as part of their job. Much has been written about how to build psychological safety in a team. (See “Why Employees Are Afraid to Speak,” by James R. Detert and Amy C. Edmondson, HBR, May 2007.) But it’s especially critical to establish it in a team that seeks to use evidence to make business decisions—so that the fear of raising unpopular findings doesn’t cause members to miss critical data.
The chilling effect of low psychological safety was evident in the response to experimental research at Facebook that looked at whether showing more positive versus negative posts affected users’ emotions. In 2014, in the aftermath of public backlash to the research—which arose partly because people didn’t know that Facebook was running these types of experiments—CEO Mark Zuckerberg pulled the plug on ongoing external-facing research projects.
That deterred employees from undertaking experiments that might explore Facebook’s social impact proactively. More recently, Zuckerberg has changed course and expressed renewed interest in external research. However, had he created an atmosphere where Facebook executives felt able to thoughtfully discuss the negative effects of social media a decade ago, the company might have avoided some of its recent reputational challenges related to misinformation and its effects on user well-being.
From Data to Decisions
Decision-making in the face of uncertainty is necessarily iterative; it requires regular pauses for reflection on both information and process. Effective teams will learn from data, adjust plans accordingly, and deliberately work on improving their discussions.
Taking the time to discuss the nuances of analyses—including sample size and composition, the outcomes being measured, the approach to separating causation from correlation, and the extent to which results might generalize from one setting to another—is vital to understanding how evidence can, or can’t, inform a specific decision.
When carefully considered, each empirical result presents a piece of a puzzle, helping businesses figure out whether and when different changes are likely to have an effect. Such discussions will also set the stage for organizations to be more rigorous about data collection.
Even in the best of worlds, evidence is rarely definitive, and how a business move will play out is uncertain. You can nonetheless aspire to make thoughtful choices based on information you have or might obtain. By employing a systematic approach to its collection, analysis, and interpretation, you can more effectively reap the benefits of the ever-increasing mountain of internal and external data and make better decisions.