Do makeup calls exist in sports?
Monty McCutchen still remembers his first accuser. Not by name, but by time and place. Long before he summited sport’s most thankless profession, back in 1980s small-town Texas, he was a novice ref in a stuffy high school gym surrounded by freshman boys and fuming dads. “I didn't know what I was doing,” he admits, other than making some much-needed cash. But he saw a foul, so he blew his whistle and raised his arm, and that’s when he heard it.
“C’mon!” a parent hollered. “No makeup calls!”
Like every ref, McCutchen heard those two profane words plenty more over the years, “from the very beginnings” of his career all the way to nine NBA Finals. At the college level, makeup-call accusations greet “every other call we make,” J.D. Collins, the NCAA men’s basketball officiating coordinator, says with an incredulous chuckle. They come from broadcasters, superstars and grade-school coaches. They span continents and cultures, languages and generations, and several sports. They represent a simple, widely held perception: that officials “make up” for faulty calls against one team by calling equally dubious fouls against the other team.
But that perception, according to former NBA and college referees, is “bull****.”
It’s “buffoonery.” It’s a “myth.” It’s a “non-issue.”
“I don't even want to say the words,” says Penny Davis, the NCAA’s coordinator of women’s basketball officiating. “Because it's not relevant.” She and others insist that, at every reasonably high level of sport, officials consider each play on its own merits.
Despite impassioned assurances from leagues and refs that the accusations are blasphemous, or even libelous, researchers have found evidence of potential bias. Four decades of top-flight German soccer revealed, for example, that referees who’d awarded a penalty to one team were far less likely to award a second penalty to the same team, and more likely to award one to the opposing team. More than 1 million NBA possessions from 2006-2011, meanwhile, revealed that the frequency of offensive fouls, traveling violations and offensive three-second calls increased significantly on the possessions after those same calls had been made against the other team.
What those studies show, however, are not makeup calls in the traditional sense. They aren’t phantom fouls consciously concocted to right past wrongs. Economists, psychologists and even some referees have postulated that they’re byproducts of subconscious instincts — innate tendencies that may lead referees, perhaps influenced by furious players and frenzied crowds, to give more marginal calls to a previously aggrieved team.
Some refs and league directors acknowledge these instincts.
“Every referee is human,” says John Adams, the former NCAA men’s basketball officiating coordinator. Humans want to be liked, not hated. Many referees want to be seen as impartial. “In trying not to show bias,” explains former NFL official Jeff Triplette, they “actually may overcompensate,” and over-scrutinize one team after a questionable call against the other.
“Human nature says that's what we're going to do,” Triplette says.
But “the best officials,” he continues, are the ones who don’t. The savviest leagues train refs to guard against these subconscious biases. They then scrutinize every official’s every move. They evaluate a ref’s ability to block out noise. And crucially, they retain or promote only the ones who can.
In doing so, they believe, they are purging high-profile sports of the “makeup call,” and disproving this age-old perception that, they hope, will someday die a slow, silent death.
‘An eye toward fairness’
The origins of refereeing’s most infamous but elusive concept are murky. The perception, though, wasn’t baseless. When Larry Pedowitz, a former federal prosecutor, spent 14 months investigating the NBA’s officials, he heard in interviews that, prior to 2003, “if a referee recognized that he or his crew had made an incorrect call, a referee might whistle a ‘make-up’ call soon thereafter.”
Officiating reform distanced the NBA from these “game management calls.” But more recently, data and referee testimony have refined the concept. Longtime NHL ref Kerry Fraser detailed the most controversial version of it in 2013, not long after his decorated 30-year career. No hockey referee “invents” penalties, Fraser confirmed, but if he knew he’d erred and granted one team an undeserved power play, he might jump at the chance to whistle a fringe penalty on the opponent.
“The hard truth is that while every referee's attempted objective is to maintain a ‘consistent’ standard, he might alter that standard to grab a quick penalty with an eye toward fairness,” Fraser wrote.
Many officials across sports would dispute that characterization. But data seemed to support it. A 2015 FiveThirtyEight analysis of 10 NHL seasons showed that penalty disparities had an “enormous” effect on subsequent calls. Around that time, Paul Gift, an economics professor at Pepperdine University, was studying those 1 million NBA possessions, and his findings jibed with a similar hypothesis: that “referees increase scrutiny on one team following a potentially difficult decision against the opposing team.”
That increased awareness, though, isn’t necessarily intentional. It could stem from a referee’s subconscious desire to avoid the ire of one team. “If I call so many penalties against one team, I can’t miss one against the other team,” former NHL ref Tim Peel explained last year. “Now, does that mean, I’ve called six penalties against Detroit, and now I’m gonna go look for a call against Nashville? Not at all. … What happens is, your antennas better be up. And you go, ‘I better not miss a penalty’ [on the opponent].”
The searching antenna would likely show up in data — in the NHL, and anywhere judgment calls exist. Steven Angel, the NBA’s head of game analytics and strategy, knows this well. “The thing with offensive fouls is, there are always screens, there's always contact in the game, and very rarely are they perfectly legal,” Angel explains. “So you could find a call if you wanted one. And chances are referees are never looking to make those calls, but nevertheless, if they're a little more focused on that kind of play, it's possible, they're gonna be a little more alerted to something.”
Angel’s explanation for that focus, though, isn’t makeup calls. He found Gift’s paper “fascinating,” but after his NBA referee analytics team conducted its own analysis using game film, he came to a separate conclusion. He attributes the trend to “priming” — the concept that, if officials have just called an offensive foul at one end, that type of call will be, “suddenly, front of mind.”
More generally, Angel says, the NBA has studied error sequencing. The league lined up all incorrect calls in its massive dataset, and calculated the probability that a given error harms the same team as the previous error. It found that, on “the vast majority of plays,” the probability is roughly 50%, which would seem to refute perceptions about makeup calls.
“While we’d be naïve to say that no back-to-back calls are ever made – especially subconsciously – with a sense of balance in mind [referees are human],” Angel wrote in a follow-up email, “we do not believe it is systemic or the primary reason for these patterns.”
And whatever the reasons, they’re the type of patterns that the NBA teaches its officials to combat.
Learning to resist the urge
McCutchen, who in 2017 left the court to become the NBA’s SVP of referee training and development, doesn’t shy away from “makeup calls” as a subject. Whereas some leagues and refs declined interview requests, he and Angel discussed it for 50 minutes.
He hates the term. He avoids those two dirty words. “Because,” he says, “really what you're talking about is, can you uphold standards up against all of the outside noise that an NBA referee has to deal with?
“You have to guard, through training, [against] these natural human tendencies,” McCutchen says. “Training is what overcomes that.”
He knows that the tendencies exist. Most referees feel the urge when they first don a whistle, whether for an AAU game or the local high school’s JV. “The first time you make a call and you go, ‘Wow, that was bad,’ when you go to the other end, your brain, that's the way it works,” Collins, the NCAA men’s coordinator, explains. “You're like, ‘Well, I gotta be more lenient here because of that.’”
As the magnitude of games increases, pressure does too. And referees feel it. “To think that large home crowds, high-profile coaches, don't somehow get in [a referee’s] subconscious would be silly,” Adams says. “I mean, I think they do.”
Over time, those same refs learn to resist those urges and that pressure. They develop mistake recovery strategies and techniques to rid themselves of guilt over blown calls. They ritualize staying in the moment. Some practice mindfulness. Several who spoke with Yahoo Sports said they constantly talked to themselves. One-word cues from their conscious mind could train their subconscious to avoid bias.
“Focus on this play,” Triplette used to tell himself.
Others focus on controlling their breathing. Some divide games into four-minute segments, and promise to bury mistakes from previous segments in the past. Others write on wristbands, or find an object in the sky and visualize their questionable calls sailing away.
McCutchen, meanwhile, preaches fundamentals. “You don't referee based off the past, and you don't based on a hope for the future,” he says. Working with NBA refs, he homes in on the mechanics of identifying potential violations on dozens of specific play types. The stronger a referee’s focus on those mechanics, McCutchen reasons, the less susceptible he or she will be to the noise.
“And hopefully, from Year 1 to 8, you've eliminated through training a lot of those tendencies that lead to outside noise helping drive decisions — which we want to avoid,” he says. “We really are heavy, heavy, heavy on this idea of training as a way of overcoming natural inclinations.”
Impact of a grading system
At sport’s grassroots, where refs and umps are volunteers or untrained part-timers, these natural inclinations will always run rampant. Experimental studies have demonstrated as much. Two decades ago, German researchers showed video clips of a Real Madrid match, including three potential fouls in the penalty box, to 115 soccer referees and players, and asked them to make decisions. Not a single participant awarded two penalties to the same team. And if, on the other hand, they awarded one, they were more likely to give a later penalty to the opponent.
Because, in that controlled environment, like in youth leagues across the globe, there were no warnings about subconscious bias. There were no grades, no scrutiny, no video reviews, no consequences for mistakes.
Once upon a time, professional sports weren’t all that different. Rookie referees learned not from psychologists, but from veteran colleagues over postgame beers. Their games weren’t always on TV. Their performance reviews traveled by word of mouth. The NBA, back in the 1950s and 60s, had no way to legislate the mythical makeup call out of their game.
That changed under David Stern. Besieged by complaints from team owners about supposedly incompetent officiating, the late commissioner wanted empirical data on referee performance — in part for evaluation and development, in part for rebuttals. The league designed the first of several predecessors to its current review system. Nowadays, officials are graded on roughly 500 calls and non-calls in each game. Analysts comb through them, possession by possession, spending 6-8 hours on each 48-minute chunk of basketball. They record every error a referee makes.
Similar, albeit less robust systems govern college basketball and other sports. Promotions and postseason assignments often depend on what they reveal. Call accuracy is the crucial variable. Anything below 90% jeopardizes employment. Errors and qualitative deficiencies lead to demotions and decreased pay.
And it’s in this context that refs and their bosses scoff at the notion of makeup calls.
“If you make one mistake, and you make a makeup call, that's called two mistakes,” says former NBA official Ted Bernhardt.
“If a guy misses a play at one end, and goes to the other end and misses another play, that dramatically affects his ratings, his possibility for playoffs,” says longtime NBA ref and officiating guru Ed Rush. “So what fool would say, ‘Oh, I missed that call, let me go miss another one?’ That'll knock me right out of the playoffs.”
Most makeup calls, McCutchen and Angel said, would show up in data and on tape. McCutchen reviews the tape and can identify specific triggers. “When I see pressure points being applied in a game — meaning, I see a coach complain four times straight down the floor about a ‘forearm, forearm, forearm,’ and they're over there on the sideline [gesturing], and then the weakest forearm you've ever seen in your life is called — I'm absolutely holding that official accountable,” he says.
The financial incentives, psychologists believe, can at least partially offset the subconscious instincts. And thanks to training, “if there is a temptation to succumb to that social bias, the more experience you have as a referee, the more insulated you are to that temptation,” says Stuart Carrington, a British psychologist and author of a book on refereeing. “So it's actually unlikely that at the highest level, that happens as frequently. Can't say it doesn't happen. But it's significantly less likely.”
And if it does happen? If referees do succumb to instincts? They’re significantly less likely to last at elite levels.
At amateur levels, unchecked by professional programs or video reviews, the makeup call will likely persist. “The lower you are on the totem pole, the more you worry about evening things up,” Adams says. “So maybe it happens more in high school, or at junior college, or at NAIA, or in Division III.”
But at the top? Among the best of the best?
“If a person does that,” Rush says of makeup calls, “it's called 'unemployment.’ They're gone. Period. No ifs, ands or buts about it.”