The Hidden Bias in Performance Reviews

We’ve all been looking for the grand gesture to solve the problem of diversity, equity, and inclusion in one fell swoop. It doesn’t exist.

Instead, what I have found does exist is a series of 1%t changes that, with persistence, can help root out the bias that too often subverts our ideals of meritocracy. I call these changes bias interrupters: evidence-based, metrics-driven tools.

If a company faces diversity challenges, typically it’s because bias is constantly being transmitted, day after day, through its basic business systems: the hiring process, the way access is granted to valued opportunities….or through performance evaluations.

Because performance evaluations leave a written record, they’ve been a gold mine for researchers studying bias. In our study of a Wall Street law firm, we found women’s successes discounted: Women got more positive comments than men on their performance evaluations, but men got higher ratings; men’s positive comments predicted high rankings, but women’s did not.

A 2020 study of tech found the same thing: Women with the same feedback got lower evaluations.

Any process, including performance evaluations, that requires informal horse-trading unmoored from data is a petri dish for bias. Calibration meetings where candidates are rank-ordered allow bias to run rampant unless they’re conducted with evidence, consistent grading rubrics to be certain everyone is assessed on the same things, and care to ensure that harsher standards are not applied to particular groups.

To help with this, start by tracking data about the rankings assigned by the graders, and then separately track rankings assigned after data is debated. And decide in advance what the weight of specific factors should be. Often, in calibration meetings, Candidate 1’s higher scores on X will be used to justify a higher overall ranking while ignoring his low score on Y. Candidate 2’s higher scores on Z will be used to justify a high ranking while ignoring his poor performance on X. Candidate 3 will have all her scores counted and be rewarded only if she lacks any weakness. This is precisely where prove-it-again bias creeps in.

However, there are simple ways to interrupt bias in performance evaluations:

Appoint and Train People to Interrupt Bias

Train HR professionals or senior managers to spot various types of bias. Our research shows that formal training, where these issues are discussed aloud, is more impactful than just handing out worksheets.

Also, have trained bias interrupters read performance evaluations before they become final. If the evaluations of a given supervisor show consistent patterns of bias, the trained bias interrupter should intervene. Without making a heavy-handed judgment, they can ask the supervisor to take another look at their evaluations.

Redesign the Evaluation Form

First, provide clear and specific performance criteria, and ask for evidence from the rating period that justifies any numerical rating. This approach is powerful.

Second, separate out performance and potential. This is important because white men, but not other groups, tend to be judged on potential and given the benefit of the doubt.

Third, separately assess personal style issues that need to be addressed and skill sets that need to be developed. This will make it easy to spot if women and people of color are faulted for issues of personal style that white men get a pass for.

Provide Guidelines for Self-Evaluations

Giving guidance for self-evaluations helps ensure that everyone knows how to promote themselves effectively and sends the message they are expected to do so. For instance, make it clear that everyone is expected to provide examples of work they’re especially proud of. Ask everyone to give an example of an area where they could improve.

Also have the people you’ve trained to be bias interrupters play an active role in calibration meetings and use rubrics. If calibration meetings are unstructured, they will likely lead to bias, as would forced rankings. Abandon them, or at least provide a rubric so that everyone is judged objectively and in writing on the same criteria — and then look for demographic patterns. Pay close attention to whether objective requirements are waived for some groups more than others.

Don’t Eliminate Your Performance Appraisal System!

Women and people of color already get less feedback, and less honest feedback. Offering feedback on the fly will exacerbate the problem.

At the same time, feedback must be honest. An experiment found that women were told “gendered white lies,” in that people avoided telling them hard truths. Our data suggests this effect may be widespread, and particularly strong for Black women. In architecture, for example, they are more than three times as likely as white men to report getting less honest feedback than their colleagues. And in law, only about a fifth of white men, but 40% of men of color, report they don’t receive constructive feedback.

Reprinted by permission of Harvard Business Review Press. Adapted from Bias Interrupted: Creating Inclusion For Real and For Good by Joan C. Williams. Copyright 2021 Joan Williams. All rights reserved.