Analytics is all about aggregates. We look at the behavior of many employees and summarize it in single numbers such as “cases of misconduct,” “average absenteeism,” “total cost of benefits,” etc.
This aggregation works well when we are looking at reasonably large numbers of employees. However, as you cut data more finely, you risk running into problems.
For example, imagine you are looking at the number of cases of misconduct. Perhaps there were 75, up about 10% from last year. To dig deeper, you do a cut by department and find one department was accountable for 12 of those cases, more than any other department, so you begin to wonder if the department manager is doing something wrong.
Keen to understand what’s really happening, you explore the data further and come across something remarkable: Eight of the 12 misconduct cases involved employees who were hired in the last three years.
Now you feel you have learned something important. By exploring various data cuts, you’ve found a real hot spot of misconduct. You present this to the department manager. What will they say?
“Show me the names of the eight people you are talking about,” the manager might say. It’s not an unreasonable request.
What often happens when the manager looks at those eight names is that the manager will come up with a unique explanation for what happened in each case. At the aggregate level, it appears there is truly something wrong with the amount of misconduct going on with recent hires in that department. But at the individual-case level, each situation seems justified and unconnected to all the others.
Who is right? Is it the analyst who found a statistical outlier or the manager who can explain each instance?
There is no general answer to that. What we do know is that it’s likely from the manager’s perspective that each case was unique and unconnected. It’s not the role of the analyst to debate that. The analyst has flagged a disturbing anomaly, but it’s up to the manager to decide if this finding points to a real problem.
However, this decision cannot be limited to the immediate department manager, who is too close to the situation to make an unbiased judgment. We need an HR business partner or the manager’s manager to determine if there is some common thread — such as inadequate screening of new hires — between the cases. If so, then that is something to be addressed.
The takeaway for analysts is to be aware of this chasm from aggregate to individual data and to recognize that at that point their work is done. When the manager starts talking about individual cases, then it’s important to bring in other people who are closer to the situation to decide if underlying issues are causing the flagged anomaly.