HR Management, Recruiting and Staffing

Is Your Hiring Test a Joke??

Test

When something looks good on the surface, but completely without merit, it is called a joke.

You might not have thought of this before, but many hiring tests fit that bill. I’m talking about tests that deliver numbers and data that look good on the surface, but do nothing to predict candidate job success — in other words, scores do a better job predicting vendor sales than employee performance.

Let me explain why, beginning with how professionals develop a hiring test.

What Works: Professional standards

Professionals always start with a job theory that sounds something like this: “I believe factor-X affects job performance.”

Next, they draft some X- items and give their test to hundreds of people, tweaking and tuning the items along the way. Then they use one or more methods to test whether scores are directly associated with job performance; for example they might give their test to everyone upon hiring, ignore the scores, and later compare test scores to job performance.

This is called predictive validity. They could also give their test to people already on the job and comparing test scores to job performance. This is called concurrent validity. Both methods have their pros and cons.

Drafting a stable, solid and trustworthy hiring test takes months of writing, editing, running studies, and systematically examining the guts of the test at both the item and factor level. This is the only way to know test scores consistently and accurately predict job performance.

Bad joke examples

A while ago, I reviewed a test supposedly developed for retail hires.

The vendor’s own test manual showed scores predicted nothing. Not shrinkage. Not theft. Not turnover. Not performance. Zilch…nada…nothing!

Still, the vendor with a straight face, claimed it “could be helpful” for hiring. You know, like claiming it predicts job performance even though it doesn’t?

Another time I was asked by a proud author to look at their web test. I intentionally answered every multiple choice question with the same letter (i.e., a technique to see if it would produce junk scores).

After the vendor told me the test results described me exactly, I explained what I did. Then, I went on to explain the kind of work necessary before it could be considered professional. They replied their investors would never stand for that.

Wouldn’t it be nice to have, you know, accuracy?

In a final example, a user claimed a certain well-known test would predict management success based on ego-drive. He maintained this trait was desirable for managers.

I said that was a nice thought, but if I was rejected for having a low ego-drive score, I would want to see proof ego-drive was necessary for job performance and then demand to see a study that showed my score predicted job performance.

We did not talk much after that. I guess I was being downright unreasonable by expecting a test user to show scores predicted job performance

Developing a Joke Test: Begin with ignorance

Ignorance is not a permanent condition. It can be fixed. So why do people think, without taking a single class in identifying job skills, measuring job performance, or psychometrics, they know how to develop a hiring test that meets professional standards?

It takes cooperative organizations, patient candidates, honesty, accuracy, and a boatload of statistical work. In fact, here is a link to a book how professionals do it: http://www.apa.org/science/programs/testing/standards.aspx. If you think you want to develop a test, or fix the one you market now, read this book thoroughly.

If you only want to buy a good test, ask your vendor for proof he/she followed the standards. If the vendor never heard of it, or claims it’s too complicated for the average person, then the test is probably bogus!

Developing a Joke Test: Assume personality scores = skill

I attended a course on the DISC once when the instructor mentioned it was often used to hire salespeople. What? DISC factors predict job performance? DISC scores are just differences between how people answer questions, NOT differences in performance!

Not only is DISC scoring weird, the “either/or” scoring method requires rejecting one factor every time another is chosen, thus two people can provide completely different answers but get the same score! Furthermore, its theory was originally based on soldier-behavior under combat conditions. And, just because the vendor thinks all sales people should be pushy, does that mean all customers enjoy dealing with salespeople who are high D’s?

Personality score differences are not skill differences.

Developing a Joke Test: Average everything

Averages are particularly insidious because they look job-credible.

For example, a vendor gives a generic (usually homegrown) test to 100 truck drivers, or 200 salespeople, or some other job title, averages the scores, and exclaims his/her test scores predict success in driving a truck, selling, or in some other occupation!

Are all the people in the sample equally competent? Did they all earn high marks for job performance or low turnover? Are all the truck drivers in the group doing identical work? How might you explain why some individual truck drivers score exactly the same as individuals in other jobs?

Remember that, on average, a person with one foot in a fire and the other in a bowl of ice is perfectly comfortable. Of course, a disreputable test vendor is perfectly comfortable selling junk because he/she really does not know, think, or care about selling averages.

Developing a Joke Test: Toss and stick

Imagine giving a test to a high-performing group of employees, averaging their scores, and using the mean as the job target.

Whoa! The state of job prediction science just regressed to throwing lots.

This technique is plagued with problems: the vendor assumes each factor affects job performance; average scores hide individual differences; people in the low group are often ignored; and, the biggest joke of all, the differences probably happened by chance. I had one vendor tell me that “Toss-and-Stick” was just another way to confirm a test works. I must have missed that class in grad school.

Developing a Joke Test: Circus acts

Let me introduce you to Prof. Bertram Forer. Forer gave his college students a personality test, but instead of giving back their actual scores, he gave each student an identical report gathered from several horoscopes.

Using a 0 to 5 agreement scale, students averaged 4.26. In other words, although entirely different people received the same personality description, virtually all individuals agreed it described them to a “T.”

This experiment later became termed the Barnum Effect, after P.T. Barnum who always made sure he had something for everyone. Junk test vendors take advantage of the Forer Effect when people get so excited about their test scores in a training workshop, they want to take the test into the hiring/promotion arena.

Another user Circus Act is the “one-off” effect. That is, some users tend to think their recollection of one or two exceptions makes the rule.

This often sounds like, “That can’t be right. I knew someone who…..” That’s bad human judgment at work, and a great reason why people need to base hiring/ promotion decisions on hard test facts. And, let’s not forget, interviews are tests — verbal ones. They have something to measure, use questions, and right/wrong answers.

Developing a Joke Test: Summary

The marketplace filled with junk and deception: wrong-headed vendors seek more sales; trainers and managers mistakenly think training tests predict job performance; professional test practices are treated with ignorance and disrespect; occupational averages wrongly predict performance; meaningless organizational groupings and averages predict nothing; and, so forth.

Think about it: When someone uses or sells an unprofessional test they are really saying, “I don’t care how many careers are ruined by my bogus test scores, or how much money is lost by making a bad hire, these inaccurate tests help make better hiring decisions.”

Are you laughing yet?

R. Wendell Williams, Ph.D., is Managing Director of ScientificSelection.com. He specializes in helping organizations develop job competencies, measure applicant skills, implement performance management programs, develop performance appraisal systems, make promotion decisions, and develop Web-enabled hiring sites. Contact him at rww@scientificselection.com .
  • Jonathan Wilson

    Thank you for writing this. You are right and the matter is important.  I don’t think it will make many test vendors or HR customers happy, but it should reassure many who have not been hired, or who have been hired and found themselves unsuited to their role. 

  • Jacque Vilet

    Thank  you SO MUCH for this article although I fear it will fall on deaf ears within HR.  

    This is not a “hot” topic like social media and employee engagement.   But it should be —- as I see it as a ticking bomb.    With an MS in psychology I have more than a little knowledge of testing. 

    Not to sound too harsh but testing is the easy way out.   Instead of spending more time doing probing interviews —- give a test, see the score and — and bingo you have your hiring answer!   Salesmen use a lot of big, impressive words — and more importantly HR does not know the right questions to ask. 

    Wonder how many law suits companies will have to face on this issue before recognizing the fallacy of testing?

    There is no substitute for thorough interviewing.

     

    • Rww

      Just to clarify the point, Jacque, interviews are another form of test, as are applicaiton blanks, candidate sources, resume’s, and so forth. The best form of interview is the behavioral event.

  • Rita Allen

    I’d like to learn more about ScientificSelection.com, but was unable to get to the web site through the link or via Google search.

    • Rww

      The website is up and running. The host did a software upgrade on their server that did not play well with my code. All is fixed. Sorry for the confusion.

  • Rww

    Hi Rita…Thanks for the heads-up..I’ve contacted my vendor and he is seeing what happened to the web host 

    Wendell

  • Rww

    The website is up and running. The host did a software upgrade on their server that did not play well with my code. All is fixed. Sorry for the confusion.