Are code tests a good idea for interviewing software engineers?

How effective are code tests in the software engineer interviewing process? This is a Googleable proposition with mixed conclusions. Many engineers grumble at the kinds of coding tests that have become standard, which are usually highly contrived algorithmic puzzles that rarely reflect the role’s actual projects in any discernible way. Here are some key areas to think about as you ponder instituting, revamping, or removing a code test.

A viable litmus test

In the 1970s, US Secretary of Defense Robert McNamara decided the best way to conduct the Vietnam War was to use data. Within months, thousands of pages per day of various kinds of data were sent to Washington, including a map which assigned a percentage score to each Vietnamese village of how supportive it was of American forces. In practice, the difference between “55% in favor” and “45% in favor” could mean the difference between peace and attack. This became known as the McNamara Fallacy, because it attempted to distill something extremely complicated and subjective into a single quantitative measurement. This is how I feel about code tests. What does it mean for someone to be a “B+ Java Engineer” or have a “90% React” score? And should a business really care what that score is?

Does a coding test simulate the kinds of challenges most engineers will run into in their day to day work? Development work involves a lot of collaboration, and most frequently occurs on code that has already been written or has some framework or architecture in place. It is also well known that most engineers spend a bulk of their time Googling things, reading StackOverflow, etc. The typical code test does not simulate this, and many code tests these days even require camera supervision and secure applications to prevent candidates from using Google at all. How can this apples to oranges assessment possibly measure success on the job with any precision?

You could develop a code test which gives a user an existing codebase, and then asks them to fix bugs or add features. This would be a closer approximation to real world development, but still misses any collaborative aspect. It also limits the candidate’s ability to ask questions, which is something candidates would have to do frequently in their regular job. Particularly when working with legacy systems, a lot of code that looks buggy has actually become a baked in feature that users and other developers depend on.

False positives and false negatives

No interview process will be perfect and neither will any coding test, nor any piece of software. Recruiting is a funnel, and you must have some tolerance for hiring candidates that you shouldn’t have, and passing on candidates that you shouldn’t have. The challenge becomes minimizing these false positives and false negatives, especially the former. With coding tests, I think that both happen to an unacceptable degree. People get rejected due to code tests all the time but go to different companies and offer tremendous value; this is not a rarity. Also, people get accepted due to code tests all the time who don’t work out. At a former company where we adapted ThoughtWorks’ take-home coding test, the engineers who had the best code test results went on to contribute the most amount of bugs and needed the most amount of coaching. Most of them did not last, and created many problems for the business. If your company has a standard code test, do you think every engineer meets your company’s standards? I’d be doubtful, you probably can think of a number of engineers who aren’t contributing valuable, quality code.

Selection bias

Like many other scoring-based systems, the system can be gamed, manipulated, or mastered. An engineer who has no familial obligations may have plenty of spare time to go on LeetCode and learn all the coding problems and become really good at cracking them, or have time to read a Java or Systems textbook. What about an engineer who does not have much free time outside of work, how do they prepare? Must they sacrifice their personal time? Even so, how will they be able to compete on a level field with others? The answer is, all else held equal, the person with more time will be more likely to pass the test and this is unfair. This exacerbates a system that encourages melting the boundaries between work and non-work.

Hindsight being 20/20, at a former company I regularly screened out folks who had families or were older because they tended to use older frameworks and developed less interesting code. We regularly accepted candidates who were young, male, and white. This contributed to the company developing a white tech bro culture, exacerbating problems with racial, gender, and age inclusion.

You can address some of this bias of availability by giving candidates take-home tests, but much of those concerns remain. You can even pay candidates to take the code test (whether they pass or fail), as an aditional incentive. But you will always be biasing towards younger engineers who are less likely to have families, and against older engineers who have had less exposure to these sorts of code tests and less time and interest in practicing them or less able to carve out 4-10 hours for a take-home test.

Incentivization

Taken to scale, what effects will coding tests have on the engineering commmunity at large? The obvious answer is that it incentivizes engineers to learn how to game coding tests, instead of learning a new language or framework or technology, or improving their coding skills, or learning about project management or system design or leadership. If individuals in the community have the desire to learn outside of work, how does the community benefit from incentivizing them to spend those cycles on prepping for coding tests? If the tests really are algorithmic puzzles, or Googleable questions, then everybody loses: employers, engineers, and technology users. As an employer, consider your role in incentivizing the community.

Expense

Most software engineers on most teams are already stretched thin. You can automate the results of a code test into pass/fail, but I’ve already explained why I think that is a dubious proposition. Meanwhile if the exam is take home, you can’t be certain that the candidate worked on it and followed the rules. You could blindly accept a code test, but what part of this mimics the real world, in which engineers have to go through a code review process? Finally, you could do in-person digital or whiteboard exercises, but that is incredibly far from the real world process of an engineering sitting at their desk and Googling and coding. But still there is no getting around the fact that in these situations you will have to allocate engineer time to reviewing or conducting code tests and interviews.

More importantly, the people writing your code test are not experts on designing tests. Organizations like College Board and ETS spend tremendous amounts of money developing and continuously tuning their exams. You are an expert at designing software and leading engineering teams - you are not an expert test maker. The test you make will reflect your own biases and shortcomings and will skew the test results. It is likely you will spend a large amount of time crafting an exam that exam writing experts will pick apart.

So does the expense of test design, administration, and evaluation really pay off? If you are paying engineers to review code tests, lets say 100 of them, what type of placement rate, and success rate will you really have, and how much does that cost? You will still get false positives, and you’ll have lost people as false negatives. This brings us back to the question engineers always have to ask themselves whenever they do anything: “what is it I’m really trying to accomplish here?” Do you just feel uncomfortable hiring an engineer without watching them code? Do you really think you can boil down an engineer’s abilities to a test that you wrote in your free time? And if code tests are inherently flawed just like most written tests, are you really better off grading code tests instead of letting the rest of the interview process determine whether you move forward or not? I doubt it. And anyway, you still will want to do oral interviews with engineers. Seems like your engineering team will have to become a full service recruitment team, so when will they actually do the work they were hired to do?

My recommendation

I am against whiteboard coding tests and digital code tests, period. I would consider take home tests, if they are limited in scope, timeboxed, and compensated. But I believe two of the most important characteristics in a good engineer are their willingness to learn, and their ability to communicate. The rest can be learned and taught. Neither is assessed objectively in a coding interview. I think it is much more unlikely for a candidate to be able to bullshit their way through an oral interview where they have to describe a project in detail and specific bugs and features they implemented, than to bullshit a coding test.

If you are very bullish on coding tests, experiment with the idea of giving the candidate an existing codebase and asking them to fix bugs or add features, or experiment with having them design and build a small product end to end. I’d also let their Github serve as a stand-in for a coding test. I think companies should have some policies that allow some engineers to take some of their code with them as part of their portfolio. This may seem terrifying to some people, but this is what people do already anyway. Every other role in a tech company recruits based in part on a candidate’s portfolio (e.g. business analysts and product managers provide past roadmaps, UX engineers bring screenshots of their past designs, etc).

Architects, accountants, lawyers, doctors, realtors, and investment brokers all have to take written exams. None of those fields resemble software engineering at all. Why should our certification process mimic the certification process of unrelated fields?