Reward Difficulty, not Complexity

Douglas Gresham
5 min readAug 11, 2022


A recurring conversation in performance reviews and calibrations is: “this is good work, but is it <Level> scope?”.

One of the ways we deal with this is to talk about how complex the deliverable is, referring to huge design documents or RFCs, trawling through reams of hard-to-parse code, and so on. There is a huge problem with this:

If you say you will reward complexity, people will build complex solutions to simple problems.

This is the opposite of what good looks like. What you should want is simple solutions to difficult problems.

Talking about Difficulty

Photo by Michał Parzuchowski on Unsplash

I wrote previously about performance reviews and stressed the need for frameworks for talking about things like Misses or your company competencies in a way that was clear and consistent. In this case, what we need is a way to talk about how difficult a problem was to solve, without resorting to talking about complexity or just using gut feeling.

My non-exhaustive list of ways in which something might be difficult:

  • Level of Effort: how long did it take? Something which would reasonably take 10 engineers 6 months to do is more difficult than something which takes 2 engineers a week to do. There’s more planning, more time for things to go wrong, more time for circumstances to change underneath you.
  • Ambiguity: was the solution obvious, or were there lots of different possibilities with different pros and cons? Did the person have to do a bunch of up-front research to figure out the path forward? Were there a lot of risks they had to handle along the way?
  • Technical Breadth: was this a change made to a single service, or to multiple interdependent ones? Was it only in Java, or did they also have to use Python or C++ as well? Was it only in a distributed system, or were they also working on a front-end component or a data pipeline?
  • Technical Depth: did the problem require deep domain expertise? A classic example would be tuning JVM parameters for better performance, which requires knowledge of garbage collection, memory allocation, and performance characteristics of the service in question — but results in very small code changes.
  • Number of Contributors: how many people were involved? Did you have to break down tasks for others? In general, something with fewer people working on it is less difficult than if there are more people.
  • Stakeholders: how many people outside of your immediate team care about the project, and how many were involved in decision-making processes? The more there are, the harder they are to manage.
  • Business Impact: how important was this to the business? This is where metrics come into play.
  • Blast Radius: if something went wrong, how bad would it be? Similar to impact, but on the flip side of things. This is especially useful if the project is about mitigating incidents or regulatory compliance rather than delivering business value, but it’s also helpful to express the additional difficulty which comes with working on something that could bring down all the services at your company (and the resultant extra care you have to take).

All of these are “all else being equal”; sizing these things is often hard and subjective, and it’s usually impossible to directly compare different projects which hit different subsets of these criteria. My intention in listing them out is purely to have some vocabulary to talk about them rather than resorting to “this is complex”.

Levels of Accountability

Photo by Natalie Pedigo on Unsplash

It’s possible for someone to work on a project which hits many of the above criteria, but is that person actually on the hook for solving those difficulties? Here are some things to watch for.

Leadership: if someone is claiming they “led” a project, you need to get deep into specifics. Did they propose the idea, work with product and design to flesh out requirements, write/present a design review, do most/all of the implementation, direct others in getting the implementation done, manage stakeholders, or other contributing activities? I’ve seen “leading a project” used to describe every combination of these, including none. Without details, feedback and performance reviews get very hand-wavey.

Outcomes over activities: organising a scrum-of-scrums meeting between contributing teams is all very well, but shouldn’t be rewarded in itself. What came of those meetings? What’s the evidence that they made things better? Decision logs (especially when capturing changed direction) and peer feedback are helpful.

Autonomy: a person given a fully-scoped task to do which results in a million dollars of incremental revenue doesn’t get the credit for that impact (they do for the delivery); if, however, they’d spotted the issue, proposed the fix and delivered it, they should get that credit.

Supervision: did the person require a manager or senior peer to oversee the work to make sure it got done on time and to the right level of quality? Or were they fire-and-forget with full confidence that they’d flag up any blockers or issues in a timely fashion?

The Process Trap

Photo by UX Indonesia on Unsplash

An antipattern is to look for key markers which could indicate difficulty, but not exclusively and not in a way that can’t be manipulated. The most common example of this is in design reviews.

How it works is: a few engineers get their ratings dropped or promotions denied on the basis of difficulty (or scope or complexity or some other word meaning the same thing), where this is expressed as “where’s the design review for this?”.

Design documents are great when doing performance or promotion evaluations — you can see someone’s thought process laid out on paper, you can see how well they’ve covered all the bases, you can make your own judgement on how difficult the problem was, and you can see if they’ve done the right things culturally like inviting others to critique the document.

However, the feedback gets to their manager and subsequently to the report in the form “you haven’t authored and presented a design document”. This in turn caused them — and anyone else they talked to about the topic — to start authoring design documents for everything because that’s what they believe gets rewarded.

I’ve seen this down to the level of blowing a ticket into a full-blown design document, complete with sample code which could be copy-pasted into your IDE (in the file and at the line number the design doc specifies, no less), and then arranging a review meeting with all the staff+ engineers in the space attending. This is not a good use of anyone’s time.

Look for key markers by all means, but don’t point to their absence as the problem — talk to the facets of difficulty instead.



Douglas Gresham

He/him. Currently Director of Engineering @ Skyscanner; formerly Google and FB.