What is CIEDE2000? The color difference formula behind our scoring
Why we score color guesses with CIEDE2000 instead of RGB distance, how the formula corrects for perceptual quirks, and what it means for the round-by-round score you see.
Almost every quick “match the colour” tool scores its guesses the same way: convert both colours to RGB, treat the three numbers as coordinates, and measure the straight-line distance between them. The result is a single number that tells you how far apart the two colours are. The problem is that the number is misleading, often by a lot. Two colours can be close in RGB and feel completely different to your eyes; two others can be far apart in RGB and feel almost identical. RGB distance measures the wrong thing.
The Color Memory Game scores every guess with CIEDE2000, the perceptual colour difference formula recommended by the International Commission on Illumination (CIE). It’s slower to compute and harder to explain, but it produces numbers that match how human observers actually rank colour pairs. This article walks through why that’s necessary and what CIEDE2000 corrects for.
The naive approach: RGB distance
Pick any two colours. In RGB, each is a triple — say red at (255, 0, 0) and orange at (255, 165, 0). The Euclidean distance between them is roughly 165. Compare that to red (255, 0, 0) and a slightly different red (240, 0, 0): distance 15. The math says red and orange are about eleven times farther apart than two reds. Sounds reasonable.
Now try a different pair. Pure green (0, 255, 0) versus a slightly less saturated green (40, 255, 40): distance about 57. To your eye, these two greens are almost indistinguishable. Compare that to a deep navy (0, 0, 50) and a slightly different navy (0, 0, 90): distance 40. To your eye, those two navies are also nearly identical. RGB says one pair is 1.4× farther apart than the other, but visually they’re both small differences. The numbers don’t reflect how the colours actually look.
The deeper problem: human vision isn’t equally sensitive to every direction in colour space. We can spot tiny shifts in green and yellow but barely notice equivalent shifts in deep blue. RGB treats every channel as equal, so it overstates colour differences in regions where our eyes are dull and understates them in regions where our eyes are sharp. A scoring system built on RGB distance will reward and penalise the wrong things.
Going perceptual with CIELAB
In 1976 the CIE published a colour space called L*a*b*(also written CIELAB). Instead of red, green, and blue, its three axes are designed to map onto how humans actually see colour: L is lightness (black to white), a runs from green to red, and b runs from blue to yellow. The space was built around the idea that equal numerical distance should mean equal perceptual difference. If you move 1 unit in any direction, the change should look about the same to a human observer.
Once two colours are in CIELAB, the difference between them can be measured the same way: straight-line Euclidean distance. The result is called delta-E, written ΔE. CIE76 — the first version — was a big step up from RGB distance. A ΔE of 1 in CIELAB roughly corresponds to a difference at the threshold of human perception; a ΔE of 5 is “clearly different”; a ΔE of 10 is “not really the same colour”.
Why CIE76 wasn’t enough
The promise of CIELAB — that equal numerical distance equals equal perceptual difference — turned out to be only approximately true. Researchers in the 1980s and 90s catalogued specific places where CIE76 over- or under-stated visual differences. A few of the biggest:
Saturated regions are too forgiving. A 5-unit shift between two highly saturated colours feels much smaller to a human than a 5-unit shift between two muted ones. CIE76 treats them as identical.
Hue rotations near neutral feel huge. If you nudge a near-grey colour from slightly-red to slightly-green, the small numerical change in CIELAB corresponds to a perceptual jump that goes way beyond what the formula reports.
Lightness sensitivity isn’t flat. Humans notice small lightness differences in mid-greys far more than in very dark or very bright colours. CIE76 weights every part of the L axis equally.
CIEDE2000: corrections all the way down
CIEDE2000, published in 2001, layers a series of corrections on top of CIE76 to fix those specific problems. Without going into the equations: the formula reweights each of the three axes depending on where in colour space the comparison is happening, rotates the hue axis to account for the fact that humans see hue differences differently in red-violet than in yellow-green, and adds a special term for near-neutral colours that fixes the hue rotation problem above. The output is still a single ΔE number, but the number is meaningful: a CIEDE2000 ΔE of 1 corresponds much more reliably to a single just-noticeable-difference for a human observer, regardless of where in colour space the comparison sits.
It’s the formula used by serious colour-critical industries — print, paint, textiles, broadcast colour grading — when they need to know whether two samples are “the same”. We use it because we want our scoring to feel honest. A 7/10 in our game should mean the guess is genuinely close, not just numerically close in some arbitrary space.
From ΔE to your round score
After computing CIEDE2000 between the target colour and your guess, we convert that ΔE into a 0–10 round score using a sigmoid curve. A ΔE of 0 (perfect match) maps to 10. A ΔE of around 7 — a clearly visible difference — maps to roughly 5. ΔE values above about 25 (a totally wrong colour family) map to scores near 0. The curve is shaped to feel intuitive: a small mistake costs you a small amount, a hue-family error costs you a lot.
On top of that we apply a small adjustment for hue accuracy specifically. Two colours can have the same ΔE but a noticeably different feel if one matches the hue family and one doesn’t. Our score adds a recovery bonus for nailing the hue and a small penalty for missing it on highly saturated targets, where hue errors are especially obvious. The result is a score that tracks what your eye is telling you about the comparison, rather than what a simpler formula would have produced.
What this means as a player
A few things follow from CIEDE2000-based scoring. The hue family matters most — getting it wrong is what costs you the most points, because a wrong hue is what looks the most wrong to a human observer. Saturation and brightness are forgiving on dim, near-neutral colours and unforgiving on vivid ones, because that’s how human vision behaves. And small errors are cheap; a guess that feels “basically right” will score in the 7–9 range even when the slider numbers are visibly different.
If you’ve ever wondered why two seemingly equal-quality guesses scored differently, the answer is almost always that CIEDE2000 caught something your eye agreed with but your slider readout didn’t — usually a hue rotation that looked smaller than it perceptually was, or a saturation gap on a vivid colour that looked tinier than it actually felt.