The goal-difficulty algorithm is a method for ranking how hard a goal was to score, measured at the moment the ball is struck. It scores eight inputs: shot location, defensive pressure, goalkeeper position, body part, ball movement, touches before the shot, build-up complexity, and game state. Unlike expected goals (xG), which measures probability, it measures the technical difficulty of the finish itself.
Every "Goal of the Year" award since 2005 has been wrong.
Not wrong in the sense that the wrong player won. Wrong in the sense that the question itself was never properly asked. The Puskas Award, the BBC Goal of the Season, FIFA's various year-end lists. All of them rank goals the way pundits rank goals. Visual spectacle. Emotional moment. Player narrative. Tribal memory. Bicycle kicks beat solo runs. Finals beat group stages. Goals scored by famous players beat goals scored by anyone else.
None of those weights have anything to do with how difficult a goal actually was to score.
This piece lays out a public methodology for fixing that. Build it yourself if you want to. The math is here.
What "great goal" actually means
There are two questions hiding inside the phrase "great goal."
The first is: how unlikely was it that this shot resulted in a goal, given the situation? That is expected goals (xG), and it has been a solved problem in football analytics since roughly 2012. Every major Opta and StatsBomb derivative has a version of it. A header from six yards in front of an open net is high xG. A 25-yard volley with two defenders closing is low xG. xG measures probability.
It does not measure difficulty.
The second question, the one nobody operationalizes, is: given the constraints the player was under at the moment of contact, how many players in the world could have made that shot? That is a difficulty question, and it is what separates an expected goal from a goal you remember a decade later.
A penalty has the lowest difficulty in football. A bicycle kick from 18 yards with a defender on your back has among the highest. xG cannot tell you that. The methodology below can.
The eight inputs
A goal-difficulty score takes eight inputs at the moment the ball is struck.
Shot location. Distance and angle to goal, measured to the nearest foot. This is the spine of any xG model and the foundation here too. Closer and more central is easier. The relationship is non-linear; an extra ten yards from 20 to 30 costs far more difficulty than the same distance added from 6 to 16.
Defensive pressure. The number of defenders within three yards of the ball at strike, plus a binary flag for whether the closest defender was actively committing to the block. A clear sight on goal is one thing. A clear sight with a center-back ducking into the lane is another.
Goalkeeper position. Distance off the line at strike, plus a measure of how much of the goal frame was actually unguarded given the keeper's set. A keeper out of position turns a hard shot into an open net. A keeper set and square shrinks the target by half.
Body part and contact type. Weak foot, header, bicycle, scorpion, outside-of-the-boot. Each gets a difficulty coefficient calibrated against historical conversion rates. Bicycle kicks convert at roughly one-fifth the rate of standard right-foot strikes from comparable positions. That ratio is the coefficient.
Ball speed and movement at strike. A ball arriving fast, on the half-volley, dipping, or spinning is harder to hit cleanly than a stationary or rolling ball. Tracking data captures this. So does careful video review when tracking is unavailable.
Touches before the shot. A solo run with multiple touches against committed defenders is fundamentally harder than a first-time finish from a cutback. Both can be brilliant. Only one carries individual-creation difficulty.
Build-up complexity. This is the team-level input. Number of passes in the sequence, opposition pressing intensity broken, vertical distance covered. A goal that emerges from a 15-pass move through a high press carries a different kind of difficulty than a turnover-and-strike. Both belong on a great-goals list. They belong there for different reasons.
Game state. Score line, time remaining, competition stage, opponent quality. This is the only context input. It does not change how hard the goal was to score. It changes how much the goal mattered. The two get scored separately and then combined.
Weighting without pundits
The trap most rankings fall into is weighting by panel vote. Pick a committee, give them a spreadsheet, watch them argue. The output reflects the committee, not the goals.
A defensible methodology weights each input by how much it predicts a goal not being scored from comparable situations. Run the historical dataset. Find the marginal contribution of each input to the conversion rate. That marginal contribution is the weight.
Distance from goal carries the heaviest weight because it has the strongest empirical effect on conversion. Body part (bicycle, header, weak foot) carries a meaningful but smaller weight. Build-up complexity carries a small weight in raw difficulty and a larger weight in a combined "memorability" score that includes game state.
The weights are not subjective. They are fit to data and updated as new data arrives.
Validation
A model that says every Messi solo run is the greatest goal ever is broken in a useful way: it reveals a real pattern. A model that says every penalty is great is broken in a useless way. Validation tells you which kind of broken you have.
The honest validation is forward-blind. Build the model on goals scored before a cutoff date. Score every goal in the test period the model has never seen. Compare against three things: bookmaker prop markets where they exist, expert consensus where it converges, and the model's own internal stability when you bootstrap the training set. Beat all three or the model is decoration.
What changes when you apply it
A few things change that pundits will not like.
Messi's catalogue dominates differently than the highlight reels suggest. His top goals by difficulty are not the famous solo runs against weak opposition. They are the tight-angle finishes against set defenses with two defenders closing, the goals nobody clips because the camera does not love them.
Several iconic goals fall. Zidane's volley in the 2002 Champions League final is a great goal. By pure difficulty it is not in the top fifty of the last 20 years. The technique was exceptional. The defensive pressure was minimal and the goalkeeper was set deep. The reason it is iconic is the moment. The methodology splits those two things apart and scores them separately.
Several forgotten goals rise. A handful of low-stakes league goals scored under absurd conditions outscore famous final-winners. They never get clipped. The algorithm finds them anyway. That is the point.
The methodology is public
The full input list, the weighting approach, and the validation framework are above. Anyone with tracking data and a willingness to do the work can build a version of this. Nobody has, in public, because the incentives in football media reward narrative and tribal alignment, not method.
The methodology is the open layer. The data pipeline, the live rankings, the model maintained over time. That is a separate question. This piece exists so the next "Goal of the Year" debate can at least be argued on shared ground.
If you have read this far and you want to see the rankings the methodology produces, those are in the rest of the series. The first one is the Messi versus Ronaldo question, settled in 2023 and ignored ever since.
Frequently Asked Questions
What is the goal-difficulty algorithm?
The goal-difficulty algorithm ranks goals by how hard they were to score rather than how good they look. It measures eight inputs at the moment of contact, including shot location, defensive pressure, goalkeeper position, and body part, and weights each input by how strongly it predicts a goal not being scored from comparable situations.
Is goal difficulty the same as expected goals (xG)?
No. Expected goals measures the probability that a shot becomes a goal from a given position. Goal difficulty measures how hard the finish was to execute given the constraints the player faced. A penalty has high xG and low difficulty. A bicycle kick under pressure has low xG and high difficulty. They answer different questions.
How is a goal-difficulty score calculated?
Each of the eight inputs is weighted by its marginal contribution to whether comparable shots are missed. Distance from goal carries the heaviest weight because it has the strongest empirical effect on conversion. Body part and obstructed vision carry meaningful weight. Game state is scored separately, because it changes how much a goal mattered, not how hard it was to score.
Can you objectively rank the best goals?
You can objectively rank how difficult goals were to score, using measurable inputs at the moment of contact. You cannot objectively rank how memorable or important they were, because those depend on context and emotion. Most "greatest goal" lists conflate the two. The goal-difficulty algorithm separates them and scores only difficulty.
Why are most "goal of the year" awards biased?
Awards like the FIFA Puskás are decided by public vote and panel selection, which measure popularity, not difficulty. Goals scored in widely broadcast leagues win more often because more people see and share them. Free kicks and long-range strikes win more often because they photograph well, even though they are frequently easier than tight-angle finishes through traffic.