// the idea
A predictor that explains itself
The 2026 World Cup runs across Canada, the USA, and Mexico from June 11th. Forty-eight teams, 104 matches, a month of football. I wanted a hobby project to run alongside it — not just a bracket predictor, but one that narrates its own shifts as results come in. Claude writes the match summaries; a deterministic model handles the maths.
The site is fully static. No backend, no server-side rendering, no live API calls from the browser. A small Python pipeline runs on my laptop after each match, regenerates the JSON files, and I upload them to the host. Total cost across the tournament: about a dollar fifty.
// the pipeline
One JSON file in, four JSON files out
The match-day workflow is deliberately spartan. Edit a results file, run a script, upload the resulting JSONs. Everything in between is automated.
The static front-end is vanilla HTML/CSS/JS. No framework, no build step. The browser fetches five JSON files, renders the bracket, the contender list, the per-team pages. Hosting is just plain static file serving.
// the model
Elo, Poisson, Monte Carlo
The numerical model is intentionally simple. Three layers, each doing one job.
- Elo ratings Each team carries a pre-tournament strength rating. Hosts get a bonus. Ratings update after every match using a margin-of-victory multiplier.
- Poisson goals The Elo gap between two teams converts to expected goals (a lambda value) for each side. A Poisson distribution then gives win/draw/loss probabilities for any unplayed match.
- Monte Carlo 10,000 full tournament simulations per update. Each run plays every remaining match, builds the group tables, and runs the knockout bracket. The percentages on the live site are just frequencies across those 10,000 simulated futures.
No injuries. No suspensions. No tactical matchups. No weather. No referee history. The model has the score and a strength rating; that's it. The predictor's honesty about that is the point of the experiment.
// claude's role
Narration from data, not from journalism
After each match finishes, the script computes the structured context — pre-match probabilities, Elo deltas, tournament-odds shifts, group state — and passes it to Claude with a tight system prompt. The prompt has a banned-cliche list and three voice modes picked automatically.
- expectedFavourite won as predicted. Dry, observational, focused on group implications.
- upsetUnderdog beat a >65% favourite. The shock is the story; pre-match probability and tournament-odds collapse take centre stage.
- knockoutAny knockout-stage match. Weightier; a team's tournament has just ended or extended.
Crucially: Claude doesn't read journalism. There's no scraping of BBC Sport or ESPN match reports. Every summary is written from the model's numerical output alone. That keeps it ethically clean and it forces a particular kind of writing — observational, odds-aware, never claiming to know how the football actually looked.
"Mexico edged South Africa 2-1 in the Group A opener — a result the model gave them only 54%. Two goals were enough to take the points, but the bracket has shifted noticeably: Mexico's path to the quarter-finals just got 6 percentage points easier, while South Africa's hopes of a third-place qualifier dropped to 22%."
// what it costs to run
Two pennies a match
Sonnet 4.6 at ~2,000 input tokens and ~150 output tokens per summary works out to roughly $0.006 per call. The matchday narrative is closer to $0.015.
Across 104 matches plus matchday round-ups, the entire tournament costs about $1.00 to $1.50 in API spend. Hosting is whatever I'm already paying for cyber-wyse. There's no runtime cost — visitors hit static files, not the Claude API.
// the disclaimer
Not betting advice. Not even close.
The site is plastered with this and I'll repeat it here: do not use this to inform betting decisions. The model is a toy. The probabilities on the site will routinely diverge from bookmakers' lines because bookmakers price in vastly more information — injuries, suspensions, lineups, recent form, weather — and add their own margin.
This is a hobby project demonstrating what AI-assisted prediction can and can't see. Half the interesting output will be the post-tournament retrospective: where did the model and Claude get things right? Where did they whiff? What patterns of error emerge when an LLM narrates from numbers alone?
National Council on Problem Gambling: ncpgambling.org · 1-800-GAMBLER, 24/7.