Within Calibration

What Forecasting Tournaments Teach About Better Judgment

Forecasting tournaments show how probability practice, feedback and collaboration can improve confidence calibration.

On this page

  • Why tournaments record probabilities before outcomes
  • How feedback and collaboration improve forecasts
  • What ordinary teams can borrow from the model
Preview for What Forecasting Tournaments Teach About Better Judgment

Introduction

Forecasting tournaments are one of the clearest demonstrations that judgement under uncertainty can be improved through deliberate practice rather than treated as a fixed talent. Participants are asked to assign numerical probabilities to questions about future events before the outcomes are known, their predictions are scored after resolution, and they receive repeated feedback on both accuracy and confidence. Over time, this process exposes systematic overconfidence, rewards well-calibrated judgement, and encourages better habits of reasoning. Rather than treating forecasting as guesswork, tournaments turn it into a measurable learning exercise in which probabilities, evidence, revisions and outcomes are all recorded. Research from large forecasting competitions, particularly those associated with the Intelligence Advanced Research Projects Activity (IARPA) and the Good Judgment Project, shows that structured training, regular feedback, collaborative discussion and careful aggregation can produce substantially better-calibrated forecasts than conventional expert judgement alone. [journals.sagepub.com+2iarpa.gov]journals.sagepub.comForecasting TournamentsTetlock, Barbara A….4 Aug 2014 — Forecasting tournaments are level-playing-field competitions that reveal which individuals, teams, or…

Tournaments illustration 1

Why tournaments record probabilities before outcomes

The defining feature of a forecasting tournament is that predictions are made before anyone knows the answer. Each forecast must specify a probability—for example, a 70% chance that a peace agreement will be signed within six months, or a 20% chance that inflation will exceed a given threshold. Once the question resolves, the forecast is compared with reality using a proper scoring rule, most commonly the Brier score, which rewards both accuracy and honest expression of uncertainty rather than confident guessing. [Wharton Faculty Platform]faculty.wharton.upenn.edu2015 superforecastersWharton Faculty Platform2015—superforecasters.pdf - Wharton Faculty Platformby B Mellers · 2015 · Cited by 323 — Brier scores are the a…

This structure solves several common problems in everyday judgement.

  • Hindsight bias: predictions cannot be unconsciously rewritten after the event.
  • Outcome bias: a sound decision is not mistaken for a poor one simply because chance intervened.
  • Vague confidence: words such as “likely” or “probably” are replaced by explicit probabilities that can later be evaluated.

Because hundreds of questions accumulate over months or years, participants receive a statistically meaningful picture of their calibration. Someone who routinely assigns 90% confidence to uncertain questions but is correct only 70% of the time will quickly see evidence of overconfidence. Conversely, someone whose 60% predictions succeed close to six times in ten is well calibrated, even if they are occasionally wrong. [journals.sagepub.com+2Wharton Faculty Platform]journals.sagepub.comForecasting TournamentsTetlock, Barbara A….4 Aug 2014 — Forecasting tournaments are level-playing-field competitions that reveal which individuals, teams, or…

How feedback and collaboration improve forecasts

One of the most important findings from major forecasting tournaments is that improvement comes from repeated cycles of prediction, feedback and revision rather than from one-off instruction.

The Good Judgment Project, which won IARPA’s multi-year geopolitical forecasting competition, combined several elements:

  • brief training in probabilistic reasoning;
  • continual scoring of forecasts;
  • opportunities to revise predictions as new evidence emerged;
  • carefully designed collaborative teams;
  • statistical aggregation of multiple forecasts. [goodjudgment.com+2learnmoore.org]goodjudgment.comThe impact of training and practice on judgmental accuracy…by W Chang · 2016 · Cited by 147 — Ad- ditional details on the forecasting…

Participants were encouraged to update forecasts whenever meaningful evidence appeared instead of defending their original judgement. Subsequent analyses found that better forecasters tended to make more frequent, incremental updates rather than large swings driven by single news events. This behaviour improved both calibration and overall forecasting accuracy because beliefs stayed aligned with changing evidence instead of becoming anchored to first impressions. [ResearchGate]researchgate.netResearchGate(PDF) Small steps to accuracy: Incremental belief updaters…This article explores how real-world forecasters who vary in sk…

Collaboration also mattered, but not because groups automatically outperform individuals. The most successful teams were trained to challenge assumptions constructively, explain reasoning clearly and evaluate evidence rather than authority or confidence. Later analyses of Good Judgment Project data found that compromise or aggregated forecasts often exceeded the performance of individual forecasts, illustrating the practical value of combining partially independent judgements. [learnmoore.org+2PMC]learnmoore.orgMellers et al 2014.pdfby B Mellers · 2014 · Cited by 434 — They were taught strategies for explaining their forecasts to others, offering…

Tournaments illustration 2

What made the IARPA tournaments distinctive?

The IARPA Aggregative Contingent Estimation (ACE) programme was designed not merely to identify good forecasters but to test competing methods for improving judgement. Multiple research teams used different combinations of training, team structures, aggregation algorithms and selection methods while forecasting hundreds of real geopolitical events over several years. [iarpa.gov]iarpa.govACEThe goal of the ACE Program is to dramatically enhance the accuracy, precision, and timeliness of intelligence forecasts for a broad r…

The results challenged the assumption that forecasting skill is mostly innate or dependent on privileged information.

Research emerging from the tournament showed that:

  • modest training in probabilistic reasoning improved performance;
  • identifying consistently well-calibrated forecasters produced further gains;
  • collaborative forecasting outperformed many independent approaches;
  • sophisticated aggregation methods improved on simple averages;
  • the best-performing forecasters remained consistently better than most participants over long periods rather than succeeding through luck alone. [goodjudgment.com+3goodjudgment.com+3ResearchGate]goodjudgment.comThe impact of training and practice on judgmental accuracy…by W Chang · 2016 · Cited by 147 — Ad- ditional details on the forecasting…

These findings helped popularise the idea of “superforecasters”—individuals who consistently produced unusually accurate and well-calibrated probability estimates across many unrelated topics. Importantly, research suggested that their advantage reflected disciplined reasoning habits, active updating and careful calibration more than specialised domain expertise alone. [Wharton Faculty Platform+2journals.sagepub.com]faculty.wharton.upenn.edu2015 superforecastersWharton Faculty Platform2015—superforecasters.pdf - Wharton Faculty Platformby B Mellers · 2015 · Cited by 323 — Brier scores are the a…

What ordinary teams can borrow from the model

Most organisations do not need a formal forecasting tournament to benefit from its principles. The core learning mechanisms are surprisingly portable.

A practical version can include:

  1. Recording numerical probabilities before important decisions.
  2. Defining objective resolution criteria in advance.
  3. Reviewing outcomes after enough cases have accumulated.
  1. Scoring forecasts consistently rather than relying on memory.
  2. Discussing why forecasts changed, not simply whether they were correct.
  3. Encouraging revision when evidence changes instead of treating updates as admissions of failure.

For example, a product team might estimate the probability that a software release will meet its deadline, a sales team might forecast quarterly revenue ranges, or a management group might estimate the chance that a regulatory approval will arrive within a specified period. After several months, calibration can be assessed by comparing stated probabilities with actual outcomes rather than by relying on subjective impressions.

Equally important is separating the quality of reasoning from the eventual result. A carefully justified 40% forecast that fails may represent better judgement than an unjustified 95% prediction that succeeds by chance. Forecasting tournaments repeatedly reinforce this distinction because participants are evaluated across many predictions instead of memorable anecdotes. [journals.sagepub.com+2Wharton Faculty Platform]journals.sagepub.comForecasting TournamentsTetlock, Barbara A….4 Aug 2014 — Forecasting tournaments are level-playing-field competitions that reveal which individuals, teams, or…

Tournaments illustration 3

Limits and lessons

Forecasting tournaments are not a universal solution. Many important decisions concern unique situations with poorly defined outcomes, limited feedback or very long time horizons. Calibration is also easier for binary questions than for complex strategic choices involving multiple interacting uncertainties.

There are methodological cautions as well. Winning a tournament does not necessarily identify the single “best” forecaster because chance still plays a role in rankings, particularly when many contestants perform at similarly high levels. Researchers have also questioned how easily tournament results transfer to environments with fewer questions, scarce data or specialised domains. [arXiv]arxiv.orgarXiv A Prediction Tournament ParadoxA Prediction Tournament ParadoxMarch 5, 2019…Published: March 5, 2019

Even so, the central lesson has proved remarkably robust. When people express uncertainty numerically, receive honest feedback, update beliefs in response to evidence and learn from repeated scoring, their confidence becomes better matched to reality. Forecasting tournaments therefore serve as practical laboratories for improving confidence calibration—not because they eliminate uncertainty, but because they make uncertainty measurable, discussable and ultimately learnable. [journals.sagepub.com+2goodjudgment.com]journals.sagepub.comForecasting TournamentsTetlock, Barbara A….4 Aug 2014 — Forecasting tournaments are level-playing-field competitions that reveal which individuals, teams, or…

Amazon book picks

Further Reading

Books and field guides related to What Forecasting Tournaments Teach About Better Judgment. Use these as the next step if you want deeper reading beyond the article.

eBay marketplace picks

Marketplace Samples

Live-tested eBay searches with available results related to this page.

Using USA

Endnotes

  1. Source: journals.sagepub.com
    Title: Forecasting Tournaments
    Link: https://journals.sagepub.com/doi/10.1177/0963721414534257
    Source snippet

    Tetlock, Barbara A....4 Aug 2014 — Forecasting tournaments are level-playing-field competitions that reveal which individuals, teams, or...

  2. Source: iarpa.gov
    Link: https://www.iarpa.gov/research-programs/ace
    Source snippet

    ACEThe goal of the ACE Program is to dramatically enhance the accuracy, precision, and timeliness of intelligence forecasts for a broad r...

  3. Source: goodjudgment.com
    Link: https://goodjudgment.com/wp-content/uploads/2018/12/jdm16511.pdf
    Source snippet

    The impact of training and practice on judgmental accuracy...by W Chang · 2016 · Cited by 147 — Ad- ditional details on the forecasting...

  4. Source: learnmoore.org
    Link: https://learnmoore.org/papers/Mellers%20et%20al%202014.pdf
    Source snippet

    Mellers et al 2014.pdfby B Mellers · 2014 · Cited by 434 — They were taught strategies for explaining their forecasts to others, offering...

  5. Source: researchgate.net
    Link: https://www.researchgate.net/publication/340292724_Small_steps_to_accuracy_Incremental_belief_updaters_are_better_forecasters
    Source snippet

    ResearchGate(PDF) Small steps to accuracy: Incremental belief updaters...This article explores how real-world forecasters who vary in sk...

  6. Source: pmc.ncbi.nlm.nih.gov
    Title: PMCCompromising improves forecasting
    Link: https://pmc.ncbi.nlm.nih.gov/articles/PMC10189590/
    Source snippet

    improves forecasting - PMC - NIHby DN Ferreiro · 2023 · Cited by 4 — We test this by analysing 5 years of data from the Good Judgement Pr...

  7. Source: researchgate.net
    Link: https://www.researchgate.net/publication/277087515_Identifying_and_Cultivating_Superforecasters_as_a_Method_of_Improving_Probabilistic_Predictions
    Source snippet

    (PDF) Identifying and Cultivating Superforecasters as a...25 May 2015 — Effective aggregation mechanisms are central to crowd-forecastin...

    Published: May 2015

  8. Source: goodjudgment.com
    Link: https://goodjudgment.com/resources/the-superforecasters-track-record/
    Source snippet

    The Superforecasters' Track RecordSuperforecasters beat all competing research teams in the IARPA ACE tournament by 35-72%. Good Judgment...

  9. Source: goodjudgment.com
    Link: https://goodjudgment.com/about/the-science-of-superforecasting/
    Source snippet

    The Science Of SuperforecastingGood Judgment research discovered four keys to accurate forecasting: talent-spotting, training, teaming, a...

  10. Source: arxiv.org
    Title: arXiv A Prediction Tournament Paradox
    Link: https://arxiv.org/abs/1903.02131
    Source snippet

    A Prediction Tournament ParadoxMarch 5, 2019...

    Published: March 5, 2019

  11. Source: pmc.ncbi.nlm.nih.gov
    Title: The superforecasting hypothesis is challenged under real-life scarcity
    Link: https://pmc.ncbi.nlm.nih.gov/articles/PMC7333631/
    Source snippet

    Superforecasting reality check: Evidence from a small pool of...by I Katsagounos · 2020 · Cited by 24 — The study contributes to the...

  12. Source: iarpa.gov
    Link: https://www.iarpa.gov/
    Source snippet

    Intelligence Advanced Research projects Activity...IARPA invests in research programs to tackle some of the Intelligence Communi...

  13. Source: researchgate.net
    Link: https://www.researchgate.net/figure/Mean-standardized-Brier-scores-for-superforecasters-Supers-and-the-two-comparison_fig1_277087515
    Source snippet

    hich reality is coded as 1 for the event and 0 otherwise), ranging from 0 (...Read more...

  14. Source: researchgate.net
    Link: https://www.researchgate.net/publication/274992096_Forecasting_Tournaments_Tools_for_Increasing_Transparency_and_Improving_the_Quality_of_Debate
    Source snippet

    s on the feasibility of improving judgmental accuracy and on the best methods...Read more...

  15. Source: goodjudgment.com
    Link: https://goodjudgment.com/about/
    Source snippet

    asting questions across the political, economic and social spectrum.Read more...

  16. Source: goodjudgment.com
    Link: https://goodjudgment.com/resources/the-superforecasters-track-record/the-first-championship-season/
    Source snippet

    ontrol group by more than 50%. This is the largest...Read more...

  17. Source: goodjudgment.com
    Title: They can forecast outcomes 300 days prior to resolution better than their
    Link: https://goodjudgment.com/superforecasters-still-creme-de-la-creme-six-years-on/
    Source snippet

    Superforecasters: Still Crème de la Crème Six Years OnSuperforecasters are significantly more accurate than their forecasting peers...

  18. Source: arxiv.org
    Link: https://arxiv.org/pdf/2602.19520
    Source snippet

    Domain-Specific Calibration Dynamics in Prediction Marketsby NA Le · 2026 · Cited by 2 — Tetlock and Gardner [65] demonstrated that struc...

  19. Source: arxiv.org
    Link: https://arxiv.org/html/2602.19520v1
    Source snippet

    Decomposing Crowd Wisdom: Domain-Specific Calibration...23 Feb 2026 — structured forecasting tournaments can identify “superforecasters”...

  20. Source: youtube.com
    Title: Engaging Minds with Philip Tetlock and Barbara Mellers in New York City,
    Link: https://www.youtube.com/watch?v=cLg8AdJG1v8
    Source snippet

    The Good Judgment Project - Know It ALL...

    Published: December 3, 2011

  21. Source: youtube.com
    Title: The Good Judgment Project
    Link: https://www.youtube.com/watch?v=9yT9V-LvWdA
    Source snippet

    VN1 Forecasting Competition: Winning Solutions (AI Forecasting Academy)...

  22. Source: faculty.wharton.upenn.edu
    Title: 2015 superforecasters
    Link: https://faculty.wharton.upenn.edu/wp-content/uploads/2015/07/2015—superforecasters.pdf
    Source snippet

    Wharton Faculty Platform2015---superforecasters.pdf - Wharton Faculty Platformby B Mellers · 2015 · Cited by 323 — Brier scores are the a...

  23. Source: emergentmind.com
    Link: https://www.emergentmind.com/topics/superforecasters
    Source snippet

    Metrics and Methods20 Feb 2026 — They use granular Bayesian reasoning, problem decomposition, and continuous recalibration to refine prob...

  24. Source: Wikipedia
    Title: The Good Judgment Project
    Link: https://en.wikipedia.org/wiki/The_Good_Judgment_Project
    Source snippet

    The Good Judgment ProjectPredictions are scored using Brier scores. The top forecasters in GJP are "reportedly 30% better than intelli...

  25. Source: Wikipedia
    Title: Intelligence Advanced Research Projects Activity
    Link: https://en.wikipedia.org/wiki/Intelligence_Advanced_Research_Projects_Activity
    Source snippet

    Intelligence Advanced Research Projects ActivityIARPA funds academic and industry research across a broad range of technical areas, in...

  26. Source: repository.upenn.edu
    Title: eduexposure to similar vs
    Link: https://repository.upenn.edu/bitstreams/988cd5cf-7bf6-479b-8d66-3cb7e6816168/download
    Source snippet

    diverse perspectives in forecastingby K Chen · 2023 — Forecasting tournaments are competitions in which participants attempt to make the...

  27. Source: repository.upenn.edu
    Link: https://repository.upenn.edu/server/api/core/bitstreams/d182dd97-ce71-4d0e-9393-c44514f78036/content
    Source snippet

    Good Judgment Projectby G Forecasting · Cited by 1 — When scores are calculated to assess how correct the predictions were, there is a fu...

  28. Source: dni.gov
    Link: https://www.dni.gov/index.php/careers/special-programs/iarpa
    Source snippet

    IARPA | Office of the Director of National IntelligenceIARPA is capable of quickly responding to new priorities, emerging challenges, sci...

  29. Source: linkedin.com
    Link: https://www.linkedin.com/company/iarpa-odni

Additional References

  1. Source: andrewclark.co.uk
    Link: https://andrewclark.co.uk/all-media/superforecasting
    Source snippet

    SuperForecastingThe Brier score is a way to measure how good your predictions are. It looks at both calibration (how accurate your predic...

  2. Source: osf.io
    Link: https://osf.io/download/n5czv
    Source snippet

    ACE and HFC forecasting tournaments, the Brier score was the core metric. That score had to be adjusted to respect...Read more...

  3. Source: youtube.com
    Link: https://www.youtube.com/watch?v=dQKFaYofqGE

  4. Source: newsroom.haas.berkeley.edu
    Title: harnessing the wisdom of the crowd to forecast world events
    Link: https://newsroom.haas.berkeley.edu/harnessing-the-wisdom-of-the-crowd-to-forecast-world-events/
    Source snippet

    Wisdom of the Crowd for Forecast Accuracy14 Jun 2017 — Prof. Don Moore found a way to dramatically improve forecast accuracy by training...

  5. Source: forum.effectivealtruism.org
    Title: two directions for research on forecasting and decision
    Link: https://forum.effectivealtruism.org/posts/dsG5SYjhPqnxhystM/two-directions-for-research-on-forecasting-and-decision
    Source snippet

    directions for research on forecasting and decision...11 Mar 2023 — Forecasting tournaments have shown that a set of methods for good ju...

  6. Source: forum.effectivealtruism.org
    Title: evidence on good forecasting practices from the good 1
    Link: https://forum.effectivealtruism.org/posts/W94KjunX3hXAtZvXJ/evidence-on-good-forecasting-practices-from-the-good-1
    Source snippet

    on good forecasting practices from the...15 Feb 2019 — For superforecasters, rounding to the nearest 0.10 produced significantly worse B...

  7. Source: lifeitself.org
    Link: https://lifeitself.org/blog/notes-on-tetlock-and-gardners-superforecasting
    Source snippet

    Superforecasting, Tetlock and Gardner (Notes)Brier score = Sum of square error between prediction probability and actual outcome (e.g...

  8. Source: arbresearch.com
    Link: https://arbresearch.com/files/comparing_forecasters.pdf
    Source snippet

    ment is 0.52 (SD: 0.11).” No better than predicting 50% on all...Read more...

  9. Source: reddit.com
    Link: https://www.reddit.com/r/ObscurePatentDangers/comments/1mwm9p4/biometric_recognition_and_identification_at/
    Source snippet

    veloped by IARPA, aims to enhance the U.S. Intelligence Community's...

  10. Source: commoncog.com
    Title: how do you evaluate your own predictions
    Link: https://commoncog.com/how-do-you-evaluate-your-own-predictions/
    Source snippet

    ?17 Dec 2019 — This post provides a comprehensive summary of the technique that Tetlock and Gardner presents in Superforecasting.Read more...

Topic Tree

Follow this branch

Parent topic

Calibration How Confident Should You Be?

Related pages 5