When the Metric Is the Weak Link

Introduction

Weakest-link thinking is not only about checking facts. It is also about checking whether the evidence actually measures the thing you care about. A conclusion can appear rigorous because it rests on precise numbers, yet still be weak if those numbers are poor proxies for the underlying question. This is a measurement assumption: the often-unstated belief that a chosen metric faithfully represents reality.

Bad Metrics illustration 1 The danger is that numbers create confidence. Dashboards, percentages and rankings can make an argument feel objective even when the metric captures only what is easy to count rather than what matters. Before asking whether a result is statistically impressive, ask a simpler question: if this metric moved, would the real-world outcome we care about necessarily move with it? That question often reveals the weakest link.

When numbers quietly redefine the problem

Metrics do more than record reality. They also define what success looks like.

Suppose a company claims its customer support has improved because average call time has fallen by 30%. That statistic assumes that shorter calls represent better service. Yet customers may simply be ending calls without resolving their problems. The metric has silently changed the question from “Did customers receive effective help?” to “How quickly were calls completed?”

Measurement experts describe this as a problem of construct validity: whether a measure genuinely represents the concept it claims to measure. A measure can be reliable—producing consistent numbers—while still lacking validity because it captures the wrong construct. [cur.org]cur.orgSo, You Want to Measure Something?An Introduction…by TT Tuma — Measurement validity is essential for ensuring that research findings lead to trustworthy conclusions. Ye…

Weakest-link thinking therefore asks:

What exactly is this metric measuring?
What important parts of the real objective are missing?
Could the number improve while the real outcome stays the same or even worsens?

If the answer to the last question is yes, the metric itself becomes the fragile assumption supporting the conclusion.

How metrics smuggle in definitions

Every metric contains hidden choices about what counts and what does not.

For example:

Employee productivity may be defined as tasks completed, assuming all tasks have similar value.
School quality may be defined through examination scores, assuming test performance captures learning.
Scientific impact may be defined through citation counts, assuming citations reflect research quality.
Customer engagement may be defined through app logins, assuming opening an application reflects meaningful use.

These are not purely technical decisions. They embed definitions about success.

Once those definitions disappear into dashboards, people often stop questioning them. Debate shifts from “Are we measuring the right thing?” to “Why did the score change?”

Weakest-link thinking deliberately reverses that order by examining the measurement before interpreting the result.

Checking whether the measure matches the decision

A useful test is to compare the decision being made with the information the metric actually provides.

DecisionMetricHidden assumptionShould we keep this employee?Annual performance ratingRatings predict future contribution.Should we launch this feature?Number of clicksClicks reflect user value rather than curiosity or confusion.Is customer loyalty improving?Monthly active usersReturning users represent satisfied users.Is treatment effective?Short-term symptom scoreThe score reflects meaningful long-term health.

The more important the decision, the stronger the evidence needed that the metric corresponds to the underlying objective.

Measurement specialists emphasise gathering multiple forms of validity evidence rather than assuming that a familiar measure automatically represents the intended concept. [Psicothema]psicothema.comValidity evidence based on test contentValidity evidence based on test contentJanuary 15, 2014 — by S Sireci · 2014 · Cited by 576 — In this paper, we describe the lo…Published: January 15, 2014

Common failures in churn, hiring and performance claims

Churn: activity is not loyalty

Many businesses monitor customer churn using behavioural measures such as logins, purchases or subscription renewals.

Each is useful, but each captures something different.

A customer might:

log in because cancelling is difficult;
renew automatically despite intending to leave;
reduce spending while technically remaining a customer.

If management concludes that “customer loyalty is improving” solely because monthly churn fell, the weakest assumption is that lower measured churn represents genuine customer commitment rather than contractual friction or inertia.

Supporting evidence might include customer satisfaction, repeat voluntary purchases, referral rates and qualitative feedback instead of relying on one behavioural proxy.

Bad Metrics illustration 2

Hiring: interview performance is not job performance

Organisations frequently infer future success from interview scores.

That inference assumes interview behaviour reflects the skills required in the role.

For some jobs this relationship is reasonably strong, particularly when interviews are structured and closely tied to job requirements. For others, especially when interviews reward confidence, fluency or personal similarity, interview scores may exaggerate future performance.

The weak link is rarely the arithmetic used to combine interviewer ratings. It is the assumption that interview success measures future effectiveness.

Performance: what gets rewarded changes behaviour

Performance systems often reward measurable outputs because they are easier to track.

Examples include:

tickets closed;
sales calls completed;
papers published;
patients seen;
software features shipped.

Employees naturally optimise whatever receives attention.

If quality, creativity or long-term improvement are omitted from measurement, they may decline while performance metrics continue to rise.

This illustrates a broader pattern identified by Goodhart’s Law: when a measure becomes the target itself, its usefulness as a measure tends to deteriorate because people adapt their behaviour to maximise the metric rather than the underlying objective. [PMC+2Splunk]pmc.ncbi.nlm.nih.govPMCReliance on metrics is a fundamental challenge for AIGoodhart's law. Here Goodhart's law is framed in terms of a metric being chosen as a proxy for a goal, and the collapse that occurs with…

Why impressive statistics can still support weak conclusions

A sophisticated statistical analysis cannot repair an invalid measure.

Researchers sometimes distinguish between questions such as:

Did we estimate the number accurately?
Did we estimate the right number?

The second question is often more important.

A survey might estimate customer satisfaction with remarkable precision, but if respondents interpret the questions differently from what researchers intended, highly accurate calculations merely produce highly accurate measurements of the wrong thing. Measurement validity research repeatedly stresses that trustworthy conclusions depend on both accurate measurement and appropriate interpretation. [cur.org]cur.orgSo, You Want to Measure Something?An Introduction…by TT Tuma — Measurement validity is essential for ensuring that research findings lead to trustworthy conclusions. Ye…

Weakest-link thinking therefore resists being impressed by precision alone. Confidence intervals, significance tests and decimal places matter only after establishing that the metric itself represents the intended concept.

Bad Metrics illustration 3

Practical questions that expose weak metrics

Before accepting a metric as evidence, ask:

What real-world concept is this supposed to represent?
Could this number improve while the underlying outcome stays unchanged?
What incentives does this metric create?
What important outcome is left unmeasured?
Would an independent observer describe success the same way as this metric does?
Does another measure point in a different direction?

These questions focus attention on the assumption connecting the metric to the conclusion rather than the calculation itself.

The key takeaway

Measurement assumptions often become invisible because numbers appear objective. Yet every metric is a model of reality rather than reality itself. The weakest link is frequently not the data but the unexamined belief that the chosen measure captures what truly matters.

The most robust conclusions rely on metrics that have demonstrated validity, are interpreted alongside complementary evidence, and remain tied to the real decision instead of becoming the decision’s substitute. When a conclusion looks unusually strong because it is backed by precise numbers, one of the first questions should be whether those numbers are measuring the right thing at all. [PMC+2PMC]pmc.ncbi.nlm.nih.govPMCReliance on metrics is a fundamental challenge for AIGoodhart's law. Here Goodhart's law is framed in terms of a metric being chosen as a proxy for a goal, and the collapse that occurs with…

Amazon book picks

Marketplace Samples

Example marketplace items related to this page. Use the search link to explore similar finds on eBay.

Example eBay listing

50/60/70/120cm Wide Magnetic Office Home Whiteboard White Board Notice Dry Wipe

Search eBay.co.uk: office whiteboard

Browse similar on eBay.co.uk

Example eBay listing

Magnetic Whiteboard Large 900x600 White Board Dry Wipe Notice Office School Home

Search eBay.co.uk: office whiteboard

Browse similar on eBay.co.uk

Example eBay listing

A4 Magnetic Whiteboard Dry Wipe Small Memo Notice Board for Office Home Eraser

Search eBay.co.uk: office whiteboard

Browse similar on eBay.co.uk

Example eBay listing

A4 Magnetic Whiteboard Dry Wipe White Board Office Home Memo Notice Small Eraser

Search eBay.co.uk: office whiteboard

Browse similar on eBay.co.uk

Browse more on eBay.co.uk

Example items shown for inspiration; availability and pricing can change. Branchoria may earn a commission if you purchase through outbound eBay links.

Endnotes

Source: cur.org
Title: So, You Want to Measure Something?
Link: https://www.cur.org/journal-article/so-you-want-to-measure-something-an-introduction-to-measurement-validity-in-educational-research/
Source snippet
An Introduction...by TT Tuma — Measurement validity is essential for ensuring that research findings lead to trustworthy conclusions. Ye...
Source: pmc.ncbi.nlm.nih.gov
Title: PMCConstructing Validity: New Developments in Creating
Link: https://pmc.ncbi.nlm.nih.gov/articles/PMC6754793/
Source snippet
by LA Clark · 2019 · Cited by 1777 — Clark and Watson (1995) discussed theoretical principles, practical issues, and pragmatic decisio...
Source: psicothema.com
Title: Validity evidence based on test content
Link: https://www.psicothema.com/pdf/4167.pdf
Source snippet
Validity evidence based on test contentJanuary 15, 2014 — by S Sireci · 2014 · Cited by 576 — In this paper, we describe the lo...

Published: January 15, 2014
Source: pmc.ncbi.nlm.nih.gov
Title: PMCReliance on metrics is a fundamental challenge for AI
Link: https://pmc.ncbi.nlm.nih.gov/articles/PMC9122957/
Source snippet
Goodhart's law. Here Goodhart's law is framed in terms of a metric being chosen as a proxy for a goal, and the collapse that occurs with...
Source: splunk.com
Title: goodharts law
Link: https://www.splunk.com/en_us/blog/learn/goodharts-law.html
Source snippet
What is Goodhart's Law?6 Mar 2024 — Goodhart's Law warns of distorted metrics when tied to goals: It states, "When a measure becomes a ta...
Source: Wikipedia
Title: Goodhart’s law
Link: https://en.wikipedia.org/wiki/Goodhart%27s_law
Source snippet
Goodhart's lawGoodhart's law is an adage that has been stated as, "When a measure becomes a target, it ceases to be a good measure". F...

Additional References

Source: arxiv.org
Link: https://arxiv.org/html/2505.10573v3
Source snippet
A Validity-Centered Framework for AI EvaluationAt the same time, our framework also informs the construction of evaluations designed to s...
Source: researchgate.net
Link: https://www.researchgate.net/publication/259825627_Validity_evidence_based_on_test_content
Source snippet
(PDF) Validity evidence based on test contentValidity evidence based on test content is one of the five forms of validity evidence stipul...
Source: modelthinkers.com
Link: https://modelthinkers.com/mental-model/goodharts-law
Source snippet
Goodhart's LawGoodhart's Law highlights challenges with measuring anything as a proxy for a goal, but it should not be mistaken for dismi...
Source: linkedin.com
Link: https://www.linkedin.com/pulse/goodharts-law-why-your-metrics-lying-you-kassidy-bird-sagbf
Source snippet
Goodhart's Law and Why Your Metrics are Lying to YouHow is behavior evolving relative to original assumptions? This is a shift from measu...
Source: medium.com
Link: https://medium.com/%40claus.nisslmueller/goodharts-law-and-the-death-of-honest-metrics-e08cc756f93a
Source snippet
Goodhart's Law and the Death of Honest MetricsExample: “reduce average handling time” in customer support works until it starts rewarding...
Source: changeassociates.com
Link: https://changeassociates.com/goodharts-law-and-the-illusion-of-progress-in-change-programmes/
Source snippet
Goodhart's Law in change programmesHow Goodhart's Law undermines change programmes when metrics become the goal. Plus practical ways to k...
Source: m.youtube.com
Title: Distinguish among direct, indirect, and product measures of behavior
Link: https://m.youtube.com/watch?v=mttsaDFto-8
Source snippet
The KPI Trap: When Metrics Start Managing You is highly relevant because it directly explains how choosing the wrong metric can distort i...
Source: deviq.com
Link: https://deviq.com/laws/goodharts-law/
Source snippet
w highlights a paradox inherent in relying on specific metrics for...Read more...
Source: psychsafety.com
Title: goodharts law campbells law and the cobra effect
Link: https://psychsafety.com/goodharts-law-campbells-law-and-the-cobra-effect/
Source snippet
Goodhart's Law, Campbell's Law, and the Cobra Effect.19 Jul 2024 — Goodhart's Law is “When a measure becomes a target, it ceases to be a...
Source: gesis.org
Link: https://www.gesis.org/fileadmin/admin/Dateikatalog/pdf/guidelines/validity_in_survey_research_repke_birkenmaier_lechner_2024.pdf
Source snippet
It encompasses multiple kinds of validity-...Read more...

When the Metric Is the Weak Link

Introduction

When numbers quietly redefine the problem

How metrics smuggle in definitions

Checking whether the measure matches the decision

Common failures in churn, hiring and performance claims

Churn: activity is not loyalty

Hiring: interview performance is not job performance

Performance: what gets rewarded changes behaviour

Why impressive statistics can still support weak conclusions

Practical questions that expose weak metrics

The key takeaway

Further Reading

How to Measure Anything

The Tyranny of Metrics

Thinking, Fast and Slow

The Signal and the Noise

Marketplace Samples

50/60/70/120cm Wide Magnetic Office Home Whiteboard White Board Notice Dry Wipe

Magnetic Whiteboard Large 900x600 White Board Dry Wipe Notice Office School Home

A4 Magnetic Whiteboard Dry Wipe Small Memo Notice Board for Office Home Eraser

A4 Magnetic Whiteboard Dry Wipe White Board Office Home Memo Notice Small Eraser

Endnotes

Additional References

Follow this branch

Parent topic

Related pages 5