Why it matters

Canonical methodology failures, briefly explained · companion reference to Paper Review

When Paper Review flags an issue, it tags the failure mode in canonical terms. The pages below explain each tag in 200-400 words — what it is, why a journal reviewer cares, and how to fix it. These aren't comprehensive treatments; they're the briefing you'd give a smart co-author who hadn't heard the term.

Statistics & design

p-hacking — selectively reporting whichever analysis returned p < 0.05.
multiple comparisons — when running 20 tests, 1 will be "significant" by chance.
underpowered sample — too few subjects to detect the effect you claim.
HARK-ing — Hypothesizing After the Results are Known.
garden of forking paths — many analysis choices, only one reported.

Machine learning

test-train contamination — when the test set leaked into training.
hyperparameter asymmetry — tuning the proposed method but not baselines.
cherry-picked seeds — single-run reporting that hides variance.
missing ablations — claims without ablating each architectural choice.
LLM self-judge bias — using the same model family to evaluate itself.

Sampling & generalisability

WEIRD samples — generalising from Western, Educated, Industrialised, Rich, Democratic samples to all humans.
blinding not evidenced — claiming blinding without methods support.

Citations & integrity

hallucinated citation — references that look real but aren't.
figure presentation — broken axes, missing error bars, color-only encoding.
trial not registered — clinical claim without a registry entry.

Get a Paper Review — $9