Introduction: The Crisis in Statistical Science
Statistical science and its applications face a deep crisis. Methods once considered robust are revealing severe practical weaknesses, contributing to a replication crisis across natural and social sciences. This work proposes that examining the foundations of inference and model-building offers a path forward.
The Method of Multiple Working Hypotheses
Thomas Chrowder Chamberlain's 19th-century approach advocates generating multiple, often contradictory potential explanations for any phenomenon under investigation. This method prevents unconscious bias toward a single preferred hypothesis and enables multi-causal analysis.
Strong Inference Framework
John R. Platt's "Strong Inference" builds on Chamberlain's work, demanding that scientists:
- Derive alternate hypotheses
- Run experiments based on those hypotheses
- Use results to exclude or refine hypotheses
The Akaikean Framework
George E.P. Box's famous observation that "all models are wrong, some are useful" guides this approach. The Akaike Information Criterion (AIC) provides a tool for evaluating model utility through predictive accuracy while preventing overfitting through complexity penalties.
The Six-Step Akaikean Inference Framework
1. Hypothesizing
Generate multiple working hypotheses using Chamberlain's method to prevent bias toward single explanations.
2. Model-Building
Translate verbal hypotheses into mathematical models through creative judgment and domain expertise.
3. Fitting
Adjust model parameters to observed data while maintaining theoretical foundations.
4. Evaluating
Measure model fit and complexity using AIC to balance predictive accuracy with parsimony.
5. Choosing
Select best models using appropriate statistical paradigms as inference engines.
6. Concluding
Interpret mathematical results within subject-matter context using plausible reasoning and scientific judgment.
Information-Theoretic Background
The Akaike Information Criterion emerged from a rich history of information theory, spanning from thermodynamics to cryptography to modern statistical science.
Developed entropy as a measure of disorder in physical systems, laying groundwork for information theory.
Extended Boltzmann's entropy equation to measure information, developing the binary bit as S = k log 2.
Developed H = n log s for telecommunications, focusing on practical transmission problems.
Alan Turing and Jack Good used information measures for codebreaking at Bletchley Park.
Published "A Mathematical Theory of Communication," establishing information entropy H(X) = -Σ P(xi) log P(xi).
Solomon Kullback and Richard Leibler developed a measure of distance between statistical distributions.
Hirotugu Akaike published AIC = 2k - 2 log L, combining model fit with complexity penalties.
The AIC Formula
AIC = 2k - 2 log L
Where k = number of parameters, L = likelihood of the model creating the observed data
This balances model fit (log-likelihood) against complexity (parameter count), implementing Occam's Razor in statistical model selection.
Philosophies of Science
Understanding how to interpret AIC requires grappling with fundamental questions about the nature and purpose of science itself.
Four Major Philosophical Positions
1. Realism
Science discovers truths about the world. Quantitative tools provide evidence toward the truth of statements or functionally good explanations.
2. Empiricism
Truth is unattainable. Science aims for adequacy in explaining observable phenomena, tested through experiment and analysis.
3. Instrumentalism
Science generates good predictions. Value comes from predictive accuracy and conceptual advancement, not truth claims.
4. Anarchism (Feyerabend)
No single method defines science. Progress requires skepticism toward all guidelines and paradigms.
Paradigms and Revolutions
Thomas Kuhn's "Structure of Scientific Revolutions" describes science as alternating between normal activity within paradigms and revolutionary shifts. Karl Popper countered that scientists should actively seek to falsify accepted theories.
AIC and Philosophical Frameworks
The Akaike Information Criterion can be viewed as instrumentalist (seeking good prediction models), but it can be used within various philosophical frameworks. The key insight is that statistical tools themselves need not be tied to single philosophies.
Models and Meaning
Moving from philosophical abstraction to practical application requires understanding how scientific hypotheses become mathematical models.
From Subject Matter to Statistical Models
This translation involves professional judgment and field-specific expertise. Researchers must convert verbal descriptions and domain terminology into formal mathematical representations.
The Challenge of Uncertainty
Models must represent fundamental uncertainty. Different approaches include:
Approaches to Modeling Uncertainty
- Precise Probability: Traditional statistical models with point estimates
- Interval Approaches: Dempster-Shafer evidence theory
- Fuzzy Sets: Non-binary membership functions
- Imprecise Probability: Sets of probability distributions
Realism vs. Instrumentalism in Modeling
Two contrasting approaches emerge:
Milton Friedman (Instrumentalist): Model assumptions need not be true if predictions are accurate. Focus on outputs, not premises.
Herbert Simon (Realist): Models must follow from empirically valid composition laws. Premises and their connections must be true.
Means of Inference
Statistical inference fundamentally deals with uncertainty and unknowns. The choice of statistical paradigm shapes how we interpret evidence and reach conclusions.
The Nature of Uncertainty
Different statistical schools offer distinct answers to fundamental questions about uncertainty:
Bayesian
Uncertainty reflects subjective beliefs. Procedures provide support for/against hypotheses based on prior knowledge.
Frequentist
Uncertainty emerges from long-run sampling. Validity comes from repeated random sampling from populations.
Likelihoodist
Focus on likelihood functions. Information relevant to models is contained in the likelihood.
Information-Theoretic
Use AIC and similar criteria to fit the best, most concise curve to data.
The Inference Process
Inference proceeds through interconnected steps:
Six Steps of Statistical Inference
- Hypothesizing: Develop theories using multiple working hypotheses
- Model-building: Translate theories into mathematics
- Fitting: Adjust parameters to data
- Evaluating: Measure fit and complexity with AIC
- Choosing: Apply statistical paradigms as "inference engines"
- Concluding: Interpret mathematical results in subject-specific terms
Engines of Inference
Statistical paradigms function like engines - taking models as inputs and producing interpretable outputs. Just as mechanical engines combine simple machines into complex systems, statistical inference combines basic calculations into comprehensive analytical frameworks.
Ends of Inference
Moving from statistical models to scientific conclusions requires bridging the gap between mathematical results and domain-specific meaning.
The Translation Challenge
Consider a botanist studying water temperature and grass growth. After collecting data and finding that 72°F ± 1°F optimizes growth with 95% confidence, what does this mean scientifically? The mathematical result must be interpreted within botanical knowledge and practical constraints.
Historical Foundations
Developed probability for gambling - if a game ends early, how should winnings be distributed based on chances of victory?
Created inverse probability for updating beliefs about unobserved events based on new evidence.
Developed "objective" significance testing, contrasting with "subjective" Bayesian approaches.
Advocated choosing among multiple hypotheses using likelihood ratios rather than testing against null hypotheses.
The Statistics Wars
Despite appearing as a simple Bayesian vs. Frequentist dichotomy, statistical inference involves multiple competing paradigms:
Major Statistical Paradigms
- Bayesian: Update beliefs using prior information and new evidence
- Frequentist: Test hypotheses against null using long-run sampling properties
- Likelihoodist: Base inference on likelihood functions alone
- Hybrid approaches: Combine elements from multiple paradigms
Moving Beyond Statistical Wars
This thesis advocates using AIC for model evaluation while remaining agnostic about inference paradigms. Researchers should:
- Use multiple working hypotheses
- Evaluate models with AIC
- Select inference engines based on problem requirements
- Judge results on logical justification and plausibility
The final step - from model selection to scientific conclusions - requires creativity, judgment, and plausibility assessments that cannot be fully systematized.
Conclusion: A Unified Approach
The Akaikean Framework
This approach is philosophically Akaikean (evaluating models on fit and complexity) but not methodologically restrictive (using only AIC-type statistics). The crisis in inferential statistics is procedural, not fundamental to the tools themselves.
Beyond Statistical Rituals
As Gerd Gigerenzer argues, we must move beyond "statistical rituals" toward genuine acts of judgment, creativity, and plausible reasoning. Statistical science requires logical reasoning that cannot be judged by any single paradigm alone.
Key Principles
- Epistemic Humility: Recognize that this is one approach among many
- Creative Judgment: Emphasize researcher creativity in hypothesizing, testing, and concluding
- Plausible Reasoning: Judge inferences on logical appropriateness, not paradigm purity
- Systematic Process: Follow structured steps while maintaining flexibility
The Role of Creativity
Hirotugu Akaike advocated for plausibility "for the evaluation of verbally defined models." The translation from hypothesis to statistical modeling is creative, as is the movement from model selection to inferential conclusions.
Future Directions
This framework applies beyond traditional statistics to machine learning, bridging computational pattern recognition with robust statistical inference. While not claiming to solve the problem of inference entirely, it offers a practical, unified approach to scientific reasoning under uncertainty.
The method provides a foundation for empirical sciences while acknowledging the irreducible role of human judgment in moving from evidence to conclusions. In Issac Newton's spirit of "standing on the shoulders of giants," this work builds on centuries of developments in probability, information theory, and philosophy of science.