Inference Using the Akaike Information Criterion

Introduction: The Crisis in Statistical Science

                    Central Thesis: The crisis of statistical replication can be addressed by investigating the basis of inference and model-building itself, using the Akaike Information Criterion as a foundation for unified statistical inference.
                

Statistical science and its applications face a deep crisis. Methods once considered robust are revealing severe practical weaknesses, contributing to a replication crisis across natural and social sciences. This work proposes that examining the foundations of inference and model-building offers a path forward.

The Method of Multiple Working Hypotheses

Thomas Chrowder Chamberlain's 19th-century approach advocates generating multiple, often contradictory potential explanations for any phenomenon under investigation. This method prevents unconscious bias toward a single preferred hypothesis and enables multi-causal analysis.

Strong Inference Framework

John R. Platt's "Strong Inference" builds on Chamberlain's work, demanding that scientists:

Derive alternate hypotheses
Run experiments based on those hypotheses
Use results to exclude or refine hypotheses

The Akaikean Framework

George E.P. Box's famous observation that "all models are wrong, some are useful" guides this approach. The Akaike Information Criterion (AIC) provides a tool for evaluating model utility through predictive accuracy while preventing overfitting through complexity penalties.

The Six-Step Akaikean Inference Framework

1. Hypothesizing

Generate multiple working hypotheses using Chamberlain's method to prevent bias toward single explanations.

2. Model-Building

Translate verbal hypotheses into mathematical models through creative judgment and domain expertise.

3. Fitting

Adjust model parameters to observed data while maintaining theoretical foundations.

4. Evaluating

Measure model fit and complexity using AIC to balance predictive accuracy with parsimony.

5. Choosing

Select best models using appropriate statistical paradigms as inference engines.

6. Concluding

Interpret mathematical results within subject-matter context using plausible reasoning and scientific judgment.

Information-Theoretic Background

The Akaike Information Criterion emerged from a rich history of information theory, spanning from thermodynamics to cryptography to modern statistical science.

19th Century

Ludwig Boltzmann & Statistical Mechanics

Developed entropy as a measure of disorder in physical systems, laying groundwork for information theory.

1927

Leo Szilard's Solution to Maxwell's Demon

Extended Boltzmann's entropy equation to measure information, developing the binary bit as S = k log 2.

1928

Ralph Hartley's Information Measure

Developed H = n log s for telecommunications, focusing on practical transmission problems.

1940s

WWII Cryptography Advances

Alan Turing and Jack Good used information measures for codebreaking at Bletchley Park.

1948

Claude Shannon's Information Theory

Published "A Mathematical Theory of Communication," establishing information entropy H(X) = -Σ P(xi) log P(xi).

1951

Kullback-Leibler Divergence

Solomon Kullback and Richard Leibler developed a measure of distance between statistical distributions.

1971

Akaike Information Criterion

Hirotugu Akaike published AIC = 2k - 2 log L, combining model fit with complexity penalties.

The AIC Formula

AIC = 2k - 2 log L

Where k = number of parameters, L = likelihood of the model creating the observed data

This balances model fit (log-likelihood) against complexity (parameter count), implementing Occam's Razor in statistical model selection.

Philosophies of Science

Understanding how to interpret AIC requires grappling with fundamental questions about the nature and purpose of science itself.

Four Major Philosophical Positions

1. Realism

Science discovers truths about the world. Quantitative tools provide evidence toward the truth of statements or functionally good explanations.

2. Empiricism

Truth is unattainable. Science aims for adequacy in explaining observable phenomena, tested through experiment and analysis.

3. Instrumentalism

Science generates good predictions. Value comes from predictive accuracy and conceptual advancement, not truth claims.

4. Anarchism (Feyerabend)

No single method defines science. Progress requires skepticism toward all guidelines and paradigms.

Paradigms and Revolutions

Thomas Kuhn's "Structure of Scientific Revolutions" describes science as alternating between normal activity within paradigms and revolutionary shifts. Karl Popper countered that scientists should actively seek to falsify accepted theories.

                    Elliott Sober's Synthesis: Evaluate models first on instrumental predictability, then assess fitted models through realist truth evaluation. This approach informs the AIC framework proposed in this thesis.
                

AIC and Philosophical Frameworks

The Akaike Information Criterion can be viewed as instrumentalist (seeking good prediction models), but it can be used within various philosophical frameworks. The key insight is that statistical tools themselves need not be tied to single philosophies.

Models and Meaning

Moving from philosophical abstraction to practical application requires understanding how scientific hypotheses become mathematical models.

From Subject Matter to Statistical Models

                    David Cox: "The translation from subject-matter problem to statistical model is... often the most critical part of an analysis."
                

This translation involves professional judgment and field-specific expertise. Researchers must convert verbal descriptions and domain terminology into formal mathematical representations.

The Challenge of Uncertainty

Models must represent fundamental uncertainty. Different approaches include:

Approaches to Modeling Uncertainty

Precise Probability: Traditional statistical models with point estimates
Interval Approaches: Dempster-Shafer evidence theory
Fuzzy Sets: Non-binary membership functions
Imprecise Probability: Sets of probability distributions

Realism vs. Instrumentalism in Modeling

Two contrasting approaches emerge:

Milton Friedman (Instrumentalist): Model assumptions need not be true if predictions are accurate. Focus on outputs, not premises.

Herbert Simon (Realist): Models must follow from empirically valid composition laws. Premises and their connections must be true.

                    Synthesis: Use both approaches - evaluate models on fit to data (instrumentalist), then assess fitted models on parameter realism. AIC serves as the measure for instrumental quality.
                

Means of Inference

Statistical inference fundamentally deals with uncertainty and unknowns. The choice of statistical paradigm shapes how we interpret evidence and reach conclusions.

The Nature of Uncertainty

Different statistical schools offer distinct answers to fundamental questions about uncertainty:

Bayesian

Uncertainty reflects subjective beliefs. Procedures provide support for/against hypotheses based on prior knowledge.

Frequentist

Uncertainty emerges from long-run sampling. Validity comes from repeated random sampling from populations.

Likelihoodist

Focus on likelihood functions. Information relevant to models is contained in the likelihood.

Information-Theoretic

Use AIC and similar criteria to fit the best, most concise curve to data.

The Inference Process

Inference proceeds through interconnected steps:

Six Steps of Statistical Inference

Hypothesizing: Develop theories using multiple working hypotheses
Model-building: Translate theories into mathematics
Fitting: Adjust parameters to data
Evaluating: Measure fit and complexity with AIC
Choosing: Apply statistical paradigms as "inference engines"
Concluding: Interpret mathematical results in subject-specific terms

Engines of Inference

Statistical paradigms function like engines - taking models as inputs and producing interpretable outputs. Just as mechanical engines combine simple machines into complex systems, statistical inference combines basic calculations into comprehensive analytical frameworks.

                    Epistemic Humility: Rather than declaring one paradigm superior, researchers should select methods based on the problem at hand. As Kristin Lennox noted: "I was raised a frequentist... when frequentist inference can't do something I want to do, I go Bayesian."
                

Ends of Inference

Moving from statistical models to scientific conclusions requires bridging the gap between mathematical results and domain-specific meaning.

The Translation Challenge

Consider a botanist studying water temperature and grass growth. After collecting data and finding that 72°F ± 1°F optimizes growth with 95% confidence, what does this mean scientifically? The mathematical result must be interpreted within botanical knowledge and practical constraints.

Historical Foundations

Pascal & Cardano (17th Century)

Developed probability for gambling - if a game ends early, how should winnings be distributed based on chances of victory?

Thomas Bayes (18th Century)

Created inverse probability for updating beliefs about unobserved events based on new evidence.

Ronald Fisher (20th Century)

Developed "objective" significance testing, contrasting with "subjective" Bayesian approaches.

Neyman-Pearson

Advocated choosing among multiple hypotheses using likelihood ratios rather than testing against null hypotheses.

The Statistics Wars

Despite appearing as a simple Bayesian vs. Frequentist dichotomy, statistical inference involves multiple competing paradigms:

Major Statistical Paradigms

Bayesian: Update beliefs using prior information and new evidence
Frequentist: Test hypotheses against null using long-run sampling properties
Likelihoodist: Base inference on likelihood functions alone
Hybrid approaches: Combine elements from multiple paradigms

                    Practical Reality: Most practitioners use whatever methods work for their specific problems, rather than adhering strictly to single paradigms. The choice often depends more on training institution than philosophical conviction.
                

Moving Beyond Statistical Wars

This thesis advocates using AIC for model evaluation while remaining agnostic about inference paradigms. Researchers should:

Use multiple working hypotheses
Evaluate models with AIC
Select inference engines based on problem requirements
Judge results on logical justification and plausibility

The final step - from model selection to scientific conclusions - requires creativity, judgment, and plausibility assessments that cannot be fully systematized.

Conclusion: A Unified Approach

                    Core Proposal: A systematic process using multiple working hypotheses for model-building, AIC for model evaluation, and paradigm agnosticism for model selection and conclusion.
                

The Akaikean Framework

This approach is philosophically Akaikean (evaluating models on fit and complexity) but not methodologically restrictive (using only AIC-type statistics). The crisis in inferential statistics is procedural, not fundamental to the tools themselves.

Beyond Statistical Rituals

As Gerd Gigerenzer argues, we must move beyond "statistical rituals" toward genuine acts of judgment, creativity, and plausible reasoning. Statistical science requires logical reasoning that cannot be judged by any single paradigm alone.

Key Principles

Epistemic Humility: Recognize that this is one approach among many
Creative Judgment: Emphasize researcher creativity in hypothesizing, testing, and concluding
Plausible Reasoning: Judge inferences on logical appropriateness, not paradigm purity
Systematic Process: Follow structured steps while maintaining flexibility

The Role of Creativity

Hirotugu Akaike advocated for plausibility "for the evaluation of verbally defined models." The translation from hypothesis to statistical modeling is creative, as is the movement from model selection to inferential conclusions.

                    Akaike's Vision: "Statistical thinking as the science of creative thinking" - establishing inference as fundamentally requiring human judgment and creativity at every step.
                

Future Directions

This framework applies beyond traditional statistics to machine learning, bridging computational pattern recognition with robust statistical inference. While not claiming to solve the problem of inference entirely, it offers a practical, unified approach to scientific reasoning under uncertainty.

The method provides a foundation for empirical sciences while acknowledging the irreducible role of human judgment in moving from evidence to conclusions. In Issac Newton's spirit of "standing on the shoulders of giants," this work builds on centuries of developments in probability, information theory, and philosophy of science.