2

Thomas Matcham has a first class master's degree in mathematics from Imperial College London. He has worked as a data scientist in various capacities for 14 years, and has founded two businesses focused on the applications of machine learning. He currently works as a contractor in London. In this video, Tom gives a presentation on causal inference in statistics, and where inferential statistics falls short in establishing conviction in the world.

summerizer

Summary

This video features a detailed presentation and discussion on causality in statistics, specifically focusing on the challenges and methodologies of establishing causal relationships within the field of inferential statistics. The guest speaker, Thomas Matcham, a UK-based data scientist with 14 years of experience and a strong mathematical background, provides a deep dive into causal inference, a subfield of statistics concerned with translating associative data into causal understanding. The conversation highlights the fundamental limitations of traditional inferential statistics, which primarily deal with associations rather than causation, and introduces the tools and frameworks—such as randomized controlled trials, causal graphs, and Judea Pearl’s do-calculus—that enable researchers to make more rigorous causal claims.

Thomas emphasizes that inferential statistics, historically and by design, avoids making definitive causal statements. Causality can only be approached with additional structure, such as experimental design or causal inference techniques. The discussion includes examples illustrating the pitfalls of assuming causality from correlation and the importance of understanding alternative causal pathways, confounding variables, and colliders. They explore practical scenarios like cholesterol reduction’s effect on heart disease and agricultural fertilizer’s impact on crop yield, demonstrating how causal inference can help clarify these relationships.

Key concepts such as the limitations of R-squared values in implying causality, the role of d-separation in causal graphs, and the importance of falsification in scientific reasoning are explained. The talk also touches on the cultural and scientific challenges faced by researchers, including the tendency to seek confirmation rather than falsification of hypotheses and the influence of biases and incentives in scientific research. Finally, Thomas encourages a more rigorous and humble approach to interpreting scientific data, advocating for causal inference as a powerful tool to clarify true causal mechanisms and avoid misleading conclusions.

Highlights

  • 🔍 Inferential statistics primarily measures association, not causation.
  • ⚠️ Correlation (e.g., high R-squared) does not imply causality.
  • 🧩 Causal inference uses additional structure (experiments, causal graphs) to rigorously analyze causality.
  • 📊 Randomized controlled trials are a classical but imperfect method to study causal effects.
  • 🔄 D-separation in causal graphs helps identify independent and dependent variables and block confounding pathways.
  • 🧪 Falsification remains central: causal models can only be supported by failing to falsify predictions.
  • 🔄 Scientific culture often favors proving hypotheses rather than disproving them, risking bias in causal claims.

Key Insights

  • 🔗 Association vs. Causation: Inferential statistics traditionally focuses on associations—relationships where variables move together—but these do not necessarily indicate that one causes the other. This fundamental limitation means that statistical significance or correlation coefficients, including R-squared values, cannot alone justify causal claims. Thomas clarifies that even perfect correlations can arise without causality, and causal relationships can exist with noisy, imperfect correlations.

  • 🧪 Randomized Controlled Trials (RCTs) as a Gold Standard with Limits: RCTs randomize the assignment of treatments to control for confounding variables, theoretically isolating the causal effect of an intervention. However, Thomas points out that even well-designed RCTs may fail to rule out all confounders or hidden causal pathways. For instance, dietary interventions may affect heart disease via unexpected mechanisms such as weight loss or blood pressure changes, rather than directly through cholesterol reduction.

  • 🌐 Causal Graphs and d-Separation: Causal inference uses directed graphs to visually and mathematically represent cause-effect relationships among variables. A key concept is d-separation, a method for identifying which variables influence others and which paths can be “blocked” to isolate causal effects. This framework enables researchers to translate intuitive causal hypotheses into testable probabilistic statements, bridging the gap between qualitative causal ideas and quantitative data analysis.

  • 🔄 Falsification as Scientific Principle: In causal inference, researchers begin with a causal model, derive predictions about probabilistic dependencies and independencies, and then test these against observed data. Failure to observe predicted dependencies falsifies the model, aligning with Karl Popper’s philosophy of science. This iterative process refines causal understanding and helps avoid confirmation bias.

  • 🧩 Complexity of Real-World Causality: The presentation’s worked example on fertilizer and crop yield illustrates the complexity of causal reasoning. Variables such as past crop yield, soil properties, and farming practices create intricate causal relationships. Without controlling for all confounding factors and understanding the causal graph, simple associations can be misleading. Only when the researcher controls the intervention (e.g., randomly assigning fertilizer) can a causal claim be more confidently made.

  • ⚠️ Misuse of Statistical Measures in Scientific Debates: The discussion highlights the common misuse of statistics in fields like nutrition and medicine, where studies often overstate causal conclusions based on associative data. Thomas and Eddie note the problem of scientists attempting to confirm rather than falsify hypotheses, sometimes influenced by funding, ideology, or publication pressures. This results in a proliferation of conflicting or inconclusive studies that confuse public understanding.

  • 📚 Emerging Importance of Causal Inference: Judea Pearl’s do-calculus and the formal causal inference toolbox have revolutionized how statisticians and scientists conceptualize and analyze causality. Though initially controversial, these methods are becoming increasingly accepted and provide a rigorous language to discuss and test causal hypotheses. They offer hope for improving scientific rigor in many fields, including epidemiology, economics, and social sciences.

Additional Context and Reflection

This video serves as both an educational primer and a critical reflection on the scientific method in statistics. It stresses that understanding causality requires more than observing data patterns; it requires carefully designed experiments, well-constructed causal models, and an openness to falsification rather than confirmation. The conversation also acknowledges the difficulty of translating intuitive causal ideas into formal mathematical terms, an ongoing challenge for researchers and communicators alike.

The discussion’s relevance extends beyond statistics to broader scientific literacy. In an age of data abundance and rapid dissemination of research, the ability to critically evaluate causal claims is essential for scientists, policymakers, and the public. This video encourages viewers to question simplistic interpretations of scientific data and appreciate the nuanced, rigorous work involved in uncovering causal truths.

you are viewing a single comment's thread
view the rest of the comments
[-] jet@hackertalks.com 1 points 4 hours ago* (last edited 4 hours ago)

Yeah, we could go into depth on how to make a model to try to control for a correlation in epidemiology, but I don't think there are enough people interested in basically becoming a hobby statistician.

I would like to dissect mendelian randomization but it's going to have a super niche audience as well.

The only people who this information would be useful to are the very same people who won't engage in good faith discussions anyway.

this post was submitted on 18 Aug 2025
2 points (100.0% liked)

Friendly Carnivore

65 readers
3 users here now

Carnivore

The ultimate, zero carb, elimination diet

Meat Heals.

We are focused on health and lifestyle while trying to eat zero carb bioavailable foods.

Keep being AWESOME

We welcome engaged, polite, and logical debates and questions of any type


Purpose

Rules

  1. Be nice
  2. Stay on topic
  3. Don't farm rage
  4. Be respectful of other diets, choices, lifestyles!!!!
  5. No Blanket down voting - If you only come to this community to downvote its the wrong community for you
  6. No LLM generated posts . Don't represent machine output as your own, and don't use machines to burn human response time.

Other terms: LCHF Carnivore, Keto Carnivore, Ketogenic Carnivore, Low Carb Carnivore, Zero Carb Carnivore, Animal Based Diet, Animal Sourced Foods


Resource Post!- Papers - Books - Channels

If you need to block this community and the UI won't let you, go to settings -> blocks you can add it.

founded 1 month ago
MODERATORS