Many disciplines use probability theory, including mathematical statistics, philosophy, and physics. The goal of this seminar, which began at Rutgers University in 2016, was to help people in these disciplines learn from each other and from the history of probability and its applications.
The Seminar on History and Foundations of Probability and Statistics is now inactive. This page is an archive of the former website "foundationsofprobabilityseminar.com" which documented schedules and abstracts from Fall 2021 to Spring 2024. Previous seminars from 2016 through Spring 2021 are archived at http://www.harrycrane.com/seminar.html by Harry Crane.
See below for a menu by semester, with links to each semester list with date, title, and author, and a further link to abstracts — all on this webpage. Some seminars also have links to YouTube videos.
The most recent organizers of the seminar were Xueyin Zhang (Berkeley, Philosophy), Glenn Shafer (Rutgers, Business and Statistics), Barry Loewer (Rutgers, Philosophy), Virgil Murthy (CMU, Philosophy), and Tripp Roberts (Rice, Statistics). Previous organizers include Harry Crane, Ruobin Gong, Eddy Keming Chen, Isaac Wilhelm, and Dimitris Tsementzis.
Videos for some seminars are at Foundations of Probability Panel Discussions YouTube channel.
See links below for listing of seminars by date, with title and author; and link to abstracts.)
(Spring 2022 seminars include six panel discussions and three non-panel seminars.)
Feb. 15, 2024: Good Guesses: The Conjunction Fallacy and the Tradeoff between Accuracy and Informativity, Kevin Dorst, MIT
Feb 29, 2024: Probing the qualitative-quantitative divide in probability logics, Krzysztof Mierzewski, CMU
May 2, 2024: Updating by maximizing expected accuracy in infinite non-partitional settings, Kenny Easwaran, UC Irvine
The conjunction fallacy is the well-documented empirical finding that subjects sometimes rate a conjunction A&B as more probable than one of its conjuncts, A. Most explanations appeal in some way to the fact that B has a high probability. But Tentori et al. (2013) have recently challenged such approaches, reporting experiments which find that (1) when B is confirmed by relevant evidence despite having low probability, the fallacy is common, and (2) when B has a high probability but has not been confirmed by relevant evidence, the fallacy is less common. They conclude that degree of confirmation, rather than probability, is the central determinant of the conjunction fallacy. In this paper, we address a confound in these experiments: Tentori et al. (2013) failed to control for the fact that their (1)-situations make B conversationally relevant, while their (2)-situations do not. Hence their results are consistent with the hypothesis that conversationally relevant high probability is an important driver of the conjunction fallacy. Inspired by recent theoretical work that appeals to conversational relevance to explain the conjunction fallacy, we report on two experiments that control for this issue by making B relevant without changing its degree of probability or confirmation. We find that doing so increases the rate of the fallacy in (2)-situations, and leads to comparable fallacy-rates as (1)-situations. This suggests that (non-probabilistic) conversational relevance indeed plays a role in the conjunction fallacy, and paves the way toward further work on the interplay between relevance and confirmation.
Several notable approaches to probability, going back to at least Keynes (1921), de Finetti (1937), and Koopman (1940), assign a special importance to qualitative, comparative judgments of probability ("event A is at least as probable as event B"). The difference between qualitative and explicitly quantitative probabilistic reasoning is intuitive, and one can readily identify paradigmatic instances of each. It is less clear, however, whether there are any natural structural features that track the difference between inference involving comparative probability judgments on the one hand, and explicitly numerical probabilistic reasoning on the other. Are there any salient dividing lines that can help us understand the relationship between the two, as well as classify intermediate forms of inference lying in between the two extremes? In this talk, based on joint work with Duligur Ibeling, Thomas Icard, and Milan Mossé, I will explore this question from the perspective of probability logics.
Probability logics can represent probabilistic reasoning at different levels of grain, ranging from the more "qualitative" logic of purely comparative probability to explicitly "quantitative" languages involving arbitrary polynomials over probability terms. As I will explain, when classifying these systems in terms of expressivity, computational complexity, and axiomatisation, what emerges as a robust dividing line is the distinction between systems that encode merely additive reasoning from those that encode additive and multiplicative reasoning. I will show that this distinction tracks a divide in computational complexity (NP-complete vs. ETR-complete) and in the kind of algebraic tools needed for a complete axiomatisation (hyperplane separation theorems vs. real algebraic geometry). I will present new completeness results and a result on the non-finite-axiomatisability of comparative probability, and I will conclude with some overlooked issues concerning the axiomatisation of comparative conditional probability. One lesson from this investigation is that, for the multiplicative probability logics as well as the additive ones, the paradigmatically "qualitative" systems are neither simpler in terms of computational complexity nor in terms of axiomatisation, while losing in expressive power to their explicitly numerical counterparts.
Greaves and Wallace (2006) justify Bayesian conditionalization as the update plan that maximizes expected accuracy, for an agent considering finitely many possibilities, who is about to undergo a learning event where the potential propositions that she might learn form a partition. In recent years, several philosophers have generalized this argument to less idealized circumstances. Some authors (Easwaran (2013b); Nielsen (2022)) relax finiteness, while others (Carr (2021); Gallow (2021); Isaacs and Russell (2022); Schultheis (2023)) relax partitionality. In this paper, we show how to do both at once. We give novel philosophical justifications of the use of σ-algebras in the infinite setting, and argue for a different interpretation of the “signals” in the non-partitional setting. We show that the resulting update plan mitigates some problems that arise when only relaxing finiteness, but not partitionality, such as the Borel-Kolmogorov paradox.
Sept. 21, 2023: The Flawed Genius of William Playfair, David Bellhouse, Western University
Sept. 28, 2023: Totality, Regularity, and Cardinality in Probability Theory, Paolo Mancosu, Berkeley, Philosophy
Oct. 5, 2023: Probabilistic Systems and the Stochastic-Quantum Theorem, Jacob Barandes, Harvard
Oct 19, 2023: One Way Testing by Betting Can Improve Data Analysis: Optional Continuation, Glenn Shafer, Rutgers
Nov. 9, 2023: Chance Combinatorics, John Norton, Pittsburgh
Abstract: variously as a statistician, economist, engineer, banker, scam artist, and political propagandist, among other activities. He began with much promise working for Boulton and Watt of steam engine fame, saw many ups and downs throughout his career that ranged from being moderately well off to bankruptcy and imprisonment, and ended his life in poverty. His flaw was his inability to deal with his personal and professional finances. His lasting contributions are in statistics, where he is known as the father of statistical graphics, and in economics, where he is known for his contributions to the first posthumous edition of Adam Smith’s Wealth of Nations.
In this talk, I will focus on Playfair’s graphs, which were motivated by political and economic issues of his day. I will also interweave biographical information with a discussion of the graphs. Playfair invented time series line graphs, as well as the now ubiquitous bar chart and the pie chart. He also introduced innovative ways to display multivariate data. Many of the graphical elements that Playfair used are now standard elements in statistical graphics today. For example, Playfair pioneered the use of colour in his graphs, when colour was rarely used in the printing process. I have left out spying for the British government as one of Playfair’s activities. I will address this claim as part of the talk.
Abstract: A probability space is given by a triple (Ω, 𝔍, P) where Ω is a set called the sample space, 𝔍 is a σ-algebra of subsets of Ω, and P is a probability function from 𝔍 to the interval [0, 1]. The standard Kolmogorovian approach to probability theory on infinite sample spaces is neither regular nor total. Totality, expressed set-theoretically, is the request that every subset of the sample space Ω is measurable, i.e. has a probability value. Regularity, expressed set-theoretically, is the request that only the empty set gets probability 0. Mathematical and philosophical interest in non-Kolmogorovian approaches to probability theory in the last decade has been motivated by the possibility to satisfy totality and regularity in non-Archimedean contexts (Wenmackers and Horsten 2013, Benci, Horsten, Wenmackers 2018). Much of the mathematical discussion has been focused on the cardinalities of the sample space, the algebra of events, and the range (Hájek 2011, Pruss 2013, Hofweber 2014). In this talk I will present some new results characterizing the relation between completeness and regularity in a variety of probabilistic settings and I will give necessary and sufficient conditions relating regularity and the cardinalities of the sample space, the algebra of events, and the range of the probability function, thereby improving on the results hitherto available in the literature. This is joint work with Guillaume Massas (Group in Logic, UC Berkeley).
Bibliography
Benci, V., Horsten, L., and Wenmackers, S. (2018), “Infinitesimal Probabilities”, The British Journal for the Philosophy of Science, 69, 509–552.
Hajek, A. (2011), “Staying Regular?”, unpublished typescript.
Hofweber, T. (2014), “Cardinality arguments against regular probability measures”, Thought, 3, 166–175.
Pruss, A. R. (2013), “Probability, regularity, and cardinality”, Philosophy of Science, 80, 231–240.
Wenmackers, S., and Horsten, L. (2013), “Fair infinite lotteries”, Synthese, 190, 37–61.
Abstract: On the one hand, scientists across disciplines act as though various kinds of phenomena physically occur, perhaps according to probabilistic laws. On the other hand, textbook quantum theory is an instrumentalist recipe whose only predictions refer to measurement outcomes, measurement-outcome probabilities, and statistical averages of measurement outcomes over measurement-outcome probabilities. The conceptual gap between these two pictures makes it difficult to see how to reconcile them.
In this talk, I will present a new theorem that establishes a precise equivalence between quantum theory and a highly general class of stochastic processes, called generalized stochastic systems, that are defined on configuration spaces rather than on Hilbert spaces. From a foundational perspective, some of the mysterious features of quantum theory – including Hilbert spaces over the complex numbers, linear-unitary time evolution, the Born rule, interference, and noncommutativity – then become the output of a theorem based on the simpler and more transparent premises of ordinary probability theory. From a somewhat more practical perspective, the stochastic-quantum theorem leads to a new formulation of quantum theory, alongside the Hilbert-space, path-integral, and phase-space formulations, potentially opens up new methods for using quantum computers to simulate stochastic processes beyond the Markov approximation, and may have implications for how we think about quantum gravity.
This talk is based on two papers:
1. https://arxiv.org/abs/2309.03085 / http://philsci-archive.pitt.edu/22502
2. https://arxiv.org/abs/2302.10778 / http://philsci-archive.pitt.edu/21774
Abstract: When testing a statistical hypothesis, is it legitimate to deliberate on the basis of initial data about whether and how to collect and analyze further data? My 2019 book with Vladimir Vovk, Game-Theoretic Foundations for Probability and Finance, says YES, provided that you are testing by betting and do not risk more capital than initially committed. Standard statistical theory does not allow such optional continuation. Related paper: https://arxiv.org/abs/2308.14959
Abstract: Seventeenth century “chance combinatorics” was a self-contained theory. It had an objective notion of chance derived from physical devices with chance properties, such as die casts, combinatorics to count chances and, to interpret their significance, a rule for converting these counts into fair wagers. It lacked a notion of chance as a measure of belief, a precise way to connect chance counts with frequencies and a way to compare chances across different games. These omissions were not needed for the theory’s interpretation of chance counts: determining which are fair wagers. The theory provided a model for how indefinitenesses could be treated with mathematical precision in a special case and stimulated efforts to seek a broader theory.
Prof. Norton is a Distinguished Professor in the Department of History and Philosophy of Science at the University of Pittsburgh, where he works on the history and philosophy of physics and probability. His website can be found here. He is well-known for his analysis of Einstein's notes, his theory of thought experiments, and the Norton dome problem, exhibiting nondeterminism in Newtonian mechanics. Norton's well-reviewed recent book The Material Theory of Induction was the inaugural volume in the BSPS Open book series.
January 23, 2023: Inference, Optimal Stopping and New Hypotheses, Boris Babic, University of Toronto
February 27, 2023: Symmetry of Value, Zachary Goodsell, USC
March 20, 2023: Von Mises, Popper, and the Cournot Principle, Tessa Murthy, Carnegie Mellon University
March 27, 2023: Probability from Symmetry, Ezra Rubenstein, Berkeley
April 3, 2023: Why the Concept of Statistical Inference Is Incoherent, and Why We Still Love It, Michael Acre, Senior Statistician (Retired), University of California, San Francisco
April 17, 2023: Parity, Probability and Approximate Difference, Kit Fine, NYU
April 24, 2023: Algorithmic Randomness and Probabilistic Laws, (with Jeffrey A. Barrett, UC Irvine) Eddy Keming Chen, UC San Diego
May 1, 2023: Entropy and Subjectivism, Anubav Vasudevan, Chicago
Abstract: Boris Babic, Anil Gaba, Ilia Tsetlin and Robert L. Winkler
In this project we address the Bayesian problem of new hypotheses and attempt to reframe normative constraints for rational inference in situations characterized by substantial unawareness about the possible outcomes. In particular, we first argue that we can meaningfully distinguish two types of learning scenarios: problem framing and problem solving. Problem solving is the sort of thing we do when we have quite a bit of information about a problem—enough to identify the relevant outcomes, and sometimes to even put some meaningful prior probabilities on them. Problem framing is what happens when we encounter an issue for more or less the first time. For example, one steps into an organic chemistry class without any background knowledge of the underlying subject matter. In cases like these, it's unreasonable to expect such an agent to be aware of what she might learn, let alone to place probabilities on possible learning outcomes.
Problem framing, we will suggest, is the "hard" problem of learning. Problem solving is "easy.'' And while Bayesianism (or, more specifically, traditional Bayesian confirmation theory in philosophy of science) is often pitched as a theory of rational learning, it is a persuasive normative theory of problem solving only. For framing problems, we need to look elsewhere. We will propose a slightly different set of normative criteria by drawing on principles of optimal stopping. Instead of having an agent do something they are not in a position to do (place probabilities on unanticipated hypotheses), we will have them instead focus on more tractable aspects of the learning situation (e.g., evaluate the importance of the problem, the upside if it proves fruitful, the downside if it is a waste of time). This will yield different normative criteria for rational inference. We will also consider whether this is better thought of as a non-Bayesian approach to learning, or just a reframing of the traditional Bayesian paradigm.
Abstract: Expected utility theory problematically falls silent on the comparison of some prospects whose possible payoffs have unbounded utility. However, natural ways to extend expected utility theory can quickly lead to inconsistency. This talk considers the effect of the affine invariance principle, which says that an affine transformation of the utility of outcomes preserves the ordering of prospects. The affine invariance principle is shown to have surprising consequences in contexts where utility is unbounded, but is also shown to be consistent with some natural extensions of expected utility theory to settings with unbounded utility. Some philosophical motivations for accepting the affine invariance principle are also considered.
Abstract: Adherents to non-frequentist metaphysics of probability have often claimed that the axioms of frequentist theories can be derived as theorems of alternate characterizations of probability by means of statistical convergence theorems. One representative such attempt is Karl Popper's "bridge" between propensities and frequencies, discussed in his books Logic of Scientific Discovery and Realism and the Aim of Science. What makes Popper's argument particularly interesting is that he seemed much more sympathetic to the frequentists than the other theorists who promoted similar claims. In particular, while Richard von Mises criticized the use of the Law of Large Numbers in interderivability arguments, Popper was clearly aware of this worry—he even joined von Mises in criticizing his contemporary Frechet's more heavy-handed application of it. What Popper thought set his version of the bridge argument apart was his use of the almost-sure strong law in place of the weak law. SLLN has a measure zero exclusion clause, which Popper claimed could be unproblematically interpreted as probability zero. While in other contexts he agreed with von Mises that taking low propensity to entail low frequency requires frequentist assumptions, the zero case, according to Popper, was special. To defend this claim, he relied on a contextually odd application of the Cournot principle.
In this project I investigate two related, understudied elements of the Popper/von Mises dispute. First, I provide an explanation of the mutual misunderstanding between von Mises and Popper about the admissibility of the claim that measure zero sets are probability zero sets. Popper takes von Mises as levying the criticism that claims along the lines that "low propensity means common in a long sequence of trials" are inaccurate (a claim von Mises elsewhere makes) when in fact von Mises is instead concerned that such claims are circular or fundamentally frequentist. This explains, but does not entirely justify, Popper's appeal to the Cournot principle. Second, I relay the worry that the use of convergence theorems in the context of propensities requires more auxiliary information than similar uses in frequentist theories. The SLLN, for example, requires that subsequent trials satisfy independence conditions (usually i.i.d.). I provide a charitable interpretation of Popper's project that better justifies that experimental iterations satisfy the antecedent of the SLLN, and also makes sense of Popper's references to Doob's impossibility theorem. I conclude by reflecting that though this historical investigation paints Popper's approach as more statistically informed than is commonly thought, it does not get him entirely out of trouble.
Abstract: At the heart of the ‘classical’ approach to probability is the idea that the probability of a proposition quantifies the proportion of possible worlds in which it is true. This idea has a proud history, but it has fallen on hard times. I aim to rejuvenate it. Firstly, I show how the metaphysics of quantity together with the physical laws might yield well-defined relative sizes for sets of physically possible worlds. Secondly, I argue that these relative sizes are apt to reconcile the credence-guiding and frequency-explaining roles of probability.
Abstract: By the 17th century the Italian insurance industry was coming to use the word probability to refer to informal assessments of risk, and Pascal and Fermat soon began the calculus of gambling. But the latter never used the word probability, framing their inquiries instead in terms of expectation—the focus of insurance. At the end of the 17th century Bernoulli tried to bring these two lines together. A revolution in the historiography of probability occurred in 1978 with a paper by Shafer. Virtually all attention to Bernoulli’s Ars Conjectandi had focused only on the closing pages, where he famously proved the weak law of large numbers; Shafer was the first to notice, at least since the 18th century, Bernoulli’s struggle to integrate the two concepts. He also attributes the success of Bernoulli’s dualistic concept to Bernoulli’s widow and son having withheld publication for 8 years following his death, and to eulogies having created the impression that Bernoulli had succeeded in his ambition of applying the calculus of chances to “civil, moral, and economic matters.” Lambert improved Bernoulli’s formula for the combination of probabilities 50 years later, but did not address the question of a metric for epistemic probability, or the meaningfulness of combining them with chances. I suggest that no such meaningful combination is possible. But Bernoulli’s attempted integration promised that all uncertainty could be quantified, and the promise was philosophical heroin to the world of Hume. Laplace’s Rule of Succession was of no use to scientists, but was beloved of philosophers for over a century. Criticisms of it on metaphysical grounds led to Fisher’s approach, 150 years later, based on intervening developments in astronomy. His theory still desperately straddled the tension between aleatory and epistemic probabilities. Jerzy Neyman set about to purify Fisher’s theory of the epistemic elements, and was led to scrap all reference to inference, leaving him with a theory of statistical decision making, with principal application to quality control in manufacturing. Bayesians, meanwhile, were eager to retain epistemic reference, but wanted epistemic probabilities to be measured on the same scale as aleatory probabilities. That left them with, among other problems, the inability to represent ignorance meaningfully. Efforts to adhere consistently to the requirements of epistemic probability impel us to abandon reference to statistics, as efforts to adhere to the requirements of an aleatory conception impel us to abandon reference to inference. Statistics and inference, I conclude, have really nothing more to do with each other than do ethics and trigonometry.
Abstract: I present a theory of parity or imprecision in which it is represented by an approximate difference in the value of two items or an approximate ratio in the credence of two propositions.
Abstract: We consider two ways one might use algorithmic randomness to characterize a probabilistic law. The first is a generative chance* law. Such laws involve a nonstandard notion of chance. The second is a probabilistic* constraining law. Such laws impose relative frequency and randomness constraints that every physically possible world must satisfy. While each notion has virtues, we argue that the latter has advantages over the former. It supports a unified governing account of non-Humean laws and provides independently motivated solutions to issues in the Humean best-system account. On both notions, we have a much tighter connection between probabilistic laws and their corresponding sets of possible worlds. Certain histories permitted by traditional probabilistic laws are ruled out as physically impossible. As a result, such laws avoid one variety of empirical underdetermination, but the approach reveals other varieties of underdetermination that are typically overlooked.
Paper link: https://arxiv.org/abs/2303.01411
Abstract: Shafer (1985) issued a stark warning to subjective Bayesians against the dangers of unqualified appeals to the rule of conditionalization outside the context of a well-defined information protocol. In this talk, I will explain how the failure to heed Shafer's warning has led to equally unjustified appeals to other, more general rules for probabilistic updating – in particular, the rule of relative entropy maximization. I will explain how certain puzzles related to maximum entropy reasoning – such as Shimony's puzzle and the Judy Benjamin Problem – can be viewed as generalized versions of the problems facing conditionalization that prompted Shafer's initial critique. I will conclude the talk by extracting from these examples some general philosophical lessons relating to subjective interpretations of probability.
Sept. 19, 2022: Bayesianism via Likelihood Theory, Hartry Field, NYU
Oct. 3, 2022: That Does Not Compute: David Lewis on Credence and Chance, Gordon Belot, University of Michigan
Oct. 31, 2022: Bootstrapping Objective Probabilities for Evolutionary Biology, Marshall Abrams, University of Alabama at Birmingham
Nov. 28, 2022: Beyond Neyman-Pearson, Peter Grünwald, CWI/Amsterdam
Abstract: Bayesianism is best introduced not by the usual arguments for it but by its applications. And a good way to focus on its applications is by comparison to the likelihood ratio approach to the confirmation of statistical hypotheses. The latter turns out to be a much more general theory than is usually appreciated, but does have limitations that a recognizably Bayesian approach overcomes. An advantage of approaching Bayesianism via likelihood theory is that doing so leads naturally to a questioning of some harmful idealizations built into crude forms of Bayesianism.
Abstract: Following Lewis, many philosophers hold reductionist accounts of chance (on which claims about chance are to be understood as claims that certain patterns of events are instantiated) and maintain that rationality requires that credence should defer to chance (in the sense that one's credence in an event, conditional on the chance of that event being x, should be x). It is a shortcoming of an account of chance if it implies that this norm of rationality is unsatisfiable by computable agents. Here it is shown, using elementary considerations from the theories of inductive learning and of algorithmic randomness, that this shortcoming is more common than one might have hoped.
While there are difficult philosophical problems concerning quantum mechanics that are relevant to our understanding of the nature of quantum mechanical probabilities, it's clear that (a) the probabilities seem to be fundamental to quantum mechanics, and (b) the numerical values of probabilities are given by the mathematical theory of quantum mechanics (perhaps via some extension such as GRW). There are also philosophical problems concerning the nature of probabilities that arise in statistical mechanics, but the mathematical theories of statistical mechanics that are relevant to these problems usually constrain the numerical values of these probabilities fairly narrowly.
The situation is different in evolutionary biology, where widespread uses of models and frequentist statistical inference in empirical research seem to depend on the fact that evolving populations realize (something like) objective probabilities. (These might be objective imprecise probabilities.) But we have had little guidance about how to think about the fundamental nature of these "evolutionary" probabilities.
There is a common problem with all of these proposals which is due to the fact that there is an enormous amount of heterogeneity among the interacting components in an evolving population in an environment, with many interactions between different levels. In my book, I call this property of evolutionary processes "lumpy complexity". This kind of complex causal structure of an evolving populations makes them extremely difficult to understand well, and difficult to model except very approximately. This makes it difficult to spell out details of the above proposals in a way that allows us to understand what gives rise to the numerical values of probabilities in evolutionary biology.
I argue that at least in many cases, there is a different sort of explanation of the probabilistic character of biological populations in environments. This strategy can explain some important numerical values of relevant probabilities, and sidesteps the sorts of problems highlighted above. It is a strategy that is not available for explaining objective probability in physical sciences. I outline the strategy below.
Empirical researchers have argued that many animals engage in a kind of random walk known as a Lévy walk or a Lévy flight: at each time step, the direction of motion is randomly chosen, and the length d of travel in that direction has probability proportional to dμ, where μ is near a particular value (usually 2).
Modeling arguments and a small amount of empirical research have been used to argue for the "Lévy Flight Foraging hypothesis" (LFF), which says that when food (or other resources) are sparsely distributed, it's adaptive for organisms to search randomly for the food by following a Lévy walk with parameter μ near 2. That is, the LFF is the claim that Lévy flight foraging, or more precisely foraging using Lévy walks, is the result of natural selection on internal mechanisms because that pattern of foraging is adaptive.
My argument can be outlined as follows.
Where such an explanation of evolutionary probabilities is applicable (and I argue that it is more widely applicable than one might think), we have an explanation of objective probability (or objective imprecise probability, or behavior that seems to involve probability) in evolving populations. This explanation avoids the problems of proposals that evolutionary probabilities should be understood as directly resulting from quantum mechanical, statistical mechanical, or microconstant/MM-CCS probabilities.
There are some reasons to think that human behavior is sometimes randomized by internal mechanisms that result either from natural selection or learning. If this is correct, people who interact with those whose behavior is randomized by such a mechanism would in turn experience randomized consequences of that behavior. Thus the explanatory strategy that I describe for evolutionary biology may also be relevant in social sciences.
Video – link to YouTube video for this seminar
A standard practice in statistical hypothesis testing is to mention the p-value alongside the accept/reject decision. We show the advantages of mentioning instead an e-value, an alternative measure of evidence that has recently started to attract attention. With p-values, we cannot use an extreme observation (e.g. p << alpha) for getting better frequentist decisions. With e-values we can, since they provide Type-I risk control in a generalized Neyman-Pearson setting with the decision task (a general loss function) determined post-hoc, after observation of the data—thereby providing a handle on the perennial issue of 'roving alpha's'. When Type-II risks are taken into consideration, the only admissible decision rules in this post-hoc setting turn out to be e-value-based. This provides e-values with an additional interpretation on top of their original one in terms of bets.
We also propose to replace confidence intervals and distributions by the *e-posterior*, which provides valid post-hoc frequentist uncertainty assessments irrespective of prior correctness: if the prior is chosen badly, e-intervals get wide rather than wrong, suggesting e-posterior credible intervals as a safer alternative for Bayes credible intervals. The resulting *quasi-conditional paradigm* addresses foundational and practical issues in statistical inference.
Six seminars in Spring 2022 were panel discussions; three were single-author seminars. Abstacts for panel discussions include questions for the panel, some include references, and all list panel members with links to their professional web pages.
January 24, 2022: Should decision theory guide the formulation of the principal principle?
Jenann Ismael, philosophy, Columbia
Itzhak Gilboa, economics, Tel Aviv & HEC Paris
Stephen Senn, statistics, consultant
Sherrilyn Roush, philosophy, UCLA
January 31, 2022: Are probability distributions for potential responses real and necessary for causal inference?
Philip Dawid, statistics, Cambridge emeritus
Ilya Shpitser, computer science, Johns Hopkins
Thomas Richardson, statistics, Washington
Sylvia Wenmackers, philosophy, KU Leuven
February 7, 2022: Are the repeated sampling principle and Cournot’s principle frequentist?
Marshall Abrams, philosophy, Alabama Birmingham
Ruobin Gong, statistics, Rutgers
Alistair Wilson, philosophy, Birmingham UK
Harry Crane, statistics, Rutgers
February 14, 2022: Cournot’s principle and the best-system interpretation of probability.
Isaac Wilhelm, philosophy, National University of Singapore
Ryan Martin, statistics, NC State
Alan Hájek, philosophy, Australian National University
Snow Xueyin Zhang, philosophy, NYU (philosophy, Berkeley, as of 2023)
February 28, 2022: Objective probability at different levels of knowledge.
Alex Meehan, philosophy, Yale
Monique Jeanblanc, mathematics, Evry
Barry Loewer, philosophy, Rutgers
Tahir Choulli, mathematics, Alberta
March 7, 2022: Game-theoretic probability in physics.
Glenn Shafer, business and statistics, Rutgers
Dustin Lazarovici, philosophy, Lausanne
Leah Henderson, philosophy, Groningen
Eddy Keming Chen, philosophy, UC San Diego
March 21, 2022: Interpreting Carnap: Seven Decades Later, Sandy Zabell, Northwestern
April 4, 2022: Extended probabilities and their application to statistical inference, Michele Caprio, Duke
April 11, 2022: Two Birds with One Coin: Convex Optimization and Confidence Sequences with Coin-Betting, Francesco Orabona, Boston University
Video – link to YouTube video for this seminar
Jenann Ismael, philosophy, Columbia
Itzhak Gilboa, economics, Tel Aviv & HEC Paris
Stephen Senn, statistics, consultant
Sherrilyn Roush, philosophy, UCLA
In 1980, the philosopher David Lewis gave a name to the principle that you should adopt known objective probabilities as subjective probabilities. He called it the principal principle. Lewis’s principle is mentioned in 16 different articles in the Stanford Encyclopedia of Philosophy, and it has been cited 287 times in philosophy journals indexed by JSTOR. But it has never been cited in statistics journals indexed by JSTOR.
The exact formulation of Lewis’s principal principle is a topic of continuing discussion. Lewis revised his initial formulation in his 1980 paper and later replaced it with a new principle. Other philosophers, including several who have spoken in our seminar, have proposed variations or qualifications. At first glance, Typ’s reasoning seems relevant to the discussion. Is it?
Questions for the panel:
References
Video – link to YouTube video for this seminar
Philip Dawid, statistics, Cambridge emeritus
Ilya Shpitser, computer science, Johns Hopkins
Thomas Richardson, statistics, Washington
Sylvia Wenmackers, philosophy, KU Leuven
Many statistical models hypothesize a joint probability distribution for an individual’s potential responses to different treatments even when only one treatment can be given. These models include structural equation and functional models as well as the potential response model popularized by Don Rubin in the 1970s. For more than two decades, Phil Dawid has advocated the more parsimonious decision-theoretic approach, in which a probability distribution for the individual’s response is given conditional on each possible treatment but no joint distribution is hypothesized.
Joint distributions for potential responses allow us to work fully within the familiar framework of probability measures. Their disadvantage, Dawid argues, is the confusion introduced by the introduction of meaningless assumptions (such as “treatment-unit additivity assumption”) and meaningless quantities (such the variance of the difference between an individual’s responses to two possible treatments). Discussion of confounding also becomes confusing in the context of probability distributions for unobservable and meaningless quantities. Dawid’s solution is to generalize the theory of conditional independence and DAGs from probability measures to decision models.
Dawid and other authors distinguish between studying the effects of causes (causal investigation as a guide to action) and studying the causes of effects (assignment of responsibility for outcomes already observed). In the latter case, Dawid is more open to “counterfactual” questions that the joint distribution of potential outcomes might help us answer, such as “given what happened when Mr. Defendant did A, what would likely have happened if he had done B instead.”
Questions for the panel:
References
Video – link to YouTube video for this seminar
Marshall Abrams, philosophy, Alabama Birmingham
Ruobin Gong, statistics, Rutgers
Alistair Wilson, philosophy, Birmingham UK
Harry Crane, statistics, Rutgers
Historically, the probability calculus began with repeated trials: throws of a fair die, etc. Independent and identically distributed (iid) random variables still populate elementary textbooks, but most statistical models permit variation and dependence in probabilities. Already in 1816, Pierre Simon Laplace explained that nature does not follow any constant probability law; its error laws vary with the nature of measurement instruments and with all the circumstances that accompany them. In 1960, Jerzy Neyman explained that scientific applications had moved into a phase of dynamic indeterminism, in which stochastic processes replace iid models.
Statisticians who call themselves “frequentists” have proposed two competing principles to support the inferences about parameters in stochastic processes and other complex probability models.
Cox and Hinkley coined the name “repeated sampling principle” in 1974. The name “Cournot’s principle” was first current in the 1950s. But both ideas are much older. When interpreted as pragmatic instructions, the two principles are more or less equivalent. But they can also be interpreted as philosophical justifications – even as explanations of the meaning of probability models – and then they seem very different.
Questions for the panel:
References
Video – link to YouTube video for this seminar
Isaac Wilhelm, philosophy, National University of Singapore
Ryan Martin, statistics, NC State
Alan Hájek, philosophy, Australian National University
Snow Xueyin Zhang, philosophy, NYU (philosophy, Berkeley, as of 2023)
Following his scholastic predecessors, Jacob Bernoulli equated high probability with practical certainty. Theoretical and applied probabilists and statisticians have done the same ever since. In 1843, Antoine Augustin Cournot suggested that this is the only way that mathematical probability can be connected with phenomena. In 1910, Aleksandr Aleksandrov Chuprov called the equation of high probability with practical certainty Cournot’s lemma. In the early 1940s, Émile Borel called it the only law of chance. In 1949, Maurice Fréchet called it Cournot’s principle.
The Bernoullian (non-Bayesian) statistician uses Cournot’s principle in three ways.
When the model is complex, only some events of high probability can happen. In fact, what does happen, described in detail, will have negligible or zero probability. (This is the fearsome lottery paradox.) So careful statements of Cournot’s principle always limit the events close to zero or one that are taken into consideration. Borel insisted that these events be “remarkable in some respect” and specified in advance. Richard von Mises, Abraham Wald, Jean Ville, and Alonzo Church, working in the idealization of an infinite number of trials, limited the events by requiring that they be definable in some language or computable in some sense. Andrei Kolmogorov brought this back to finite reality by giving a coherent definition of complexity (and simplicity) for elements of finite sets. The model predicts only those events of high probability that are simply described.
The best-system interpretation of objective probability goes back to the work of David Lewis in the 1990s. Lewis proposed that a system of probabilities should be considered objective if it provides the best description of our world, where “best” involves balancing simplicity, strength (how much is predicted), and fit (how high a probability is given to what happens). Critics have pointed out that this balance has remained nebulous.
Questions for the panel:
References
Video – link to YouTube video for this seminar
Alex Meehan, philosophy, Yale
Monique Jeanblanc, mathematics, Evry
Barry Loewer, philosophy, Rutgers
Tahir Choulli, mathematics, Alberta
In his 1843 book on probability, Cournot argued that objective probabilities can be consistent with God’s omniscience. As he saw the matter, truly objective probabilities are the probabilities of a superior intelligence at the upper limit of what might be achieved by human-like intelligence. As he explained,
Surely the word chance designates not a substantial cause, but an idea: the idea of the combination of many systems of causes or facts that develop, each in its own series, each independently of the others. An intelligence superior to man would differ from man only in erring less often or not at all in the use of this idea. It would not be liable to consider series independent when they actually influence each other in the causal order; inversely, it would not imagine a dependence between causes that are actually independent. It would distinguish with greater reliability, or even with rigorous exactness, the part due to chance in the evolution of successive phenomena. . . . In a word, it would push farther and apply better the science of those mathematical relations, all tied to the idea of chance, that become laws of nature in the order of phenomena.
In the 1970s and 1980s, the consistency of probabilities at different levels of knowledge was studied within measure-theoretic probability by Paul-André Meyer’s Strasbourg seminar. One question studied was whether the semimartingale property is preserved when a filtration is enlarged. A discrete-time process S is a semimartingale with respect to a filtration F if it is the sum of two processes, say S = A + B, where
An observer whose knowledge is represented by a larger filtration F* knows more and so can predict more, but if the remaining unpredictable part is still a martingale, then the probabilities with respect to F can be considered just as objective as those with respect to F*. In discrete time, the semimartingale property is always preserved, but in continuous time regularity conditions are needed.
The most extreme enlargement of any filtration is the filtration that knows the world’s entire trajectory at the outset. In Cournot’s picture, this would be God’s filtration. Every process is predictable and hence a semimartingale with respect to this filtration.
In recent decades, the enlargement of filtrations has been used to study insider trading and default risk in mathematical finance.
The philosophers Barry Loewer and David Albert, inspired largely by statistical mechanics, have proposed a picture roughly similar to Cournot’s, in which the superior intelligence is represented by “standard Lebesgue measure over the physically possible microstates” consistent with a description of the universe right after the Big Bang, and God is replaced by David Lewis’s Humean mosaic. Loewer has called this the “Mentaculus vision”.
Questions for the panel:
References
Video – link to YouTube video for this seminar
Glenn Shafer, business and statistics, Rutgers
Dustin Lazarovici, philosophy, Lausanne
Leah Henderson, philosophy, Groningen
Eddy Keming Chen, philosophy, UC San Diego
In measure-theoretic probability, expected values (prices) may change through time, but the way in which they change is set in advance by a comprehensive probability measure. Game-theoretic probability generalizes this by introducing players who may take actions and set prices as a betting game proceeds. In the case of discrete time, this means that a probability tree is replaced by a decision tree. Statistical testing of the prices is still possible. But instead of testing a comprehensive probability measure by checking that a simple event of small probability specified in advance does not happen (Cournot’s principle), we test the prices, or the forecaster who sets them, by checking that a simple betting strategy specified in advance does not multiply its capital by a large factor. See Glenn Shafer’s “Testing by betting” (2021).
Because it provides explicitly for a decision maker, game-theoretic probability accommodates John von Neumann’s axioms for quantum mechanics in a straightforward way, as explained on pp. 189-191 of Shafer and Vovk’s 2001 book. As further explained on pp. 216-217 of their 2019 book, it also allows a programmer to test whether a quantum computer is performing correctly.
Gurevich and Vovk have argued that the game-theoretic formulation helps take the mystery out of the role of negative probabilities in Wigner’s quasi-probability distribution. The mystery dissolves because the distribution is nothing more than a rule that tells a forecaster how to set prices.
Advocates of Cournot’s principle in statistical mechanics have pointed out that the Gibbs distribution is needed only to rule out events to which it gives probability near zero (see, e.g., Goldstein et al. 2020). Any measure equivalent in the sense of absolute continuity would do the same job; the Gibbs distribution is distinguished from the others mainly by its implausible assumption of particles’ initial independence of each other. Ken Huira (2021) has argued that the game-theoretic formulation can again provide clarification. The distribution does not “generate” outcomes; it merely sets prices.
Questions for the panel:
References
Video – link to YouTube video for this seminar
Abstract: In 1950 and 1952 Rudolph Carnap published his Logical Foundations of Probability and The Continuum of Inductive Methods, setting out his views on probability and inductive inference. Given his prestige, it is not surprising that these together sparked both interest and controversy. This talk surveys that discussion and debate: what criticisms were advanced, how did Carnap's views evolve, how has his program been advanced? A primary thesis of this talk is that many of these issues can be most readily understood from the standpoint of modern probability, rather than Carnap's initial logical and sentential approach.
Video – link to YouTube video for this seminar
Abstract: We propose a new, more general definition of extended probability measures. We study their properties and provide a behavioral interpretation. We use them in an inference procedure, whose environment is canonically represented by the probability space (Ω,F,P), when both P and the composition of Ω are unknown. We develop an ex-ante analysis – taking place before the statistical analysis requiring knowledge of Ω – in which we progressively learn the true composition of Ω. We provide an example in the field of ecology.
Video – link to YouTube video for this seminar
Abstract: Consider the following two problems. The first one is calculating valid and numerically sharp confidence sequences for the unknown expectation of a bounded random variable. The second problem is optimizing an arbitrary convex function with the smallest number of accesses to its stochastic gradients. Surprisingly, we will show that both problems can be solved through a reduction to a simple gambling game: betting money on the outcomes of a coin.
First, we will explain that the problem of betting money on a coin can be solved with optimal algorithms from the universal compression/gambling literature. These algorithms guarantee an exponential growth rate of the wealth of the gambler, even without stochastic assumptions on the coin. In turn, this exponential wealth will allow us to design a reduction to obtain state-of-the-art valid confidence sequences for the expectation of bounded random variables. In particular, our confidence sequences are never vacuous, even with a single sample. Moreover, another reduction will allow us to convert the same betting algorithm into an optimal online convex optimization algorithm.
Emphasis will be given to the history of these ideas in the fields of information theory, game-theoretic probability, and online learning.
September 20, 2021: Descriptive probability, Glenn Shafer, Rutgers
October 4, 2021: Pseudochance vs. true chance in complex systems, Marshall Abrams, University of Alabama Birmingham
October 18, 2021: From domain-specific probability models to evaluation of model-based probability forecasts Tze Leung Lai, Stanford
November 1, 2021: Borel and Bertrand, Snow Zhang, New York University
November 8, 2021: Measuring severity in statistical inference, Robin Gong, Rutgers
November 15, 2021: The Typical Principle, Isaac Wilhelm, National University of Singapore
November 22, 2021: A framework for non-fundamental chance, Alexander Meehan, Yale
November 29, 2021: ALL-IN meta-analysis,, Judith ter Schure, University of Leiden
December 6, 2021: Exploitability and accuracy, Kevin Blackwell, Bristol
Abstract: As the late Berkeley statisticians Leo Breiman (1928-2005) and David Freedman (1938-2008) taught us, most statistical models in the social sciences, especially regression models, are invalid. Their conclusions are at best descriptive.
Some sociologists and other applied statisticians have proposed treating these dubious regression studies as purely descriptive, refraining from significance tests and confidence statements. This is seldom done, because people want some sense of the precision of the regression parameters.
Descriptive intervals for parameter values can be obtained if we imagine parameter values betting against each other using Kelley betting. This allows us to interpret relative likelihood as relative predictive success. See the working paper at http://probabilityandfinance.com/articles/59.pdf.
Abstract: Pseudorandom number generating algorithms play a variety of roles in scientific practice and raise a number of philosophical issues, but they have received little attention from philosophers of science. In this talk I focus on pseudorandom number generating algorithm implementations (PRNGs) in simulations used to model natural processes such as evolving biological populations. I argue that successful practices involving such simulations, and reflection on the modeled processes, provide reasons to think that many natural processes involve what I call “pseudochance”, which is an analogue of chance or objective probability, and which is what is realized by PRNGs. Pseudochance contrasts with what I call “true chance,” the kind of objective probability that many objective interpretations of probability claim to describe. On my view, when philosophers speak of “chance” or “objective probability,” they have probably intended the term to refer to true chance, but have often applied it to systems that, I would argue, plausibly exhibit only pseudochance.
Abstract: This talk is based on a related survey/position paper on domain-specific probability definitions and stochastic models, which have been in the probability/statistics literature from 1933 to today (and in other disciplines). It concludes with the martingale approach to the evaluation of model-based probability forecasts and gives some empirical examples.
Abstract: The Borel-Kolmogorov paradox is the phenomenon that, sometimes, when P(E)=0, the probability of H given E appears to depend on the specification of the sigma algebra from which E is drawn. One popular diagnosis of the paradox is that it reveals a surprising fact about conditional probability: when P(E)=0, the value of P(H|E) depends on the choice of a sigma algebra. As Kolmogorov himself put it, "[the paradox shows that] the concept of a conditional probability with regard to an isolated given hypothesis whose probability equals 0 is inadmissible" (p.51, 1956).
This talk has two parts. The negative part raises some problems for the Kolmogorov-inspired relativistic conception of rational conditional probability. The positive part proposes an alternative diagnosis of the paradox: it is an instance of a more familiar problem — the problem of (conditional) priors. I conclude by applying my diagnosis to a different debate in the foundations of probability: the issue of countable additivity.
Abstract: Severity (Mayo, 2018) is a principle of statistical hypothesis testing. It assesses the hypothesis test in relation to the claim it makes, and the data on which the claim is based. Specifically, a claim C passes a severe test with the data at hand, if with high probability the test would have found flaws with C if present, and yet it does not.
In this talk, I discuss how the concept of severity can be extended beyond frequentist hypothesis testing to general statistical inference tasks. Reflecting the Popperian notion of falsifiability, severity seeks to establish a stochastic version of modus tollens, as a measure of strength of probabilistic inference. Severity measures the extent to which the inference resulting from an inferential strategy is warranted, in relation to the body of evidence at hand. If the current available evidence leads a method to infer something about the world, then were it not the case, would the method still have inferred it? I discuss the formulation of severity and its properties, and demonstrate its assessment and interpretation in examples that follow either the frequentist or Bayesian traditions as well as beyond. A connection with significance function (Fraser, 1991) and confidence distribution (Xie & Singh, 2013) is drawn. These tools and connections may enable the assessment of severity in a wide range of modern applications that call for evidence-based scientific decision making.
Video – link to YouTube video for this seminar
Abstract: If a proposition is typically true, then so long as you have no evidence to the contrary, you should believe that proposition; or so I argue here. In this paper, I propose and defend a principle of rationality—call it the `Typical Principle'—which links rational belief to facts about what is typical. As I show, this principle avoids several problems that other, seemingly similar principles face. And as I show, in many cases, this principle implies the verdicts of the Principal Principle: much of what the Principal Principle says about rational credence, in other words, follows from the Typical Principle.
Abstract: Many authors have sought to make sense of special science probabilities as objective. An appeal of this approach is that it could provide a straightforward realist account of the success of chance models in actual science. But the view faces several challenges and questions, including (1) what unites these various special science probabilities as objective chances?, and (2) how autonomous are they from the fundamental dynamical chances? In this talk I propose a general framework for theorizing about chance that lets us make some headway on questions (1) and (2), without presuming a special metaphysical or physical commitment like the Humean Best Systems Analysis or Mentaculus. A key ingredient of the framework is the Parent Principle, which reduces to the Principal Principle and New Principle in relevant special cases, and induces coherence constraints between chances at different levels. Indeed I show that if we accept the Parent Principle together with standard constraints on rational credence, then a Mentaculus-type view already follows from two relatively straightforward assumptions about our world’s chances.
Abstract: The game-theoretic approach to statistical inference has major philosophical and mathematical advantages. Is it useful in statistical practice as well? This talk covers its use in the design of clinical trials that aim to contribute to an ALL-IN meta-analysis, for Anytime, Live and Leading Interim meta-analysis. We deal with practical problems in clinical trials, e.g. recruitment issues and lack of power, and reflect on ongoing debates in clinical trials and meta-analysis, e.g. about representativeness and random-effects modelling. We conclude that game-theoretic statistical inference has much to offer especially since it clarifies our goals and assumptions in clinical trial research. See https://arxiv.org/abs/2109.12141 for more information.
Abstract: I’ll begin with a very basic, very fast summary of the approach for assessing the accuracy of sets of desirable gambles being developed by Jason Konek; the notions of accuracy that I will discuss in this talk are very much in the same vein, although I will flag some important differences. (This introduction will also include an even briefer introduction to sets of desirable gambles, for anyone who isn’t familiar with them.)
Next, I motivate an interpretation of the “Falsity” score: how exploitable (by better-informed agents) does an agent’s credal state render them?
I will then go through the assumptions and results of an approach to accuracy that I termed “Select-A-Size Accuracy”, which for two-dimensional sets of desirable gambles (gambles on a binary partition), provides exactly what we want: a family of accuracy scores such that (1) according to every member of this family, every incoherent set of desirable gambles is accuracy-dominated; and (2) for every coherent set of desirable gambles, there is some element of this family which renders that set of desirable gambles not merely accuracy-undominated but Imprecisely Immodest.
…But this approach doesn’t generalize to higher dimensions; I briefly discuss why that is.
Finally, I’ll present some early developments of a game-theoretic approach to accuracy which is closely related to the exploitability notion used in constructing Select-A-Size Accuracy. I will also gesture in the direction of, but not really discuss, another development of this same exploitability notion: Arthur Van Camp’s accuracy order.