Archive of Seminar on History and Foundations of Probability and Statistics Fall 2021 to Spring 2024

return to Spring 2024 menu

Abstracts – Spring 2024 Seminars

February 15, 2024

Good Guesses: The Conjunction Fallacy and the Tradeoff between Accuracy and Informativity

Kevin Dorst, MIT

The conjunction fallacy is the well-documented empirical finding that subjects sometimes rate a conjunction A&B as more probable than one of its conjuncts, A. Most explanations appeal in some way to the fact that B has a high probability. But Tentori et al. (2013) have recently challenged such approaches, reporting experiments which find that (1) when B is confirmed by relevant evidence despite having low probability, the fallacy is common, and (2) when B has a high probability but has not been confirmed by relevant evidence, the fallacy is less common. They conclude that degree of confirmation, rather than probability, is the central determinant of the conjunction fallacy. In this paper, we address a confound in these experiments: Tentori et al. (2013) failed to control for the fact that their (1)-situations make B conversationally relevant, while their (2)-situations do not. Hence their results are consistent with the hypothesis that conversationally relevant high probability is an important driver of the conjunction fallacy. Inspired by recent theoretical work that appeals to conversational relevance to explain the conjunction fallacy, we report on two experiments that control for this issue by making B relevant without changing its degree of probability or confirmation. We find that doing so increases the rate of the fallacy in (2)-situations, and leads to comparable fallacy-rates as (1)-situations. This suggests that (non-probabilistic) conversational relevance indeed plays a role in the conjunction fallacy, and paves the way toward further work on the interplay between relevance and confirmation.

February 29, 2024

Probing the qualitative-quantitative divide in probability logics

Krzysztof Mierzewski, CMU

Several notable approaches to probability, going back to at least Keynes (1921), de Finetti (1937), and Koopman (1940), assign a special importance to qualitative, comparative judgments of probability ("event A is at least as probable as event B"). The difference between qualitative and explicitly quantitative probabilistic reasoning is intuitive, and one can readily identify paradigmatic instances of each. It is less clear, however, whether there are any natural structural features that track the difference between inference involving comparative probability judgments on the one hand, and explicitly numerical probabilistic reasoning on the other. Are there any salient dividing lines that can help us understand the relationship between the two, as well as classify intermediate forms of inference lying in between the two extremes? In this talk, based on joint work with Duligur Ibeling, Thomas Icard, and Milan Mossé, I will explore this question from the perspective of probability logics.

Probability logics can represent probabilistic reasoning at different levels of grain, ranging from the more "qualitative" logic of purely comparative probability to explicitly "quantitative" languages involving arbitrary polynomials over probability terms. As I will explain, when classifying these systems in terms of expressivity, computational complexity, and axiomatisation, what emerges as a robust dividing line is the distinction between systems that encode merely additive reasoning from those that encode additive and multiplicative reasoning. I will show that this distinction tracks a divide in computational complexity (NP-complete vs. ETR-complete) and in the kind of algebraic tools needed for a complete axiomatisation (hyperplane separation theorems vs. real algebraic geometry). I will present new completeness results and a result on the non-finite-axiomatisability of comparative probability, and I will conclude with some overlooked issues concerning the axiomatisation of comparative conditional probability. One lesson from this investigation is that, for the multiplicative probability logics as well as the additive ones, the paradigmatically "qualitative" systems are neither simpler in terms of computational complexity nor in terms of axiomatisation, while losing in expressive power to their explicitly numerical counterparts.

return to Spring 2024 menu

May 2, 2024

Updating by maximizing expected accuracy in infinite non-partitional settings

Kenny Easwaran, UC Irvine

Greaves and Wallace (2006) justify Bayesian conditionalization as the update plan that maximizes expected accuracy, for an agent considering finitely many possibilities, who is about to undergo a learning event where the potential propositions that she might learn form a partition. In recent years, several philosophers have generalized this argument to less idealized circumstances. Some authors (Easwaran (2013b); Nielsen (2022)) relax finiteness, while others (Carr (2021); Gallow (2021); Isaacs and Russell (2022); Schultheis (2023)) relax partitionality. In this paper, we show how to do both at once. We give novel philosophical justifications of the use of σ-algebras in the infinite setting, and argue for a different interpretation of the “signals” in the non-partitional setting. We show that the resulting update plan mitigates some problems that arise when only relaxing finiteness, but not partitionality, such as the Borel-Kolmogorov paradox.

return to Spring 2024 menu

Fall 2023

Sept. 21, 2023: The Flawed Genius of William Playfair, David Bellhouse, Western University

Sept. 28, 2023: Totality, Regularity, and Cardinality in Probability Theory, Paolo Mancosu, Berkeley, Philosophy

Oct. 5, 2023: Probabilistic Systems and the Stochastic-Quantum Theorem, Jacob Barandes, Harvard

Oct 19, 2023: One Way Testing by Betting Can Improve Data Analysis: Optional Continuation, Glenn Shafer, Rutgers

Nov. 9, 2023: Chance Combinatorics, John Norton, Pittsburgh

Abstracts – Fall 2023 Seminars

September 21, 2023

The Flawed Genius of William Playfair

David Bellhouse, Western University

Abstract: variously as a statistician, economist, engineer, banker, scam artist, and political propagandist, among other activities. He began with much promise working for Boulton and Watt of steam engine fame, saw many ups and downs throughout his career that ranged from being moderately well off to bankruptcy and imprisonment, and ended his life in poverty. His flaw was his inability to deal with his personal and professional finances. His lasting contributions are in statistics, where he is known as the father of statistical graphics, and in economics, where he is known for his contributions to the first posthumous edition of Adam Smith’s Wealth of Nations.

In this talk, I will focus on Playfair’s graphs, which were motivated by political and economic issues of his day. I will also interweave biographical information with a discussion of the graphs. Playfair invented time series line graphs, as well as the now ubiquitous bar chart and the pie chart. He also introduced innovative ways to display multivariate data. Many of the graphical elements that Playfair used are now standard elements in statistical graphics today. For example, Playfair pioneered the use of colour in his graphs, when colour was rarely used in the printing process. I have left out spying for the British government as one of Playfair’s activities. I will address this claim as part of the talk.

September 28, 2023

Totality, Regularity, and Cardinality in Probability Theory

Paolo Mancosu, Philosophy Department, UC Berkeley

Abstract: A probability space is given by a triple (Ω, 𝔍, P) where Ω is a set called the sample space, 𝔍 is a σ-algebra of subsets of Ω, and P is a probability function from 𝔍 to the interval [0, 1]. The standard Kolmogorovian approach to probability theory on infinite sample spaces is neither regular nor total. Totality, expressed set-theoretically, is the request that every subset of the sample space Ω is measurable, i.e. has a probability value. Regularity, expressed set-theoretically, is the request that only the empty set gets probability 0. Mathematical and philosophical interest in non-Kolmogorovian approaches to probability theory in the last decade has been motivated by the possibility to satisfy totality and regularity in non-Archimedean contexts (Wenmackers and Horsten 2013, Benci, Horsten, Wenmackers 2018). Much of the mathematical discussion has been focused on the cardinalities of the sample space, the algebra of events, and the range (Hájek 2011, Pruss 2013, Hofweber 2014). In this talk I will present some new results characterizing the relation between completeness and regularity in a variety of probabilistic settings and I will give necessary and sufficient conditions relating regularity and the cardinalities of the sample space, the algebra of events, and the range of the probability function, thereby improving on the results hitherto available in the literature. This is joint work with Guillaume Massas (Group in Logic, UC Berkeley).

Bibliography

Benci, V., Horsten, L., and Wenmackers, S. (2018), “Infinitesimal Probabilities”, The British Journal for the Philosophy of Science, 69, 509–552.

Hajek, A. (2011), “Staying Regular?”, unpublished typescript.

Hofweber, T. (2014), “Cardinality arguments against regular probability measures”, Thought, 3, 166–175.

Pruss, A. R. (2013), “Probability, regularity, and cardinality”, Philosophy of Science, 80, 231–240.

Wenmackers, S., and Horsten, L. (2013), “Fair infinite lotteries”, Synthese, 190, 37–61.

October 5, 2023

Probabilistic Systems and the Stochastic-Quantum Theorem

Jacob Barandes, Harvard University

Abstract: On the one hand, scientists across disciplines act as though various kinds of phenomena physically occur, perhaps according to probabilistic laws. On the other hand, textbook quantum theory is an instrumentalist recipe whose only predictions refer to measurement outcomes, measurement-outcome probabilities, and statistical averages of measurement outcomes over measurement-outcome probabilities. The conceptual gap between these two pictures makes it difficult to see how to reconcile them.

In this talk, I will present a new theorem that establishes a precise equivalence between quantum theory and a highly general class of stochastic processes, called generalized stochastic systems, that are defined on configuration spaces rather than on Hilbert spaces. From a foundational perspective, some of the mysterious features of quantum theory – including Hilbert spaces over the complex numbers, linear-unitary time evolution, the Born rule, interference, and noncommutativity – then become the output of a theorem based on the simpler and more transparent premises of ordinary probability theory. From a somewhat more practical perspective, the stochastic-quantum theorem leads to a new formulation of quantum theory, alongside the Hilbert-space, path-integral, and phase-space formulations, potentially opens up new methods for using quantum computers to simulate stochastic processes beyond the Markov approximation, and may have implications for how we think about quantum gravity.

This talk is based on two papers:

1. https://arxiv.org/abs/2309.03085 / http://philsci-archive.pitt.edu/22502

2. https://arxiv.org/abs/2302.10778 / http://philsci-archive.pitt.edu/21774

October 19, 2023

One Way Testing by Betting Can Improve Data Analysis: Optional Continuation

Glenn Shafer, Rutgers University

Abstract: When testing a statistical hypothesis, is it legitimate to deliberate on the basis of initial data about whether and how to collect and analyze further data? My 2019 book with Vladimir Vovk, Game-Theoretic Foundations for Probability and Finance, says YES, provided that you are testing by betting and do not risk more capital than initially committed. Standard statistical theory does not allow such optional continuation. Related paper: https://arxiv.org/abs/2308.14959

November 9, 2023

Chance Combinatorics

John Norton, University of Pittsburgh

Abstract: Seventeenth century “chance combinatorics” was a self-contained theory. It had an objective notion of chance derived from physical devices with chance properties, such as die casts, combinatorics to count chances and, to interpret their significance, a rule for converting these counts into fair wagers. It lacked a notion of chance as a measure of belief, a precise way to connect chance counts with frequencies and a way to compare chances across different games. These omissions were not needed for the theory’s interpretation of chance counts: determining which are fair wagers. The theory provided a model for how indefinitenesses could be treated with mathematical precision in a special case and stimulated efforts to seek a broader theory.

Prof. Norton is a Distinguished Professor in the Department of History and Philosophy of Science at the University of Pittsburgh, where he works on the history and philosophy of physics and probability. His website can be found here. He is well-known for his analysis of Einstein's notes, his theory of thought experiments, and the Norton dome problem, exhibiting nondeterminism in Newtonian mechanics. Norton's well-reviewed recent book The Material Theory of Induction was the inaugural volume in the BSPS Open book series.

Spring 2023

January 23, 2023: Inference, Optimal Stopping and New Hypotheses, Boris Babic, University of Toronto

February 27, 2023: Symmetry of Value, Zachary Goodsell, USC

March 20, 2023: Von Mises, Popper, and the Cournot Principle, Tessa Murthy, Carnegie Mellon University

March 27, 2023: Probability from Symmetry, Ezra Rubenstein, Berkeley

April 3, 2023: Why the Concept of Statistical Inference Is Incoherent, and Why We Still Love It, Michael Acre, Senior Statistician (Retired), University of California, San Francisco

April 17, 2023: Parity, Probability and Approximate Difference, Kit Fine, NYU

April 24, 2023: Algorithmic Randomness and Probabilistic Laws, (with Jeffrey A. Barrett, UC Irvine) Eddy Keming Chen, UC San Diego

May 1, 2023: Entropy and Subjectivism, Anubav Vasudevan, Chicago

Abstracts – Spring 2023 Seminars

January 23, 2023

Inference, Optimal Stopping and New Hypotheses

Boris Babic, University of Toronto

Abstract: Boris Babic, Anil Gaba, Ilia Tsetlin and Robert L. Winkler

In this project we address the Bayesian problem of new hypotheses and attempt to reframe normative constraints for rational inference in situations characterized by substantial unawareness about the possible outcomes. In particular, we first argue that we can meaningfully distinguish two types of learning scenarios: problem framing and problem solving. Problem solving is the sort of thing we do when we have quite a bit of information about a problem—enough to identify the relevant outcomes, and sometimes to even put some meaningful prior probabilities on them. Problem framing is what happens when we encounter an issue for more or less the first time. For example, one steps into an organic chemistry class without any background knowledge of the underlying subject matter. In cases like these, it's unreasonable to expect such an agent to be aware of what she might learn, let alone to place probabilities on possible learning outcomes.

Problem framing, we will suggest, is the "hard" problem of learning. Problem solving is "easy.'' And while Bayesianism (or, more specifically, traditional Bayesian confirmation theory in philosophy of science) is often pitched as a theory of rational learning, it is a persuasive normative theory of problem solving only. For framing problems, we need to look elsewhere. We will propose a slightly different set of normative criteria by drawing on principles of optimal stopping. Instead of having an agent do something they are not in a position to do (place probabilities on unanticipated hypotheses), we will have them instead focus on more tractable aspects of the learning situation (e.g., evaluate the importance of the problem, the upside if it proves fruitful, the downside if it is a waste of time). This will yield different normative criteria for rational inference. We will also consider whether this is better thought of as a non-Bayesian approach to learning, or just a reframing of the traditional Bayesian paradigm.

February 27, 2023

Symmetry of Value

Zachary Goodsell, University of Southern California (USC)

Abstract: Expected utility theory problematically falls silent on the comparison of some prospects whose possible payoffs have unbounded utility. However, natural ways to extend expected utility theory can quickly lead to inconsistency. This talk considers the effect of the affine invariance principle, which says that an affine transformation of the utility of outcomes preserves the ordering of prospects. The affine invariance principle is shown to have surprising consequences in contexts where utility is unbounded, but is also shown to be consistent with some natural extensions of expected utility theory to settings with unbounded utility. Some philosophical motivations for accepting the affine invariance principle are also considered.

March 20, 2023

Von Mises, Popper, and the Cournot Principle

Tessa Murthy, Carnegie Mellon University

Abstract: Adherents to non-frequentist metaphysics of probability have often claimed that the axioms of frequentist theories can be derived as theorems of alternate characterizations of probability by means of statistical convergence theorems. One representative such attempt is Karl Popper's "bridge" between propensities and frequencies, discussed in his books Logic of Scientific Discovery and Realism and the Aim of Science. What makes Popper's argument particularly interesting is that he seemed much more sympathetic to the frequentists than the other theorists who promoted similar claims. In particular, while Richard von Mises criticized the use of the Law of Large Numbers in interderivability arguments, Popper was clearly aware of this worry—he even joined von Mises in criticizing his contemporary Frechet's more heavy-handed application of it. What Popper thought set his version of the bridge argument apart was his use of the almost-sure strong law in place of the weak law. SLLN has a measure zero exclusion clause, which Popper claimed could be unproblematically interpreted as probability zero. While in other contexts he agreed with von Mises that taking low propensity to entail low frequency requires frequentist assumptions, the zero case, according to Popper, was special. To defend this claim, he relied on a contextually odd application of the Cournot principle.

In this project I investigate two related, understudied elements of the Popper/von Mises dispute. First, I provide an explanation of the mutual misunderstanding between von Mises and Popper about the admissibility of the claim that measure zero sets are probability zero sets. Popper takes von Mises as levying the criticism that claims along the lines that "low propensity means common in a long sequence of trials" are inaccurate (a claim von Mises elsewhere makes) when in fact von Mises is instead concerned that such claims are circular or fundamentally frequentist. This explains, but does not entirely justify, Popper's appeal to the Cournot principle. Second, I relay the worry that the use of convergence theorems in the context of propensities requires more auxiliary information than similar uses in frequentist theories. The SLLN, for example, requires that subsequent trials satisfy independence conditions (usually i.i.d.). I provide a charitable interpretation of Popper's project that better justifies that experimental iterations satisfy the antecedent of the SLLN, and also makes sense of Popper's references to Doob's impossibility theorem. I conclude by reflecting that though this historical investigation paints Popper's approach as more statistically informed than is commonly thought, it does not get him entirely out of trouble.

March 27, 2023

Probability from Symmetry

Ezra Rubenstein, Berkeley

Abstract: At the heart of the ‘classical’ approach to probability is the idea that the probability of a proposition quantifies the proportion of possible worlds in which it is true. This idea has a proud history, but it has fallen on hard times. I aim to rejuvenate it. Firstly, I show how the metaphysics of quantity together with the physical laws might yield well-defined relative sizes for sets of physically possible worlds. Secondly, I argue that these relative sizes are apt to reconcile the credence-guiding and frequency-explaining roles of probability.

April 3, 2023

Why the Concept of Statistical Inference Is Incoherent, and Why We Still Love It

Michael Acree, Senior Statistician (Retired), University of California, San Francisco

Abstract: By the 17^th century the Italian insurance industry was coming to use the word probability to refer to informal assessments of risk, and Pascal and Fermat soon began the calculus of gambling. But the latter never used the word probability, framing their inquiries instead in terms of expectation—the focus of insurance. At the end of the 17^th century Bernoulli tried to bring these two lines together. A revolution in the historiography of probability occurred in 1978 with a paper by Shafer. Virtually all attention to Bernoulli’s Ars Conjectandi had focused only on the closing pages, where he famously proved the weak law of large numbers; Shafer was the first to notice, at least since the 18^th century, Bernoulli’s struggle to integrate the two concepts. He also attributes the success of Bernoulli’s dualistic concept to Bernoulli’s widow and son having withheld publication for 8 years following his death, and to eulogies having created the impression that Bernoulli had succeeded in his ambition of applying the calculus of chances to “civil, moral, and economic matters.” Lambert improved Bernoulli’s formula for the combination of probabilities 50 years later, but did not address the question of a metric for epistemic probability, or the meaningfulness of combining them with chances. I suggest that no such meaningful combination is possible. But Bernoulli’s attempted integration promised that all uncertainty could be quantified, and the promise was philosophical heroin to the world of Hume. Laplace’s Rule of Succession was of no use to scientists, but was beloved of philosophers for over a century. Criticisms of it on metaphysical grounds led to Fisher’s approach, 150 years later, based on intervening developments in astronomy. His theory still desperately straddled the tension between aleatory and epistemic probabilities. Jerzy Neyman set about to purify Fisher’s theory of the epistemic elements, and was led to scrap all reference to inference, leaving him with a theory of statistical decision making, with principal application to quality control in manufacturing. Bayesians, meanwhile, were eager to retain epistemic reference, but wanted epistemic probabilities to be measured on the same scale as aleatory probabilities. That left them with, among other problems, the inability to represent ignorance meaningfully. Efforts to adhere consistently to the requirements of epistemic probability impel us to abandon reference to statistics, as efforts to adhere to the requirements of an aleatory conception impel us to abandon reference to inference. Statistics and inference, I conclude, have really nothing more to do with each other than do ethics and trigonometry.

April 17, 2023

Parity, Probability and Approximate Difference

Kit Fine, NYU

Abstract: I present a theory of parity or imprecision in which it is represented by an approximate difference in the value of two items or an approximate ratio in the credence of two propositions.

April 24, 2023

Algorithmic Randomness and Probabilistic Laws (with Jeffrey A. Barrett (UC Irvine))

Eddy Keming Chen, UC San Diego

Abstract: We consider two ways one might use algorithmic randomness to characterize a probabilistic law. The first is a generative chance* law. Such laws involve a nonstandard notion of chance. The second is a probabilistic* constraining law. Such laws impose relative frequency and randomness constraints that every physically possible world must satisfy. While each notion has virtues, we argue that the latter has advantages over the former. It supports a unified governing account of non-Humean laws and provides independently motivated solutions to issues in the Humean best-system account. On both notions, we have a much tighter connection between probabilistic laws and their corresponding sets of possible worlds. Certain histories permitted by traditional probabilistic laws are ruled out as physically impossible. As a result, such laws avoid one variety of empirical underdetermination, but the approach reveals other varieties of underdetermination that are typically overlooked.

Paper link: https://arxiv.org/abs/2303.01411

May 1, 2023

Entropy and Subjectivism

Anubav Vasudevan, Chicago

Abstract: Shafer (1985) issued a stark warning to subjective Bayesians against the dangers of unqualified appeals to the rule of conditionalization outside the context of a well-defined information protocol. In this talk, I will explain how the failure to heed Shafer's warning has led to equally unjustified appeals to other, more general rules for probabilistic updating – in particular, the rule of relative entropy maximization. I will explain how certain puzzles related to maximum entropy reasoning – such as Shimony's puzzle and the Judy Benjamin Problem – can be viewed as generalized versions of the problems facing conditionalization that prompted Shafer's initial critique. I will conclude the talk by extracting from these examples some general philosophical lessons relating to subjective interpretations of probability.

Fall 2022

Sept. 19, 2022: Bayesianism via Likelihood Theory, Hartry Field, NYU

Oct. 3, 2022: That Does Not Compute: David Lewis on Credence and Chance, Gordon Belot, University of Michigan

Oct. 31, 2022: Bootstrapping Objective Probabilities for Evolutionary Biology, Marshall Abrams, University of Alabama at Birmingham

Nov. 28, 2022: Beyond Neyman-Pearson, Peter Grünwald, CWI/Amsterdam

Abstracts – Fall 2022 Seminars

September 19, 2022

Bayesianism via Likelihood Theory

Hartry Field (NYU)

Abstract: Bayesianism is best introduced not by the usual arguments for it but by its applications. And a good way to focus on its applications is by comparison to the likelihood ratio approach to the confirmation of statistical hypotheses. The latter turns out to be a much more general theory than is usually appreciated, but does have limitations that a recognizably Bayesian approach overcomes. An advantage of approaching Bayesianism via likelihood theory is that doing so leads naturally to a questioning of some harmful idealizations built into crude forms of Bayesianism.

October 3, 2022

That Does Not Compute: David Lewis on Credence and Chance

Gordon Belot (University of Michigan)

Abstract: Following Lewis, many philosophers hold reductionist accounts of chance (on which claims about chance are to be understood as claims that certain patterns of events are instantiated) and maintain that rationality requires that credence should defer to chance (in the sense that one's credence in an event, conditional on the chance of that event being x, should be x). It is a shortcoming of an account of chance if it implies that this norm of rationality is unsatisfiable by computable agents. Here it is shown, using elementary considerations from the theories of inductive learning and of algorithmic randomness, that this shortcoming is more common than one might have hoped.

October 31, 2022

Bootstrapping Objective Probabilities for Evolutionary Biology

Marshall Abrams (University of Alabama at Birmingham)

While there are difficult philosophical problems concerning quantum mechanics that are relevant to our understanding of the nature of quantum mechanical probabilities, it's clear that (a) the probabilities seem to be fundamental to quantum mechanics, and (b) the numerical values of probabilities are given by the mathematical theory of quantum mechanics (perhaps via some extension such as GRW). There are also philosophical problems concerning the nature of probabilities that arise in statistical mechanics, but the mathematical theories of statistical mechanics that are relevant to these problems usually constrain the numerical values of these probabilities fairly narrowly.

The situation is different in evolutionary biology, where widespread uses of models and frequentist statistical inference in empirical research seem to depend on the fact that evolving populations realize (something like) objective probabilities. (These might be objective imprecise probabilities.) But we have had little guidance about how to think about the fundamental nature of these "evolutionary" probabilities.

Some authors have suggested that evolutionary probabilities might derive from quantum mechanical probabilities, but it's completely unclear how quantum mechanical probabilities would connect to the probabilities assumed in empirical research.
Some authors have suggested that evolutionary probabilities might be analogous to probabilities in statistical mechanics, but this faces similar problems.
Strevens has suggested that evolutionary probabilities arise as microconstant probabilities, where individual organisms play a role analogous to wheels of fortune. In a forthcoming book, I suggest something similar under the name of "measure map complex causal structure probability" (MM-CCS), but treat an entire population in its environment as analogous to a wheel of fortune. However, even if we put aside well-known problems about measures of initial conditions that challenge views like Strevens' and mine, it's unclear why it's reasonable to think that the causal structure of an organism in an environment (Strevens) or a population in an environment (me) should give probabilities the numerical values that seem to be required by empirical research in evolutionary biology.

There is a common problem with all of these proposals which is due to the fact that there is an enormous amount of heterogeneity among the interacting components in an evolving population in an environment, with many interactions between different levels. In my book, I call this property of evolutionary processes "lumpy complexity". This kind of complex causal structure of an evolving populations makes them extremely difficult to understand well, and difficult to model except very approximately. This makes it difficult to spell out details of the above proposals in a way that allows us to understand what gives rise to the numerical values of probabilities in evolutionary biology.

I argue that at least in many cases, there is a different sort of explanation of the probabilistic character of biological populations in environments. This strategy can explain some important numerical values of relevant probabilities, and sidesteps the sorts of problems highlighted above. It is a strategy that is not available for explaining objective probability in physical sciences. I outline the strategy below.

Empirical researchers have argued that many animals engage in a kind of random walk known as a Lévy walk or a Lévy flight: at each time step, the direction of motion is randomly chosen, and the length d of travel in that direction has probability proportional to d^μ, where μ is near a particular value (usually 2).

Modeling arguments and a small amount of empirical research have been used to argue for the "Lévy Flight Foraging hypothesis" (LFF), which says that when food (or other resources) are sparsely distributed, it's adaptive for organisms to search randomly for the food by following a Lévy walk with parameter μ near 2. That is, the LFF is the claim that Lévy flight foraging, or more precisely foraging using Lévy walks, is the result of natural selection on internal mechanisms because that pattern of foraging is adaptive.

My argument can be outlined as follows.

When members of species S1 engage in Lévy walk foraging, this has a probabilistic effect on members of a species S2 (e.g. prey of S1) that applies an equal probability of (e.g.) survival to members of S2. That is, random foraging by members of S1 is a source of probabilities of survival and reproduction for members of S2. For example, in the case of animals of species S1 searching for plants of species S2, members of S2 that are equally perceptible by members of S1 will have equal probability of being eaten by members of S1.
Where the LFF is correct, the random foraging behavior by members of S1 is the result of natural selection. This claim is independent of details about the internal mechanisms that give rise to Lévy walk foraging, or details about the fundamental character of the probabilities determined by the mechanism. For example, if an animal's Lévy walk foraging is guided by neural circuits, the probabilistic character of the animal's foraging choices could make use of quantum mechanical or statistical mechanical effects within individual neurons, or could depend on neural circuits that function in ways that are broadly analogous to deterministic pseudorandom number generators. While it's worthwhile to investigate further questions about the nature of the mechanisms that generate Lévy walk forging behavior, natural selection explains why such mechanisms are present, and will often give rise to them—one way or another. So in a sense it is unimportant how these mechanisms work; they will exist, and will explain Lévy walk foraging regardless.
These facts can explain, at least in part, the (approximately) equal probabilities of survival that are assumed as a null hypothesis by many empirical studies in evolutionary biology. They do so despite the lumpy complexity of evolving populations, and despite the fact that we might not know the fundamental character (e.g. quantum mechanical or pseudorandom) of the relevant mechanisms that have been selected for.
Although I argue research in evolutionary biology depends on the existence of (something like) objective probability, the bare fact of natural selection does not depend on probability. This means that natural selection itself can give rise to at least some of the objective probabilities that evolutionary biology uses to make inferences about natural selection.

Where such an explanation of evolutionary probabilities is applicable (and I argue that it is more widely applicable than one might think), we have an explanation of objective probability (or objective imprecise probability, or behavior that seems to involve probability) in evolving populations. This explanation avoids the problems of proposals that evolutionary probabilities should be understood as directly resulting from quantum mechanical, statistical mechanical, or microconstant/MM-CCS probabilities.

There are some reasons to think that human behavior is sometimes randomized by internal mechanisms that result either from natural selection or learning. If this is correct, people who interact with those whose behavior is randomized by such a mechanism would in turn experience randomized consequences of that behavior. Thus the explanatory strategy that I describe for evolutionary biology may also be relevant in social sciences.

November 28, 2022

Beyond Neyman-Pearson

Peter Grünwald, CWI and Leiden University

A standard practice in statistical hypothesis testing is to mention the p-value alongside the accept/reject decision. We show the advantages of mentioning instead an e-value, an alternative measure of evidence that has recently started to attract attention. With p-values, we cannot use an extreme observation (e.g. p << alpha) for getting better frequentist decisions. With e-values we can, since they provide Type-I risk control in a generalized Neyman-Pearson setting with the decision task (a general loss function) determined post-hoc, after observation of the data—thereby providing a handle on the perennial issue of 'roving alpha's'. When Type-II risks are taken into consideration, the only admissible decision rules in this post-hoc setting turn out to be e-value-based. This provides e-values with an additional interpretation on top of their original one in terms of bets.

We also propose to replace confidence intervals and distributions by the *e-posterior*, which provides valid post-hoc frequentist uncertainty assessments irrespective of prior correctness: if the prior is chosen badly, e-intervals get wide rather than wrong, suggesting e-posterior credible intervals as a safer alternative for Bayes credible intervals. The resulting *quasi-conditional paradigm* addresses foundational and practical issues in statistical inference.

Spring 2022

Six seminars in Spring 2022 were panel discussions; three were single-author seminars. Abstacts for panel discussions include questions for the panel, some include references, and all list panel members with links to their professional web pages.

January 24, 2022: Should decision theory guide the formulation of the principal principle?
Jenann Ismael, philosophy, Columbia
Itzhak Gilboa, economics, Tel Aviv & HEC Paris
Stephen Senn, statistics, consultant
Sherrilyn Roush, philosophy, UCLA

January 31, 2022: Are probability distributions for potential responses real and necessary for causal inference?
Philip Dawid, statistics, Cambridge emeritus
Ilya Shpitser, computer science, Johns Hopkins
Thomas Richardson, statistics, Washington
Sylvia Wenmackers, philosophy, KU Leuven

February 7, 2022: Are the repeated sampling principle and Cournot’s principle frequentist?
Marshall Abrams, philosophy, Alabama Birmingham
Ruobin Gong, statistics, Rutgers
Alistair Wilson, philosophy, Birmingham UK
Harry Crane, statistics, Rutgers

February 14, 2022: Cournot’s principle and the best-system interpretation of probability.
Isaac Wilhelm, philosophy, National University of Singapore
Ryan Martin, statistics, NC State
Alan Hájek, philosophy, Australian National University
Snow Xueyin Zhang, philosophy, NYU (philosophy, Berkeley, as of 2023)

February 28, 2022: Objective probability at different levels of knowledge.
Alex Meehan, philosophy, Yale
Monique Jeanblanc, mathematics, Evry
Barry Loewer, philosophy, Rutgers
Tahir Choulli, mathematics, Alberta

March 7, 2022: Game-theoretic probability in physics.
Glenn Shafer, business and statistics, Rutgers
Dustin Lazarovici, philosophy, Lausanne
Leah Henderson, philosophy, Groningen
Eddy Keming Chen, philosophy, UC San Diego

March 21, 2022: Interpreting Carnap: Seven Decades Later, Sandy Zabell, Northwestern

April 4, 2022: Extended probabilities and their application to statistical inference, Michele Caprio, Duke

April 11, 2022: Two Birds with One Coin: Convex Optimization and Confidence Sequences with Coin-Betting, Francesco Orabona, Boston University

Abstracts – Spring 2022 Seminars

January 24, 2022

Should decision theory guide the formulation of the principal principle?

Jenann Ismael, philosophy, Columbia
Itzhak Gilboa, economics, Tel Aviv & HEC Paris
Stephen Senn, statistics, consultant
Sherrilyn Roush, philosophy, UCLA

In 1980, the philosopher David Lewis gave a name to the principle that you should adopt known objective probabilities as subjective probabilities. He called it the principal principle. Lewis’s principle is mentioned in 16 different articles in the Stanford Encyclopedia of Philosophy, and it has been cited 287 times in philosophy journals indexed by JSTOR. But it has never been cited in statistics journals indexed by JSTOR.

The exact formulation of Lewis’s principal principle is a topic of continuing discussion. Lewis revised his initial formulation in his 1980 paper and later replaced it with a new principle. Other philosophers, including several who have spoken in our seminar, have proposed variations or qualifications. At first glance, Typ’s reasoning seems relevant to the discussion. Is it?

Questions for the panel:

Is Typ’s reasoning sound?
- If so, and Lewis’s principle derives from principles of decision making, is it fundamental enough to be labelled “principal”?
- If not, how should we understand the relationship between the principal principle and decision theory?
Can Typ’s reasoning help us choose among formulations of the principal principle?
Can Typ’s reasoning be found in the philosophy literature? Did Lewis discuss it?
Why don’t statisticians discuss Lewis’s principal principle?

References

David Lewis (1980), A subjectivist's guide to objective chance, in his Philosophical Papers 2:83-132
Barry Loewer (2004), David Lewis’s Humean theory of objective chance, Philosophy of Science 71:1115-1125

January 31, 2022

Are probability distributions for potential responses real and necessary for causal inference?

Philip Dawid, statistics, Cambridge emeritus
Ilya Shpitser, computer science, Johns Hopkins
Thomas Richardson, statistics, Washington
Sylvia Wenmackers, philosophy, KU Leuven

Many statistical models hypothesize a joint probability distribution for an individual’s potential responses to different treatments even when only one treatment can be given. These models include structural equation and functional models as well as the potential response model popularized by Don Rubin in the 1970s. For more than two decades, Phil Dawid has advocated the more parsimonious decision-theoretic approach, in which a probability distribution for the individual’s response is given conditional on each possible treatment but no joint distribution is hypothesized.

Joint distributions for potential responses allow us to work fully within the familiar framework of probability measures. Their disadvantage, Dawid argues, is the confusion introduced by the introduction of meaningless assumptions (such as “treatment-unit additivity assumption”) and meaningless quantities (such the variance of the difference between an individual’s responses to two possible treatments). Discussion of confounding also becomes confusing in the context of probability distributions for unobservable and meaningless quantities. Dawid’s solution is to generalize the theory of conditional independence and DAGs from probability measures to decision models.

Dawid and other authors distinguish between studying the effects of causes (causal investigation as a guide to action) and studying the causes of effects (assignment of responsibility for outcomes already observed). In the latter case, Dawid is more open to “counterfactual” questions that the joint distribution of potential outcomes might help us answer, such as “given what happened when Mr. Defendant did A, what would likely have happened if he had done B instead.”

Questions for the panel:

Dawid’s decision-theoretic language for studying the effects of causes has not been widely used in statistical practice. What are the obstacles to its adoption?
Is Dawid’s framework more difficult to use with observational studies than with experimental studies?
Do standard interpretations of probability (for example, the six interpretations Alan Hájek discusses in the Stanford Encyclopedia of Philosophy) generalize from probability measures to Dawid’s decision models?
Why do some statisticians find counterfactuals useful?
Can we assign responsibility for outcomes without using counterfactuals? (See Glenn Shafer (2001): Causality and responsibility, Cardozo Law Review 22:101-123.)

References

A. Philip Dawid (2000), Causal inference without counterfactuals (with discussion), Journal of the American Statistical Association 95:407-448
A. Philip Dawid (2015), Statistical causality from a decision-theoretic perspective, Ann. Rev. Stat. Appl. 2:273-303
A. Philip Dawid (2021), Decision-theoretic foundations for statistical causality, Journal of Causal Inference 9:39-77
A. Philip Dawid (2007), Counterfactuals, hypotheticals and potential responses: a philosophical examination of statistical causality. In Causality and Probability in the Sciences, F. Russo and J. Williamson (eds.), London, College Publications, pp. 503-532.

February 7, 2022

Are the repeated sampling principle and Cournot’s principle frequentist?

Marshall Abrams, philosophy, Alabama Birmingham
Ruobin Gong, statistics, Rutgers
Alistair Wilson, philosophy, Birmingham UK
Harry Crane, statistics, Rutgers

Historically, the probability calculus began with repeated trials: throws of a fair die, etc. Independent and identically distributed (iid) random variables still populate elementary textbooks, but most statistical models permit variation and dependence in probabilities. Already in 1816, Pierre Simon Laplace explained that nature does not follow any constant probability law; its error laws vary with the nature of measurement instruments and with all the circumstances that accompany them. In 1960, Jerzy Neyman explained that scientific applications had moved into a phase of dynamic indeterminism, in which stochastic processes replace iid models.

Statisticians who call themselves “frequentists” have proposed two competing principles to support the inferences about parameters in stochastic processes and other complex probability models.

Repeated sampling principle: Assess statistical procedures by their behavior in hypothetical repetitions under the same conditions.
Cournot’s principle: Justify inferences by statements to which the model gives high probability.

Cox and Hinkley coined the name “repeated sampling principle” in 1974. The name “Cournot’s principle” was first current in the 1950s. But both ideas are much older. When interpreted as pragmatic instructions, the two principles are more or less equivalent. But they can also be interpreted as philosophical justifications – even as explanations of the meaning of probability models – and then they seem very different.

Questions for the panel:

The repeated sampling principle can be taken as saying that the meaning of a probability measure lies in the assumption that salient probabilities and expected values given by the measure will be replicated by frequencies and averages in hypothetical repetitions. Is this a frequentist interpretation of probability?
Similarly, Cournot’s principle says that the meaning of a probability measure lies in the assumption that salient events with probability close to one will happen. Is this a frequentist interpretation of probability?
From a philosophical perspective, do these two interpretations of probability differ?
The game-theoretic foundation for probability generalizes Cournot’s principle to the case where nonstochastic explanatory or decision variables may be determined in the course of observing the data and be influenced by the values of earlier variables. Here the principle says that a salient betting strategy using the probabilities for the stochastic variables will not multiply its capital by a large factor. Is this principle frequentist?

References

Pierre Simon Laplace (1816), Letter to Bernhard von Lindenau, pp. 1100-1102 of Volume II of Correspondance de Pierre Simon Laplace (1749-1827), edited by Roger Hahn
Jerzy Neyman (1960), Indeterminism in Science and New Demands on Statisticians, Journal of the American Statistical Association 55(292):625-639
David R. Cox and David V. Hinkley (1974), Theoretical Statistics, Chapman and Hall
Glenn Shafer (2022), “That’s what all the old guys said”: The many faces of Cournot’s principle. Working Paper 60, www.probabilityandfinance.com
Glenn Shafer and Vladimir Vovk (2019), Game-Theoretic Foundations for Probability and Finance, Wiley.

February 14, 2022

Cournot’s principle and the best-system interpretation of probability

Isaac Wilhelm, philosophy, National University of Singapore
Ryan Martin, statistics, NC State
Alan Hájek, philosophy, Australian National University
Snow Xueyin Zhang, philosophy, NYU (philosophy, Berkeley, as of 2023)

Following his scholastic predecessors, Jacob Bernoulli equated high probability with practical certainty. Theoretical and applied probabilists and statisticians have done the same ever since. In 1843, Antoine Augustin Cournot suggested that this is the only way that mathematical probability can be connected with phenomena. In 1910, Aleksandr Aleksandrov Chuprov called the equation of high probability with practical certainty Cournot’s lemma. In the early 1940s, Émile Borel called it the only law of chance. In 1949, Maurice Fréchet called it Cournot’s principle.

The Bernoullian (non-Bayesian) statistician uses Cournot’s principle in three ways.

Having hypothesized a statistical model (which may be a single probability measure or a parametric or nonparametric family of probability measures), she tests it by checking that various key events to which it assigns probability close to one happen.
If the model passes these goodness-of-fit tests, she applies Cournot’s principle again to get a confidence interval that narrows the model down, essentially to a single probability measure in the ideal case.
Finally, she uses the probabilities to make decisions. Cournot’s principle supports this, because it implies that the average result of a large number of such decisions will be close to optimal.

When the model is complex, only some events of high probability can happen. In fact, what does happen, described in detail, will have negligible or zero probability. (This is the fearsome lottery paradox.) So careful statements of Cournot’s principle always limit the events close to zero or one that are taken into consideration. Borel insisted that these events be “remarkable in some respect” and specified in advance. Richard von Mises, Abraham Wald, Jean Ville, and Alonzo Church, working in the idealization of an infinite number of trials, limited the events by requiring that they be definable in some language or computable in some sense. Andrei Kolmogorov brought this back to finite reality by giving a coherent definition of complexity (and simplicity) for elements of finite sets. The model predicts only those events of high probability that are simply described.

The best-system interpretation of objective probability goes back to the work of David Lewis in the 1990s. Lewis proposed that a system of probabilities should be considered objective if it provides the best description of our world, where “best” involves balancing simplicity, strength (how much is predicted), and fit (how high a probability is given to what happens). Critics have pointed out that this balance has remained nebulous.

Questions for the panel:

Should Cournot’s principle and Lewis’s best-system criterion be thought of as competing definitions of objective probability?
What are the advantages and disadvantages of Cournot’s principle, compared with Lewis’s high probability, as a measure of the fit of a probabilistic theory to the world?
When, if ever, is Lewis’s notion of “perfectly natural” properties and relations needed for Cournot’s principle?
Cournot’s principle is vague; we must decide how high a probability is needed for practical certainty and which events with high probability are sufficiently simple or remarkable. Borel argued that this indeterminacy reflects the nature and diversity of probability’s applications. Do you agree?

References

Émile Borel (1939), Valeur pratique et philosophie des probabilités, Gauthier-Villars, Paris.
David Lewis (1994), Humean Supervenience Debugged, Mind 103:473–490
Glenn Shafer and Vladimir Vovk (2006), The sources of Kolmogorov’s Grundbegriffe, Statistical Science 21(1):70-98
Glenn Shafer (2022), “That’s what all the old guys said”: The many faces of Cournot’s principle. Working Paper 60, www.probabilityandfinance.com
Alan Hájek (2019), Interpretations of probability, Stanford Encyclopedia of Philosophy

February 28, 2022

Objective probability at different levels of knowledge

Alex Meehan, philosophy, Yale
Monique Jeanblanc, mathematics, Evry
Barry Loewer, philosophy, Rutgers
Tahir Choulli, mathematics, Alberta

In his 1843 book on probability, Cournot argued that objective probabilities can be consistent with God’s omniscience. As he saw the matter, truly objective probabilities are the probabilities of a superior intelligence at the upper limit of what might be achieved by human-like intelligence. As he explained,

Surely the word chance designates not a substantial cause, but an idea: the idea of the combination of many systems of causes or facts that develop, each in its own series, each independently of the others. An intelligence superior to man would differ from man only in erring less often or not at all in the use of this idea. It would not be liable to consider series independent when they actually influence each other in the causal order; inversely, it would not imagine a dependence between causes that are actually independent. It would distinguish with greater reliability, or even with rigorous exactness, the part due to chance in the evolution of successive phenomena. . . . In a word, it would push farther and apply better the science of those mathematical relations, all tied to the idea of chance, that become laws of nature in the order of phenomena.

In the 1970s and 1980s, the consistency of probabilities at different levels of knowledge was studied within measure-theoretic probability by Paul-André Meyer’s Strasbourg seminar. One question studied was whether the semimartingale property is preserved when a filtration is enlarged. A discrete-time process S is a semimartingale with respect to a filtration F if it is the sum of two processes, say S = A + B, where

A is predictable with respect to F, in the sense that an observer whose growing knowledge is represented by F already knows the value of A_t+1 at time t, and
B is a martingale, meaning that this observer assigns expected value zero to B’s gain on the next step: E(B_t+1|F_t)= B_t.

An observer whose knowledge is represented by a larger filtration F* knows more and so can predict more, but if the remaining unpredictable part is still a martingale, then the probabilities with respect to F can be considered just as objective as those with respect to F*. In discrete time, the semimartingale property is always preserved, but in continuous time regularity conditions are needed.

The most extreme enlargement of any filtration is the filtration that knows the world’s entire trajectory at the outset. In Cournot’s picture, this would be God’s filtration. Every process is predictable and hence a semimartingale with respect to this filtration.

In recent decades, the enlargement of filtrations has been used to study insider trading and default risk in mathematical finance.

The philosophers Barry Loewer and David Albert, inspired largely by statistical mechanics, have proposed a picture roughly similar to Cournot’s, in which the superior intelligence is represented by “standard Lebesgue measure over the physically possible microstates” consistent with a description of the universe right after the Big Bang, and God is replaced by David Lewis’s Humean mosaic. Loewer has called this the “Mentaculus vision”.

Questions for the panel:

The models used by statisticians are accessible to their actual resources for representation and computation. To what extent does Cournot’s (or Loewer’s) picture provide an objective foundation for these models?
A filtration F* is an enlargement of F when the sigma-algebra F*_t contains the sigma algebra F_t for each time t. Can we suppose that the superior intelligence’s filtration is an enlargement of a data scientist’s plan for making observations? Of a data scientist’s analysis of data already obtained?
Is there a filtration for the scientist in the Mentaculus vision? (This would mean that the scientist knows in advance the trajectory of his future knowledge as a function of the world’s trajectory.)
Cournot was an outspoken opponent of Bayesian inference. Does Bayesian uncertainty about Mentaculus take us outside the Mentaculus vision?

References

Antoine Angustin Cournot (1843), Exposition de la théorie des chances et de probabilités, Hachette, Paris. Some passages relevant to this discussion are translated in “Cournot in English”. Oscar Sheynin has provided a complete translation.
Ashkan Nikeghbali (2006), An essay on the general theory of stochastic processes, Probability Surveys3:345-412. Includes a chapter on the enlargement of filtrations and references to applications in finance.
Tahir Choulli, Catherine Daveloose, and Michèle Vanmaele (2020), A Martingale Representation Theorem and Valuation of Defaultable Securities, Mathematical Finance 30(4):1527-1564. Considers two levels of information, public information and perhaps information about default time for a firm or death time for an insurance policy.
Barry Loewer (2020), The mentaculus vision. Statistical mechanics and scientific explanation: Determinism, indeterminism and laws of nature, Valia Allori (ed.), pp. 3-29.

March 7, 2022

Game-theoretic probability in physics

Glenn Shafer, business and statistics, Rutgers
Dustin Lazarovici, philosophy, Lausanne
Leah Henderson, philosophy, Groningen
Eddy Keming Chen, philosophy, UC San Diego

In measure-theoretic probability, expected values (prices) may change through time, but the way in which they change is set in advance by a comprehensive probability measure. Game-theoretic probability generalizes this by introducing players who may take actions and set prices as a betting game proceeds. In the case of discrete time, this means that a probability tree is replaced by a decision tree. Statistical testing of the prices is still possible. But instead of testing a comprehensive probability measure by checking that a simple event of small probability specified in advance does not happen (Cournot’s principle), we test the prices, or the forecaster who sets them, by checking that a simple betting strategy specified in advance does not multiply its capital by a large factor. See Glenn Shafer’s “Testing by betting” (2021).

Because it provides explicitly for a decision maker, game-theoretic probability accommodates John von Neumann’s axioms for quantum mechanics in a straightforward way, as explained on pp. 189-191 of Shafer and Vovk’s 2001 book. As further explained on pp. 216-217 of their 2019 book, it also allows a programmer to test whether a quantum computer is performing correctly.

Gurevich and Vovk have argued that the game-theoretic formulation helps take the mystery out of the role of negative probabilities in Wigner’s quasi-probability distribution. The mystery dissolves because the distribution is nothing more than a rule that tells a forecaster how to set prices.

Advocates of Cournot’s principle in statistical mechanics have pointed out that the Gibbs distribution is needed only to rule out events to which it gives probability near zero (see, e.g., Goldstein et al. 2020). Any measure equivalent in the sense of absolute continuity would do the same job; the Gibbs distribution is distinguished from the others mainly by its implausible assumption of particles’ initial independence of each other. Ken Huira (2021) has argued that the game-theoretic formulation can again provide clarification. The distribution does not “generate” outcomes; it merely sets prices.

Questions for the panel:

Is the game-theoretic formulation adequate for understanding probability in physics?
Is the game-theoretic formulation philosophically acceptable to most theoretical physicists? If not, is this because they want a generative and creative probability, not merely a probability with predictive value?
In the modern model for multiple regression, introduced by R. A. Fisher in 1922, explanatory variables that are set by an experimenter are treated as constants (Aldrich 2005). Does a Bohmian account of a programmer’s statistical test of a quantum computer require a similar device?

References

Glenn Shafer and Vladimir Vovk (2001), Probability and Finance: It’s Only a Game! Wiley
John Aldrich (2005), Fisher and regression, Statistical Science 20(4):4010417
Glenn Shafer and Vladimir Vovk (2019), Game-Theoretic Foundations for Probability and Finance, Wiley
Yuri Gurevich and Vladimir Vovk (2020), Betting with negative probabilities
Goldstein, S., Lebowitz, J. L., Tumulka, R., & Zanghì, N. (2020). Gibbs and Boltzmann entropy in classical and quantum mechanics. In Statistical Mechanics and Scientific Explanation: Determinism, Indeterminism and Laws of Nature (pp. 519-581).
Ken Huira (2021), Gibbs distribution from sequentially predictive form of the second law
Glenn Shafer (2021), Testing by betting: a strategy for statistical and scientific communication, with discussion and response. Journal of the Royal Statistics Society, Series A 184(2):407-478, 2021.

March 21, 2022

Interpreting Carnap: Seven Decades Later

Sandy Zabell, Northwestern

Abstract: In 1950 and 1952 Rudolph Carnap published his Logical Foundations of Probability and The Continuum of Inductive Methods, setting out his views on probability and inductive inference. Given his prestige, it is not surprising that these together sparked both interest and controversy. This talk surveys that discussion and debate: what criticisms were advanced, how did Carnap's views evolve, how has his program been advanced? A primary thesis of this talk is that many of these issues can be most readily understood from the standpoint of modern probability, rather than Carnap's initial logical and sentential approach.

April 4, 2022

Extended probabilities and their application to statistical inference

Michele Caprio, Duke

Abstract: We propose a new, more general definition of extended probability measures. We study their properties and provide a behavioral interpretation. We use them in an inference procedure, whose environment is canonically represented by the probability space (Ω,F,P), when both P and the composition of Ω are unknown. We develop an ex-ante analysis – taking place before the statistical analysis requiring knowledge of Ω – in which we progressively learn the true composition of Ω. We provide an example in the field of ecology.

April 11, 2022

Two Birds with One Coin: Convex Optimization and Confidence Sequences with Coin-Betting

Francesco Orabona, Boston University

Abstract: Consider the following two problems. The first one is calculating valid and numerically sharp confidence sequences for the unknown expectation of a bounded random variable. The second problem is optimizing an arbitrary convex function with the smallest number of accesses to its stochastic gradients. Surprisingly, we will show that both problems can be solved through a reduction to a simple gambling game: betting money on the outcomes of a coin.

First, we will explain that the problem of betting money on a coin can be solved with optimal algorithms from the universal compression/gambling literature. These algorithms guarantee an exponential growth rate of the wealth of the gambler, even without stochastic assumptions on the coin. In turn, this exponential wealth will allow us to design a reduction to obtain state-of-the-art valid confidence sequences for the expectation of bounded random variables. In particular, our confidence sequences are never vacuous, even with a single sample. Moreover, another reduction will allow us to convert the same betting algorithm into an optimal online convex optimization algorithm.

Emphasis will be given to the history of these ideas in the fields of information theory, game-theoretic probability, and online learning.

Fall 2021

September 20, 2021: Descriptive probability, Glenn Shafer, Rutgers

October 4, 2021: Pseudochance vs. true chance in complex systems, Marshall Abrams, University of Alabama Birmingham

October 18, 2021: From domain-specific probability models to evaluation of model-based probability forecasts Tze Leung Lai, Stanford

November 1, 2021: Borel and Bertrand, Snow Zhang, New York University

November 8, 2021: Measuring severity in statistical inference, Robin Gong, Rutgers

November 15, 2021: The Typical Principle, Isaac Wilhelm, National University of Singapore

November 22, 2021: A framework for non-fundamental chance, Alexander Meehan, Yale

November 29, 2021: ALL-IN meta-analysis,, Judith ter Schure, University of Leiden

December 6, 2021: Exploitability and accuracy, Kevin Blackwell, Bristol

Abstracts – Fall 2021 Seminars

September 20, 2021

Descriptive probability

Glenn Shafer, Rutgers

Abstract: As the late Berkeley statisticians Leo Breiman (1928-2005) and David Freedman (1938-2008) taught us, most statistical models in the social sciences, especially regression models, are invalid. Their conclusions are at best descriptive.

Some sociologists and other applied statisticians have proposed treating these dubious regression studies as purely descriptive, refraining from significance tests and confidence statements. This is seldom done, because people want some sense of the precision of the regression parameters.

Descriptive intervals for parameter values can be obtained if we imagine parameter values betting against each other using Kelley betting. This allows us to interpret relative likelihood as relative predictive success. See the working paper at http://probabilityandfinance.com/articles/59.pdf.

October 4, 2021

Pseudochance vs. true chance in complex systems

Marshall Abrams, University of Alabama Birmingham

Abstract: Pseudorandom number generating algorithms play a variety of roles in scientific practice and raise a number of philosophical issues, but they have received little attention from philosophers of science. In this talk I focus on pseudorandom number generating algorithm implementations (PRNGs) in simulations used to model natural processes such as evolving biological populations. I argue that successful practices involving such simulations, and reflection on the modeled processes, provide reasons to think that many natural processes involve what I call “pseudochance”, which is an analogue of chance or objective probability, and which is what is realized by PRNGs. Pseudochance contrasts with what I call “true chance,” the kind of objective probability that many objective interpretations of probability claim to describe. On my view, when philosophers speak of “chance” or “objective probability,” they have probably intended the term to refer to true chance, but have often applied it to systems that, I would argue, plausibly exhibit only pseudochance.

October 18, 2021

From domain-specific probability models to evaluation of model-based probability forecasts

Tze Leung Lai, Stanford

Abstract: This talk is based on a related survey/position paper on domain-specific probability definitions and stochastic models, which have been in the probability/statistics literature from 1933 to today (and in other disciplines). It concludes with the martingale approach to the evaluation of model-based probability forecasts and gives some empirical examples.

November 1, 2021

Borel and Bertrand

Snow Zhang, New York University

Abstract: The Borel-Kolmogorov paradox is the phenomenon that, sometimes, when P(E)=0, the probability of H given E appears to depend on the specification of the sigma algebra from which E is drawn. One popular diagnosis of the paradox is that it reveals a surprising fact about conditional probability: when P(E)=0, the value of P(H|E) depends on the choice of a sigma algebra. As Kolmogorov himself put it, "[the paradox shows that] the concept of a conditional probability with regard to an isolated given hypothesis whose probability equals 0 is inadmissible" (p.51, 1956).

This talk has two parts. The negative part raises some problems for the Kolmogorov-inspired relativistic conception of rational conditional probability. The positive part proposes an alternative diagnosis of the paradox: it is an instance of a more familiar problem — the problem of (conditional) priors. I conclude by applying my diagnosis to a different debate in the foundations of probability: the issue of countable additivity.

November 8, 2021

Measuring severity in statistical inference

Robin Gong, Rutgers

Abstract: Severity (Mayo, 2018) is a principle of statistical hypothesis testing. It assesses the hypothesis test in relation to the claim it makes, and the data on which the claim is based. Specifically, a claim C passes a severe test with the data at hand, if with high probability the test would have found flaws with C if present, and yet it does not.

In this talk, I discuss how the concept of severity can be extended beyond frequentist hypothesis testing to general statistical inference tasks. Reflecting the Popperian notion of falsifiability, severity seeks to establish a stochastic version of modus tollens, as a measure of strength of probabilistic inference. Severity measures the extent to which the inference resulting from an inferential strategy is warranted, in relation to the body of evidence at hand. If the current available evidence leads a method to infer something about the world, then were it not the case, would the method still have inferred it? I discuss the formulation of severity and its properties, and demonstrate its assessment and interpretation in examples that follow either the frequentist or Bayesian traditions as well as beyond. A connection with significance function (Fraser, 1991) and confidence distribution (Xie & Singh, 2013) is drawn. These tools and connections may enable the assessment of severity in a wide range of modern applications that call for evidence-based scientific decision making.

November 15, 2021

The Typical Principle

Isaac Wilhelm, National University of Singapore

Abstract: If a proposition is typically true, then so long as you have no evidence to the contrary, you should believe that proposition; or so I argue here. In this paper, I propose and defend a principle of rationality—call it the `Typical Principle'—which links rational belief to facts about what is typical. As I show, this principle avoids several problems that other, seemingly similar principles face. And as I show, in many cases, this principle implies the verdicts of the Principal Principle: much of what the Principal Principle says about rational credence, in other words, follows from the Typical Principle.

November 22, 2021

A framework for non-fundamental chance

Alexander Meehan, Yale

Abstract: Many authors have sought to make sense of special science probabilities as objective. An appeal of this approach is that it could provide a straightforward realist account of the success of chance models in actual science. But the view faces several challenges and questions, including (1) what unites these various special science probabilities as objective chances?, and (2) how autonomous are they from the fundamental dynamical chances? In this talk I propose a general framework for theorizing about chance that lets us make some headway on questions (1) and (2), without presuming a special metaphysical or physical commitment like the Humean Best Systems Analysis or Mentaculus. A key ingredient of the framework is the Parent Principle, which reduces to the Principal Principle and New Principle in relevant special cases, and induces coherence constraints between chances at different levels. Indeed I show that if we accept the Parent Principle together with standard constraints on rational credence, then a Mentaculus-type view already follows from two relatively straightforward assumptions about our world’s chances.

November 29, 2021

ALL-IN meta-analysis

Judith ter Schure, University of Leiden

Abstract: The game-theoretic approach to statistical inference has major philosophical and mathematical advantages. Is it useful in statistical practice as well? This talk covers its use in the design of clinical trials that aim to contribute to an ALL-IN meta-analysis, for Anytime, Live and Leading Interim meta-analysis. We deal with practical problems in clinical trials, e.g. recruitment issues and lack of power, and reflect on ongoing debates in clinical trials and meta-analysis, e.g. about representativeness and random-effects modelling. We conclude that game-theoretic statistical inference has much to offer especially since it clarifies our goals and assumptions in clinical trial research. See https://arxiv.org/abs/2109.12141 for more information.

December 6, 2021

Exploitability and accuracy

Kevin Blackwell, Bristol

Abstract: I’ll begin with a very basic, very fast summary of the approach for assessing the accuracy of sets of desirable gambles being developed by Jason Konek; the notions of accuracy that I will discuss in this talk are very much in the same vein, although I will flag some important differences. (This introduction will also include an even briefer introduction to sets of desirable gambles, for anyone who isn’t familiar with them.)

Next, I motivate an interpretation of the “Falsity” score: how exploitable (by better-informed agents) does an agent’s credal state render them?

I will then go through the assumptions and results of an approach to accuracy that I termed “Select-A-Size Accuracy”, which for two-dimensional sets of desirable gambles (gambles on a binary partition), provides exactly what we want: a family of accuracy scores such that (1) according to every member of this family, every incoherent set of desirable gambles is accuracy-dominated; and (2) for every coherent set of desirable gambles, there is some element of this family which renders that set of desirable gambles not merely accuracy-undominated but Imprecisely Immodest.

…But this approach doesn’t generalize to higher dimensions; I briefly discuss why that is.

Finally, I’ll present some early developments of a game-theoretic approach to accuracy which is closely related to the exploitability notion used in constructing Select-A-Size Accuracy. I will also gesture in the direction of, but not really discuss, another development of this same exploitability notion: Arthur Van Camp’s accuracy order.