Article
Author: Alejandro Pérez Carballo (University of Massachusetts, Amherst)
Keywords:
How to Cite: Pérez Carballo, A. (2023) “Generalized Immodesty Principles in Epistemic Utility Theory”, Ergo an Open Access Journal of Philosophy. 10(0). doi: https://doi.org/10.3998/ergo.4661
According to one of the better known constraints on epistemic utility functions, each probabilistically coherent function should be immodest in a particular sense: for any probabilistically coherent credence function and any alternative to , the expected epistemic utility of relative to should be greater than that of relative to . This constraint, often known as Strict Propriety, is usually motivated by appealing to a combination of two independent claims. The first is a certain kind of admissibility principle: that any probabilistically coherent function can sometimes be epistemically rational.1 The second is an abstract principle linking epistemic utility and rationality: that an epistemically rational credence function should always expect itself to be epistemically better than any of its alternatives.2 If we assume, as most typically do, that the alternatives to any probabilistically coherent function are all and only those credence functions with the same domain, these two principles arguably entail Strict Propriety.
What happens if we enlarge the class of alternatives to include a wider range of probability functions, including some with a different domain? This would strengthen the principle linking epistemic utility and rationality: it would no longer suffice, for a credence function to be deemed epistemically rational, that it expects itself to be doing better, epistemically, than credence functions with the same domain. And this stronger principle would arguably give us a more plausible theory of epistemic rationality, at least on some ways of widening the range of alternatives. Suppose an agent with a credence function defined over a collection of propositions takes herself to be doing better, epistemically, than she would be by having another credence function defined over the same collection of propositions. But suppose she thinks she would be doing better, epistemically, having a credence function defined over a smaller collection of propositions—perhaps she thinks she would be doing better, epistemically, not having certain defective concepts and thus that she would be doing better, epistemically, simply not having propositions with those concepts as constituents in the domain of her credence function. Such an agent would seem to be irrational in much the same way as an agent who thinks she would be doing better, epistemically, by assigning different credences to the propositions she assigns credence to.3
Now, my interest here is not with the question what is the right principle linking epistemic utility and rationality. Rather, I am interested in understanding how strong a principle we can consistently endorse: I am interested in the kinds of constraints on epistemic utility functions that come from different views on how epistemic utility and epistemic rationality are related to one another. So I start by considering the strongest version of a principle linking epistemic utility and rationality, one that says that an epistemically rational credence function should take itself to be doing better than any other credence function, regardless of its domain. As we will see, the resulting immodesty constraint is far too strong, in that, perhaps surprisingly, it cannot be satisfied by any reasonable epistemic utility function—that this is so is a consequence of the main results in this paper (Subsection 3.1–Subsection 3.2).4
I then consider different possible ways of weakening this principle, study the resulting constraints on epistemic utility functions and their relationship to one another, and establish a few characterization results for the class of epistemic utility functions satisfying these constraints (Subsection 3.3). Before concluding, I discuss (Section 4) how my results relate to recent work on the question whether epistemic utility theory is incompatible with imprecise, or ‘mushy’, credences.
Fix a collection of possible worlds and a finite partition of —a collection of pairwise disjoint, jointly exhaustive subsets of , which we call cells.
I will say that a real-valued function defined over is coherent iff for each , , and . A coherent function over uniquely determines a probability function over the Boolean closure of . Accordingly, and slightly abusing notation, I will refer to coherent functions over as probability functions over .5
Let denote the collection of probability functions over . An epistemic utility function (for ) is a function such that for each is -measurable—where is -measurable iff for each is in the Boolean closure of .6
Throughout, I assume that epistemic utility functions are bounded above (for each there exists a finite such that ) and truth-directed in the following sense: for all , if (i) for any proposition in , is at least as close as is to the truth-value of in ,7 and (ii) for some proposition in , is strictly closer than is to ’s truth-value in , then .8
By definition, for fixed and is a discrete random variable. Accordingly, for I will let denote the expectation of relative to , so that
where is a choice function, in that for each and , . The -measurability of ensures that our definition does not depend on our choice function. Indeed, I will simply write to denote , so that .
I will say that an epistemic utility function for is proper iff for each ,
I will say that is strictly proper iff for each , the above inequality is always strict. (When is proper but not strictly proper I will sometimes say that is weakly proper.) A variety of characterization results can be found in the literature—see especially Gneiting and Raftery (2007).
Strictly proper epistemic utility functions have been the subject of considerable interest. In discussions of how to reward a forecaster’s predictions, strictly proper functions are of interest because they reward honesty—someone whose forecasts will be rewarded using a strictly proper epistemic utility function cannot expect to do better than by reporting her true credences (Brier 1950; Savage 1971). In general discussions of epistemic value, strictly proper functions are of interest because they incorporate a certain kind of immodesty—if your epistemic values are represented by a strictly proper epistemic utility function and you are rational, you will never expect any other credence function to be doing better, epistemically, than your own (Joyce 2009; Gibbard 2008; Horowitz 2014; Greaves & Wallace 2006; inter alia).9
And in discussions of justifications of Probabilism—the requirement on degrees of belief functions that they satisfy the axioms of the probability calculus—strictly proper utility functions have played a starring role in a range of dominance results to the effect that probabilistic credences strictly dominate non-probabilistic credences and are never dominated by any other credence function (Joyce 1998; 2009; Leitgeb & Pettigrew 2010; Pettigrew 2016; Predd, Seiringer, Lieb, Osherson, Poor, & Kulkarni 2009).10
One natural question to ask is how to generalize the framework of epistemic utility theory to allow for comparisons of probability functions defined over distinct algebras of propositions. And given such a generalization, an equally natural question is how to generalize the notion of (strict) propriety. Let me take each of these questions in turn.
Let denote the collection of finite partitions of . For , say that is a refinement of iff for each there is such that . If is a refinement of , I will say that is a coarsening of . Of course, the refinement relation induces a partial ordering over , which I will denote by , where iff is a refinement of . In fact, the resulting partially ordered set constitutes a lattice, in that any subset of admits of an infimum (a coarsest partition that is a refinement of all elements of ) and a supremum (a coarsening of each element of that refines any other partition that coarsens each element of ).
Define now , and, for a given , let denote the domain of . If , and , I will say that is an extension of to (and is a restriction of to ) iff for each ,
where ranges over elements of . I will say that is a restriction of (and an extension of ) iff and is a restriction of to .
A generalized epistemic utility function is a real-valued function defined over such that for each , the restriction11 of to is a truth-directed, epistemic utility function for . I will say that a generalized epistemic utility function is partition-wise proper iff for each , the restriction of to is proper. I will say that is (partition-wise) strictly proper iff is strictly proper for all .
It is straightforward to define generalized epistemic utility functions that are partition-wise proper. For example, take the generalized version of the Brier score, defined by
where equals if and otherwise. It is easy to check that is a generalized epistemic utility function that is partition-wise strictly proper. Indeed, for any family of functions such that is a partition-wise strictly proper utility function for each , the function is a generalized epistemic utility function that is partition-wise strictly proper.
If we are working with a fixed partition and only considering probability functions defined over that partition, a strictly proper epistemic utility function for that partition ensures the kind of immodesty that is allegedly a feature of epistemic rationality (Lewis 1971). And in the context of elicitation, strictly proper epistemic utility functions for a given partition can be used to devise systems of penalties and rewards that ensure the kind of honest reporting of forecasts over that partition that made epistemic utility functions, or scoring rules, play the starring role in a wide body of literature.12
Once we relax the assumption that we are working with a fixed partition, however, partition-wise strict propriety does not suffice to ensure immodesty, nor to encourage honest reporting. To see why, first note that for any , our assumptions so far allow us to define the expectation of relative to any defined over a refinement of ,13 and in fact, where is the restriction of to the domain of , we have:
(1)
We can now see, using the Brier score as our epistemic utility function, that any probability function that is not maximally opinionated—any probability function that assigns values other than 0 or 1 to some propositions—will assign a greater expected epistemic utility to a probability function other than itself.14 (Consequently, if we do not fix a partition but allow a forecaster to choose which partition to report her forecasts on, she will expect to do better by reporting a strict restriction of her credence function as long as her credence function is not maximally opinionated.15)
Example 1. Suppose is not maximally opinionated. Let be a coarsening of such that the restriction of is maximally opinionated. Note now that for with , , and hence that
Now let be such that , and let . By definition,
and thus
□
An interesting question, then, is whether there are epistemic utility functions that capture the relevant kind of immodesty once we consider probability functions defined over any partition. In other words, the question is whether there are epistemic utility functions such that, for any probability function , ‘takes itself’ to be doing better than any other in terms of epistemic utility. But in order to answer this question, of course, we need to make clear what it is for some probability function to ‘take itself’ to be doing better than another in terms of epistemic utility. After all, we cannot just use the familiar notion of expectation here since, in general, for given , the expectation of relative to is not well-defined.
Before turning to this question, let me introduce a few more pieces of terminology. Fix and let be some partition of . I will denote by the collection of all extensions of to the coarsest common refinement of and —thus, each in will be an extension of whose domain refines both and .16 Slightly abusing notation, for a given and , I will use as shorthand for . (Note that if is a refinement of , is just the set of extensions of to , and that if is a coarsening of , is just the singleton set of the restriction of to .)
It will be convenient to also have at our disposal three different quantities which (albeit imperfectly) summarize some of the information about how and compare relative to members of . First, define the lower expectation17 of relative to , which I denote by , by
Similarly, define the upper expectation of relative to , which I denote by , by
Finally, for , we can define the -expectation of relative to , which I denote by , by
Intuitively, the lower expectation of relative to can be thought of as ’s worst-case estimate for the value of ; similarly, the upper expectation of relative to can be thought of as ’s best-case estimate for the value of . (For a given , the -expectation of relative to is a weighted average of the two estimates.)
Clearly,
with equality if , in which case
Also note that for any ,
(2)
so that for any , we have:
(3)
Given all of these resources, we have two ways of formulating a generalized immodesty principle.18 Say that an epistemic utility function is universally -proper iff for each ,
and strictly universally -proper iff the above inequality is always strict. Say that it is universally -proper iff for each ,
and strictly universally -proper iff the above inequality is always strict. The two generalized immodesty principles I will consider are (strict) universal -propriety—the claim that all epistemic utility functions must be (strictly) universally -proper—and (strict) universal -propriety—the claim that all epistemic utility functions must be (strictly) universally -proper. My question will be whether there are any epistemic utility functions that satisfy any of these principles.
Before turning to this question, I want to spend some time explaining why these two principles stand out among other plausible generalizations as worthy of our attention. (Those who find -propriety and -propriety independently interesting are welcome to skip to the next section.)
One way to think about immodesty is as the claim that epistemic utility functions should make all coherent credence functions immodest in the following sense: an agent with that credence function will think her own credence function is choice-worthy—and perhaps uniquely so—among alternative credence functions she could have and relative to that epistemic utility function. When the alternatives all have a well-defined expectation, and on the assumption that an option is choice-worthy if it maximizes expected utility, immodesty thus understood amounts to the claim that any epistemic utility function should be proper or strictly proper. So in order to formulate generalizations of immodesty to the case where alternative credence functions lack a well-defined expectation, we need to consider alternative ways of identifying when a credence function is choice-worthy among a given set of alternatives.
The literature on decision-making with imprecise probabilities contains a number of options we can make use of: rules for deciding between options whose outcomes depend on the state of the world when we do not have well-defined credences for each of the relevant states of the world.19 Each of them can be used to formulate a way to understand what it is for a credence function to take itself to be choice-worthy when the alternatives include all credence functions regardless of their domain, and accordingly to formulate a generalized immodesty principle.20
First, we could say that takes itself to be choice-worthy iff it has greater expectation relative to all members of :
(4)
Alternatively, we could say that takes itself to be choice-worthy iff there is no other option that gets greater expectation relative to all members of :
(5)
We could instead say that takes itself to be choice-worthy iff
(6)
Or that takes itself to be choice-worthy iff
(7)
We could also say that takes itself to be choice-worthy iff
(8)
Finally, we could say that takes itself to be choice-worthy iff for a given ,
(9)
For each of these ways of understanding what it is for to take itself to be choice-worthy, we could have a generalized version of weak propriety. Now, any objection to using one of the above principles—the detailed formulation of the principles to the more general decision-theoretic setting need not concern us here—can arguably be used to object to a particular way of making precise the fully general version of immodesty.21 But since it remains largely an open question whether any of the objections to the above principles are decisive, I want to remain neutral as to which is the best way of characterizing a fully general immodesty principle.
Fortunately, these generalizations are not logically independent of one another. To see that, start by fixing and noting that the supremum and infimum in the definitions of upper and lower expectations can be replaced with a maximum and a minimum. (This follows from the fact that is compact in .22) It follows from this and the observation in (3) that (5) and (7) are equivalent to each other; that (4), (6), and (8) are equivalent to each other; and that (6) entails (7). As a result, (7) is weaker than all of (4), (5), (6), and (8). Further, since for a fixed , any counterexample to (7) is itself a counterexample to (6), we have that the weakest form of immodesty we could hope for is given by (7): if does not satisfy (7) for all , it cannot satisfy any of the other generalizations.
Similarly, it follows from these observations that (6) is the strongest generalization of immodesty from among those we have considered. In short, the most we can hope for when formulating a generalized immodesty principle is essentially the requirement that all epistemic utility functions satisfy (6)—that is, universal -propriety; but at the very least, we want a generalized immodesty principle equivalent to the claim that all epistemic utility functions satisfy (7)—that is, universal -propriety. The question now is whether there are epistemic utility functions satisfying either of these generalizations.
I begin by asking whether there are any universally -proper epistemic utility functions. The answer, perhaps unsurprisingly, is no, at least if we restrict our attention to strictly partition-wise proper epistemic utility functions.
Say that an epistemic utility function is downwards proper iff for each and each defined over a coarsening of ,
and strictly downwards proper iff the above inequality is always strict. Say that is upwards -proper iff for each and each defined over a refinement of ,
and strictly upwards -proper iff the above inequality is always strict.23
Using these definitions we can make a few simple observations. First, and most clearly, (strict) downwards propriety and (strict) upwards -propriety individually suffice for (strict) partition-wise propriety. Second, for partition-wise proper epistemic utility functions, (strict) downwards propriety (resp. (strict) upwards -propriety) can be established by looking only at comparisons between credence functions and their restrictions (resp. extensions).
Fact 3.1. Suppose is partition-wise proper. Then:
is downwards proper iff for each and each restriction of , ; is strictly downwards proper iff whenever is a restriction of .
is upwards -proper iff for each and each extension of , ; is strictly upwards -proper iff whenever is an extension of and .
Proof. Only the right-to-left direction of each biconditional is non-trivial, and that of (i) follows immediately from (1) and the fact that if is defined over a coarsening of and is the restriction of to , partition-wise propriety ensures that .
For the right-to-left direction of (ii), simply note that for and with , our assumptions ensure that for each extension of to ,
which ensures .
□
Say that an extension of is opinionated iff for each there is with and —in other words, an extension is opinionated if for each cell of , assigns all of the probability assigns to it to a single one of its subsets in . In order to determine the value of the upper or lower expectation of any extension of , all we need to look at are the opinionated extensions of defined over .
Fact 3.2. Fix an epistemic utility function , a probability function and any defined over a refinement of . There are opinionated extensions and of defined over such that
Proof. For each , pick , with and , such that for all with
and let and be the unique opinionated extensions of such that for all , and .
□
A consequence of the last two results is that for determining whether is upwards -proper, we don’t really need to compute upper-expectations.
Corollary 3.3. A partition-wise proper epistemic utility function is upwards -proper (resp. strictly upwards -proper) iff for each and each opinionated extension of ,
Proof. The left-to-right direction follows immediately from (1) and the definition of upper expectation. For the right-to-left direction, take and fix defined over a refinement of . From Fact 3.2, we know that there is an opinionated extension of to such that .
But by assumption,
(resp. the above inequality is strict when ), and from partition-wise propriety we know that
We can thus conclude that (resp. when ). From Fact 3.1, we conclude that is upwards -proper (resp. strictly upwards -proper).
□
Corollary 3.4. A partition-wise proper epistemic utility function is upwards -proper (resp. strictly upwards -proper) iff for each and each extension of , (resp. when .)
Proof. The right-to-left direction is an immediate consequence of Corollary 3.3. For the converse, simply note that if is an opinionated extension of such that
the definition of upper expectation entails that , so the left-to-right direction of Corollary 3.3. yields the desired result.
□
Now, a natural question to ask is whether there are epistemic utility functions that are both (strictly) upwards -proper and (strictly) downwards proper. But this just turns out to be the question whether there are universally -proper epistemic utility functions.
Fact 3.5. An epistemic utility function is universally -proper (resp. strictly universally -proper) iff it is downwards proper (resp. strictly downwards proper) and upwards -proper (resp. strictly upwards -proper).
Proof. The left-to-right direction is immediate. For the right-to-left direction, suppose is downwards proper and upwards -proper and fix . Let be an arbitrary probability function in , so that is an extension of to the coarsest partition that refines both and . From the fact that is upwards -proper, together with Corollary 3.4, we know that
And since is downwards proper, we know that
We thus have that for any in , , which entails , as desired. If is both strictly upwards -proper and strictly downwards proper, then for any we know that cannot equal both and , and thus either we have or ; either way, we can conclude that , and thus that , as desired.
□
And as announced above, there just are no strictly universally -proper epistemic utility functions.
Theorem 3.6 There are no strictly universally -proper epistemic utility functions.
Proof. Suppose is strictly downwards proper. Fix and with , and let be some extension of to . Strict downwards propriety entails . And combined with the definition of upper-expectation and (1), this entails
which shows that is not strictly upwards -proper.
□
Finally, we can strengthen Theorem 3.6 if we restrict ourselves to the class of partition-wise strictly proper epistemic utility functions.
Theorem 3.7. There are no universally -proper epistemic utility functions that are strictly partition-wise proper.
Proof. Let be strictly partition-wise proper, and suppose is downwards proper. Pick and let be a refinement of such that such that for all , . Let be an extension of to that is not opinionated and pick and such that and . From Fact 3.2 and the fact that is downwards proper we know, again using (1) and the definition of upper-expectation, that there is an opinionated extension of defined over such that
Since by construction , strict partition-wise propriety ensures that
Putting all of this together and using the definition of upper-expectation, we have that there is an extension of such that
which shows that is not upwards -proper.
□
The next question to ask is whether there are any universally -proper epistemic utility functions. If we require that epistemic utility functions be continuous,24 the answer to this question also turns out to be no—again, at least if we restrict ourselves to the class of strictly partition-wise proper epistemic utility functions.
Much like in the previous section, I will define an analogue of upwards -propriety that relies on the lower expectation, rather than on upper expectation, in the obvious way: is upwards -proper iff for each and each refinement of ,
is strictly upwards -proper iff the above inequality is always strict.
Before asking whether there are strictly universally -proper epistemic utility functions, we could ask whether there are any epistemic utility functions that are both strictly downwards proper and strictly upwards -proper. If we restrict ourselves to the class of continuous epistemic utility functions, we can answer this question in the negative.25
Theorem 3.8. If is continuous and strictly downwards-proper, then it is not upwards -proper.
So we can conclude that if is continuous, it is not strictly universally -proper.
Corollary 3.9 There are no continuous, strictly universally -proper epistemic utility functions.
□
Proof of Theorem 3.8. This result is a straightforward consequence of the following lemma (essentially due to Grünwald & Dawid 2004), a proof of which is in the appendix.
Lemma 3.10. Suppose is continuous and partition-wise proper. For any and any , there is some such that, for all , and all
Suppose now is continuous and strictly downwards proper, fix , and let with . Lemma 3.10 ensures that there is with
And since is strictly downwards proper and is a restriction of , we can conclude that
which means is not upwards -proper.
□
Before concluding this subsection, let me note two consequences of Lemma 3.10, which serve as counterparts to Corollary 3.4 and Fact 3.5.
Fact 3.11. Suppose is partition-wise proper and continuous. Then is upwards -proper (resp. strictly upwards -proper) iff for each and each there is such that (resp. if ).
Proof. From Lemma 3.10, we know that for each and each there is such that
The left-to-right direction now follows immediately (simply let ). For the right-to-left direction, simply note that for each and we have with (resp. ). But of course,
where the last equality follows from Lemma 3.10. We can thus conclude that is upwards -proper (resp. strictly upwards -proper).
□
Fact 3.12. A continuous epistemic utility function is (strictly) universally -proper iff it is (strictly) upwards -proper and (strictly) downwards proper.
Proof. Again, we only need to show the right-to-left direction, so fix . From Fact 3.11 and the fact that is continuous and upwards -proper, we know that there is such that . But the fact that is downwards proper entails that , so that
as desired. If is strictly upwards -proper and strictly downwards proper, then repeat the above reasoning after first assuming is neither a refinement nor a coarsening of , so that is either different from or from .
□
We have seen that there are no strictly universally -proper or -proper epistemic utility functions. But we can easily find examples of downwards proper and upwards -proper (and hence upwards -proper) epistemic utility functions.
Say that an epistemic utility function is an additive accuracy measure26 iff there is a function such that
Say that a function is proper iff for all ,
and say that is strictly proper iff the above inequality is always strict.
If is an additive accuracy measure, I will call its local accuracy measure. It is easy to see that an additive accuracy measure is partition-wise proper (resp. strictly proper) iff its local accuracy measure is proper (resp. strictly proper).
For a given local accuracy measure I will call its self-expectation function, where
The linearity of expectation ensures that if is an additive accuracy measure with local accuracy measure u,
From this we can easily derive the following characterization result.27
Theorem 3.13. An additive accuracy measure with a proper local accuracy measure is downwards proper (resp. strictly downwards proper) iff its self-expectation function is subadditive, in that for , with
(resp. strictly subadditive, in that the above inequality is always strict).
□
Proof. For the left-to-right direction, start by taking a three celled partition and let be a coarsening of (and hence, ). Fix with and let be the unique probability function in assigning to and to . Let be the restriction of to and note that
and
Since is downwards proper, we know that , and hence that , as required.
For the converse, fix and with . In other words, is a coarsening of such that there are with such that . Let be the restriction of to , set , , and note that, letting range over elements of ,
and
Since is subadditive, we conclude that . A simple inductive argument on the size of shows that for any and any restriction of , , as required.
Parallel reasoning shows that, for proper , strict subadditivity is equivalent to strict downwards propriety.
□
As we saw in Example 1, the (generalized) Brier score is not downwards proper, but the (generalized version of the) well-known spherical score (which, like the Brier score, is an additive accuracy measure) is strictly downwards proper.
Example 2. Define by
and let
Clearly, the restriction of to any partition is just the familiar spherical score, which is strictly proper, so that is strictly partition-wise proper. But is also strictly downwards proper.
To see why, note that for any ,
where is the Euclidean norm.
Since the Euclidean norm is a norm, it satisfies the triangle inequality, and thus for any with ,
which means is strictly subadditive and thus that is strictly downwards proper.
□
We also need not look far to find an example of an upwards proper additive accuracy measure.
Example 3. Let
where is the unique with . As is well known, is partition-wise proper. But it is also upwards -proper. To see that, first note that for any ,
Fix now , let be an opinionated extension of , and for each let denote the unique with and . Note now that
And of course,
From Corollary 3.3 we conclude that is upwards -proper, and thus upwards -proper.
□
Note that the log score is also an additive accuracy measure with local accuracy measure l, where
Note too that . Interestingly, any additive accuracy measure whose local accuracy measure satisfies will be upwards -proper, as the following makes clear.
Theorem 3.14. An additive accuracy measure is upwards -proper (resp. strictly upwards proper) iff (resp. ), where is ’s local accuracy measure.
Proof. Suppose is an upwards -proper additive accuracy measure with local accuracy measure . Take a two cell partition of . Let be the unique probability function in that assigns probability to and let be the unique probability function defined over the trivial partition . Of course is an opinionated extension of , so that the upwards -propriety of and Fact 3.2 entails . But , and , and thus u(0,0) ≤ 0. A similar argument shows that if is strictly upwards -proper, then u(0,0) < 0.
To establish the other direction, fix and let be an opinionated extension of . For each , let be the unique such that and , and let .
Note now that
And since clearly
u(0,0) ≤ 0 (resp. u(0,0) < 0) entails that (resp. , and hence, using Fact 3.2, we can conclude that is upwards -proper (resp. strictly upwards -proper).
□
Now, it is well-known28 that if a proper local accuracy measure (resp. strictly proper), the function is convex (resp. strictly convex), in the sense that for each and ,
(resp. the above inequality is always strict). And it is well-known (see, e.g., Bruckner 1962) that for a convex over , (resp. ) iff is superadditive (resp. strictly superadditive), in the sense that for with ,
(resp. the above inequality is always strict). Since by definition of , we can put all these observations together with Theorem 3.14, to establish the following analogue Theorem 3.13:
Corollary 3.15. An additive accuracy measure is upwards -proper (resp. strictly upwards -proper) iff its local accuracy measure is proper and is superadditive (resp. strictly superadditive).
□
To conclude this section, let me state one final characterization result, this time for the class of upwards -proper additive accuracy measures.
Theorem 3.16. A continuous, additive accuracy measure with a strictly proper local accuracy measure is upwards -proper (resp. strictly upwards -proper) iff for each there are with and (resp. ).
Proof. We know from Fact 3.11 that is upwards -proper iff for each and each there is such that
and that is strictly upwards -proper iff the above inequality is always strict whenever . For the left-to-right direction, assume is upwards -proper (resp. strictly upwards -proper). Given , take defined over a two-celled partition with and take a three-celled refinement of . From Fact 3.11 we know that there is some such that (resp. ). Let and , and note that the above inequality entails that .
For the right-to-left direction, start by fixing , with , and let and be such that . Let and fix with such that . Let be the unique extension of to that assigns probability to , and note that, letting range over ,
and
whence . A simple induction argument on the size of allows us to conclude that for each there is with , and thus that is upwards -proper. Parallel reasoning shows that if for each there are with such that , is strictly upwards -proper.
□
Surprisingly, it follows from this that for additive accuracy measures, upwards -propriety and -propriety coincide:
Corollary 3.17. A continuous, additive accuracy measure with a strictly proper accuracy measure is upwards -proper (resp. strictly upwards -proper) iff it is upwards -proper (resp. strictly upwards -proper).
Proof. Apply Theorem 3.16 with , to conclude that if is upwards -proper (resp. strictly upwards -proper), then , since
Using Theorem 3.14, we conclude that is upwards -proper. Strictly parallel reasoning shows that if is strictly -proper, then it is strictly upwards -proper.
□
According to the standard, Bayesian picture we have been taking for granted, an agent’s epistemic state can be adequately represented with a single probability function. But many think this is a mistake: on their view, an agent’s epistemic state is best represented not with a single probability function but with a set thereof. This view can model any agent the more standard Bayesian picture can equally well—identify each probability function with its singleton set. But it is, at least on the face of it, more flexible. It can, for example, represent the kind of epistemic state most of us are arguably in with respect to the proposition that the last person to arrive in Australia in the year 2000 was wearing a white shirt: a state that seems hard to represent by assigning any one number to that proposition.
Grant that proponents of this dissenting view are right—grant, in other words, that one can be in the kind of epistemic state that is better modeled with a set of probability functions than with a single probability function.29 An interesting question is whether it is ever epistemically rational to be in the kind of state that cannot be aptly represented with a unique probability function.
There has been much debate around this question and it is not my purpose here to take a stance either way.30 But a family of related and interesting results that emerged from this debate bear some resemblance to the results established in this paper and it is worth clarifying exactly how they differ from my results.31
In the literature on epistemic utility theory, it is by and large taken for granted that something like the following principle captures an important relationship between epistemic utility and epistemic rationality:32
Dominance: If for any world , the epistemic utility of at is strictly lower than that of at , and if for some world , the epistemic utility of at is strictly less than that of at , then is epistemically irrational.
So, much attention has been paid to the question what kinds of reasonable epistemic utility functions can be defined that allow us to compare the epistemic utility of a ‘precise’ credence function at a world with that of an ‘imprecise’ one—here we think of sets of probability functions as ‘imprecise’ or ‘indeterminate’ credence functions since for many propositions they do not determine a unique degree of credence.
For example, generalizing some results in Schoenfield (2017), Berger and Das (2020) have argued that for any imprecise credence function there is a precise credence function that is at least as accurate relative to any world—at least given some assumptions about what a measure of accuracy must be like. And this, at least if we think that epistemic utility functions should be measures of accuracy, arguably shows that no epistemic utility function can be strictly upwards -proper, and a fortiori that no epistemic utility function can be strictly universally -proper or strictly universally -proper.
To see why, note that for a fixed and a refinement of , we can identify any probability function defined over with an imprecise probability function defined over —essentially, we identify with (see footnote 17). Berger and Das’s results can then be used to show that on any reasonable measure of accuracy, for any defined over there is some defined over that is as accurate as relative to any state of the world. Thus, if we identify epistemic utility functions with measures of accuracy satisfying their constraints, their result can be used to show that for any , any refinement of , and any defined over , there is defined over such that for any , the epistemic utility of at equals that of at . And this in turn would suffice to show that there are no strictly upwards -proper epistemic utility functions.
Now, we can first observe that in a sense my results are more general, in that they do not make any substantive assumptions about epistemic utility functions—at most, we assume that epistemic utility functions are continuous and truth-directed.33 I do not, for instance, assume that epistemic utility is atomistic (I do not assume that the epistemic utility of a credence function at a world is determined by the utility of the individual credence assignments that make up that credence function at that world), nor that it is extensional (I do not assume that the epistemic utility of a credence function at a world is independent of the content of the propositions it assigns credence to).34
But there is a more significant difference between my results and those from the literature on imprecise probability functions. The question at the center of impossibility results for imprecise probability functions takes as given a fixed partition and asks whether there are reasonable ways of measuring accuracy or epistemic utility for that partition that will sometimes have imprecise probability functions doing better than precise probability functions. And one assumption all in the literature seem to take for granted—an assumption which seems perfectly natural given the presuppositions of the question—is that any reasonable measure of accuracy for a given partition should satisfy the following constraint:35
Perfection: For any , there is a credence function that has maximal epistemic utility with respect to : the epistemic utility of any credence function, precise or not, defined over and different from , is strictly less than that of .
To get a handle on what Perfection says, it helps to focus on a simple case with a two-cell partition . Take some world and consider the credence function that assigns to and to . It seems natural to say that, with respect to , any credence function different from has lower epistemic utility, at , than has. In particular, relative to , any (non-trivially) imprecise credence function—any set of probability functions defined over with more than one element—is worse, epistemically and with respect to π*, than . After all, relative to , has a legitimate claim to being as good as it gets, epistemically with respect to .
Now, in this paper I have not trafficked in anything quite like the notion of epistemic utility relative to a partition. So it is not completely straightforward to translate Perfection into a constraint on the kind of epistemic utility functions we have been interested in. But there is a somewhat natural way to recast Perfection into a constraint on generalized epistemic utility functions in my sense. And once we see what that constraint amounts to, we will see both that it is not quite so plausible (as a constraint on generalized epistemic utility functions) and that my results do not depend on it.
Recall that in comparing the discussion of imprecise probability functions over a partition with my discussion of credence functions whose domain does not include elements of that partition, I identified a (precise) credence function defined over a coarsening of with an imprecise credence function defined over . Hence, saying that is better, epistemically and relative to , than any imprecise credence function defined over the same domain as , entails that for any coarsening of , is better, relative to , than any other credence function defined over that coarsening. So in my framework, Perfection amounts to the claim that relative to any , any non-trivial coarsening of , and any , there is some credence function defined over that is better, epistemically relative to , than any credence function defined over . Equivalently, in my framework Perfection amounts to the claim that for any partition, any world , and any refinement of , there is a credence function defined over that refinement that is better relative to than any credence function defined over :
Refinement: For any , any refinement of , and any , there is defined over such that for any defined over , .
Now, it should be clear that my results do not depend on anything like Refinement. After all, Refinement rules out as admissible any upwards -proper epistemic utility function, whereas my assumptions are compatible with the admissibility of such epistemic utility functions. So, strictly speaking, this is another sense in which my results are more general. But it is worth highlighting that, whereas in the discussion of imprecise probability functions something like Refinement may well be uncontroversial, in the present context it is far from it.
The constraint imposed by Refinement is incompatible with thinking of some refinements as an unalloyed epistemic bad: if epistemic utility satisfies Refinement, there can be no proposition such that that you are epistemically worse off no matter what when you come to form an opinion on that proposition. Whether it be a proposition about phlogiston, or about miasma, Refinement entails that it is always in principle possible to do better, epistemically, by forming a view on that proposition.
Of course, it may be that this is the right way to think about epistemic utility, but it is certainly not obviously the right way to think about it. One might, for example, think that there is an ideal language for theorizing about the world, and that the ideal epistemic state is the one that is maximally accurate with respect to propositions expressible in that ideal language and simply fails to even entertain hypotheses that cannot be formulated in that language. If that’s how we think about epistemic utility, we will want to reject Refinement.36
At any rate, it is not my goal here to suggest that the right way to think about epistemic utility is incompatible with Refinement. But I do want to point out that it is yet another substantive assumption about epistemic utility that is required for the impossibility results mentioned above to go through. In contrast, my results make no substantive assumptions about epistemic utility. Rather, they establish that no matter how we think of epistemic utility, there are hard limits on the degree of immodesty we can expect to come from epistemic rationality.37
In contexts where probability functions are stipulated to all be defined over a fixed domain, strictly proper epistemic utility functions arguably capture a certain kind of immodesty. Once we move on to contexts where probability functions are allowed to be defined over different domains, strictly proper epistemic utility functions do not capture the relevant sense of immodesty. My question was whether there was a way of characterizing immodesty in this general setting. I considered a variety of strong, generalized immodesty principles and showed that, under minimal assumptions, no epistemic utility function satisfies any of these stronger immodesty principles.
I also considered some very weak generalizations of strict propriety and showed that some of the familiar epistemic utility functions satisfy one or another of these weak immodesty principles. One interesting question left outstanding is how strong an immodesty principle can be imposed without ruling out every reasonable epistemic utility function. In particular, one interesting question is whether there are immodesty principles that distinguish among partitions—say, immodesty principles that say that for any partition of a certain kind, all credence functions defined over that partition take themselves to be doing better, in terms of epistemic utility, than any of their restrictions without thereby taking themselves to be worse than any of their extensions.
I have not, of course, argued that epistemic utility functions ought to satisfy any of these stronger immodesty principles. But it is at the very least not obvious that strict partition-wise propriety suffices to capture the sense in which epistemic rationality is said to be immodest. What else, if anything, suffices to capture that kind of immodesty is a question for some other time.
is also not downwards proper. ⮭
(Simply fix an enumeration of and identify each with the vector .) An epistemic utility function is continuous iff for each and , the function is continuous. ⮭
For helpful conversations, comments, and advice, I am grateful to Kenny Easwaran, Richard Pettigrew, Itai Sher, and Henry Swift. Special thanks to Chris Meacham who, in addition to indulging me on many conversations about the material in this paper, went through an earlier draft of the paper with great care. Last but not least, thanks are also due to two anonymous referees for this journal for their many generous and extremely helpful comments. Much of this paper was written while I was a fellow at the Center for Advanced Study in the Behavioral Sciences at Stanford University: I’m grateful to the Center for its financial support.
1 Berger, Dominik and Nilanjan Das (2020). Accuracy and Credal Imprecision. Noûs, 54 (3), 666–703. http://doi.org/10.1111/nous.12274
2 Bradley, Seamus (2018). A Counterexample to Three Imprecise Decision Theories. Theoria, 85 (1), 18–30. http://doi.org/10.1111/theo.12170
3 Bradley, Seamus (2019). Imprecise Probabilities. In Edward N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy (Spring 2019 ed.). https://plato.stanford.edu/archives/spr2019/entries/imprecise-probabilities/
4 Brier, Glenn W. (1950). Verification of Forecasts Expressed in Terms of Probability. Monthly Weather Review, 78 (1), 1–3. http://doi.org/10.1175/1520-0493(1950)078<0001:vofeit>2.0.co;2
5 Bruckner, A. M. (1962). Tests for the Superadditivity of Functions. Proceedings of the American Mathematical Society, 13 (1), 126–30. http://doi.org/10.2307/2033788
6 Campbell-Moore, Catrin and Benjamin A. Levinstein (2021). Strict Propriety Is Weak. Analysis, 81 (1), 8–13. http://doi.org/10.1093/analys/anaa001
7 Carr, Jennifer (2015). Epistemic Expansions. Res Philosophica, 92 (2), 217–36. http://doi.org/10.11612/resphil.2015.92.2.4
8 Ellsberg, Daniel (1961). Risk, Ambiguity, and the Savage Axioms. The Quarterly Journal of Economics, 75 (4), 643–69. http://doi.org/10.2307/1884324
9 Gibbard, Allan (2008). Rational Credence and the Value of Truth. In Tamar Szábo Gendler and John Hawthorne (Eds.), Oxford Studies in Epistemology (Vol. 2, 143–64). Oxford University Press.
10 Gilboa, Itzhak (1987). Expected Utility with Purely Subjective Non-Additive Probabilities. Journal of Mathematical Economics, 16 (1), 65–88.
11 Gilboa, Itzhak and David Schmeidler (1989). Maxmin Expected Utility with Non-Unique Prior. Journal of Mathematical Economics, 18 (2), 141–53. http://doi.org/10.1016/0304-4068(89)90018-9
12 Gneiting, Tilmann and Adrian E. Raftery (2007). Strictly Proper Scoring Rules, Prediction, and Estimation. Journal of the American Statistical Association, 102 (477), 359–78. http://doi.org/10.1198/016214506000001437
13 Greaves, Hilary and David Wallace (2006). Justifying Conditionalization: Conditionalization Maximizes Expected Epistemic Utility. Mind, 115 (459), 607–32.
14 Grünwald, Peter D. and A. Philip Dawid (2004). Game Theory, Maximum Entropy, Minimum Discrepancy and Robust Bayesian Decision Theory. The Annals of Statistics, 32 (4), 1367–433.
15 Halpern, Joseph Y. (2003). Reasoning about Uncertainty. The MIT Press.
16 Horowitz, Sophie (2014). Immoderately Rational. Philosophical Studies, 167 (1), 41–56.
17 Hurwicz, Leonid (1951). The Generalized Bayes Minimax Principle: A Criterion for Decision Making Under Uncertainty. Cowles Commission Discussion Paper 335.
18 Joyce, James M. (1998). A Nonpragmatic Vindication of Probabilism. Philosophy of Science, 65 (4), 575–603.
19 Joyce, James M. (2005). How Probabilities Reflect Evidence. Philosophical Perspectives, 19 (1), 153–78.
20 Joyce, James M. (2009). Accuracy and Coherence: Prospects for an Alethic Epistemology of Partial Belief. In Franz Huber and Christoph Schmidt-Petri (Eds.), Degrees of Belief (263–97), Vol. 342 of Synthese Library. Springer Netherlands.
21 Joyce, James M. (2010). A Defense of Imprecise Credences in Inference and Decision Making. Philosophical Perspectives, 24 (1), 281–323. http://doi.org/10.1111/j.1520-8583.2010.00194.x
22 Kechris, Alexander (1995). Classical Descriptive Set Theory. Springer.
23 Konek, Jason (in press). Epistemic Conservativity and Imprecise Credence. Philosophy and Phenomenological Research.
24 Leitgeb, Hannes and Richard Pettigrew (2010). An Objective Justification of Bayesianism I: Measuring Inaccuracy. Philosophy of Science, 77 (2), 201–35. http://doi.org/10.1086/651317
25 Levi, Isaac (1974). On Indeterminate Probabilities. The Journal of Philosophy, 71 (13), 391–418. http://doi.org/10.2307/2025161
26 Lewis, David (1971). Immodest Inductive Methods. Philosophy of Science, 38 (1), 54–63. http://doi.org/10.1086/288339
27 Mayo-Wilson, Conor and Gregory Wheeler (2016). Scoring Imprecise Credences: A Mildly Immodest Proposal. Philosophy and Phenomenological Research, 93 (1), 55–78. http://doi.org/10.1111/phpr.12256
28 Mertens, Jean-François, Sylvain Sorin, and Shmuel Zamir (2015). Repeated Games. Econometric Society Monographs. Cambridge University Press.
29 Nielsen, Michael (2022). On the Best Accuracy Arguments for Probabilism. Philosophy of Science, 89 (3), 621–30. http://doi.org/10.1017/psa.2021.43
30 Pérez Carballo, Alejandro (2023). Downwards Propriety in Epistemic Utility Theory. Mind, 132 (525), 30–62. http://doi.org/10.1093/mind/fzac011
31 Pettigrew, Richard (2016). Accuracy and the Laws of Credence. Oxford University Press.
32 Pettigrew, Richard (2018). The Population Ethics of Belief: In Search of an Epistemic Theory X. Noûs, 52 (2), 336–72.
33 Predd, J. B., R. Seiringer, E. H. Lieb, D. N. Osherson, H. V. Poor, and S. R. Kulkarni (2009). Probabilistic Coherence and Proper Scoring Rules. IEEE Transactions on Information Theory, 55 (10), 4786–92.
34 Ramoni, Marco and Paola Sebastiani (2001). Robust Bayes Classifiers. Artificial Intelligence, 125 (1–2), 209–26. http://doi.org/10.1016/s0004-3702(00)00085-0
35 Satia, Jay K. and Roy E. Lave (1973). Markovian Decision Processes with Uncertain Transition Probabilities. Operations Research, 21 (3), 728–40.
36 Savage, Leonard J. (1951). The Theory of Statistical Decision. Journal of the American Statistical Association, 46 (253), 55–67. http://doi.org/10.1080/01621459.1951.10500768
37 Savage, Leonard J. (1971). Elicitation of Personal Probabilities and Expectations. Journal of the American Statistical Association, 66 (336), 783–801.
38 Schoenfield, Miriam (2017). The Accuracy and Rationality of Imprecise Credences. Noûs, 51 (4), 667–85.
39 Seidenfeld, Teddy (1988). Decision Theory Without “Independence” or Without “Ordering”. Economics and Philosophy, 4 (2), 267–90. http://doi.org/10.1017/s0266267100001085
40 Seidenfeld, Teddy, Mark J. Schervish, and Joseph B. Kadane (2012). Forecasting with Imprecise Probabilities. International Journal of Approximate Reasoning, 53 (8), 1248–61.
41 Sion, Maurice (1958). On General Minimax Theorems. Pacific Journal of Mathematics, 8 (1), 171–76. http://doi.org/10.2140/pjm.1958.8.171
42 Talbot, Brian (2019). Repugnant Accuracy. Noûs, 53 (3), 540–63.
43 Troffaes, Matthias C. M. (2007). Decision Making Under Uncertainty Using Imprecise Probabilities. International Journal of Approximate Reasoning, 45 (1), 17–29.
44 van Fraassen, Bas C. (1990). Figures in a Probability Landscape. In J. Michael Dunn and Anil Gupta (Eds.), Truth or Consequences: Essays in Honor of Nuel Belnap (345–56). Kluwer Academic.
45 Walley, Peter (1991). Statistical Reasoning with Imprecise Probabilities. Chapman and Hall.
46 White, Roger (2010). Evidential Symmetry and Mushy Credence. In Tamar Szábo Gendler and John Hawthorne (Eds.), Oxford Studies in Epistemology (Vol. 3, 161–86). Oxford University Press.
My proof of Lemma 3.10 will rely on a fundamental result in game theory, which I will simply state without proof.38 Before stating the result, I need to introduce some minimal background.
A two-person, zero-sum game (henceforth, a game) is a triple , where is the set of pure strategies for player I, is the set of pure strategies for player II, and is a payoff function. When player I chooses to play and player II chooses to play , player I gets from player II if and gives player if (nothing is exchanged if , and let’s not bother to think of an ‘intuitive’ interpretation of a situation in which is non-finite).
The lower value of the game, is defined as
This is the maximum payoff that player can guarantee, since for each ,
is the best player I can do. The upper value of the game, is analogously defined as
In general,
We say that has a value iff
If a game has a value, we say that player I has an optimal strategy iff there is that achieves
Similarly, we say that player II has an optimal strategy iff there is achieving
If the game has a value and both players have an optimal strategy, the pair of optimal strategies corresponds in an intuitive way to an equilibrium in the game—a pair of strategies such that neither player prefers unilaterally deviating from it. Such a pair of strategy is called a saddle-point—thus, a saddle point in a game is a pair of strategies such that for all , , .
Not all games have a value. Some of the foundational results in game theory allow us to characterize classes of games that have a value. I will be relying on one such result for the proof of Lemma 3.10.
Recall that a function on a vector space that takes values in is convex iff for each , whenever the term on the left hand side is well-defined. We say that is concave iff is convex, and that is affine iff it is both convex and concave.
If is a topological space, we say that a function is upper-semi-continuous (or u.s.c.) iff , and for each , the set is closed in . We say that is lower-semi-continuous (or l.s.c.) iff and for each , the set is closed in . (Here we follow Mertens, Sorin, and Zamir 2015.) Of course, is u.s.c. iff is l.s.c. The result below is essentially Sion’s minimax theorem (Sion 1958).39
Theorem A.1. Let and be two convex topological spaces and suppose is concave and u.s.c. on the first argument, and convex and l.s.c. on the second—that is, for any and , and are concave, u.s.c. functions of and (respectively). Then the game has a value. If and are compact, then the game has a saddle-point.
□
We can apply Theorem A.1 to show that, whereas many games of interest do not contain a saddle-point, if we allow players to randomize their choice of strategy, the resulting game does have a saddle-point. Let me explain.
For any compact , let denote the space of all Borel probability functions on —the space of all countably additive probability functions on whose domain is the smallest -algebra that contains all the open subsets of . Of course, is a convex set, and from the fact that is a compact subset of Euclidean space, we know that is compact.40
Now, fix , with and compact. We say that is a mixed extension of iff (resp. ) is a closed and convex subset of (resp. ) and, for any , ,
Since each (resp. ) can be identified with the unique probability function (resp. ) that assigns probability one to (resp. ), I will abuse notation and think of as also defined over elements of .
If is continuous and , is concave (since linear) and u.s.c. on the first argument and convex (since linear) and l.s.c. on the second. And since any closed subset of a compact topological space is compact, we know that and are compact and convex topological spaces. So we can apply Theorem A.1 to show that any mixed extension has a saddle-point.
Corollary A.2. Suppose and are compact and is separately continuous. If , then any mixed extension of has a saddle-point.
□
I can finally present the proof of Lemma 3.10.
Proof of Lemma 3.10. Suppose is a continuous epistemic utility function that is partition-wise strictly proper. Fix a probability function and a refinement of . We define a game as follows. First, let , fix an enumeration of , and let be those elements of of the form for . Abusing notation, I will use , etc. to denote elements of , even though I will think of them as members of . Next let
Again abusing notation, I will use to denote the elements of in the obvious way (with corresponding to that assigning probability to ). Finally, let
Note that and are compact subsets of , and since < ∞, we know that any mixed extension of has a saddle point.
Our next step is to define a particular mixed extension of and apply Corollary A.2. Before doing so, however, let me make a couple of observations. First, any element of corresponds to a probability function over . I will use , etc. to denote elements of , and will continue to abuse notation and use , , etc. to denote the element of that assigns probability to the corresponding element of . Second, any element of corresponds to a probability function over . I will thus abuse notation and use , , etc. to denote elements of .
Let now and , and note that is indeed a subset of . Moreover, both and are closed and convex subsets of and (respectively), so that is indeed a mixed extension of , with
and accordingly .
From Corollary A.2, we know that our game has a saddle point . We claim that this saddle point is in fact of the form (, ). To see why, note that since is an optimal mixed strategy for player I, it follows that for any ,
and thus that
But since is strictly partition-wise proper,
Summing up, we have a saddle point of the form (, ), and thus we know that for any and any ,
In other words,
(10)
and
(11)
But note that (10) entails both
(12)
by definition, and
(13)
since is partition-wise proper.
Hence, from (11), (12), and (13), we have that for any and any ,
as desired.
□