Sunday, 12 February 2017

Chance Neutrality and the Swamping Problem for Reliabilism

Reliabilism about justified belief comes in two varieties: process reliabilism and indicator reliabilism. According to process reliabilism, a belief is justified if it is formed by a process that is likely to produce truths; according to indicator reliabilism, a belief is justified if it likely to be true given the ground on which the belief is based. Both are natural accounts of justification for a veritist, who holds that the sole fundamental source of epistemic value for a belief is its truth.

Against veritists who are reliabilists, opponents raise the Swamping Problem. This begins with the observation that we prefer a justified true belief to an unjustified true belief; we ascribe greater value to the former than to the latter; we would prefer to have the former over the latter. But, if reliablism is true, this means that we prefer a belief that is true and had a high chance of being true over a belief that is true and had a low chance of being true. For a veritist, this means that we prefer a belief that has maximal epistemic value and had a high chance of having maximal epistemic value over a belief that has maximal epistemic value and had a low chance of having maximal epistemic value. And this is irrational, or so the objection goes. It is only rational to value a high chance of maximal utility when the actual utility is not known; once the actual utility is known, this 'swamps' any consideration of the chance of that utility. For instance, suppose I find a lottery ticket on the street; I know that it comes either from a 10-ticket lottery or from a 100-ticket lottery; both lotteries pay out the same amount to the holder of the winning ticket; and I know the outcome of neither lottery. Then it is rational for me to hope that the ticket I hold belongs to the smaller lottery, since that would maximise my chance of winning and thus maximise the expected utility of the ticket. But once I know that the lottery ticket I found is the winning ticket, it is irrational to prefer that it came from the smaller lottery --- my knowledge that it's the winner 'swamps' the information about how likely it was to be the winner. This is known variously as the Swamping Problem or the Value Problem for reliabilism about justification (Zagzebski 2003, Kvanvig 2003).

The central assumption of the swamping problem is a principle that, in a different context, H. Orri Stefánsson and Richard Bradley call Chance Neutrality (Stefánsson & Bradley 2015). They state it precisely within the framework of Richard Jeffrey's decision theory (Jeffrey 1983). In that framework, we have a desirability function $V$ and a credence function $c$, both of which are defined on an algebra of propositions $\mathcal{F}$. $V(A)$ measures how strongly our agent desires $A$, or how greatly she values it. $c(A)$ measures how strongly she believes $A$, or her credence in $A$. The central principle of the decision theory is this:

Desirability  If the propositions $A_1$, $\ldots$, $A_n$ form a partition of the proposition $X$, then $$V(X) = \sum^n_{i=1} c(A_i | X) V(A_i)$$

Now, suppose the algebra on which $V$ and $c$ are defined includes some propositions that concern the objective probabilities of other propositions in the algebra.  Then:

Chance Neutrality  Suppose $X$ is in the partition $X_1$, \ldots, $X_n$. And suppose $0 \leq \alpha_1, \ldots, \alpha_n \leq 1$ and $\sum^n_{i=1} \alpha = 1$. Then $$V(X\ \&\ \bigwedge^n_{i=1} \mbox{Objective probability of $X_i$ is $\alpha_i$}) = V(X)$$

That is, information about the outcome of the chance process that picks between $X_1$, $\ldots$, $X_n$ `swamps' information about the chance process in our evaluation, which is recorded in $V$. A simple consequence of this: if $0 \leq \alpha_1, \alpha'_1 \ldots, \alpha_n, \alpha'_n \leq 1$ and $\sum^n_{i=1} \alpha_i = 1$ and $\sum^n_{i=1} \alpha'_i = 1$, then

$V(X\ \&\ \bigwedge^n_{i=1} \mbox{Objective probability of $X_i$ is $\alpha_i$}) = $
$V(X\ \&\ \bigwedge^n_{i=1} \mbox{Objective probability of $X_i$ is $\alpha'_i$})$

Now consider the particular case of this that is used in the Swamping Problem. I believe $X$ on the basis of ground $g$. I assign greater value to $X$ being true and justified than I do to $X$ being true and unjustified. That is, given the reliabilist's account of justification, if $\alpha$ is a probability that lies above the threshold for justification and $\alpha'$ is a probability that lies below that threshold --- for the veritist, $\alpha' < \frac{W}{R+W} < \alpha$ --- then

$V(X\ \&\ \mbox{Objective probability of $X$ given I have $g$ is $\alpha'$}) <$
$V(X\ \&\ \mbox{Objective probability of $X$ given I have $g$ is $\alpha$})$

And of course this violates Chance Neutrality.

Thus, the Swamping Problem stands or falls with the status of Chance Neutrality. Is it a requirement of rationality? Stefánsson and Bradley argue that it is not (Section 3, Stefánsson & Bradley 2015). They show that, in the presence of the Principal Principle, Chance Neutrality entails a principle called Linearity; and they claim that Linearity is not a requirement of rationality. If it is permissible to violate Linearity, then it cannot be a requirement to satisfy a principle that entails it. So Chance Neutrality is not a requirement of rationality.

In this context, the Principal Principle runs as follows:

Principal Principle $$c(X_i | \bigwedge^n_{i=1} \mbox{Objective probability of $X_i$ is $\alpha_i$}) = \alpha_i$$

That is, an agent's credence in $X_i$, conditional on information that gives the objective probability of $X_i$ and other members of a partition to which it belongs, should be equal to the objective probability of $X_i$. And Linearity is the following principle:

Linearity $$V(\bigwedge^n_{i=1} \mbox{Objective probability of $X_i$ is $\alpha_i$}) = \sum^n_{i=1} \alpha_iV(X_i)$$

That is, an agent should value a lottery at the expected value of its outcome. Now, as is well known, real agents often violate Linearity (Buchak 2014). The most famous violations are known as the Allais preferences (Allais 1953). Suppose there are 100 tickets numbered 1 to 100. One ticket will be drawn and you will be given a prize depending on which option you have chosen from $L_1$, $\ldots$, $L_4$:
  • $L_1$: if ticket 1-89, £1m; if ticket 90-99, £1m; if ticket 100, £1m.
  • $L_2$: if ticket 1-89, £1m; if ticket 90-99, £5m; if ticket 100, £0m
  • $L_3$: if ticket 1-89, £0m; if ticket 90-99, £1m; if ticket 100, £1m
  • $L_4$: if ticket 1-89, £0m; if ticket 90-99, £5m; if ticket 100, £0m
I know that each ticket has an equal chance of winning --- thus, by the Principal Principle, $c(\mbox{Ticket $n$ wins}) = \frac{1}{100}$. Now, it turns out that many people have preferences recorded in the following desirability function $V$: $$V(L_1) > V(L_2) \mbox{ and } V(L_3) < V(L_4)$$

When there is an option that guarantees them a high payout (\pounds 1m), they prefer that over something with 1% chance of nothing (\pounds 0) even if it also provides 10%  chance of much greater payout (£5m). On the other hand, when there is no guarantee of a high payout, they prefer the chance of the much greater payout (\pounds 5m), even if there is also a slightly greater chance of nothing (£0). The problem is that there is no way to assign values to $V(£0)$, $V(£1m)$, and $V(£5m)$ so that $V$ satisfies Linearity and also these inequalities. Suppose, for a reductio, that there is. By Linearity,
$$V(L_1) = 0.89V(£1\mathrm{m}) + 0.1 V(£1\mathrm{m}) + 0.01 V(£1\mathrm{m})$$
$$V(L_2) = 0.89V(£1\mathrm{m}) + 0.1 V(£5\mathrm{m}) + 0.01 V(£0\mathrm{m}) $$
Then, since $V(L_1) > V(L_2)$, we have: $$0.1 V(£1\mathrm{m}) + 0.01 V(£1\mathrm{m}) > 0.1 V(£5\mathrm{m}) + 0.01 V(£0\mathrm{m})$$ But also by Linearity, $$V(L_3) = 0.89V(£0\mathrm{m}) + 0.1 V(£1\mathrm{m}) + 0.01 V(£1\mathrm{m})$$
$$V(L_4) = 0.89V(£0\mathrm{m}) + 0.1 V(£5\mathrm{m}) + 0.01 V(£0\mathrm{m})$$
Then, since $V(L_3) < V(L_4)$, we have: $$0.1 V(£1\mathrm{m}) + 0.01 V(£1\mathrm{m}) < 0.1 V(£5\mathrm{m}) + 0.01 V(£0\mathrm{m})$$
And this gives a contradiction. In general, an agent violates Linearity when she has any risk averse or risk seeking preferences.

Stefánsson and Bradley show that, in the presence of the Principal Principle, Chance Neutrality entails Linearity; and they argue that there are rational violations of Linearity (such as the Allais preferences); so they conclude that there are rational violations of Chance Neutrality. So far, so good for the reliabilist: the Swamping Problem assumes that Chance Neutrality is a requirement of rationality; and we have seen that it is not. However, reliabilism is not out of the woods yet. After all, the veritist's version of reliabilism that in fact assumes Linearity! They say that a belief is justified if it is likely to true. And they say this because a belief that is likely to be true has high expected epistemic value on the veritist's account of epistemic value. And so they connect justification to epistemic value by taking the value of a belief to be its expected epistemic value --- that is, they assume Linearity. Thus, if the only rational violations of Chance Neutrality are also rational violations of Linearity, then the Swamping Problem is revived. In particular, if Linearity entails Chance Neutrality, then reliabilism cannot solve the Swamping Problem.

Fortunately, even in the presence of the Principal Principle, Linearity does not entail Chance Neutrality. Together, the Principal Principle and Desirability entail:

$V(\mbox{Objective probability of $X$ given I have $g$ is $\alpha$}) =$

$\alpha V(X\ \&\ \mbox{Objective probability of $X$ given I have $g$ is $\alpha$}) + $

$(1-\alpha) V(\overline{X}\ \&\ \mbox{Objective probability of $X$ given I have $g$ is $\alpha$})$

And Linearity entails:

 $V(\mbox{Objective probability of $X$ given I have $g$ is $\alpha$}) = \alpha V(X) + (1-\alpha) V(\overline{X})$

So
$\alpha V(X) + (1-\alpha) V(\overline{X}) =$

$\alpha V(X\ \&\ \mbox{Objective probability of $X$ given I have $g$ is $\alpha$}) + $

$(1-\alpha) V(\overline{X}\ \&\ \mbox{Objective probability of $X$ given I have $g$ is $\alpha$})$

And, whatever the values of $V(X)$ and $V(\overline{X})$, there are values of $$V(X\ \&\ \mbox{Objective probability of $X$ given I have $g$ is $\alpha$})$$ and $$V(\overline{X}\ \&\ \mbox{Objective probability of $X$ given I have $g$ is $\alpha$})$$
such that the above equation holds. Thus, it is at least possible to adhere to Linearity, yet violate Chance Neutrality. Of course, this does not show that the agent who adheres to Linearity but violates Chance Neutrality is rational. But, now that the intuitive appeal of Chance Neutrality is undermined, the burden is on those who raise the Swamping Problem to explain why such cases are irrational.

References


  • Allais, M. (1953). Le comportement de l’homme rationnel devant le risque: critique des postulats et axiomes de l'école Amáricaine. Econometrica, 21(4), 503–546.
  • Buchak, L. (2013). Risk and Rationality. Oxford University Press.
  • Kvanvig, J. (2003). The Value of Knowledge and the Pursuit of Understanding. Cambridge: Cambridge University Press.
  • Stefánsson, H. O., & Bradley, R. (2015). How Valuable Are Chances? Philosophy of Science, 82, 602–625.
  • Zagzebski, L. (2003). The search for the source of the epistemic good. Metaphilosophy, 34(12-28).

Monday, 6 February 2017

What is justified credence?

Aafira and Halim are both 90% confident that it will be sunny tomorrow. Aafira bases her credence on her observation of the weather today and her past experience of the weather on days that follow days like today -- around nine out of ten of them have been sunny. Halim bases his credence on wishful thinking -- he's arranged a garden party for tomorrow and he desperately wants the weather to be pleasant. Aafira, it seems, is justified in her credence, while Halim is not. Just as one of your full or categorical beliefs might be justified if it is based on visual perception under good conditions, or on memories of recent important events, or on testimony from experts, so might one of your credences be; and just as one of your full beliefs might be unjustified if it is based on wishful thinking, or biased stereotypical associations, or testimony from ideologically driven news outlets, so might your credences be. In this post, I'm looking for an account of justified credence -- in particular, I seek necessary and sufficient conditions for a credence to be justified. Our account will be reliabilist.

Reliabilism about justified beliefs comes in two varieties: process reliabilism and indicator reliabilism. Roughly, process reliabilism says that a belief is justified if it is formed by a reliable process, while indicator reliabilism says that a belief is justified if it is based on a ground that renders it likely. Reliabilism about justified credence also comes in two varieties; indeed, it comes in the same two varieties. And, indeed, of the two existing proposals, Jeff Dunn's is a version of process reliabilism (paper) while Weng Hong Tang offers a version of indicator reliabilism (paper). As we will see, both face the same objection. If they are right about what justification is, it is mysterious why we care about justification, for neither of the accounts connects justification to a source of epistemic value.  We will call this the Connection Problem.

I begin by describing Dunn's process reliabilism and Tang's indicator reliabilism. I argue that, understood correctly, they are, in fact, extensionally equivalent. That is, Dunn and Tang reach the top of the same mountain, albeit by different routes. However, I argue that both face the Connection Problem. In response, I offer my own version of reliabilism, which is both process and indicator, and I argue that it solves that problem. Furthermore, I show that it is also extensionally equivalent to Dunn's reliabilism and Tang's.

Reliabilism and Dunn on reliable credence


Let us begin with Dunn's process reliabilism for justified credences. Now, to be clear, Dunn takes himself only to be providing an account of reliability for credence-forming processes. He doesn't necessarily endorse the other two conjuncts of reliabilism, which say that a credence is justified if it is reliable, and that a credence is reliable if formed by a reliable process. Instead, Dunn speculates that perhaps being reliably formed is but one of the epistemic virtues, and he wonders whether all of the epistemic virtues are required for justification. Nonetheless, I will consider a version of reliabilism for justified credences that is based on Dunn's account of reliable credence. For reasons that will become clear, I will call this the calibrationist version of process reliabilism for justified credence. Dunn rejects it based on what I will call below the Graining Problem. As we will see, I think we can answer that objection.

For Dunn, a credence-forming process is perfectly reliable if it is well calibrated. Here's what it means for a process $\rho$ to be well calibrated:
  • First, we construct a set of all and only the outputs of the process $\rho$ in the actual world and in nearby counterfactual scenarios. An output of $\rho$ consists of a credence $x$ in a proposition $X$ at a particular time $t$ in a particular possible world $w$ -- so we represent it by the tuple $(x, X, w, t)$. If $w$ is a nearby world and $t$ a nearby time, we call $(x, X, w, t)$ a nearby output. Let $O_\rho$ be the set of nearby outputs -- that is, the set of tuples $(x, X, w, t)$, where $w$ is a nearby world, $t$ is a nearby time, and $\rho$ assigns credence $x$ to proposition $X$ in world $w$ at time $t$.
  • Second, we say that the truth-ratio of $\rho$ for credence $x$ is the proportion of nearby outputs $(x, X, w, t)$ in $O_\rho$ such that $X$ is true at $w$ and $t$.
  • Finally, we say that $\rho$ is well calibrated (or nearly so) if, for each credence $x$ that $\rho$ assigns, $x$ is equal to (or approximately equal to) the truth-ratio of $\rho$ for $x$.
For instance, suppose a process only ever assigns credence 0.6 or 0.7. And suppose that, 60% of the time that it assigns 0.6 in the actual world or a nearby world it assigns it to a proposition that is true; and 70% of the time it assigns 0.7 it assigns it to a true proposition. If, on the other hand, 59% of the time that it assigns 0.6 in the actual world or a nearby world it assigns it to a proposition that is true, while 71% of the time it assigns 0.7 it assigns it to a true proposition, then that process is not well calibrated, but it is nearly well calibrated. But if 23% of the time that it assigns 0.6 in the actual world or a nearby world it assigns it to a proposition that is true, while 95% of the time it assigns 0.7 it assigns it to a true proposition, then that process is not even nearly well calibrated.

This, then, is Dunn's calibrationist account of the reliability of a credence-forming process. Any version of reliabilism about justified credences that is based on it requires two further ingredients. First, we must use the account to say when an individual credence is reliable; second, we must add the claim that a credence is justified iff it is reliable. Both of these moves creates problems. We will address them below. But first it will be useful to present Tang's version of indicator reliabilism for justified credence. It will provide an important clue that helps us solve one of the problems that Dunn's account faces. And, having it in hand, it will be easier to see how these two accounts end up coinciding.

Tang's indicator reliabilism for justified credence


According to indicator reliabilism for justified belief, a belief is justified if the ground on which it is based is a good indicator of the truth of that belief. Thus, beliefs formed on the basis of visual experiences tend to be justified because the fact that the agent had the visual experience in question makes it likely that the belief they based on it is true. Wishful thinking, on the other hand, usually does not give rise to justified belief because the fact that an agent hopes that a particular proposition will be true -- which in this case is the ground of their belief -- does not make it likely that the proposition is true.

Tang seeks to extend this account of justified belief to the case of credence. Here is his first attempt at an account:

Tang's Indicator Reliabilism for Justified Credence (first pass)  A credence of $x$ in $X$ by an agent $S$ is justified iff
(TIC1-$\alpha$) $S$ has ground $g$;
(TIC2-$\alpha$) the credence $x$ in $X$ by $S$ is based on ground $E$;
(TIC3-$\alpha$) the objective probability of $X$ given that the agent has ground $g$ approximates or equals $x$ -- we write this $P(X | \mbox{$S$ has $g$}) \approx x$.

Thus, just as an agent's full belief in a proposition is justified if its ground makes the objective probability of that proposition close to 1, a credence $x$ in a proposition is justified if its ground makes the objective probability of that proposition close to $x$. There is a substantial problem here in identifying exactly to which notion of objective probability Tang wishes to appeal. But we will leave that aside for the moment, other than to say that he conceives of it along the lines of hypothetical frequentism -- that is, the objective probability of $X$ given $Y$ is the hypothetical frequency with which propositions like $X$ are true when propositions like $Y$ are true.

However, as Tang notes, as stated, his version of indicator reliabilism  faces a problem. Suppose I am presented with an empty urn. I watch as it is filled with 100 balls, numbered 1 to 100, half of which are white, and half of which are black. I shake the urn vigorously and extract a ball. It's number 73 and it's white. I look at its colour and the numeral printed on it. I have a visual experience of a white ball with '73' on it. On the basis of my visual experience of the numeral alone, I assign credence 0.5 to the proposition that ball 73 is white. According to Wang's first version of indicator reliabilism for justified credence, my credence is justified. My ground is the visual experience of the number on the ball; I have that ground; I base my credence on that ground; and the objective probability that ball 73 is white given that I have a visual experience of the numeral '73' printed on it is 50% -- after all, half the balls are white. Of course, the problem is that I have not used my total evidence -- or, in the language of grounds, I have not based my belief on my most inclusive ground. I had the visual experience of the numeral on the ball as a ground; but I also had the visual experience of the numeral on the ball and the colour of the ball as a ground. The resulting credence is unjustified because the objective probability that ball 73 is white given I have the more inclusive ground is not 0.5 -- it is close to 1, since my visual system is so reliable. This leads Tang to amend his account of justified credence as follows:

Tang's Indicator Reliabilism for Justified Credence  A credence of $x$ in $X$ by an agent $S$ is justified iff
(TIC1) $S$ has ground $g$;
(TIC2) the credence $x$ in $X$ by $S$ is based on ground $g$;
(TIC3) the objective probability of $X$ given that the agent has ground $g$ approximates or equals $x$ -- that is, $P(X | \mbox{$S$ has $g$}) \approx x$;
(TIC4) there is no more inclusive ground $g'$ such that (i) $S$ has $g'$ and (ii) the objective probability of $X$ given that the agent has ground $g'$ does not equal or approximate $x$ -- that is, $P(X | \mbox{$S$ has $g'$}) \not \approx x$.

This, then, is Tang's version of indicator reliabilism for justified credences.

Same mountain, different routes


Thus, we have now seen Dunn's process reliabilism and Tang's indicator reliabilism for justified credences. Is either correct? If so, which? In one sense, both are correct; in another, neither is. Less mysteriously: as we will see in this section, Dunn's process reliablism and Tang's indicator reliabilism are extensionally equivalent -- that is, the same credences are justified on both. What's more, as we will see in the final section, both are extensionally equivalent to the correct account of justified credence, which is thus a version of both process  and indicator reliabilism. However, while they get the extension right, they do so for the wrong reasons. A justified credence is not justified because it is formed by a well calibrated process; and it is not justified because it matches the objective chance given its grounds. Thus, Dunn and Tang delimit the correct extension, but they use the wrong intension. In the final section of this post, I will offer what I take to be the correct intension. But first, let's see why it is that the routes that Dunn and Tang take lead them both to the top of the same mountain.

We begin with Dunn's calibrationist account of the reliability of a credence-forming process. As we noted above, any version of reliabilism about justified credences that is based on this account requires two further ingredients. First, we must use the calibrationist account of reliable credence-forming processes to say when an individual credence is reliable. The natural answer: when it is formed by a reliable credence-forming process. But then we must be able to identify, for a given credence, the process of which it is an output. The problem is that, for any credence, there are a great many processes of which it might be the output. I have a visual experience of a piece of red cloth on my desk, and I form a high credence that there is a piece of red cloth on my desk. Is this credence the output of a process that assigns a high credence that that there is a piece of red cloth on my desk whenever I have that visual experience? Or is it the output of a process that assigns a high credence that there is a piece of red cloth on my desk whenever I have that visual experience and the lighting conditions in my office are good, while it assigns a middling credence that there is a piece of red cloth on my desk whenever I have that visual experience and the lighting conditions in my office are bad? It is easy to see that this is important. The first process is poorly calibrated, and thus unreliable on Dunn's account; the second process is better calibrated and thus more reliable on Dunn's account. This is the so-called Generality Problem, and it is a challenge that faces any version of reliabilism. I will offer a version of Juan Comesaña's solution to this problem below -- as we will see, that solution also clears the way for a natural solution to the Graining Problem, which we consider next.

Dunn provides an account of when a credence-forming process is reliable. And, once we have a solution to the Generality Problem, we can use that to say when a credence is reliable -- it is reliable when formed by a reliable credence-forming process. Finally, to complete the version of process reliablism about justified credence that we are basing on Dunn's account, we just need the claim that a credence is justified iff it is reliable. But this too faces a problem, which we call the Graining Problem. As we did above, suppose I am presented with an empty urn. I watch as it is filled with 100 balls, numbered 1 to 100, half of which are white, and half of which are black. I shake the urn vigorously and extract a ball. I look at its colour and the numeral printed on it. I have two processes at my disposal. Process 1 takes my visual experience of the numeral only, say '$n$', and assigns the credence 0.5 to the proposition that ball $n$ is white. Process 2 takes my visual experience of the numeral, '$n$', and my visual experience of the colour of the ball, and assigns credence 1 to the proposition that ball $n$ is white if my visual experience is of a white ball, and assigns credence 1 to the proposition that ball $n$ is black if my visual experience is of a black ball. Note that both processes are well calibrated (or nearly so, if we allow that my visual system is very slightly fallible). But we would usually judge the credence formed by the second to be better justified than the credence formed by the first. Indeed, we would typically say that a Process 1 credence is unjustified, while a Process 2 credence is justified. Thus, being formed by a well calibrated or nearly well calibrated process is not sufficient for justification. And, if reliability is calibration, then reliability is not justification and reliabilism fails. It is this problem that leads Dunn to reject reliabilism about justified credence. However, as we will see below, I think he is a little hasty.

Let us consider the Generality Problem first. To this problem, Juan Comesaña offers the following solution (paper). Every account of doxastic justification -- that is, every account of when a given doxastic attitude of a particular agent is justified for that agent -- must recognize that two agents may have the same doxastic attitude and the same evidence while the doxastic attitude of one is justified and the doxastic attitude of the other is not, because their doxastic attitudes are not based on the same evidence. The first might base her belief on the total evidence, for instance, whilst the second ignores that evidence and bases his belief purely on wishful thinking. Thus, Comesaña claims, every theory of justification needs a notion of the grounds or the basis of a doxastic attitude. But, once we have that, a solution to the Generality Problem is very close. Comesaña spells out the solution for process reliabilism about full beliefs:

Well-Founded Process Reliablism for Justified Full Beliefs  A belief that $X$ by an agent $S$ is justified iff
(WPB1) $S$ has ground $g$;
(WPB2) the belief that $X$ by $S$ is based on ground $g$;
(WPB3) the process producing a belief state $X$ based on ground $g$ is a reliable process.

This is easily adapted to the credal case:

Well-Founded Process Reliablism for Justified Credences  A credence of $x$ in $X$ by an agent $S$ is justified iff
(WPC1) $S$ has ground $g$;
(WPC2) the credence $x$ in $X$ by $S$ is based on ground $g$;
(WPC3) the process producing a credence of $x$ in $X$ based on ground $g$ is a reliable process.

Let us now try to apply Comesaña's solution to the Generality Problem to help Dunn's calibrationist reliabilism about justified credences. Recall: according to Dunn, a process $\rho$ is reliable if it is well calibrated (or nearly so). Consider the process producing a credence of $x$ in $X$ based on ground $g$ -- for convenience, we'll write it $\rho^g_{X,x}$. There is only one credence that it assigns, namely $x$. So it is well calibrated if that truth-ratio of $\rho^g_{X,x}$ for $x$ is equal to $x$. Now, $O_{\rho^g_{X,x}}$ is the set of tuples $(X, x, w, t)$ where $w$ is a nearby world and $t$ a nearby time where $\rho^g_{X,x}$ assigns credence $x$ to proposition $X$. But, by the definition of $\rho^g_{X,x}$, those are the nearby worlds and nearby times at which the agent has the ground $g$. Thus, the truth-ratio of $\rho^g_{X,x}$ for $x$ is the proportion of those nearby worlds and times at which the agent has the ground $g$ at which $X$ is true. And that, it seems to me, is the something like the objective probability of $X$ conditional on the agent having ground $g$, at least given the hypothetical frequentist account of objective probability of the sort that Tang favours. As above, we denote the objective probability of $X$ conditional on the agent $S$ having grounds $g$ as follows: $P(X | \mbox{$S$ has $g$})$. Thus, $P(X | \mbox{$S$ has $g$})$ is the truth-ratio of $\rho^g_{p,x}$ for $x$. And thus, a credence $x$ in $X$ based on ground $g$ is reliable iff $x$ is close to $P(X | \mbox{$S$ has $g$})$. That is,

Well-Founded Calibrationist Process Reliabilism for Justified Credences (first attempt) A credence of $x$ in $X$ by an agent $S$ is justified iff
(WCPC1) $S$ has ground $g$;
(WCPC2) the credence $x$ in $X$ by $S$ is based on ground $g$;
(WCPC3) the process producing a credence of $x$ in $X$ based on ground $g$ is a (nearly) well calibrated process -- that is, $P(X | \mbox{$S$ has $g$}) \approx x$.

But now compare Well-Founded Calibrationist Process Reliabilism, based on Dunn's account of reliable processes and Comesaña's solution to the Generality Problem, with Tang's first attempt at Indicator Reliabilism. Consider the necessary and sufficient conditions that each imposes for justification: TIC1 = WCPC1; TIC2 = WCPC2; TIC3 = WCPC3. Thus, these are the same account. However, as we saw above, Tang's first attempt to formulate indicator reliabilism for justified credence fails because it counts as justified a credence that is not based on an agent's total evidence; and we also saw that, once the Generality Problem is solved for Dunn's calibrationist process reliabilism, it faces a similar problem, namely, the Graining Problem from above. Tang amends his version of indicator reliabilism by adding the fourth condition TIC4 from above. Might we amend Dunn's calibrationist process reliabilism is a similar way?

Well-Founded Calibrationist Process Reliabilism for Justified Credences  A credence of $x$ in $X$ by an agent $S$ is justified iff
(WCPC1) $S$ has ground $g$;
(WCPC2) the credence $x$ in $X$ by $S$ is based on ground $g$;
(WCPC3) the process producing a credence of $x$ in $X$ based on ground $g$ is a (nearly) well calibrated process -- that is, $P(X | \mbox{$S$ has $g$}) \approx x$;
(WCPC4) there is no more inclusive ground $g'$ and credence $x' \not \approx x$, such that the process producing a credence of $x'$ in $X$ based on ground $g'$ is a (nearly) well calibrated process -- that is, $P(X | \mbox{$S$ has $g'$}) \approx x'$.

Since TIC4 is equivalent to WCPC4, this final version of process reliabilism for justified credences is equivalent to Tang's final version of his indicator reliabilism for justified credences. Thus, Dunn and Tang have reached the top of the same mountain, albeit by different routes
  

The third route up the mountain


Once we have addressed certain problems with the calibrationist version of process reliabilism for justified credence, we see that it agrees with the current best version of indicator reliabilism. This gives us a little hope that both have hit upon the correct account of justification. In the end, I will conclude that both have indeed hit upon the correct extension of the concept of justified credence. But that have done so for the wrong reasons, for they have not hit upon the correct intension.

There are two sorts of route you might take when pursuing an account of justification for a given sort of doxastic attitude, such as a credence or a full belief. You might look to intuitions concerning particular cases and try to discern a set of necessary and sufficient conditions that sort these cases in the same way that your intuitions do; or, you might begin with an account of epistemic value, assume that justification must be linked in some natural way to the promotion of epistemic value, and then provide an account of justification that vindicates that assumption. Dunn and Tang have each taken a route of the first sort; I will follow a route of the second sort.

I will adopt the veritist's account of epistemic value. That is, I take accuracy to be the sole fundamental source of epistemic value for a credence, where a credence in a true proposition is more accurate the higher it is; a credence in a false proposition is more accurate the lower it is. Given this account of epistemic value, what is the natural account of justification? Well, at first sight, there are two: one is process reliabilist; the other is indicator reliabilist. But, in a twist that should come as little surprise given the conclusions of the previous section, it will turn out that these two accounts coincide, and indeed coincide with the final versions of Dunn's and Tang's accounts that we reached above. Thus, I too will reach the top of the same mountain, but by yet another route.

Epistemic value version of indicator reliabilism


In the case of full beliefs, indicator reliabilism says this: a belief in $X$ by $S$ on the basis of grounds $g$ is justified iff the objective probability of $X$ given that $S$ has grounds $g$ is high --- that is, close to 1. Tang generalises this to the case of credence, but I think he generalises in the wrong direction; that is, he takes the wrong feature to be salient and uses that to formulate his indicator reliabilism for justified credence. He takes the general form of indicator reliabilism to be something like this: a doxastic attitude $s$ towards $X$ by $S$ on the basis of grounds $g$ is justified iff the attitude $s$ 'matches' the objective probability of $X$ given that $S$ has grounds $g$. And he takes the categorical attitude of belief in $X$ to 'match' high objective probability of $X$, and credence $x$ in $X$ to 'match' objective probability of $x$ that $X$. The problem with this account is that it leaves mysterious why justification is valuable. Unless we say that matching objective probabilities is somehow epistemic valuable in itself, it isn't clear why we should want to have justified doxastic attitudes in this sense.

I contend instead that the general form of indicator reliabilism is this:

Indicator reliabilism for justified doxastic attitude (epistemic value version)  Doxastic attitude $s$ towards proposition $X$ by agent $S$ is justified iff
(EIA1) $S$ has $g$;
(EIA2) $s$ in $X$ by $S$ is based on $g$;
(EIA3) if  $g' \subseteq g$ is a ground that $S$ has, then for every doxastic attitude $s'$ of the same sort as $s$, the expected epistemic value of attitude $s'$ towards $X$ given that $S$ has $g'$ is at most (or not much above) the expected epistemic value of attitude $s$ towards $X$ given that $S$ has $g'$.

Thus, attitude $s$ towards $X$ by $S$ is justified if $s$ is based on a ground $g$ that $S$ has, and $s$ is the attitude towards $X$ that has highest expected accuracy relative to the most inclusive grounds that $S$ has.

Let's consider this in the full belief case. We have:

Indicator reliabilism for justified belief (epistemic value version)  A belief in proposition $X$ by agent $S$ is justified iff
(EIB1) $S$ has $g$;
(EIB2) $s$ in $X$ by $S$ is based on $g$;
(EIB3) if  $g' \subseteq g$ is a ground that $S$ has, then
  1. the expected epistemic value of disbelief in $X$, given that $S$ has $g'$, is at most (or not much above) the expected epistemic value of belief in $X$, given that $S$ has $g'$;
  2. the expected epistemic value of suspension in $X$, given that $S$ has $g'$, is at most (or not much above) the expected epistemic value of belief in $X$, given that $S$ has $g'$.

To complete this, we need only an account of epistemic value. Here, the veritist's account of epistemic value runs as follows. There are three categorical doxastic attitudes towards a given proposition: belief, disbelief, and suspension of judgment. If the proposition is true, belief has greatest epistemic value, then suspension of judgment, then disbelief. If it is false, the order is reversed. It is natural to say that a belief in a truth and disbelief in a falsehood have the same high epistemic value -- following Kenny Easwaran (paper), we denote this $R$ (for `getting it Right'), and assume $R >0$. And it is natural to say that a disbelief in a truth and belief in a falsehood have the same low epistemic value -- again following Easwaran, we denote this $-W$ (for `getting it Wrong'), and assume $W > 0$. And finally it is natural to say that suspension of belief in a truth has the same epistemic value as suspension of belief in a falsehood, and both have epistemic value 0. We assume that $W > R$, just as Easwaran does. Now, suppose proposition $X$ has objective probability $p$. Then the expected epistemic utility of different categorical doxastic attitudes towards $X$ is given below:
  • Expected epistemic value of belief in $X$ = $p\cdot R + (1-p)\cdot(-W)$.
  • Expected epistemic value of suspension in $X$ = $p\cdot 0 + (1-p)\cdot 0$.
  • Expected epistemic value of disbelief in $X$ = $p\cdot (-W) + (1-p)\cdot R$.
Thus, belief in $X$ has greatest epistemic value amongst the possible categorical doxastic attitudes to $X$ if $p > \frac{W}{R+W}$;  disbelief in $X$ has greatest epistemic value if $p < \frac{R}{R+W}$; and suspension in $X$ has greatest value if $\frac{R}{R+W} < p < \frac{W}{R+W}$ (at $p = \frac{W}{R+W}$, belief ties with suspension; at $p = \frac{R}{R+W}$, disbelief ties with suspension). With this in hand, we have the following version of indicator reliabilism for justified beliefs:

Indicator reliabilism for justified belief (veritist version)  A belief in $X$ by agent $S$ is justified iff
(EIB1$^*$) $S$ has $g$;
(EIB2$^*$) the belief in $X$ by $S$ is based on $g$;
(EIB3$^*$) the objective probability of $X$ given that $S$ has $g$ is (nearly) greater than $\frac{W}{R+W}$;
(EIB4$^*$) there is no more inclusive ground $g'$ such that (a) $S$ has $g'$ and (b) the objective probability of $X$ given that $S$ has $g'$ is not (nearly) greater than $\frac{W}{R+W}$.

And of course this is simply a more explicit version of the standard version of indicator reliabilism. It is more explicit because it gives a particular threshold above which the objective probability of $X$ given that $S$ has $g$ counts as 'high', and above which (or not much below which) the belief in $X$ by $S$ counts as justified --- that threshold is $\frac{W}{R+W}$.

Note that this epistemic value version of indicator reliabilism for justified doxastic states also gives a straightforward account of when a suspension of judgment is justified. Simply replace (EIB3$^*$) and (EIB4$^*$) with:

(EIS3$^*$) the objective probability of $X$ given that $S$ has $g$ is (nearly) between $\frac{W}{R+W}$ and $\frac{R}{R+W}$;
(EIS4$^*$) there is no more inclusive ground $g'$ such that (a) $S$ has $g'$ and (b) the objective probability of $X$ given that $S$ has $g'$ is not (nearly) between $\frac{W}{R+W}$ and $\frac{R}{R+W}$.

And when a disbelief is justified. This time, replace (EIB3$^*$) and (EIB4$^*$)  with:

(EID3$^*$) the objective probability of $X$ given that $S$ has $g$ is (nearly) less than $\frac{R}{R+W}$;
(EID4$^*$) there is no more inclusive ground $g'$ such that (a) $S$ has $g'$ and (b) the objective probability of $X$ given that $S$ has $g'$ is not (nearly) less than $\frac{R}{R+W}$.

Next, let's turn to indicator reliabilism for justified credence. Here's the epistemic value version:

Indicator reliabilism for justified credence (epistemic value version) A credence of $x$ in proposition $X$ by agent $S$ is justified iff
(EIC1) $S$ has $g$;
(EIC2) credence $x$ in $X$ by $S$ is based on $g$;
(EIC3) if $g' \subseteq g$ is a ground that $S$ has, then for every credence $x'$, the expected epistemic value of credence $x'$ in $X$ given that $S$ has $g'$ is at most (or not much above) the expected epistemic value of credence $x$ in $X$ given that $S$ has $g'$.

Again, to complete this, we need an account of epistemic value for credences. As noted above, the veritist holds that the sole fundamental source of epistemic value for credences is their accuracy. There is a lot to be said about different potential measures of the accuracy of a credence -- see, for instance, Jim Joyce's 2009 paper 'Accuracy and Coherence', chapters 3 & 4 of my 2016 book Accuracy and the Laws of Credence, or Ben Levinstein's forthcoming paper 'A Pragmatist's Guide to Epistemic Utility'. But here I will say only this: we assume that those measures are continuous and strictly proper. That is, we assume: (i) we assume that the accuracy of a credence is a continuous function of that credence; and (ii) any probability $x$ in a proposition $X$ expects credence $x$ to be more accurate than it expects any other credence $x' \neq x$ in $X$ to be. These two assumptions are widespread in the literature on accuracy-first epistemology, and they are required for many of the central arguments in that area. Given veritism and the continuity and strict propriety of the accuracy measures, (EIC3) is provably equivalent to the conjunction of:

(EIC3$^*$) the objective probability of $X$ given that the agent has ground $g$ approximates or equals $x$ -- that is, $P(X | \mbox{$S$ has $g$}) \approx x$;
(EIC4$^*$) there is no more inclusive ground $g'$ such that (i) $S$ has $g'$ and (ii) the objective probability of $X$ given that the agent has ground $g'$ does not equal or approximate $x$ -- that is, $P(X | \mbox{$S$ has $g'$}) \not \approx x$.

But of course EIC3 = TIC3 and EIC4 = TIC4 from above. Thus, the veritist version of indicator reliabilism for justified credences is equivalent to Tang's indicator reliabilism, and thus to the calibrationist version of process reliabilism.

Epistemic value version of process reliabilism


Next, let's turn to process reliabilism. How might we give an epistemic value version of that? The mistake made by the calibrationist version of process reliabilism is of the same sort as the mistake made by Tang in his formulation of indicator reliabilism -- both generalise from the case of full beliefs in the wrong way by mistaking an accidental feature for the salient feature. For the calibrationist, a full belief is justified if it is formed by a reliable process, and a process is reliable if a high proportion of the beliefs it produces are true. Now, notice that there is a sense in which such a process is calibrated: a belief is associated with a high degree of confidence, and that matches, at least approximately, the high truth-ratio of the process. In fact, we want to say that this process is belief-reliable. For it is possible for a process to be reliable in its formation of beliefs, but not in its formation of disbeliefs. So a process is disbelief-reliable if a high proportion of the disbeliefs it produces are false. And we might say that a process is suspension-reliable if a middling proportion the suspensions it forms are true and a middling proportion are false. In each case, we think that,  corresponding to each sort of categorical doxastic attitude $s$, there is a fitting proportion $x$ such that a process is $s$-reliable if $x$ is (approximately) the proportion of truths amongst the propositions to which it assigns $s$. Applying this in the credal case gives us the calibrationist version of process reliabilism that we have already met -- a credence $x$ in $S$ is justified if it is formed by a process whose truth-ratio for a given credence is equal to that credence. However, being the product of a belief-reliable process is not the feature of a belief in virtue of which it is justified. Rather, a belief is justified if it is the product of a process that has high expected epistemic value.

Process reliabilism for justified doxastic attitude (epistemic value version)  Doxastic attitude $s$ towards proposition $X$ by agent $S$ is justified iff
(EPA1-$\beta$) $s$ is produced by a process $\rho$;
(EPA2-$\beta$) If $\rho'$ is a process that is available to $S$, then the expected epistemic value of $\rho'$ is at most (or not much more than) the expected epistemic value of $\rho$.

That is, a doxastic attitude is justified for an agent if it is the output of a process that maximizes or nearly maximizes expected epistemic value amongst all processes that are available to her. To complete this account, we must say which processes count as available to an agent. To answer this, recall Comesaña's solution to the Generality Problem. On this solution, the only processes that interest us have the form, process producing doxastic attitude $s$ towards $X$ on basis of ground $g$. Clearly, a process of this form is available to an agent exactly when the agent has ground $g$. This gives

Process Reliabilism about Justified Doxastic Attitudes (Epistemic value version) Attitude $s$ towards proposition $X$ by $S$ is justified iff
(EPA1-$\alpha$) $s$ is produced by process $\rho^g_{s, X}$;
(EPA2-$\alpha$) If  $g' \subseteq g$ is a ground that $S$ has, then for every doxastic attitude $s'$, the expected epistemic value of process $\rho^{g'}_{s', X}$ is at most (or not much more than) the expected epistemic value of process $\rho^{g}_{s, X}$.

Thus, in the case of full beliefs, we have:

Process reliabilism for justified belief (epistemic value version)  A belief in proposition $X$ by agent $S$ is justified iff
(EPB1) Belief in $X$ is produced by process $\rho^g_{\mathrm{bel}, X}$;
(EPB2) if  $g' \subseteq g$ is a ground that $S$ has, then
  1. the expected epistemic value of process $\rho^g_{\mathrm{dis}, X}$ is at most (or not much more than) the expected epistemic value of process $\rho^g_{\mathrm{bel}, X}$;
  2. the expected epistemic value of process $\rho^g_{\mathrm{sus}, X}$ is at most (or not much more than) the expected epistemic value of process $\rho^g_{\mathrm{bel}, X}$;

And it is easy to see that (EPB1) = (EIB1) + (EIB2), since belief in $X$ is produced by process $\rho^g_{\mathrm{bel}, X}$ iff $S$ has ground $g$ and a belief in $X$ by $S$ is based on $g$. Also, (EPB2) is equivalent to (EIB3). Thus, as for the epistemic version of indicator reliabilism, we get:

Indicator reliabilism for justified belief (veritist version) A belief in $X$ by agent $S$ is justified iff
(EPB1) $S$ has $g$;
(EPB2) the belief in $X$ by $S$ is based on $g$;
(EPB3) the objective probability of $X$ given that $S$ has $g$ is (nearly) greater than $\frac{W}{R+W}$;
(EPB4) there is no more inclusive ground $g'$ such that (a) $S$ has $g'$ and (b) the objective probability of $X$ given that $S$ has $g'$ is not (nearly) greater than $\frac{W}{R+W}$.

Next, consider how the epistemic value version of process reliabilism applies to credences.

Process reliabilism for justified credence (epistemic value version)  A credence of $x$ in proposition $X$ by agent $S$ is justified iff
(EPC1) the credence in $x$ is produced by process $\rho^g_{x, X}$;
(EPC2) if  $g' \subseteq g$ is a ground that $S$ and $x'$ is a credence, then the expected epistemic value of process $\rho^{g'}_{x', X}$ is at most (or not much more than) the expected epistemic value of process $\rho^g_{x, X}$.

As before, we see that (EPC1) is equivalent to (EIC1) + (EIC2). And, providing the measure of accuracy is strictly proper and continuous, we get that (EPC2) is equivalent to (EIC3). So, once again, we arrive at the same summit. The routes taken by Tang, Dunn, and the epistemic value versions of process and indicator reliabilism lead to the same spot, namely, the following account of justified credence:

Reliabilism for justified credence (epistemic value version)  A credence of $x$ in proposition $X$ by agent $S$ is justified iff
(ERC1) $S$ has $g$;
(ERC2) credence $x$ in $X$ by $S$ is based on $g$;
(ERC3) the objective probability of $X$ given that the agent has ground $g$ approximates or equals $x$ -- that is, $P(X | \mbox{$S$ has $g$}) \approx x$;
(ERC4) there is no more inclusive ground $g'$ such that (i) $S$ has $g'$ and (ii) the objective probability of $X$ given that the agent has ground $g'$ does not equal or approximate $x$ -- that is, $P(X | \mbox{$S$ has $g'$}) \not \approx x$.


Tuesday, 31 January 2017

Fifth Reasoning Club Conference @ Turin EXTENDED DEADLINE

The Fifth Reasoning Club Conference will take place at the Center for Logic, Language, and Cognition in Turin on May 18-19, 2017.

Keynote speakers:

Branden FITELSON (Northeastern University, Boston)
Jeanne PEIJNENBURG (University of Groningen)
Katya TENTORI (University of Trento)
Paul EGRÉ (Institut Jean Nicod, Paris)

Organizing committee: Gustavo Cevolani (Turin), Vincenzo Crupi (Turin), Jason Konek (Kent), and Paolo Maffezioli (Turin).

 
CALL FOR ABSTRACTS

The submission deadline for the Fifth Reasoning Club Conference has been EXTENDED to 15 February 2017. The final decision on submissions will be made by 15 March 2017.

All PhD candidates and early career researchers with interests in reasoning and inference, broadly construed, are encouraged to submit an abstract of up to 500 words (prepared for blind review) via Easy Chair at https://easychair.org/conferences/?conf=rcc17. We especially welcome members of groups that are underrepresented in philosophy to submit. We are committed to promoting diversity in our final programme.

Grants will be available to help cover travel costs for contributed speakers. To apply for a travel grant, please send a CV and a short travel budget estimate in a single pdf file to reasoningclubconference2017@gmail.com.

More information is available at http://www.llc.unito.it/notizie/reasoning-club-2017-llc-call-papers-now-open. For any queries please contact Vincenzo Crupi (vincenzo.crupi@unito.it) or Jason Konek (J.Konek@kent.ac.uk).​

The Reasoning Club is a network of institutes, centres, departments, and groups addressing research topics connected to reasoning, inference, and methodology broadly construed. It issues the monthly gazette The Reasoner.

Earlier editions of the meeting were held in BrusselsPisaKent, and Manchester. ​

Saturday, 21 January 2017

More on the Principal Principle and the Principle of Indifference

Last week, I posted about a recent paper by James Hawthorne, Jürgen Landes, Christian Wallmann, and Jon Williamson called 'The Principal Principle implies the Principle of Indifference', which was published in the British Journal for the Philosophy of Science in 2015. In that post, I read the HLWW paper a particular way. I took their argument to run roughly as follows:

The Principal Principle, as Lewis stated it, includes an admissibility condition. Any adequate account of admissibility should entail Conditions 1 and 2 (see below). Together with Conditions 1 and 2, the Principal Principle entails the Principle of Indifference. Thus, the Principal Principle entails the Principle of Indifference.

Read like this, my response to the argument ran thus:

There is an account of admissibility -- namely, Levi-admissibility -- that is adequate and on which Condition 2 is not generally true. Levi-admissibility is adequate since has all of the features that Lewis required of admissibility, and it is very natural when we consider a close relative of Lewis' Principal Principle, namely, Levi's Principal Principle, which follows from Lewis' Principal Principle given some natural assumptions about admissibility that Lewis accepts.

However, there is another reading of the HLWW argument, and indeed it seems that some of H, L, W, and W favour it. On this alternative reading, it is not assumed that Conditions 1 and 2 follow from any adequate account of admissibility. Rather Conditions 1 and 2 are not taken to be consequences of the Principal Principle at all. Rather, they are intended to be plausible further constraints on credences that are independent of the Principal Principle. Thus, on this reading, the conclusion of the HLWW is not that the Principal Principle implies the Principle of Indifference. Rather, it is that the Principal Principle, together with two further norms (namely, Conditions 1 and 2), implies the Principle of Indifference.

In this post, I will raise an objection to this alternative argument.

The HLWW argument turns on a mathematical theorem. It takes certain constraints -- (I), (II), (III) below -- and shows that, if an agent's credence function satisfies those constraints, then it must satisfy a particular instance of the Principle of Indifference.

Theorem 1 If there is $0 < x < 1$ such that
(I) $P(F | X) = P(F)$
(II) $P(A | FX) = x$
(III) $P(A | X (A \leftrightarrow F)) = x$
then
(IV) $P(F) = 0.5$.

Now, the instance of the Principle of Indifference that HLWW wish to infer using this theorem is this:

Principle of Indifference (atomic case) Suppose $F$ is an atomic proposition and $P_0$ is our agent's initial credence function. Then $P_0(F) = 0.5$.

Thus, to obtain this from Theorem 1, we need the following: for each atomic $F$, there is $A$, $X$, and $0 < x < 1$ that satisfy (I), (II), and (III). Conditions 1 and 2 are intended to obtain this, but I think the argument is clearest if we argue for them directly, using the considerations found in HLWW.

Thus, suppose $F$ is atomic. Then the idea is this. Pick a proposition $X$ with two features: (a) if you were to learn $X$ and nothing more as your first piece of evidence, it would place a very strict constraint on your credence in $A$ --- it would require you to have credence $x$ in $A$; (b) $X$ provides no information about $F$ nor about the relationship between $A$ and $F$. Now, providing that $A$ is not logically related to $F$, we might take $X$ to be the proposition $C^A_x$ that says that the objective chance of $A$ is $x$. By the Principal Principle, $C^A_x$ has the first feature (a): $P_0(A | X) = x$. What's more, since $A$ is logically independent of $F$, $C^A_x$ also has the second feature (b): in the absence of further evidence, and in particular evidence about the relationship between $A$ and $F$, $C^A_x$ provides no information about $F$ nor about the relationship between $A$ and $F$.

Now, with $A$, $X$, $x$ in hand, we appeal to two principles concerning the way that we should respond to evidence:

(Ev1): If your credence function is $P$ and your evidence does not provide any information about the connection between $B$ and $C$, then $P(B | C) = P(B)$.

In slogan form, this says: Ignorance entails irrelevance.

(Ev2): If you have strong evidence concerning $B$ and no evidence concerning $C$, then $P(B | B \leftrightarrow C) = P(B)$.

In slogan form, as we will see: Credences supported by stronger evidence are more resilient.

Now, from (Ev1), we immediately obtain (I) for our agent's initial credence function $P_0$ with $F$ atomic and $X = C^A_x$. After all, if you have no evidence, your evidence certainly does not provide any information about the connection between $C^A_x$ and $F$.

From (Ev1) and the Principal Principle, we obtain (II) for $P_0$ with $F$ atomic and $X = C^A_x$. Suppose you first learn $C^A_x$ as evidence. So your credence function is $P_1(-) = P_0(-|C^A_x)$. Now, by hypothesis, $C^A_x$ provides no information about the connection between $F$ and $A$. Then, by (Ev1), $P_1(A | F) = P_1(A)$. So $P_0(A | F\ \&\ C^A_x) = P_0(A | C^A_x)$. And, by the Principal Principle, $P_0(A | C^A_x) = x$. So $P_0(A | F\ \&\ C^A_x) = x$.

Finally, from (Ev2) and the Principal Principle, we (III) for $P_0$ with $F$ atomic and $X = C^A_x$. Again, suppose you learn $C^A_x$. So $P_1(-) = P_0(-|C^A_x)$. You thus have strong evidence concerning $A$ and no evidence concerning $F$. Thus, by (Ev2), $P_1(A | A \leftrightarrow F) = P_1(A)$. That is, $P_0(A | C^A_x\ \&\ (A \leftrightarrow F)) = P_0(A | C^A_x)$. And by the Principal Principle, $P_0(A | C^A_x) = x$. So $P_0(A | C^A_x\ \&\ (A \leftrightarrow F)) = x$.

Thus, the plausibility of the HLWW argument turns on the plausibility of (Ev1) and (Ev2). Unfortunately, both beg the question concerning the Principle of Indifference. As a result, they cannot be assumed in a justification of that norm. Let's consider each in turn.

First, (Ev1). If your evidence does not provide any information about the connection between $B$ and $C$, then this evidence leaves open the possibility that $B$ is positively relevant to $C$; it leaves open the possibility that $B$ is negatively relevant to $C$; and it leaves open the possibility that $B$ is irrelevant to $C$. But (Ev1) demands that we deny the first two possibilities and take $B$ to be irrelevant to $C$. But why? Without further argument, it seems that we would be equally justified in taking $B$ to be positively relevant to $C$ and equally justified in taking $C$ to be negatively relevant to $C$.

Second, (Ev2). The idea is this: When I learn that two propositions, $B$ and $C$, are equivalent, there are many ways I might respond. I might retain my prior credence in $B$ and bring my credence in $C$ into line with that. Or I might retain my prior credence in $C$ and bring my credence in $B$ into line with that. Or I might do many other things. (Ev2) says that, if I have strong evidence concerning $B$ and no evidence concerning $C$, then I should opt for the first response and retain my prior credence in $B$ -- which was formed in response to the strong evidence concerning $B$ -- and bring my credence in $C$ into line with that -- since my prior credence in $C$ was, in any case, formed in response to no relevant evidence at all.

Now, on the face of it, this seems like a reasonable constraint on our response to evidence. It says, essentially, that credence formed in response to stronger evidence should be more resilient than credence formed in response to weaker evidence. And, as a limiting case, credence formed in response to strong evidence, such as evidence about the chances, should be maximally resilient when compared to credence formed in response to no evidence. (Note that a similar way of thinking might give an alternative motivation for (II), since this is also a principle of resilient credence.)

However, unfortunately, (Ev2) threatens to be inconsistent. After all, it is easy to suppose that there are propositions $B$, $C$, and $D$ such that you have strong evidence for $B$, but no evidence concerning $C$ or $D$ or $C\ \&\ D$ or $C\ \&\ \neg D$. But, in that situation, (Ev2) entails:

  • $P(B | B \leftrightarrow C) = P(B)$
  • $P(B | B \leftrightarrow (C\ \&\ D)) = P(B)$
  • $P(B | B \leftrightarrow (C\ \&\ \neg D)) = P(B)$

And unfortunately these are inconsistent constraints on a probability function. To avoid this inconsistency, the defender of (Ev2) must say that, in fact, our lack of evidence concerning $C$, $D$, $C\ \&\ D$ and $C\ \&\ \neg D$ indeed counts as no evidence concerning $C$ and $D$, but does count as evidence concerning $C\ \&\ D$ and $C\ \&\ \neg D$. How might they do that? Well, they might note that, while $C$ and $D$ are each true in half the possible worlds, since they are atomic, $C\ \&\ D$ and $C\ \&\ \neg D$ are true only in a quarter of the possible worlds. And thus a lack of evidence is in fact evidence against them. But of course this line of argument appeals to the Principle of Indifference. Only if you think that every world should receive equal credence will you think that a lack of evidence counts as no evidence for a proposition that is true at half of the possible worlds, but counts as genuine evidence against a proposition that is true at only a quarter of the worlds.

Thus, I conclude that the HLWW argument fails. While (Ev1) and (Ev2) may be true, we cannot appeal to them in order to justify the Principle of Indifference, since they can only be defended by appealing to the Principle of Indifference itself.

Tuesday, 17 January 2017

The Principal Principle does not imply the Principle of Indifference

Recently, James Hawthorne, Jürgen Landes, Christian Wallmann, and Jon Williamson published a paper in the British Journal of Philosophy of Science in which they claim that the Principal Principle entails the Principle of Indifference -- indeed, the paper is called 'The Principal Principle implies the Principle of Indifference'. In this post, I argue that it does not.

All Bayesian epistemologists agree on two claims. The first, which we might call Precise Credences, says that an agent's doxastic state at a given time $t$ in her epistemic life can be represented by a single credence function $P_t$, which assigns to each proposition $A$ about which she has an opinion a precise numerical value $P_t(A)$ that is at least 0 and at most 1. $P_t(A)$ is the agent's credence in $A$ at $t$. It measures how strongly she believes $A$ at $t$, or how confident she is at $t$ that $A$ is true. The second point of agreement, which is typically known as Probabilism, says that an agent's credence function at a given time should be a probability function: that is, for all times $t$, $P_t(\top) = 1$ for any tautology $\top$, $P_t(\bot) = 0$ for any contradiction $\bot$, and $P_t(A \vee B) = P_t(A) + P_t(B) - P_t(AB)$ for any propositions $A$ and $B$.

So Precise Credences and Probabilism form the core of Bayesian epistemology. But, beyond these two norms, there is little agreement between its adherents. Bayesian epistemologists disagree along (at least) two dimensions. First, they disagree about the correct norms concerning updating on evidence learned with certainty --- some say they are diachronic norms concerning how an agent should in fact update; others say that there are only synchronic norms concerning how an agent should plan to update; and others think there are no norms concerning updating at all. Second, they disagree about the stringency of the synchronic norms that don't concern updating. Our concern here is with the latter. Some candidates norms of this sort: the Principal Principle, which says how an agent's credences in propositions concerning the objective chances should relate to her credences in other propositions (Lewis 1980); the Reflection Principle, which says how an agent's current credences in propositions concerning her future credences should relate to her current credences in other propositions (van Fraassen 1984, Briggs 2009); and the Principle of Indifference, which says, roughly, that an agent with no evidence should divide her credences equally over all possibilities (Keynes 1921, Carnap 1950, Jaynes 2003, Williamson 2010, Pettigrew 2014). Those we might call Radical Subjective Bayesians adhere to Precise Credences and Probabilism, but reject the Principal Principle, the Reflection Principle, and the Principle of Indifference. Those we might call Moderate Subjective Bayesians adhere to Precise Credences, Probabilism, and the Principal Principle (and also, quite often, the Reflection Principle), but they reject the Principle of Indifference. And the Objective Bayesians accept all of the principles.

In a recent paper, Hawthorne et al. (2015) (henceforth, HLWW) argue that Moderate Subjective Bayesianism is an inconsistent position, because the Principal Principle (and, indeed the Reflection Principle) entails the Principle of Indifference. Thus, it is inconsistent to accept the former and reject the latter. We must either reject the Principal Principle, as the Radical Subjective Bayesian does, or accept it together with the Principle of Indifference, as the Objective Bayesian does.

Notoriously, as Lewis originally stated it, the Principal Principle includes an admissibility condition (266-7, Lewis 1980). Equally notoriously, Lewis did not provide a precise account of this condition, thereby leaving his formulation of the principle similarly imprecise. HLWW do not give a precise account either. But they do appeal to two principles that they take to follow intuitively from the Principal Principle. And from these two principles, together with the Principal Principle itself, they derive what they take to be an instance of the Principle of Indifference. The first principle to which they appeal --- their Condition 1 --- is in fact provable, as they note. The second --- their Condition 2 --- is not. Indeed, as we will see, on the correct understanding of admissibility, it is false. Thus, the HLWW argument fails. What's more, its conclusion is not true. It is possible to satisfy the Principal Principle without satisfying the Principle of Indifference, as we will see below. Moderate Subjective Bayesianism is a coherent position.


Introducing the Principal Principle


We begin by introducing the Principal Principle. To aid our statement, let me introduce a piece of notation. Given a proposition $A$ and a real number $0 \leq x \leq 1$, let $C^A_x$ be the following proposition: The current objective chance of $A$ is $x$. And we will let $P_0$ be the credence function of our agent at the very beginning of her epistemic life --- when she is, as Lewis would say, a superbaby; that is, she is not yet in receipt of any evidence. Then, as Lewis originally formulates the Principal Principle, it says this:

Lewis' Principal Principle Suppose $A$, $E$ are propositions and $0 \leq x \leq 1$. Then it should be the case that $$P_0(A | C^A_xE) = x $$providing (i) $P_0(C^A_xE) > 0$, and (ii) $E$ is admissible for $A$.

In this version, the principle applies to an agent only at the beginning of her epistemic life; it governs her initial credence function. In this situation, the principle says, her credence in a proposition $A$ conditional on the conjunction of some proposition $E$ and a chance proposition that says that the chance of $A$ is $x$ should be $x$, providing the conditional probability is well-defined and $E$ is admissible for $A$.

The motivation for the admissibility condition is this. Suppose $E$ entails $A$. Then we surely don't want to demand that $P_0(A | C^A_xE) = x$. After all, if $x < 1$, then such a demand would conflict with Probabilism, since it is a consequence of Probabilism that, if $E$ entails $A$, then $P_0(A | C^A_xE) = 1$. Thus, we must at least restrict the Principal Principle so that it does not apply when $E$ entails $A$. But there are other cases in which the Principal Principle should not be imposed, even if such an application would not be outright inconsistent with other norms such as Probabilism. For instance, suppose that $E$ entails that the chance of $A$ at some time in the future is $x' \neq x$. Then, again, we don't want to require that $P_0(A | C^A_xE) = x$. The moral is this: if $E$ contains information about $A$ that overrides the information that the current chance of $A$ gives about $A$, then it is inadmissible. Clearly any proposition that logically entails $A$ provides information that overrides the current chance information about $A$; and so does a proposition that entails something about the future chance of $A$. So much for propositions that are inadmissible. Are there any we can be sure are admissible? According to Lewis, there are, namely, propositions solely concerning the past or the present. Thus, Lewis does not give a precise account of admissibility: he gives a heuristic --- $E$ is admissible for $A$ if $E$ does not provide information about $A$ that overrides the information contained in propositions about the current chance of $A$ --- and he gives examples of propositions that do and do not provide such information --- I've recalled some of Lewis' examples here.

Now, as Lewis himself noted, the Principal Principle has implausible consequences when the chances are self-undermining --- that is, when the chances assign a positive probability to outcomes in which the chances are different. This happens, for instance, for Lewis' own favoured account of chance, the Humean account or Best System Analysis. This lead to reformulations of the Principal Principle, such as Thau's and Hall's New Principle (Lewis 1994, Thau 1994, Hall 1994) and Ismael's General Recipe  (Ismael 2008). HLWW say nothing explicitly  about whether or not chances are self-undermining. But, since they are interested in investigating the Principal Principle and not the New Principle or the General Recipe,  I take them to assume that chances are not self-undermining. I will do likewise.

The HLWW argument


However imprecise Lewis' account of admissibility is, HLWW take it to be precise enough to allow us to be confident of the following principles:

Condition 1  If
(1a) $E$ is admissible for $A$, and
(1b) $C^A_xE$ contains no information that renders $F$ relevant to $A$,
then
(1c) $EF$ is admissible for $A$.

Now, HLWW propose to make (1b) precise as follows: $$P_0(A | FC^A_xE) = P_0(A | C^A_xE)$$ That is, $C^A_xE$ contains no information that renders $F$ relevant to $A$ just in case $C^A_xE$ renders $A$ probabilistically independent of $F$. With that explication in hand, Condition 1 now actually follows logically from Lewis' Principal Principle, as HLWW note. After all, by (1a) and Lewis' Principal Principle, $P_0(A | C^A_xE) = x$. And, by the explication of (1b), $P_0(A | C^A_xE) = P_0(A | FC^A_xE)$. Daisychaining these identities together, we have $P_0(A | FC^A_xE) = x$, which is (1c).

Condition 2 If
(2a) $E$ is admissible for $A$, and
(2b) $C^A_xE$ contains no information that renders $F$ relevant to $A$,
then
(2c) $E(A \leftrightarrow F)$ is admissible for $A$.

This is not provable. Indeed, as we will see below, it is false. Nonetheless, together with Lewis' Principal Principle, Conditions 1 and 2 entail a constraint on an agent's credence function that HLWW take to be the constraint imposed by the Principle of Indifference.

Proposition 1 Suppose Lewis' Principal Principle together with Conditions 1 and 2 hold. And suppose that there are propositions $A$, $E$, and $F$ and $0 < x < 1$ such that $E$ is admissible for $A$. Suppose further that $F$ is atomic and contingent. Then

(i) If $C^A_xE$ contains no information that renders $F$ relevant to $A$, then the following is required of the agent's initial credence function: $P_0(F | C^A_xE) = 0.5.$

(ii) If $C^A_xE$ contains no information whatsoever about $F$ (so that $P_0(F | C^A_xE) = P_0(F)$), then the following is required of the agent's initial credence function: $P_0(F) = 0.5$

HLWW take Proposition 1 to show that the Principle of Indifference follows from the Principal Principle. After all, Condition 1 is simply a theorem. And they take Condition 2 to be a consequence of the Principal Principle, given the correct understanding of admissibility. So if you assume the Principal Principle, you get all of the hypotheses of the theorem. However, as we will see in the next two sections, Condition 2 is in fact false.

Levi's Principal Principle and Levi-Admissibility


Above, we stated the Principal Principle as follows:

Lewis' Principal Principle $P_0(A | C^A_xE) = x$, providing (i) $P_0(C^A_xE) > 0$,  and (ii) $E$ is admissible for $A$.

Now suppose we make the following assumption about admissibility:

Current Chance Admissibility Propositions about the current objective chances are admissible.

Thus, for instance, $P_0(A | C^A_xC^B_y) = x$, providing $P_0(C^A_xC^B_y) > 0$, which also ensures that $C^A_x$ and $C^B_y$ are compatible.

Now suppose that, if $ch$ is a probability function defined over all the propositions about which the agent has an opinion, $C_{ch}$ is the proposition that says that the objective chances are given by $ch$. Then it follows from the Principal Principle and Current Chance Admissibility that $P_0(A | C_{ch}) = ch(A)$. But it also follows from this that:

Levi's Principal Principle (Bodgan 1984, Pettigrew 2012) $P_0(A | C_{ch}E) = ch(A | E)$, providing $P_0(C_{ch}E), ch(E) > 0$.

This is a version of the Principal Principle that makes no mention of admissibility. From it, something close to Lewis' Principal Principle follows: If $P_0(C^{A|E}_x E) > 0$, then $$P_0(A | C^{A|E}_x E) = x$$ where $C^{A|E}_x$ is the proposition: The current objective chance of $A$ conditional on $E$ is $x$. What's more, while Levi's version does not mention admissibility, since it applies equally when the proposition $E$ is not admissible, it does suggest a precise account of admissibility. And it is possible to show that, if we take the version of Lewis' Principal Principle that results from understanding admissibility in this way, it is a consequence of Levi's Principal Principle.

Levi-Admissibility $A$ is Levi-admissible for $E$ if, for all possible chance functions $ch$, $ch(A | E) = ch(A)$.

That is, on this account $A$ is admissible for $E$ if every chance function renders $A$ and $E$ stochastically independent. Three points are worthy of note:
  1. All propositions providing future information about the chance of $A$ or information about the truth value of $A$ are Levi-inadmissible, since $A$ will be stochastically dependent on such propositions according to all possible current chance functions. So this account of admissibility agrees with the examples of clearly inadmissible propositions that we gave above.
  2. All propositions solely about the past are Levi-admissible, since all such propositions will now be true or false and will be assigned chance 1 or 0 accordingly by all possible current chance functions. So this account of admissibility agrees with the examples of clearly admissible propositions that we gave above.
  3. If $A$ is Levi-admissible for $E$, then $P_0(A | C^A_xE) = P_0(A | C^{A|E}_xE ) = x$. That is, Lewis' Principal Principle follows from Levi's version if we understand Lewis' notion of admissibility as Levi-admissibility.
Taken together, (1), (2), and (3) entail that Levi-admissibility has all of the features that Lewis wished admissibility to have.

Now, although Levi's account of admissibility recovers Lewis' examples, it might seem to be too demanding. Suppose, for instance, that $A$ is a proposition concerning the toss of a coin in Quito --- it says that it will lands heads --- while $E$ is a proposition concerning tomorrow's weather in Addis Ababa --- it says that it will rain. Then, intuitively, $E$ is admissible for $A$. But $E$ is not Levi-admissible for $A$. After all, we are considering an agent at the beginning of her epistemic life. And so there are certainly possible chance functions --- probability functions that, for all she knows, give the objective chances --- that do not render $E$ and $A$ stochastically independent.

However, in fact, on closer inspection, the Levi-admissibility verdict is exactly right. Consider my credence in $A$ conditional on $E$ and the chance hypothesis $C^A_{0.5}$, which says that the coin in Quito is fair and so the unconditional chance of $A$ is 0.5. Amongst the chance functions that are epistemically possible for me, some make $E$ irrelevant to $A$, some make it positively relevant to $A$ and some make it negatively relevant to $A$. Indeed, we might suppose that the possible chances of $A$ conditional on $E$ run the full gamut of values from 0 to 1. In that case, surely we don't want to say that $E$ is admissible for $A$ and thereby impose, via the Principal Principle, the demand that our agent's credence in $A$ conditional on $E$ and $C^A_{0.5}$ is 0.5. After all, if I choose to place most of my prior credence on the chance hypotheses on which $E$ is positively relevant to $A$, then my credence in $A$ conditional on $E$ and $C^A_{0.5}$ should not be 0.5 --- it should be something greater than 0.5. If I choose to place most of my prior credence on the chance hypotheses on which $E$ is negatively relevant to $A$, then my credence in $A$ conditional on $E$ and $C^A_{0.5}$ should not be 0.5 --- it should be something less than 0.5. Of course, we might think that it is irrational for our agent, a superbaby with no evidence one way or the other, to favour the positive relevance hypotheses over those that posit neutral relevance and negative relevance. We might think that she should spread her credences equally over all of the possibilities, in which case their effects will cancel out, and her credence in $A$ conditional on $E$ and $C^A_{0.5}$ will indeed be 0.5. But of course to do this is to assume the Principle of Indifference and beg the question.

The failure of Condition 2


With this precise account of admissibility in hand, we can now test to see whether or not it vindicates Condition 2 --- recall, HLWW claim that this is a consequence of the Principal Principle. As we saw above, Condition 2 runs as follows:

Condition 2 If
(2a) $E$ is admissible for $A$, and
(2b) $C^A_xE$ contains no information that renders $F$ relevant to $A$,
then
(2c) $E(A \leftrightarrow F)$ is admissible for $A$.

Now suppose that Lewis' Principal Principle is true, and assume that admissibility means Levi-admissibility. Then this is equivalent to:

Condition 2$^*$ If $ch$ is a possible chance function, and
(2a$^*$) $ch(A | E) = ch(A)$, and
(2b$^*$) $ch(A | FE) = ch(A | E)$,
then
(2c$^*$) $ch(A | E(A \leftrightarrow F)) = ch(A)$.

However, this is false. Indeed, we can show the following:

Proposition 2 For any value $0 \leq y \leq 1$, there is a chance function $ch$ such that (2a$^*$) and (2b$^*$) hold, but $$ch(A | E(A \leftrightarrow F)) = y$$

Thus, (2a$^*$) and (2b$^*$) impose no constraints whatsoever on the chance of $A$ conditional on $E(A \leftrightarrow F)$.

Thus, it is possible that $E$ is Levi-admissible for $A$ and that $C^A_xE$ carries no information whatsoever about $F$, and yet $E(A \leftrightarrow F)$ is not Levi-admissible for $A$. Thus, Condition 2 is false and the HLWW argument fails.

Levi's Principal Principle and the Principle of Indifference


Of course, the failure of an argument does not entail the falsity of its conclusion. It might yet be the case that the Principal Principle entails the Principle of Indifference, even if the HLWW argument does not show that. But in fact we can show that this is not true. To see this, we note a sufficient condition for satisfying Levi's Principal Principle:

Proposition 3 Suppose $C$ is the set of all possible chance functions. Then, if $P_0$ is in the convex hull of $C$, then $P_0(A | C_{ch} E) = ch(A | E)$.

Now, if Levi's Principal Principle entails the Principle of Indifference, and the Principle of Indifference entails that every atomic proposition has probability 0.5, then it follows that every member of the convex hull of the set of possible chance functions must assign probability 0.5 to every atomic proposition. But it is easy to see that this is not true. Let $F$ be the atomic proposition that says that a sample of uranium will decay at some point in the next hour. In the absence of evidence, the possible chances of $F$ range over the full unit interval from 0 to 1. Thus, there are members of the convex hull of the set of possible chance functions that assign probabilities other than 0.5 to $F$. And, by Proposition 3, these members will satisfy Levi's Principal Principle.

Applying Levi's Principal Principle


A possible objection: Levi's Principal Principle is all well and good in theory, but it is not applicable. Suppose we are interested in a proposition $A$; and we have collected evidence $E$. How might we apply Levi's Principal Principle in order to set our credence in $A$? In the case of Lewis' version of the principle, we need only know the chance of $A$ and the fact that $E$ is admissible for $A$, and we often know both of  these. But, in order to apply Levi's version, we must know the chance of $A$ conditional on our evidence $E$. And, at least for large and varied bodies of evidence, we never know this. Or so the objection goes.

But the objection fails. In fact, Levi's Principal Principle may be applied in those cases. You don't have to know the chance of $A$ conditional on $E$ in order to set your credence in $A$ when you have evidence $E$. You simply have to have opinions about the different possible values that that conditional chance might take. You then apply Levi's Principal Principle, together with the Law of Total Probability, which jointly entail that your credence in $A$ given $E$ should be your expectation of the chance of $A$ given $E$. Of course, neither Levi's Principal Principle nor the Law of Total Probability will tell you how to set your credences in the different possible values that the conditional chance of $A$ given $E$ might take. But that's not a problem for the Moderate Subjective Bayesian, who doesn't expect her evidence to pin down a unique credal response. Only the Objective Bayesian would expect that. You pick your probability distribution over those possible conditional chance values and Levi's Principal Principle does the rest via the Law of Total Probability.

Conclusion



The HLWW argument purports to show that the Principal Principle entails the Principle of Indifference. But it fails because, on the correct understanding of admissibility, Condition 2 is not a consequence of the Principal Principle; and indeed it is false. What's more, we can see that there are credence functions that satisfy the correct version of the Principal Principle --- namely, Levi's Principal Principle --- that do not satisfy the Principle of Indifference. The logical space is therefore safe once again for Moderate Subjective Bayesians, that is, those who accept Precise Credences, Probabilism, the Principal Principle (and perhaps the Reflection Principle), but who deny the Principle of Indifference.


References

  • Bogdan, R. (Ed.) (1984). Henry E. Kyburg, Jr. and Isaac Levi. Dordrecht: Reidel.
  • Briggs, R. (2009). Distorted Reflection. Philosophical Review, 118(1), 59–85.
  • Carnap, R. (1950). Logical Foundations of Probability. Chicago: University of Chicago Press.
  • Hall, N. (1994). Correcting the Guide to Objective Chance. Mind, 103, 505–518.
  • Hawthorne, J., Landes, J., Wallman, C., & Williamson, J. (2015). The Principal Principle Implies the Principle of Indifference. The British Journal for the Philosophy of Science
  • Ismael, J. (2008). Raid! Dissolving the Big, Bad Bug. Noûs, 42(2), 292–307.
  • Jaynes, E. T. (2003). Probability Theory: The Logic of Science. Cambridge, UK: Cambridge University Press.
  • Keynes, J. M. (1921). A Treatise on Probability. London: Macmillan.
  • Lewis, D. (1980). A Subjectivist’s Guide to Objective Chance. In R. C. Jeffrey (Ed.) Studies in Inductive Logic and Probability, vol. II. Berkeley: University of California Press.
  • Lewis, D. (1994). Humean Supervenience Debugged. Mind, 103, 473–490.
  • Pettigrew, R. (2012). Accuracy, Chance, and the Principal Principle. Philosophical
    Review,
    121(2), 241–275.
  • Pettigrew, R. (2014). Accuracy, Risk, and the Principle of Indifference. Philosophy
    and Phenomenological Research
    .
  • Thau, M. (1994). Undermining and Admissibility. Mind, 103, 491–504.
  • van Fraassen, B. C. (1984). Belief and the Will. Journal of Philosophy, 81, 235–56.
  • Williamson, J. (2010). In Defence of Objective Bayesianism. Oxford: Oxford University Press.