Isaac Levi on Rationality, Deliberation and Prediction (2/3)

This is the second of a three-part post on the philosopher Isaac Levi’s account of the relationship between deliberation and prediction in decision theory and which is an essential part of Levi’s more general theory of rationality. Levi’s views potentially have tremendous implications for economists especially regarding the current use of game theory. These views are more particularly developed in several essays collected in his book The Covenant of Reason, especially “Rationality, prediction and autonomous choice”, “Consequentialism and sequential choice” and “Prediction, deliberation and correlated equilibrium”. The first post presented and discussed Levi’s main thesis that “deliberation crowds out prediction”. This post discusses some implications of this thesis for decision theory and game theory, specifically the equivalence between games in dynamic form and in normal form. On the same basis, the third post will evaluate the relevance of the correlated equilibrium concept for Bayesianism in the context of strategic interactions. The three posts are collected under a single pdf file here.

 

In his article “Consequentialism and sequential choice”, Isaac Levi builds on his “deliberation crowds prediction” thesis to discuss Peter Hammond’s account of consequentialism in decision theory presented in the paper “Consequentialist Foundations for Expected Utility”. Hammonds contends that consequentialism (to be defined below) implies several properties for decision problems, especially (i) the formal equivalence between decision problems in sequential (or extensive) form and strategic (or normal) form and (ii) ordinality of preferences over options (i.e. acts and consequences). Though Levi and Hammonds are essentially concerned with one-person decision problems, the discussion is also relevant from a game-theoretic perspective as both properties are generally assumed in the latter. This post will focus on point (i).

First, what is consequentialism? Levi distinguishes between three forms: weak consequentialism (WC), strong consequentialism (SC) and Hammond’s consequentialism (HC). According to Levi, while only HC entails point (i), both SC and HC entail point (ii). Levi contends however that none of them is defensible once we take into account the “deliberation crowds out prediction” thesis. We may define these various forms of consequentialism on the basis of the notation introduced in the preceding post. Recall that any decision problem D corresponds then to a triple < A, S, C > with A the set of acts (defined as functions from states to consequences), S the set of states of nature and C the set of consequences. A probability distribution over S is defined by the function p(.) and represents the decision-maker DM’s subjective beliefs while a cardinal utility function u(.) defined over C represents DM’s preferences. Now the definitions of WC and SC are the following:

Weakly consequentialist representation – A representation of D is weakly consequentialist if, for each a 󠄉 ∈ A, an unconditional utility value u(c) is ascribed to any element c of the subset Ca  C and where we allow for a  ∈ Ca. If a is not the sole element of Ca, then the representation is nontrivially weakly consequentialist.

(WC)   Any decision problem D has a weakly consequentialist representation.

Strongly consequentialist representation – A representation of D is strongly consequentialist if, (i) it is nontrivially weakly consequentialist and (ii) given the set of consequence-propositions C, if ca and cb are two identical propositions, then the conjuncts aca and bcb are such that u(aca) = u(bcb).

            (SC)     Any decision problem D has a strongly consequentialist representation.

WC thus holds that it is always possible to represent a decision problem as a set of acts to which we can ascribe unconditional utility value to all consequences each act leads to, and where an act itself can be analyzed as a consequence. As Levi notes, WC formulated this way is undisputable.* SC has been endorsed by Savage and most contemporary decision theorists. The difference with WC lies in the fact that SC holds a strict separation between acts and consequences. Specifically, the utility value of any consequence c is independent of the act a that brought it. SC thus seems to exclude various forms of “procedural” account of decision problems. Actually, I am not sure that the contrast between WC and SC is as important as Levi suggests for all is required for SC is to have a sufficiently rich set of consequences C to guarantee the required independence.

According to Levi, HC is stronger than SC. This is due to the fact that while SC does not entail that sequential form and strategic form decision problems are equivalent, HC makes this equivalence its constitutive characteristic. To see this, we have to refine our definition of a decision problem to account for the specificity of the sequential form. A sequential decision problem SD is constituted by a set N of nodes n with a subset N(D) of decision nodes (where DM makes choice), a subset N(C) of chance nodes (representing uncertainty) and a subset N(T) of terminal nodes. All elements of N(T) are consequence-propositions and therefore we may simply assume that N(T) = C. N(D) is itself partitioned into information sets I where two nodes n and n’ in the same I are indistinguishable for DM. For each n ∈ N(D), DM has subjective beliefs measured by the probability function p(.|I) that indicates DM’s belief of being at node n given that he knows I. The conditional probabilities p(.|I) are of course generated on the basis of the unconditional probabilities p(.) that DM holds at each node n ∈ N(C). The triple < N(D), N(C), N(T) > defines a tree T. Following Levi, I will however simplify the discussion by assuming perfect information and thus N(C) = ∅. Now, we define a behavior norm B(T, n) for any tree T and any decision node n in T the set of admissible options (choices) from the set of available options at that node. Denote T(n) the subtree starting from any decision node n. A strategy (or act) specifies at least one admissible option for all decision nodes reachable, i.e. B(T(n), n) must be non-empty for each n ∈ N(D). Given that N(T) = C, we write C(T(n)) the subset of consequences (terminal nodes) that are reachable in the subtree T(n) and B[C(T(n)), n] the set of consequences DM would regard as admissible if all elements in C(T(n)) were directly available at decision node n. Therefore, B[C(T(n)), n] is the set of admissible consequences in the strategic form equivalent of SD as defined by (sub)tree T(n). Finally, write φ(T(n)) the set of admissible consequences in the sequential form decision problem SD. HC is then defined as follows:

(HC)   φ(T(n)) = B[C(T(n)), n]

In words, HC states the kind of representation (sequential or strategic) of a decision problem SD used is irrelevant in the determination of the set of admissible consequences. Moreover, since we have assumed perfect information, its straightforward that this is also true for admissible acts which, in sequential form decision problems, correspond to exhaustive plans of actions.

Levi argues that assuming this equivalence is too strong and cannot be an implication of consequentialism. This objection is of course grounded on the “deliberation crowds out prediction” thesis. Consider a DM faced with a decision problem SD with two decision nodes. At node 1, DM has the choice between consuming drug (a1) or abstaining (b1). If he abstains, the decision problem ends but if the adduces, he then has the choice at node 2 between continuing taking drugs and becoming addict (a2) or stopping and avoiding addiction (c2). Suppose that DM’s preferences are such that u(c2) > u(b1) > u(a2). DM’s available acts (or strategies) are therefore (a1, a2), (a1, c2), (b1, a2) and (b1, c2).** Consider the strategic form representation of this decision problem where DM has to make a choice once for all regarding the whole decision path. Arguably, the only admissible consequence is c2 and therefore the only admissible act is (a1, c2). Assume however that if DM were to choose a1, he would fall prey to temptation at node 2 and would not be able to refrain from continuing consuming drugs. In other words, at node 2, only option a2 would actually be available. Suppose that DM knows this at node 1.*** Now, a sophisticated DM will anticipate his inability to resist temptation and will choose to abstain (b1) at node 1. It follows that a sophisticated DM will choose   (b1, a2) in the extensive form of SD, thus violating HC (but not SC).

What is implicit behind Levi’s claim is that, while it makes perfect sense for DM to ascribe probability (including probability 1 or 0) to his future choices at subsequent decision nodes, this cannot be the case for his choice over acts, i.e. exhaustive plans of actions in the decision path. For if it was the case, then as (a1, c2) is the only admissible act, he would have to ascribe probability 0 to all acts but (a1, c2) (recall Levi’s claim 2 in the previous post). But then, that would also imply that only (a1, c2) is feasible (Levi’s claim 1) while this is actually not the case.**** Levi’s point is thus that, at node 1, the choices at node 1 and at node 2 are qualitatively different: at node 1, DM has to deliberate over the implications of choosing the various options at his disposal given his beliefs over what he would do at node 2. In other words, DM’s choice at node 1 requires him to deliberate on the basis of a prediction about his future behavior. In turn, at node 2, DM’s choice will involve a similar kind of deliberation. The reduction of extensive form into strategic form is only possible if one conflates these two choices and thus ignores the asymmetry between deliberation and prediction.

Levi’s argument is also relevant from a game-theoretic perspective as the current standard view is that a formal equivalence between strategic form and extensive form games holds. This issue is particularly significant for the study of the rationality and epistemic conditions sustaining various solution concepts. A standard assumption in game theory is that the players have knowledge (or full belief) of their strategy choices. The evaluation of the rationality of their choices both in strategic and extensive form games however requires to determine what the players believe (or would have believed) in counterfactual situations arising from different strategy choices. For instance, it is now well established that common belief in rationality does not entail the backward induction solution in perfect information games or rationalizability in strategic form games. Dealing with these issues necessitates a heavy conceptual apparatus. However, as recently argued by economist Giacomo Bonanno, not viewing one’s strategy choices as objects of belief or knowledge allows an easier study of extensive-form games that avoids dealing with counterfactuals. Beyond the technical considerations, if one subscribes to the “deliberation crowds out prediction”, this is an alternative path worth exploring.

Notes

* Note that this has far reaching implications for moral philosophy and ethics as moral decision problems are a strict subset of decision problems. All moral decision problems can be represented along a weakly consequentialist frame.

** Acts (b1, a2) and (b1, c2) are of course equivalent in terms of consequences as DM will never actually have to make a choice at node 2. Still, in some cases it is essential to determine what DM would do in counterfactual scenarios to evaluate his rationality.

*** Alternatively, we may suppose that DM has at node 1 a probabilistic belief over his ability to resist temptation at node 2. This can be simply implemented by adding a chance node before node 1 that determines the utility value of the augmented set of consequences and/or the available options at node 2 and by assuming that DM ignores the result of the chance move.

**** I think that Levi’s example is not fully convincing however. Arguably, one may argue that since action c2 is assumed to be unavailable at node 2, acts (a1, c2) and (b1, c2) should also be regarded as unavailable. The resulting reduced version of the strategic form decision problem would then lead to the same result than the sequential form. This is not different even if we assume that DM is uncertain regarding his ability to resist temptation (see the preceding note). Indeed, the resulting expected utilities of acts would trivially lead to the same result in the strategic and in the sequential forms.  Contrary to what Levi argues, it is not clear that that would violate HC.

Isaac Levi on Rationality, Deliberation and Prediction (1/3)

This is the first of a three-part post on the philosopher Isaac Levi’s account of the relationship between deliberation and prediction in decision theory and which is an essential part of Levi’s more general theory of rationality. Levi’s views potentially have tremendous implications for economists especially regarding the current use of game theory. These views are more particularly developed in several essays collected in his book The Covenant of Reason, especially “Rationality, prediction and autonomous choice”, “Consequentialism and sequential choice” and “Prediction, deliberation and correlated equilibrium”. The first post presents and discusses Levi’s main thesis that “deliberation crowds out prediction”. The next two posts will discuss some implications of this thesis for decision theory and game theory, specifically (i) the equivalence between games in dynamic form and in normal form and (ii) the relevance of the correlated equilibrium concept for Bayesianism in the context of strategic interactions. The three posts are collected under a single pdf file here.

The determination of principles of rational choice is the main subject of decision theory since its early development at the beginning of the 20th century. Since its beginnings, decision theory has pursued two different and somehow conflicting goals: on the one hand, to describe and explain how people actually make choices and, on the other hand, to determine how people should make choices and what choices they should make. While the former goal corresponds to what can be called “positive decision theory”, the latter is constitutive of “normative decision theory”. Most decision theorists, especially the proponents of “Bayesian” decision theory, have agreed that decision theory cannot but be partially normative. Indeed, while today Bayesian decision theory is generally not regarded as an accurate account of how individuals are actually making choices, most decision theorists remain convinced that it is still relevant as a normative theory of rational decision-making. This is in this context that Isaac Levi’s claim that “deliberation crowds out prediction” should be discussed.

In this post, I will confine the discussion to the restrictive framework of Bayesian decision theory though Levi’s account more generally applies to any form of decision theory that adheres to consequentialism. Consequentialism will be more fully discussed in the second post of this series. Consider any decision problem D in which an agent DM has to make a choice over a set of options whose consequences are not necessarily fully known for sure. Bayesians will generally model D as a triple < A, S, C > where A is the set of acts a, S the set of states of nature s and C the set of consequences c. In the most general form of Bayesian decision theory, any a, s and c may be regarded as a proposition to which truth-values might be assigned. In Savage’s specific version of Bayesian decision theory, acts are conceived as functions from states to consequences, i.e. a: S à C or c = a(s). In this framework, it is useful to see acts as DM’s objects of choice, i.e. the elements over which he has a direct control, while states may be interpreted as every features in D over which DM has no direct control. Consequences are simply the result of the combination of an act (chosen by DM) and a state (not chosen by DM). Still following Savage, it is standard to assume that DM has (subjective) beliefs over which state s actually holds. These beliefs are captured by a probability function p(.) with ∑sp(s) = 1 for a finite state space. Moreover, each consequence c is assigned a utility value u representing DM’s preferences over the consequences. A Bayesian DM will then choose the act that maximizes his expected utility given his subjective beliefs and his preferences, i.e.

Maxa Eu(a) = ∑sp(s|a)u(a(s)) =  ∑sp(s|a)u(c).

Two things are worth noting. First, note that the probabilities that enter into the expected utility computation are conditional probabilities of states given acts. We should indeed account for the possibility that the probabilities of states depend on the act performed. The nature of the relationship between states and acts represented by these conditional probabilities is the main subject of conflict between causal and evidential decision theorists. Second, as it is well-known, in Savage’s version of Bayesian decision theory, we start with a full ordering representing DM’s preferences over acts and given a set of axioms, it is shown that we can derive a unique probability function p(.) and a cardinal utility function u(.) unique up to any positive affine transformation. It is indeed important to recognize that Savage’s account is essentially behaviorist because it merely shows that given the fact that DM’s preferences and beliefs satisfy some properties, then his choice can be represented as the maximization of some function with some uniqueness property. Not all Bayesian decision theorists necessarily share Savage’s behaviorist commitment.

I have just stated that in Savage’s account, DM ascribes probabilities to states, utilities to consequences and hence expected utilities to acts. However, if acts, states and consequences are all understood as propositions (as argued by Richard Jeffrey and Levi among others), then there is nothing in principle prohibiting to ascribe utilities to states and probabilities to both consequences and acts. This is this last possibility (ascribing probabilities to acts) that is the focus of Levi’s claim that deliberation crowds out prediction. In particular, does it make sense for DM to have unconditional probabilities over the set A? How having such probabilities could be interpreted from the perspective of DM’s deliberation in D? If we take a third person perspective, ascribing probabilities to DM’s objects of choice seems not particularly contentious. It makes perfect sense for me to say for instance “I believe that you will start again to smoke before the end of the month with probability p”. Ascribing probabilities to others’ choices is an essential part of our daily activity consisting in predicting others’ choices. Moreover, probability ascription may be a way to explain and rationalize others’ behavior. The point of course is that these are my probabilities, not yours. The issue here is whether a deliberating agent has to, or even can ascribe such probabilities to his own actions, acknowledging that such probabilities are in any case not relevant in the expected utility computation.

Levi has been (with Wolfgang Spohn) the most forceful opponent to such a possibility. He basically claims that the principles of rationality that underlie any theory of decision-making (including Bayesian ones) cannot at the same time serve as explanatory and predictive tools and as normative principles guiding rational behavior. In other words, as far as the deliberating agent is using rationality principles to make the best choice, he cannot at the same time use these principles to predict his own behavior at the very moment he is making his choice.* This is the essence of the “deliberation crowds out prediction” slogan. To understand Levi’s position, it is necessary to delve into some technical details underlying the general argument. A paper of philosopher Wlodek Rabinowicz makes a great job in reconstructing this argument (see also this paper by James Joyce). A crucial premise is that, following De Finetti, Levi considers belief ascription as fully constituted by the elicitation of betting rates, i.e. DM’s belief over some event E is determined and corresponds to what DM would consider as the fair price of a gamble where event E pays x$ and event non-E pays y$.** Consider this example: I propose you to pay y$ (the cost or the price of the bet) to participate to the following bet: if Spain win the Olympic gold medal of basketball at Rio this month, I will pay you x$, otherwise I pay you nothing. Therefore, x is the net gain of the bet and x+y is called the stake of the bet. Now, the fair price y*$ of the bet corresponds to the amount for which you are indifferent between taking and not taking the bet. Suppose that x = 100 and that y* = 5. Your betting rate for this gamble is then y*/(x+y*) = 5/105 = 0,048, i.e. you believe that Spain will win with probability less than 0,05. This is the traditional way beliefs are determined in Bayesian decision theory. Now, Levi’s argument is that such a procedure cannot be applied in the case of beliefs over acts on pain of inconsistency. The argument relies on two claims:

(1)       If DM is certain that he will not perform some action a, then a is not regarded as part of the feasible acts by DM.

(2)       If DM assigns probabilities to acts, then he must assign probability 0 to acts he regards as inadmissible, i.e. which do not maximize expected utility.

Clearly, (1) and (2) entail together that only feasible acts (figuring in the set A) are admissible (maximize expected utility), in which case deliberation is unnecessary for DM. If it is the case however, that means that principles of rationality cannot be used as normative principles in the deliberation process. While claim (1) is relatively transparent (even if it is disputable), claim (2) is less straightforward. Consider therefore the following illustration.

DM has a choice between two feasible acts a and b with Eu(a) > Eu(b), i.e. only a is admissible. Suppose that DM assigns probabilities p(a) and p(b) according to the procedure presented above. We present DM with a fair bet B on a where the price is y* and the stake is x+y*. As the bet is fair, y* is the fair price and y*/(x+y*) = p(a) is the betting rate measuring DM’s belief. Now, DM has four feasible options:

Take the bet and choose a (B&a)

Do not take the bet and choose a (notB&a)

Take the bet and choose b (B&b)

Do not take the bet and choose b (notB&b)

As taking the bet and choosing a guarantee a sure gain of x to DM, it is easy to see that B&a strictly dominates notB&a. Similarly, as taking the bet and choosing b guarantee a sure loss of y*, notB&b strictly dominates B&b. The choice is therefore between B&a and notB&b and clearly Eu(a) + x > Eu(b). It follows that the fair price for B is  y* = x + y* and hence p(a) = 1 and p(b) = 1 – p(a) = 0. The inadmissible option b has probability 0 and is thus regarded as unfeasible by DM (claim 1). No deliberation is needed for DM if he predicts his choice since only a is regarded as feasible.

Levi’s argument is by no means undisputable and the papers of Rabinowicz and Joyce referred above make a great job at showing its weaknesses. In the next two posts, I will however take it as granted and discuss some of its implications for decision theory and game theory.

 Notes

* As I will discuss in the second post, Levi considers that there is nothing contradictory or problematic in the assumption that one may be able to predict his future choices.

** A gamble’s fair price is the price at which DM is indifferent accepting to buy the bet and accepting to sell the bet.

A Short Note on Newcomb’s and Meta-Newcomb’s Paradoxes

[Update: As I suspected, the original computations were false. This has been corrected with a new and more straightforward result!]

For some reasons, I have been thinking about the famous Newcomb’s paradox and I came with a “solution” which I am unable to see if it has been proposed in the vast literature on the topic. The basic idea is that a consistent Bayesian decision-maker should have a subjective belief over the nature of the “Oracle” that, in the original statement of the paradox, is deemed to predict perfectly your choice of taking either one or two boxes. In particular, one has to set a probability regarding the event that the Oracle is truly omniscient, i.e. he is able to foreseen your choice. Another, more philosophical way to state the problem is for the decision-maker to decide over a probability that Determinism is true (i.e. the Oracle is omniscient) or that the Free Will hypothesis is true (i.e. the Oracle cannot predict your choice).

Consider the following table depicting the decision problem corresponding to Newcomb’s paradox:

Matrice

Here, p denotes the probability that the Oracle will guess that you will pick One Box (and thus put 1 000 000$ in the opaque box), under the assumption that the Free Will hypothesis is true. Of course, as it is traditionally stated, the Newcomb’s paradox normally implies that p is a conditional probability (p = 1 if you choose One Box, p = 0 if you choose two boxes), but this is the case only in the event that Determinism is true. If the Free Will hypothesis is true, then p is an unconditional probability as argued by causal decision theorists.

Denote s the probability for the event “Determinism” and 1-s the resulting probability for the event “Free Will”. It is rational for the Bayesian decision-maker to choose One Box if his expected gain for taking one box g(1B) is higher than his expected gain for taking two boxes g(2B), hence if

s > 1/1000.

Interestingly, One Box is the correct choice even if one puts a very small probability on Determinism being the correct hypothesis. Note that is independent of the value of p. If one has observed a sufficient number of trials where the Oracle has made the correct guess, then one has strong reasons to choose One Box, even if he endorses causal decision theory!

Now consider the less-known “Meta-newcomb’s paradox” proposed by philosopher Nick Bostrom. Bostrom introduces the paradox in the following way:

There are two boxes in front of you and you are asked to choose between taking only box B or taking both box A and box B. Box A contains $ 1,000. Box B will contain either nothing or $ 1,000,000. What B will contain is (or will be) determined by Predictor, who has an excellent track record of predicting your choices. There are two possibilities. Either Predictor has already made his move by predicting your choice and putting a million dollars in B iff he predicted that you will take only B (like in the standard Newcomb problem); or else Predictor has not yet made his move but will wait and observe what box you choose and then put a million dollars in B iff you take only B. In cases like this, Predictor makes his move before the subject roughly half of the time. However, there is a Metapredictor, who has an excellent track record of predicting Predictor’s choices as well as your own. You know all this. Metapredictor informs you of the following truth functional: Either you choose A and B, and Predictor will make his move after you make your choice; or else you choose only B, and Predictor has already made his choice. Now, what do you choose?

Bostrom argues that this lead to a conundrum to the causal decision theorist:

If you think you will choose two boxes then you have reason to think that your choice will causally influence what’s in the boxes, and hence that you ought to take only one box. But if you think you will take only one box then you should think that your choice will not affect the contents, and thus you would be led back to the decision to take both boxes; and so on ad infinitum.

The point is that here if you believe the “Meta-oracle”, by choosing Two Boxes you then have good reasons to think that your choice will causally influence the “guess” of the Oracle (he will not put 1000 000$ in the opaque box) and therefore, by causal decision theory, you have to choose One Box. However, if you believe the “Meta-Oracle”, by choosing One Box you have good reasons to think that your choice will not causally influence the guess of the Oracle. In this case, causal decision theory recommends you to choose Two Boxes, as in the standard Newcomb’s paradox.

The above reasoning seems to work also for the Meta-Newcomb paradox even though the computations are slightly more complicated. The following tree represents the decision problem if the Determinism hypothesis is true:

Newcomb

Here, “Before” and “After” denote the events where the Oracle predicts and observes your choice respectively. The green path and the red path in the three correspond to the truth functional stated by the Meta-oracle. The second tree depicts the decision problem if the Free Will hypothesis is true.

Newcomb 2

It is similar to the first one except for small but important differences: in the case the Oracle predicts your choice (he makes his guess before you choose) your payoff depends on the (subjective) probability p that he makes the right guess; moreover, the Oracle is now an authentic player in an imperfect information game with q the decision-maker’s belief over whether the Oracle has already made his choice or not (note that if Determinism is true, q is irrelevant exactly for the same reason than probability p in Newcomb’s paradox). Here, the green and red paths depict the decision-maker best responses.

Assume in the latter case that q = ½ as suggested in Bostrom’s statement of the problem. Denote s the probability that Determinism is true and thus that the Meta-oracle as well as the Oracle are omniscient. I will spare you the computations but (if I have not made mistakes) it can be shown that it is optimal for the Bayesian decision maker to choose One Box whenever s ≥ 0. Without fixing q, we have s > 1-(999/1000q). Therefore, even if you are a causal decision theorist and you believe strongly in Free Will, you should play as if you believe in Determinism!