Chapter 1 Utility Theory It is not the purpose here to develop the concept of utility completely or with the most generality. Nor are the results derived from the most primitive set of assumptions. All of this can be found in standard textbooks on microeconomics or game theory. Here rigor is tempered with an eye for simplicity of presentation. 1.1 Utility Functions and Preference Orderings A utility function is not presumed as a primitive in economic theory. What is assumed is that each consumer can “value” various possible bundles of consumption goods in terms of his own subjective preferences. For concreteness it is assumed that there are n goods. Exactly what the “goods” are or what distinguishes separate goods is irrelevant for the development. The reader should have some general idea that will suffice to convey the intuition of the development. Typically, goods are different consumption items such as wheat or corn. Formally, we often label as distinct goods the consumption of the same physical good at different times or in different states of nature. Usually, in finance we lump all physical goods together into a single consumption commodity distinguished only by the time and state of nature when it is consumed. Quite often this latter distinction is also ignored. This assumption does little damage to the effects of concern in finance, namely, the tradeoffs over time and risk. The n vector x denotes a particular bundle or complex of xi units of good i. Each consumer selects his consumption x from a particular set X . We shall always take this set to be convex and closed. Preferences are described by the preordering relation �. The statement x � z (1.1) is read “x is weakly preferred to z” or “x is as good as z.” This preordering also induces the related concepts of strict preference � and indifference ∼, which are defined as x � z if x � z but not z � x, x ∼ z if x � z and z � x, (1.2) and read as “x is (strictly) preferred to z” and “x is equivalent to z.” The following properties of the preordering are assumed. Axiom 1 (completeness) For every pair of vectors x ∈ X and z ∈ X either x � z or z � x. 2 Utility Theory Axiom 2 (relexivity) For every vector x ∈ X , x � x. Axiom 3 (transitivity) If x � y and y � z, then x � z. Axioms are supposed to be self-evident truths. The reflexivity axiom certainly is. The completeness axiom would also appear to be; however, when choices are made under uncertainty, many commonly used preference functions do not provide complete orderings over all possible choices. (See, for example, the discussion of the St. Petersburg paradox later in this chapter.) The transitivity axiom also seems intuitive, although among certain choices, each with many distinct attributes, we could imagine comparisons that were not transitive, as illustrated by Arrow’s famous voter paradox. This issue does not loom large in finance, where comparisons are most often one dimensional. Unfortunately, these three axioms are insufficient to guarantee the existence of an ordinal utility function, which describes the preferences in the preordering relation. An ordinal utility function is a function Υ from X into the real numbers with the properties Υ(x) > Υ(z) ⇔ x � z, (1.3a) Υ(x) = Υ(z) ⇔ x ∼ z. (1.3b) Examples of a preording satisfying these three axioms but which do not admit to representation by ordinal utility functions are the lexicographic preorderings. Under these preferences the relative importance of certain goods is immeasurable. For example, for n = 2 a lexicographic preordering is x � z if either x1 > z1 or x1 = z1 and x2 > z2. The complexes x and z are equivalent only if x = z. In this case the first good is immeasurably more important than the second, since no amount of the latter can make up for a shortfall in the former. To guarantee the existence of a utility function, we require a fourth axiom. The one generally adopted is the continuity axiom because it also guarantees that the utility function is continuous. Axiom 4 (continuity) For every x ∈ X , the two subsets of all strictly preferred and all strictly worse complexes are both open. The lexicographic preordering does not satisfy this axiom because the set of complexes strictly preferred to (x∗ 1, x∗ 2) includes the boundary points when x1 > x∗ 1 and x2 = x∗ 2. Continuity of the utility function is guaranteed by Axiom 4 because the openness of the preferred and inferior sets requires that it take on all values close to Υ(x∗ ) in a neighborhood of x∗ . With just these four axioms the existence of a continuous ordinal utility function over e consistent with the preordering can be demonstrated. Here we simply state the result. The interested reader is referred to texts such as Introduction to Equilibrium Analysis by Hildenbrand and Kirman and Games and Decisions by Luce and Raiffa. Theorem 1 For any preordering satisfying Axioms 1-4 defined over a closed, convex set of complexes X , there exists a continuous utility function Υ mapping X into the real line with the property in (3). 1.2 Properties of Ordinal Utility Functions The derived utility function is an ordinal one and, apart from continuity guaranteed by the closure axiom, contains no more information than the preordering relation as indicated in 1.2 Properties of Ordinal Utility Functions 3 (3). No meaning can be attached to the utility level other than that inherent in the “greater than” relation in arithmetic. It is not correct to say x is twice as good as z if Υ(x) = 2Υ(z). Likewise, the conclusion that x is more of an improvement over y than the latter is over z because Υ(x) − Υ(y) > Υ(y) − Υ(z) is also faulty. In this respect if a particular utility function Υ(x) is a valid representation of some preordering, then so is Φ(x) ≡ θ[Υ(x)], where θ(·) is any strictly increasing function. We shall later introduce cardinal utility functions for which this is not true. To proceed further to the development of consumer demand theory, we now assume that Assumption 1 The function Υ(x) is twice differentiable, increasing, and strictly concave. This assumption guarantees that all of the first partial derivatives are positive everywhere, except possibly at the upper boundaries of the feasible set. Therefore, a marginal increase in income can always be profitably spent on any good to increase utility. The assumption of strict concavity guarantees that the indifference surfaces, defined later, are strictly concave upwards. That is, the set of all complexes preferred to a given complex must be strictly convex. This property is used is showing that a consumer’s optimal choice is unique. The differentiability assumption is again one of technical convenience. It does forbid, for example, the strict complementarity utility function Υ(x1, x2) = min(x1, x2), which is not differentiable whenever x1 = x2. (See Figure 1.2.) On the other hand, it allows us to employ the very useful concept of marginal utility. A utility function can be characterized by its indifference surfaces (see Figures 1.1 and 1,2). These capture all that is relevant in a given pre ordering but are invariant to any strictly increasing transformation, An indifference surface is the set of all complexes of equal utility; that is, {x ∈ X |x ∼ x◦ } or {x ∈ X |Υ(x = Υ◦ )}. The directional slopes of the indifference surface are given by the marginal rates of (commodity) substitution. Using the implicit function theorem gives − dxi dxj � � � � Υ = ∂Υ/∂xj ∂Υ/∂xi ≡ Υj Υi . (1.4) For the equivalent utility function Φ = θ[Υ] the indifference surfaces are the same since − dxi dxj � � � � Φ = θ� Υj θ�Υi = − dxi dxj � � � � Υ . (1.5) Figure 1.1 Indifference Curves Increasing Utility Quantity of Good One x1 x2 Quantity of Good Two 4 Utility Theory Figure 1.2 Strict Complements Increasing Utility Quantity of Good One x1 x2 Quantity of Good Two 1.3 Properties of Some Commonly Used Ordinal Utility Functions An important simplifying property of utility is preferential independence. Two subsets of goods are preferentially independent of their complements if the conditional preference ordering when varying the amounts of the goods in each subset does not depend on the allocation in the complement. In other words, for x partitioned as (y, z), [(y1, z0) � (y2, z0)] ⇒ [(y1, z) � (y2, z)] for all y1, y2, z, (1.6a) (y0, z1) � (y0, z2) ⇒ (y, z1) � (y, z2) for all y, z1, z2. (1.6b) Preferential independence can also be stated in terms of marginal rates of substitution, Two subsets with more than one good each are preferentially independent if the marginal rates of substitution within each subset (and the indifference curves) do not depend upon the allocation in the complement. A preferentially independent utility function can be written as a monotone transform of an additive form Υ(x) = θ[a(y) + b(z)] (1.7) since marginal rates of substitution are − dyi dyj � � � � Υ = θ� (·)∂a/∂yj θ�(·)∂a/∂yi , (1.8) which are independent of z; we can write a similar expression for the marginal rates of substitution within the subset z. If a utility function over three or more goods is mutually preferentially independent (all subsets are preferentially independent), then the utility function has a completely additively separable or additive form Υ(x) = � υi(xi). (1.9) Mutual preferential independence is obviously a necessary condition for additivity as well. One commonly used form of additive utility function is a sum of power functions, These utility functions can all be written as Υ(x) = � aixγ i γ . (1.10) 1.4 The Consumer’s Allocation Problem 5 These include linear γ = 1 and log-linear γ = 0, −γ(x) = � ai log(xi), as limiting cases. An equivalent representation for the log-linear utility function is Φ(x) = exp[Υ(x)] = � xai i . (1.11) This is the commonly employed Cobb-Douglas utility function. Each of these utility functions is homothetic as well as additive. That is, the marginal rates of substitution depend only on the relative allocation among the goods. For linear utility the marginal rates of substitution are constant. − dxi dxj � � � � Υ = aj ai . (1.12) Log-linear utility is characterized by constant proportionate rates of substitution − dxi/xi dxj/xj � � � � Υ = aj ai . (1.13) In general, the marginal rates of substitution are − dxi dxj � � � � Υ = aj ai � xj xi �γ−1 , (1.14) which depends only on the ratio of consumption of the two goods. For γ = 1 the goods are perfect substitutes, and the indifference curves are straight lines. Another special case is γ → −∞, in which the marginal rate of substitution −dxi/dxj is infinite if xi > xj and zero for xi < xj. This is the strict complementarity utility function mentioned earlier. In the intermediate case γ < 0, the indifference surfaces are bounded away from the axes, For a level of utility Υ0 the indifference surface is asymptotic to xi = (γΓ0 /ai)1/γ . Thus, a limit on the availability of any single good places an upper bound on utility. For 0 < γ < 1 the indifference surfaces cut each axis at the point xi = Υ0 a −1/γ i , and for γ = 0 the indifference surfaces are asymptotic to the axes so no single good can limit utility. 1.4 The Consumer’s Allocation Problem The standard problem of consumer choice is for consumers to choose the most preferred complex in the feasible set. In terms of utility functions they are to maximize utility subject to their budget constraints, Formally, for a fixed vector p of prices and a given feasible set, the consumer with wealth W and utility function Υ solves the problem max Υ(x) subject to p� x � W. (1.15) In analyzing this problem we assume that the feasible set satisfies the following descrip- tion. Assumption 2 The set X is closed, convex, and bounded below by x� . It is unbounded above, so that if x0 ∈ X , then x ∈ X when x � x0 . X contains the null vector. Because X contains the null vector a feasible solution to (15) exists for p > 0 and W > 0. Since it is unbounded above, the optimum will not be constrained by the scarcity of any good, and since it is bounded below the consumer cannot sell unlimited quantities of certain “goods” (e.g., labor services) to finance unlimited purchases of other goods. 6 Utility Theory Theorem 2 Under Assumptions 1 and 2 the consumer’s allocation problem possesses a unique, slack-free solution x∗ for any positive price vector, p > 0, and positive wealth. Proof. The existence of a solution is guaranteed because the physically available set X is closed and bounded below by assumption xi � xl i. The set E of economically feasible complexes is bounded above. (Each xi � (W − � j�=i xl jpj)/pi.) Thus, the set of feasible complexes F, the intersection of X and E , is closed and bounded; that is, it is compact. Furthermore, it is not empty since x = 0 is an element. But since Υ(x) is continuous, it must attain a maximum on the set F. If the solution is not slack free, p� x∗ < W, then the slack s ≡ W − p� x∗ can be allocated in a new complex x ≡ x∗ + δ , where δi = s/npi > 0. This new complex is feasible since p� x = W, and it must be preferred to x∗ (since x > x∗ ), so x∗ cannot be the optimum. Now suppose that the optimum is not unique. If there is a second optimum x0 �= x∗ , then Υ(x0 ) = Υ(x∗ ). But F is convex since X is, so x = (x0 + x∗ )/2 is in F, and, by strict concavity, Υ(x) > Υ(x∗ ), which is a violation. Q.E.D. Note that existence did not require Assumption 1 apart from continuity of Υ, which can be derived from the axioms. The absence of slack required the utility function to be increasing. Uniqueness used, in addition, the strict concavity. To determine the solution to the consumer’s problem we form the Lagrangian L ≡ Υ(x) + λ(W − p� x). Since we know the optimal solution has no slack we can impose an equality constraint. The first-order conditions are ∂L ∂x = ∂Υ ∂x − λp = 0, (1.16a) ∂L ∂λ = W − p� x = 0. (1.16b) The second-order conditions on the bordered Hessian are satisfied since Υ is concave. The result in (16a) can be reexpressed as ∂Υ/∂xi ∂Υ/∂xj = pi pj , (1.17) or, from (4). the marginal rates of substitution are equal to the negative of the price ratio. This optimum is illustrated in Figure 1.3. The consumer holds that combination of goods that places him on his highest indifference curve and is affordable (on his budget line). 1.5 Analyzing Consumer Demand The solution to the consumer’s allocation problem can be expressed as a series of demand functions, which together make up the demand correspondence. x∗ i = φi(p, W). (1.18) 1.5 Analyzing Consumer Demand 7 Under the assumption that the utility function is twice differentiable, the demand functions can be analyzed as follows. Budget Line Optimum Figure 1.3 Consumer’s Maximization Problem Quantity of Good One x1 x2 Quantity of Good Two Take the total differential of the system in (16) with respect to x, p, λ, and W: 0 = ∂2 Υ ∂x∂x� dx − pdλ − λdp, (1.19a) 0 = dW − p� dx − dp� x. (1.19b) This can be written as � H(Υ) −p −p� 0 � � dx∗ dλ � = � λdp −dW + dp� x∗ � . (1.20) The matrix on the left-hand side of (20) will be recognized as the bordered Hessian of the maximization problem. Since Υ is concave, we know that we have a maximum and the second-order condition is met, so the determinant of the bordered Hessian is negative. Therefore, it has an inverse which is � H−1 � I − ηpp� H−1 � −ηH−1 p −ηp� H−1 −η � , (1.21) where η ≡ (p� H−1 p)−1 . The solution to (20) is dx∗ = λH−1 (I − ηpp� H−1 )dp + ηH−1 p(dW − dp� x∗ ), (1.22) so the partial derivatives can be expressed as ∂x∗ ∂pi � � � � W = � λH−1 (I − ηpp� H−1 ) − ηH−1 px∗� � ιi, (1.23a) ∂x∗ ∂W � � � � p = ηH−1 p, (1.23b) where ιi is the ith column of the identity matrix. Substituting (23b) into (23a) gives ∂x∗ ∂pi � � � � W = λH−1 (I − ηpp� H−1 )ιi − x∗ i ∂x∗ ∂W � � � � p . (1.24) 8 Utility Theory Equation (24) is the Slutsky equation. The first term measures the substitution effect, and the second term is the income effect. The former can be given the following interpretation. Consider a simultaneous change in pj and W to leave utility unchanged. Then 0 = dΥ = ∂Υ ∂x� dx = λp� dx, (1.25) where the last equality follows from (16a). Substituting (25) into (19b) gives dW = dp� x, which, when substituted into (22), gives ∂x∗ ∂pi � � � � T = λH−1 (I − ηpp� H−1 )υi, (1.26) the first term in (24). The Slutsky equation can now be written in its standard form ∂x∗ ∂pi � � � � W = ∂x∗ ∂pi � � � � T − x∗ i ∂x∗ ∂W � � � � p . (1.27) The direct substitution effect ∂x∗ i /∂pi|Υ must be negative because the bordered Hessian and, hence, its inverse are negative definite. The income effect can be of either sign. If ∂x∗ i /∂W < 0, then good i is an inferior good and the income effect is positive. If the good is sufficiently inferior so that the income effect dominates the substitution effect and ∂x∗ i /∂pi > 0, holding W fixed, then i is a Giffen good. If the cross substitution effect, ∂x∗ i /∂pj holding Υ constant, is positive, the goods are substitutes. Otherwise they are complements. The income and substitution effects are illustrated in Figure 1.4. An increase in the price of good 1 from p1 to p� 1 will rotate the budget line clockwise. The substitution effect, a movement from points A to B, involves increased consumption of good 2 and decreased consumption of good 1 at the same utility. The income effect is the movement from points C B A �� � Figure 1.4 Income and Substitution Effects �� �� x1 x2 �� � B to C. In the illustrated case, good 1 is a normal good, and its consumption decreases. 1.6 Solving a Specific Problem 9 1.6 Solving a Specific Problem Consider an investor with the log-linear utility function Υ(x, z) = α log(x)+(1−α) log(z). The first order conditions from (16a) are α x = λpx, 1 − α z = λpz (1.28) with solutions x∗ = αW px , z∗ = (1 − α)W pz . (1.29) The income effect is ∂x∗ ∂W = α px > 0. (1.30) The substitution effect is ∂x∗ ∂px � � � � W + x∗ ∂x∗ ∂W � � � � p = x∗ px (α − 1) < 0. (1.31) The cross substitution effect is ∂x∗ ∂pz � � � � W + z∗ ∂x∗ ∂W � � � � p = αz∗ px > 0. (1.32) Similar results hold for z∗ . 1.7 Expected Utility Maximization We now extended the concept of utility maximization to cover situations involving risk. We assume throughout the discussion that the economic agents making the decisions know the true objective probabilities of the relevant events. This is not in the tradition of using subjective probabilities but will suffice for our purposes. The consumers will now be choosing among “lotteries” described generically by their payoffs (x1, . . . , xm) and respective probabilities π = (π1, . . . , πm)� . Axioms 1-4 are still to be considered as governing choices among the various payoffs. We also now assume that there is a preordering on the set of lotteries that satisfies the following axioms: Axiom A 1 (completeness) For every pair of lotteries either L1 � L2 or L2 � L1. (The strict preference and indifference relations are defined as before.) Axiom A 2 (relexivity) For every lottery L � L. Axiom A 3 (transitivity) If L1 � L2 and L2 � L3, then L1 � L3. These axioms are equivalent to those used before and have the same intuition. With them it can he demonstrated that each agent’s choices are consistent with an ordinal utility function defined over lotteries or an ordinal utility functional defined over probability distributions of payoffs. The next three axioms are used to develop the concept of choice through the maximization of the expectation of a cardinal utility function over payoff complexes. Axiom 5 (independence) Let L1 = {(x1, . . . , xv, . . . , xm), π} and L2 = {(x1, . . . , z, . . . , xm), π}. If xv ∼ z, then L1 ∼ L2. z may be either a complex or another lottery. If z is a lottery z = {(xv 1, xv 2, . . . , xv n), πv }, then 10 Utility Theory L1 ∼ L2 ∼ {(x1, . . . , xv−1, xv 1, . . . , xv n, xv+1, . . . , xm), (π1, . . . , πv−1, πvπv 1, πv+1, . . . , πm)� } . In Axiom 5 it is important that the probabilities be interpreted correctly. πv i is the probability of getting xv i conditional on outcome v having been selected by the first lottery. πvπv i is the unconditional probability of getting xv i . This axiom asserts that only the utility of the final payoff matters. The exact mechanism for its award is irrelevant, If two complexes (or subsequent lotteries) are equafly satisfying, then they are also considered equivalent as lottery prizes. In addition, there is no thrill or aversion towards suspense or gambling per se. The importance of this axiom is discussed later. Axiom 6 (continuity) If x1 � x2 � x3, then there exists a probability π, 0 ≤ π ≤ 1, such that x2 ∼ {(x1, x3), (π, 1 − π)� }. The probability is unique unless x1 ∼ x3. Axiom 7 (dominance) Let L1 = {(x1, x2), (π1, 1 − π1)� } and L2 = {(x1, x2), (π2, 1 − π2)� }. If x1 � x2, then L1 � L2 if and only if π1 > π2. Theorem 3 Under Axioms 1-7 the choice made by a decision maker faced with selecting between two (or more) lotteries will be the one with the higher (highest) expected utility. That is, the choice maximizes � πiΨ(xi), where Ψ is a particular cardinal utility function. Proof. (We show only the proof for two alternatives.) Let the two lotteries be L1 = {(x1 1, . . . , x1 m), π1 } and L2 = {(x2 1, . . . , x2 v), π2 } and suppose the prizes are ordered so that xi 1 � · · · � xi v. Choose xh so that it is preferred to both x1 1 and x2 1 and choose xl so that both x1 m and x2 v are preferred to it. For each xi j compute qi j such that xi j ∼ {(xh , xl ), (qi j, 1−qi j)� }. This can be done by Axiom 6. By Axiom 5, Li ∼ {(xh , xl ), (Qi, 1 − Qi)� }, where Qi ≡ � qi jπi j.By Axiom 7, L1 � L2 if and only if Q1 > Q2. The only thing remaining is to show that qi j is a valid utility measure for xi j. This is so because, by Axiom 7, xi j > xi k if and only if qi j > qi k, so q is increasing, and, by Axiom 6, it is continuous. Q.E.D. The utility function just introduced is usually called a von Neumann-Morgenstern utility function, after its originators. It has the properties of any ordinal utility function, but, in addition, it is a “cardinal” measure. That is, unlike ordinal utility the numerical value of utility has a precise meaning (up to a scaling) beyond the simple rank of the numbers. This can be easily demonstrated as follows. Suppose there is a single good. Now compare a lottery paying 0 or 9 units with equal probability to one giving 4 units for sure. Under the utility function Υ(x) = x, the former, with an expected utility of 4.5, would be preferred. But if we apply the increasing transformation θ(s) = √ s the lottery has an expected utility of 1.5, whereas the certain payoff’s utility is 2. These two rankings are contradictory, so arbitrary monotone transformations of cardinal utility functions do not preserve ordering over lotteries. Cardinality has been introduced primarily by Axioms 6 and 7. Suppose xh and xl are assigned utilities of 1 and 0. By Axiom 6 any other intermediate payoff x is equivalent in utility to some simple lottery paying xh or xl . This lottery has expected utility of q ·1+0 = q. But, by Axiom 7, q ranks outcomes; that is, it is at least an ordinal utility function. Finally, by construction, Ψ(x) = q; no other value is possible. Thus, a von NeumannMorgenstern utility function has only two degrees of freedom: the numbers assigned to Ψ(xh ) and Ψ(xl ). All other numerical values are determined. Alternatively, we can say 1.8 Cardinal and Ordinal Utility 11 that utility is determined up to a positive linear transformation. Ψ(x) and a + bΨ(x), with b > 0, are equivalent utility functions. Von Neumann-Morgenstern utility functions are often called measurable rather than cardinal. Neither use is precise from a mathematician’s viewpoint. 1.8 Cardinal and Ordinal Utility Each cardinal utility function embodies a specific ordinal utility function. Since the latter are distinct only up to a monotone transformation, two very different cardinal utility functions may have the same ordinal properties. Thus, two consumers who always make the same choice under certainty may choose among lotteries differently. For example, the two Cobb-Douglas utility functions Ψ1(x, z) = √ xz and Ψ2 = −1/xz are equivalent for ordinal purposes since Ψ2 = −Ψ−2 1 . Faced by choosing between (2, 2) for sure or a 50-50 chance at (4, 4) or (1, 1), the first consumer will select the lottery with an expected utility of 5 2 over the sure thing with utility of 2. In the same situation the second consumer will select the safe alternative with utility −1 4 over the lottery with expected utility -17/32. As mentioned previously, preferences can still be expressed by using ordinal utility; however, the domain of the ordinal utility functional is the set of lotteries. Let π(x) denote the probability density for a particular lottery, and let Ψ(x) be the cardinal utility function of a consumer. Then his ordinal utility functional over lotteries is Q[π(x)] = E[Ψ(x)] ≡ � Ψ(x)π(x)dx, (1.33) When all lottery payoffs come from some common family of distributions, it is often possible to express this ordinal functional for lotteries as an ordinal utility function defined over parameters of the distributions of the lotteries. For example, if there is a single good, the consumer’s cardinal utility function is Ψ(x) = xγ /γ, and the payoff from each lottery is log- normally distributed [i.e., ln(x) is N(µ, σ2 )], then Q[πi(x)] = 1 γ exp � γµi + γ2 σ2 i 2 � ≡ Υ(µi, σi), (1.34) where µi and σi, now are the “goods.” Since this is an ordinal utility function, choices can be expressed equivalently by Φ(µ, σ) ≡ ln[γΥ(µ, σ)] γ = µ + γσ2 2 . (1.35) Utility functions like this are often called derived utility functions. As discussed in Chapter 4, derived mean-variance utility functions have an important role in finance. 1.9 The Independence Axiom Axiom 5 is called the independence axiom because it asserts that the utility of a lottery is independent of the mechanism of its award. The independence asserted here is a form of preferential independence, a concept discussed previously. It should not be confused with utility independence (for cardinal utility functions), introduced later. To see the relation to preferential independence, consider a simple lottery with two outcomes x1 and x2 with probabilities π1 and π2. Now consider a set (Y ) of 2n “goods.” 12 Utility Theory Goods y1 through yn denote quantities of goods x1 through xn under outcome 1, and goods yn+1 through y2n denote quantities of goods x1 through xn under outcome 2. The lottery payoffs are y� 1 = (x� 1, 0� ) and y� 2(0� , x� 2). The expected utility of this lottery is E[Ψ(˜x)] = π1Ψ(x1) + π2Ψ(x2) = v1(y1) + v2(y2) ≡ Υ(y). (1.36) In this latter form it is clear that the ordinal representation displays additivity, which guarantees preferential independence [see Equation (9)]. In the formulation just described, the ordinal utility functions vi depend on the lottery (through its probabilities). It is possible to express choices in a general fashion by using a utility functional as outlined in the previous section. Only the additive (integral) representation (or monotone transformations) displays preferential independence. It is possible to construct other functionals which satisfy the other six axioms. These, of course, are not representable as expectations of cardinal utility functions. Consider a set of lotteries, with payoffs in a single good. described by the density functions πi(x). The functional Q[π(x)] ≡ 1 2 � xπ(x)dx + 1 2 �� √ xπ(x)dx �2 = 1 2 E[x] + 1 2 E2 [ √ x] (1.37) defines a unique number for any lottery (provided only that its payoffs are nonnegative and its mean payoff exists), and therefore Axioms la, 2a, and 3a are clearly satisfied. Axioms 6 and 7 are also satisfied, since for any lottery with only two distinct payoffs (at fixed levels) the functional Q is continuously increasing in the probability of the higher payoff. Axiom 5 is not valid. Note first that the utility value for a sure thing is numerically equal to its payoff Q[σ(x − x0)] = x0. Now consider three lotteries: L1, equal chances at 4 or 0; L2, equal chances at 32/3 and 0; and L3, a one-fourth chance at 32/3 and a three-fourths chance at 0. Using (37) gives Q[L1] = 1 2 2 + 1 2 � 1 2 √ 4 �2 = 3 2 , Q[L2] = 1 2 32 6 + 1 2 � 1 2 � 32 3 �2 = 4, Q[L3] = 1 2 8 3 + 1 2 � 1 4 � 32 3 �2 = 5 3 . (1.38) Thus, L3 is preferred to L1. But if Axiom 5 were valid, the first lottery would be equivalent to {L2, 0; 1 2 , 1 2 }. Furthermore, this last lottery would be equivalent to {32 3 , 0; 1 4 , 3 4 }. But this is just lottery 3, which we know to be preferred. Therefore Axiom 5 cannot be valid, and a payoff cannot be assigned a cardinal level of utility which is independent of the mechanism of its determination. Machina has demonstrated that utility functionals of this type have most of the properties of von Neumann-Morgenstern utility functions. Many of the properties of the single-period investment problems examined in this book hold with “Machina” preferences that do not satisfy the independence axiom. Multiperiod problems will have different properties. 1.10 Utility Independence 13 1.10 Utility Independence As with ordinal utility, independence of some choices is an important simplifying property for expected utility maximization as well. A subset of goods is utility independent of its complement subset when the conditional preference ordering over all lotteries with fixed payoffs of the complement goods does not depend on this fixed payoff. If the subset y of goods is utility independent of its complement z, then the cardinal utility function has the form Ψ(y, z) = a(z) + b(z)c(y). (1.39) Note that this form is identical to that given in (7) for preferential independence; however, in (7) ordinal utility was described so any monotone transformation would be permitted. Thus, the utility function ψ(y, z) = θ[a(z) + b(z)c(y)] displays preferential independence but not utility independence for any nonlinear function θ. As with preferential independence, utility independence is not symmetric. If symmetry is imposed and it is assumed that the goods are mutually utility independent, then it can be shown that the utility function can be represented as Ψ(x) = k−1 � exp � k � kiψi(xi) � − 1 � (1.40) with the restriction ki > 0 · ψi is a valid (univariate) cardinal utility function for marginal decisions involving good i alone. An alternative representation is Ψ(x) = � kiψi(xi), (1.41a) Ψ(x) = � φi(xi), (1.41b) Ψ(x) = − � [−φi(xi)], (1.41c) corresponding to zero, positive, or negative k. Again the φi, are univariate utility functions related to the functions ψi. In (41b), φi, are uniformly positive, and in (41c), φi, are uniformly negative. If the utility function has the additive form, then the preference ordering among lotteries depends only on the marginal probability distributions of the goods. The converse is also true. For multiplicative utility this simple result does not hold. As an illustration consider the lotteries L1 = {(xh, zh)� , (xl, zl)� , (.5, .5)� } and L2 = {(xh, zl)� , (xl, zh)� , (.5, .5)� }. For additive utility functions both lotteries have the same expected utility. EΨ = .5 (ψ1(xh) + ψ1(xl) + ψ2(zh) + ψ2(zl)) . (1.42) For the multiplicative form, expected utilities are not equal: .5 [φ1(xh)φ2(zh) + φ1(xl)φ2(zl)] �= .5 [φ1(xh)φ2(zl) + φ1(xl)φ2(zh)] . (1.43) 1.11 Utility of Wealth Thus far we have measured outcomes in terms of a bundle of consumption goods. In financial problems it is more common to express outcomes in monetary values with utility 14 Utility Theory measuring the satisfaction associated with a particular level of wealth. Wealth (Good 1 as Numeraire) x1 Utility of Wealth Ψ1 Ψ2 Ψ3 Ψ3 Ψ2 Ψ1 Figure 1.5 Derived Utility of Wealth Function x2 If there is a single consumption good, then this can be the numeraire for wealth, and the preceding analysis is valid. If there are a number of consumption goods, then utility can be expressed as a function of wealth and the vector of consumption good prices as U(W; p) = max {Ψ(x); subject to p� x = W} . (1.44) This construction is illustrated in Figure 1.5, where good 1 is used as numeraire. If the utility function depends only on wealth (and not on relative prices), then utility of wealth is said to be state independent. Otherwise it is state dependent. If the utility function Ψ(x) is increasing and p > 0, then the utility of wealth is strictly increasing. And, if differentiable, U� (W) > 0. 1.12 Risk Aversion The aversion to risk on the part of economic decision makers is a common assumption and one that we shall adopt throughout this book unless explicitly stated otherwise. What exactly do we mean by risk aversion? The answer to this question depends on its context. A decision maker with a von Neumann-Morgenstern utility function is said to be risk averse (at a particular wealth level) if he is unwilling to accept every actuarially fair and 1.12 Risk Aversion 15 immediately resolved gamble with only wealth consequences, that is, those that leave consumption good prices unchanged. If the decision maker is risk averse at all (relevant) wealth levels, he is globally risk averse. For state-independent utility of wealth, the utility function is risk averse at W if U(W) > EU(W + ˜ε) for all gambles with E(˜ε) = 0 and positive dispersion. If this relation holds at all levels of wealth, the utility function is globally risk averse. Theorem 4 A decision maker is (globally) risk averse, as just defined, if and only if his von Neumann-Morgenstern utility function of wealth is strictly concave at the relevant (all) wealth levels. Proof. Let ˜ε denote the outcome of a generic gamble. If it is actuarially fair, then E(˜ε) = 0. By Jensen’s inequality, if U(·) is strictly concave at W, then E[U(W + ˜ε)] < U(E[W + ˜ε]) = U(W). (1.45) Thus, higher expected utility results from avoiding every gamble. To prove necessity, consider the simple gamble ε = λa with probability 1 − λ and ε = −(1 − λ)a with probability λ. This gamble is fair, so by assumption it is disliked: U(W) > λU[W − (1 − λ)a] + (1 − λ)U(W + λa). (1.46) Equation (46) must hold for all pairs a and λ (provided 0 ≤ λ ≤ 1 and both outcomes are in the domain of U), but since W = λ[W − (1 − λ)a] + (1 − λ)(W + λa) for any a and λ, U is concave. The proof for global risk aversion is the same, point by point. Q.E.D. If a utility function is twice differentiable, then it is concave, representing risk-averse choices, if and only if U�� (W) < 0. To induce a risk-averse individual to undertake a fair gamble, a compensatory risk premium would have to be offered, making the package actuarially favorable. Similarly, to avoid a present gamble a risk-averse individual would be willing to pay an insurance risk premium. These two premiums are closely related but not identical. They are the solutions to E[U(W + Πc + ˜ε)] = U(W), (1.47a) E[U(W + ˜ε)] = U(W − Πi). (1.47b) The latter risk premium is the one more commonly used in economic analysis. It corresponds in an obvious way to a casualty or liability insurance premium. In financial problems, however, the compensatory risk premium is more useful. It corresponds to the “extra” return expected on riskier assets. The quantity W −Πi, is also known as the certainty equivalent of the gamble W + ˜ε, since it is that certain amount which provides the same expected utility. If the risk is small and the utility function is sufficiently smooth, the two risk premiums are nearly equal. When these assumptions are met, the risk premium can be determined approximately as follows. Using Taylor expansions with Lagrange remainders for (47b) 16 Utility Theory gives E [U(W) + ˜εU� (W) + 1 2 ˜ε2 U�� (W) + 1 6 ˜ε3 U��� (W + α˜ε) � = U(W) − ΠiU� (W) + 1 2 Π2 i U�� (W − βΠi) 1 2 Var(˜ε)U�� (W) ≈ −ΠiU� (W) Πi ≈ 1 2 � − U�� (W) U�(W) � Var(˜ε). (1.48) Sufficient conditions for this approximation to be accurate are that U�� and the support of ˜ε are bounded. From (48) it is clear that the term in brackets is an appropriate measure of infinitesimal or local risk aversion. It is known as the Arrow-Pratt absolute risk-aversion function. This measure incorporates everything important in a utility function but is free from the arbitrary scaling factors. To recover the utility function, we note that A(W) = −d[log U� (W)]/dW, so integrating twice gives � W exp � − � z A(x)dx � dz = a + bU(W) (1.49) with b > 0. Two related measures are the risk-tolerance function and the relative (or proportional) risk-aversion function T(W) ≡ 1 A(W) , R(W) ≡ WA(W) (1.50) The latter is useful in analyzing risks expressed as a proportion of the gamble for example investment rates of return. 1.13 Some Useful Utility Functions The HARA (hyperbolic absolute risk aversion) or LRT (linear risk tolerance) class of utility functions is most commonly used. Utility functions in this class are U(W) = 1 − γ γ � aW 1 − γ + b �γ , b > 0. (1.51) This utility function is defined over the domain b + aW/(1 − γ) > 0. That is, there is a lower bound for γ < 1 and an upper bound for γ > 1. (Note that for integer γ’s greater than 1 the function would be defined for wealth levels above the upper bound, but marginal utility would be negative.) The absolute risk-tolerance function is T(W) = 1 A(W) = W 1 − γ + b a , (1.52) which is linear, as the name suggests. Risk tolerance is obviously increasing or decreasing as γ < 1 or γ > 1. Risk aversion is therefore decreasing for γ < 1 and increasing for γ > 1. 1.14 Comparing Risk Aversion 17 Special cases of LRT utility are linear (risk neutral) with γ = 1, quadratic with γ = 2, negative exponential utility with γ = −∞ and b = 1, isoelastic or power utility with b = 0 and γ < 1, and logarithmic with b = γ = 0. The negative exponential utility functions U(W) = −e−aW (1.53) have constant absolute risk aversion A(W) = a, The power utility functions U(W) = Wγ γ (1.54) have constant relative risk aversion R(W) = 1 − γ and, therefore, decreasing absolute risk aversion. Log utility also displays constant relative risk aversion; specifically, R(W) = 1. It obviously corresponds to γ = 0 in (54) in this respect. This can be verified for the equivalent utility function (Wγ − l)/γ. For γ = 0 this function is not defined, but using L’Hospital’s rule gives lim γ→0 Wγ − 1 γ = lim γ→0 Wγ log W 1 = log W. (1.55) 1.14 Comparing Risk Aversion If one individual is more hesitant to take on risk than another, then it is natural to say that he is more risk averse. To be more precise, one investor is clearly more risk averse than another if he always chooses a safe investment whenever the other does. Theorem 5 gives four equivalent definitions of greater risk aversion. Theorem 5 The following four conditions are equivalent: Ak(W) > Aj(W) all W, (1.56a) ∃G with G� > 0, G�� < 0 such that Uk(W) = G[Uj(W)], (1.56b) Πk > Πj all W and all gambles, (1.56c) Uk[U−1 j (t)] is concave. (1.56d) Proof. (a)⇒(b) Since both Uk and Uj are strictly increasing and twice differentiable, there exists a function G with positive first derivative such that Uk = G(Uj). Differentiating twice gives U� k = G� U� j, U�� k = G� U�� j + G�� (U� j)2 . (1.57) Combining and solving for G�� gives G�� = − G� U� j (Ak − Aj) < 0. (1.58) (b)⇒(c) Using Jensen’s inequality, we get Uk(W − Πk) ≡ E[Uk(W + ˜ε)] = E(G[Uj(W + ˜ε)]) < G(E[Uj(W + ˜ε)]) ≡ G[Uj(W − Πj)] ≡ Uk(W − Πj). (1.59) Now Πk > Πj since Uk is strictly increasing. 18 Utility Theory (c)⇒(a) Consider the simple gamble ˜ε = ±x with equal probability. In the limit for small x, Πi ≈ Ai(W)x2 2 . (1.60) Therefore if Πk > Πj, then Ak > Aj. (a)⇒(d) Define f(t) ≡ Uk[U−1 j (t)]. Differentiating twice gives (recall that for x = g−1 (z), dg−1 (z)/dz = 1/g� (x)) f�� (t) = (U� j)−2 U� k � U�� k U� k − U�� j U� j � = (U� j)−2 U� k(Aj − Ak). (1.61) Therefore, f is strictly concave if and only if Ak > Aj. Q.E.D. A similar theorem is true if the relative risk-aversion function is used in (56a). A proof is immediate since Ak(W) > Aj(W) if and only if Rk(W) > Rj(W). If an individual is more (less) absolutely or relatively risk averse at higher wealth levels, then he or she displays increasing (decreasing) absolute or relative risk aversion. For any utility function U(W) define a related utility function ˆU(W; x) ≡ U(W + x). Then for x > 0, U is increasingly risk averse if ˆA(W) > A(W). 1.15 Higher-Order Derivatives of the Utility Function If an individual is decreasingly risk averse, then U��� > 0. Since A� = −(U� )−2 � U� U��� − (U�� )2 � , (1.62) A� can only be negative if U�� > 0. Usually, no further assumptions about derivatives of the utility function are made, and even this last assumption is not common. However, if investors are consistent in their first m preferences (each of the first m derivatives of U is uniformly positive, negative, or zero) over an unbounded positive domain of W, then the derivatives must alternate in sign; that is, (−1)i U(i) (W) < 0, i = 1, . . . , m. (1.63) We prove this result by induction. Define fn(W) ≡ (−1)n U(n) (W) and assume that fi(W) < 0 for i = 1, . . . , n. Using the mean value theorem gives fn−1(W2) = fn−1(W1) + f� n−1(W∗ )(W2 − W1) = fn−1(W1) − fn(W∗ )(W2 − W1) (1.64) for some W∗ in [W1, W2]. Now assume (63) is false for n + 1; that is, fn+1(·) = −f� n(·) ≥ 0. Then fn(W∗ ) ≤ fn(W1), and, in (64), fn−1(W2) ≥ fn−1(W1) − fn(W1)(W2 − W1) (1.65) Now choose W2 > W1 + fn−1(W1)/fn(W1). This choice is possible since the ratio is positive and the domain of interest is unbounded above. Substituting into (65) gives fn−1(W2) > fn−1(W1) − fn(W1) fn−1(W1) fn(W1) = 0, (1.66) which contradicts our previous assumption. 1.16 The Boundedness Debate: Some History of Economic Thought 19 1.16 The Boundedness Debate: Some History of Economic Thought It is often assumed that utility is bounded or at least bounded above. The reason for this assumption is that in its absence it is possible to construct lotteries which cannot be ordered by their expected utility even in cases when one of the lotteries strictly dominates another. This point was made originally by Karl Menger, who constructed a “super St. Petersburg paradox.” The (ordinary) St. Petersburg paradox involves the evaluation of a gamble designed to pay 2n dollars if the first occurrence of a “head” is on the nth toss of a fair coin. The expected payoff is infinite: ∞� n=1 2−n 2n = � 1 = ∞. (1.67) For any risk-averse utility function, however, expected utility is finite. For example, for log utility expected utility is ∞� n=1 2−n log(2n ) = .2 log(2) ≈ 1.39 (1.68) For any unbounded (from above) utility function, there exists for every n a solution Wn to U(W) = 2n . If we construct a super St. Petersburg gamble with payoffs Wn = U−1 (2n ) when the first occurrence of a head is on the nth toss, the expected utility is infinite: ∞� n=1 2−n U[U−1 (2n )] = ∞. (1.69) For our example of log utility the payoffs would be Wn = exp(2n ). The problem with unbounded utility is not per se that expected utility can be infinite, but that obviously preference-orderable gambles cannot be ranked by expected utility. For example, a gamble offering payoffs of U−1 (3n ) is clearly preferable but also has infinite utility. In addition, a gamble offering U−1 (3n ) for n even and 0 for n odd also has infinite expected utility. This last gamble is clearly worse than the second but might be better or worse than the first. Expected utility offers no clue. With a bounded utility function this probiem can never arise. For example, for U(W) = a − 1/W, utility can never rise above a, so we cannot find the solutions Wn for any n > log2 a. There is, of course, a clear bankruptcy problem embedded in these paradoxes. The large prizes beyond some bound could never be awarded in a finite economy. Since the game could never be feasible, we should be safe in ignoring it. We are also safe on a purely theoretical basis. If we circumvent the bankruptcy problem (by defining negative wealth in a meaningful fashion for example), then the two participants will be taking exactly opposite positions in the gamble. But only one side can be better than actuarially fair, so whoever is offered the other side would definitely refuse to participate. Therefore, such lotteries will never need to be evaluated. 1.17 Multiperiod Utility Functions In finance we are often concerned with a consumer’s intertemporal allocation of his wealth to consumption. All of the mechanics required to handle this problem have already been 20 Utility Theory developed. Good i at each point of time is considered as distinct for the purpose of measuring utility. So with n goods and T time periods, the ordinal or cardinal utility functions have n times T arguments. All of the previous analysis remains valid. For simplicity we shall often work with a multiperiod utility of consumption function. This is justified if there is but a single good. More generally, we can consider it a derived utility function as in (44): ˆU(C1, C2, . . . , WT ; p1, . . . , pn) = max {Ψ(x1, . . . , xT ); subject to p� txt = Ct} . (1.70) Here WT represents remaining wealth left as a legacy. If there is no motive for bequest, ∂ ˆU/∂WT = 0. Throughout this book we shall typically assume that intertemporal choices are made without regard to (are utility independent of) past consumption; that is, (Ct, Ct+1, . . . , WT ) is utility independent of (C0, C1, Ct−1) for all t. Also, decisions affecting lifetime consumption are utility independent of bequest. These two conditions are sufficient for mutual utility independence, so the utility of lifetime consumption function can be written as in (40): ˆU(C) = k−1 � exp � k � ktUt(Ct) � − 1 � . (1.71) If we further assume that lotteries to be resolved immediately and paid off contemporaneously in consumption are evaluated the same at all points of time, then the univariate utility functions are all the same, Ut ≡ U. The static absolute risk-aversion function A(Ct) ≡ −U�� (Ct)/U� (Ct) is sufficient information for choosing among single-period gambles. The parameters kt, measure the impatience to consume or time preference. The marginal rate of substitution of period t goods for period τ goods is − dCt dCτ = ∂ ˆU/∂Cτ ∂ ˆU/∂Ct = kτ U� (Cτ ) ktU�(Ct) . (1.72) If consumption is currently at the same level, substitution is measured completely by kt, and kτ . If the consumer displays impatience, as is commonly assumed, then kτ < kt, for τ > t, and one unit now is valued more than one unit later. Often a constant rate of time preference is assumed, with kt = δt . In this case substitution between periods depends only on the interval of time and not specifically on the date. The final parameter, k, measures temporal risk aversion, the dislike of gambles with lengthy consequences. The smaller or more negative is k the more temporally risk averse is the consumer. For k = 0, corresponding to additive utility, the consumer is temporally risk neutral. For positive or negative k, the consumer is temporally risk preferring or risk averse, respectively. Temporal risk aversion can be defined as follows. Consider the single-period lottery with payoffs Ch or Cl, and certainty equivalent C∗ 1 . Now consider the lottery which is resolved now but whose level payoffs Ch or Cl continue for T periods. If the certainty equivalent level stream C∗ is greater (less) than C∗ 1 , the consumer is temporally risk averse (preferring). For two periods another explanation of temporal risk aversion is that the lottery with equal chances of paying (Ch, Cl) or (Cl, Ch) is preferred to the lottery with equal chances of paying (Ch, Ch) or (Cl, Cl). To explore temporal risk aversion, let us define a utility function for level streams: λ(C) ≡ ˆU(C1) = k−1 (exp[kαU(C)] − 1) , (1.73) 1.17 Multiperiod Utility Functions 21 where α ≡ � kt > 0. The absolute risk-aversion function for this utility function is a(C) ≡ − λ�� (C) λ�(C) = − U�� (C) U�(C) − kαU� (C). (1.74) Since the lead term in (74) is the static risk-aversion function, a(C) is greater (less) than A(C) and the consumer is temporally risk averse (preferring) if k is negative (positive).