You are on page 1of 17

The RAND Corporation

Ability, Moral Hazard, Firm Size, and Diversification


Author(s): Debra J. Aron
Source: The RAND Journal of Economics, Vol. 19, No. 1 (Spring, 1988), pp. 72-87
Published by: Blackwell Publishing on behalf of The RAND Corporation
Stable URL: http://www.jstor.org/stable/2555398
Accessed: 23/08/2009 22:00
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=black.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We work with the
scholarly community to preserve their work and the materials they rely upon, and to build a common research platform that
promotes the discovery and use of these resources. For more information about JSTOR, please contact support@jstor.org.

The RAND Corporation and Blackwell Publishing are collaborating with JSTOR to digitize, preserve and
extend access to The RAND Journal of Economics.

http://www.jstor.org

RAND Journal of Economics


Vol. 19, No. 1, Spring 1988

Ability, moral hazard, firm size, and diversification


Debra J. Aron*

I develop a model offirm diversification into multiple product lines that is based on the
agency problem between thefirm's managers and owners. The agency relationship, together
with a span-of-controlmanagerial technology, determines an optimalfirm size and degree
of diversificationthat are increasingin the manager's ability and thereforepositively correlated
cross sectionally. I compare the benefits of merger with those achieved by using compensation
contracts based on relative performance and show that, for a particular parameterization,
the relative value of merger is a nonmonotonicfunction of the correlation between the productivity signals of the twofirms.
1. Introduction
* Neoclassical microeconomic theory has generally treated the firm as being identical to
a technologically determined production function. Nevertheless, it is widely recognized that
this cost-curve approach is more appropriately applied to plants within a firm than to the
determination of firm size or structure itself. One aspect of firm structure that has received
especially little attention in the economic literature is diversification.' In particular, the
literature has not succeeded in distinguishing the benefits of efficiently using capital in
production from the benefits to common ownership of the capital. If joint use of capital
creates efficiencies in the production of two or more goods, the joint use could in principle
be achieved by contracting over the use of the separately owned factors. This ownership
structure does not preclude the efficient use of factors. Further, even if technological scope
economies create incentives to diversify, they cannot explain all diversification, because
much that we observe is between products that are (apparently) unrelated in production
technology or demand.
The purpose of this article is to analyze the incentives of firms to diversifyin an economy
comprising managers and capital owners whose interests do not necessarily coincide and
in which information is generally not perfect. I derive the implications of the principalagent relationship between owners and managers of firms for the optimal structure of the
firm in a competitive environment. Diversification is shown to be an optimal response to
* Northwestern University.
This article is based on my Ph.D. thesis at the University of Chicago. I wish to thank my committee members,
Edward P. Lazear, Sherwin Rosen, and especially my chairman, Sanford Grossman, for many helpful discussions.
I am also grateful to Alvin Klevorick, an Associate Editor, and an anonymous referee of this Journal, whose
comments greatly improved the manuscript. Financial support from the Center for the Study of the Economy and
the State at the University of Chicago is gratefully acknowledged.
'See, for example, Schererand Ravenscraft(1984), Salter and Weinhold (1979), and Gort (1962) for empirical
evidence on the importance of diversification.

72

ARON

73

the moral-hazardproblem facing firms' owners. The model yields implications for firm size,
degreeof diversification,and choice of mergerpartners,as well as for the relationshipbetween
managerial compensation and firm size and the tradeoffs between merger and relativeperformance evaluations.
An alternative approach to explaining the characteristicsof acquisition targets focuses
entirely on ownership of capital within a firm. For example, it has been suggested that firms
attempt to diversify into products whose profit streams are negatively correlated over time
with their primary product. In this way firms can avoid the secular variability associated
with business cycles and demand or cost shocks, thereby lowering the risk facing investors
in the firms. Mergerwould appearto be a costly means of achieving this sort of risk reduction,
however, since portfolio diversification can perfectly replicate these benefits of firm diversification for an investor.
A related approach views diversification as a means of reducing the risk of bankruptcy,
where bankruptcy creates a deadweight cost that is not incurred if a product line within a
firm goes out of business. The bankruptcy argument suggests that firms with profitability
streams that are negatively correlated with the acquirer's profits are the most desirable
targets. This contrasts with the empirical implications of my model, which are derived in
Section 6.
The capital-ownership and bankruptcy explanations of characteristics of diversifying
firms do not provide a model of the relationships among firm size, product line size, and
diversification. But empirical studies of these issues by Salter and Weinhold (1979) and
especially a major study by Gort (1962)2 have yielded two important regularities.First there
is a strong positive association between firm size and the number of industries in which the
firm operates. Second, the size of each product line (measured in these studies by employment) increases with total firm size. These regularities do emerge as implications of the
model of firm diversification developed here.
In this article the size and structure of firms are determined not only by the production
technology, but also by the characteristicsof the firms' managers and the incentive problems
inherent in the separation of ownership and control. As in Lucas (1978), optimal firm size
is determined by the (exogenous) talent or managerial expertise of the manager. I assume
that this manager supplies the amount of effort to the firm that maximizes his utility, given
the rewards and constraints he faces. To analyze this problem I adopt the methodology of
the principal-agentliterature(Holmstrdm, 1979; Grossman and Hart, 1983). Diversification
is valuable because it mitigates the principal-agent problem by allowing the principal more
accurately to infer the manager's behavior. For any given firm size, this leads to a tradeoff
between increasing the diversification of the firm, thereby reducing agency costs, and increasing the size of each product line, thereby reducing production costs. In equilibrium,
optimal firm size, product line size, and diversification are positively related.
I then compare diversification with relative-performance contracts as means of decreasing agency costs, under the assumption that the technology of the manager'sjob constrains him from applying different effort levels to different product lines. Although relativeperformance evaluations are valuable, they are not perfect substitutes for diversification
with respect to the agency benefits that each can provide. Which method dominates depends
on measurable characteristics of the observed signals on managerial input. Evaluating a
manager'sperformanceby comparing it with the performanceof another firm can be valuable
when the exogenous shocks affecting the two firms are correlated. On the other hand, evaluations based on the productivity of different product lines that are under the common
2 Cross sectional tests on the relation between firm size and diversification were performed on 721 enterprises
with more than 2,500 employees. Diversification was measured in several ways, including the ratio of primary
industry output to total output for the firm, a simple count of the number of industries in which the firm is engaged,
and a composite of the two. The results were qualitatively the same, regardlessof the measure used.

74

THE RAND JOURNAL

OF ECONOMICS

control of the firm's manager sharpen the measure of the manager's performance when the
exogenous components of productivity across product lines are relatively uncorrelated. I
show that the benefit to diversification vis-a-vis relative-performancecontracts is not monotonic in the correlation between the product-specific signals, is positive over a broad range
of correlations, and is maximized in the region of negative correlations.3
Other authors have considered the effects of managerial moral hazard on the firm
(Marcus, 1982; Diamond and Verrecchia, 1982; Ireland, 1983). Work by Ramakrishnan
and Thakor (1986) is closely related to the analysis in Section 6. They also analyze the value
of diversificationrelativeto performance-comparisoncontractingfrom an agency perspective,
but in a somewhat different setting. In their model, when firms merge, both incumbent
managers maintain their positions, and after the merger, each partakes in the management
of both divisions. Consequently, they focus on issues of competition and collusion between
the managers that do not arise here.
In Section 2 I describe the technology of a firm, and in Section 3 I discuss the principalagent problem and derive relevant properties of the solution. In Section 4 I analyze the
firm's optimization problem, and in Section 5 I derive comparative statics on firm size and
diversification.The implications of the model are then evaluated in light of the cross sectional
stylized facts. Section 6 analyzes the implications of the model for the firm's optimal choice
of merger partner when relative-performance comparisons are a valuable and important
component of managerial incentive contracts. Section 7 contains a summary of the results
and conclusions. The proofs of results in the text are available from the author.

2. The technology

of a firm

* I consider a highly stylized model of the firm. A firm is defined to be a collection of


assets, K, organized within a technological production function and overseen by a manager.
We can think of K as the total capital owned by the firm, or as a vector of inputs, including
labor; here I treat K as capital, but this is without loss of generality. The manager I have in
mind is the top decisionmaker, typically the chief executive officer, whose job is to organize
the factors under his control to increase their efficiency or total output.
Each manager in the economy has ability or managerial talent 0, which is exogenously
endowed, and he exerts some effort, a, which he chooses. Different managers may have
different abilities, and ability has some distribution in the population F(O).I further assume
that this managerial ability is general: a manager is equally productive in any industry.
Those qualities that make a manager successful involve organizational skills, business intuition, knowledge of economywide trends, and so forth.
A firm may produce several products. Letf(k) be the "technological" production function associated with producing a particular output. We may think of this as the plant's
production function and of k as the capital per plant. The function f(k) is defined by
max {0, 1(k)- c}, where 1(- ) is an increasing, strictly concave function, and c is a positive
constant (interpreted as a fixed cost). Thus, plants are characterized by U-shaped average
and increasing marginal cost curves. By assumption, any particularplant can produce only
one product. Firms may choose to have many plants, and these plants may produce the
same product or different products.
Let the number of plants producing product i be ni and the number of products the
firm produces be m. Then m describes the degree to which the firm is diversified. I assume

3 Radner and Rothschild (1975) also take a managerial approach to the determination of firm structure. In
their model managers must allocate their time among several products or projects in which the firm is engaged.
That article does not explicitly treat the incentives for diversification, because the number of projects requiring
attention is exogenous. Nevertheless, the idea is appropriateto a theory of diversification and is complementary to
the one developed here.

ARON

75

for simplicity that the technological production function f(. ) is the same for all products.
The firm is free to choose the amount of capital kij it wishes to invest in each plant j
producing product i.
The actual, observed output of the firm depends on both the invested capital and the
manager's inputs, a and 0. I abstract from any explicit consideration of complex or hierarchical firm structures by imposing the constraint that there is only one manager at the
top of the managerial pyramid. Nevertheless, the notion that firms are complicated systems
and that organizational complexity increases with firm size is captured in the model by
imposing what Lucas (1978) calls a "managerial technology," i.e., the effective ability of
the manager decreases at the margin as the amount of assets he oversees and organizes
increases.
m

ni

We let aOg(E

i=1 j=1

f(kij))be the effectiveinputof the manager, where, by virtue of the

function g(*), effective managerial productivity depends on the total amount of capital
invested in the firm and the technology f* ). The properties of the function g( * ) are defined
implicitly as follows. Let h(x) xg(x). Then h(*) is a strictly increasing, strictly concave
function with
lim h'(x) = O.(1
lim h'(x) = oo,
The concavity of h(x) plus conditions (1) will generate a well-defined optimal total firm size
as a function of managerial ability.
Output of product i is given by
ni

ni

nj

qi = adg(JE

f(kjs))

j=1 s=1

(2)

flki,)ci,

V=I

where Eiis a random component in the production of i. Structuring h(x) as xg(x) allows us
to specify output per product line as linear in the error terms and total output as the sum
of the output per product line, while preserving the feature that managerial productivity
declines at the margin with total firm size. The random variations are specific to a production
process for a particular product. That is, I assume that the Eiare common across plants
producing the same product since the production process is the same. But I assume that
random variations in productivity differ for different products; indeed, I assume that they
are independent across products.4 I relax this assumption in Section 6.
Without loss of generality, let the qi be in common units for all products i, so that the
total output of the firm is the sum of the qi:
m

Q=

i=

qi =E

a~g(
i=1

ni

nnj

f(klk))

j=1 s=1

E f(ki)JE
V=I

= a~g(> y,) I yisi,


j=l

(3)

i=I

ni

where yi

v1

f(kij)

Note that the manager's effort a (as well as his ability 0) is a common input across all
product lines. This assumption means that the effect of the manager's effort is spread over
the whole firm; he cannot expend different levels of effort for different product lines. This
is meant to be a property of the job, and reflects the general nature of the manager's input,
rather than a constraint on the choices of the manager.
I also make the following assumptions.
4 Although this is an extreme definition, note that it neither requires nor even implies that demand for the
products is independent or that the profit streams of the products are independent. On the other hand, products
that share common inputs (i.e., products that exhibit economies of scope) may well not have independent shocks.
Thus, the kind of diversification considered here is properly interpreted as "unrelated" in the sense discussed in
the Introduction.

76

THE RAND JOURNAL OF ECONOMICS

Assumption 1. Firms' owners are risk neutral, so that they maximize expected profits.5
Assumption 2. All firms are price takers, and the price of all products is the same and equal
to p.6

Assumption 3. The relevant sector of the economy is, as a whole, a price taker in the capital
market. Call R the exogenously determined market rate of interest or opportunity cost of
capital.
Assumption 4. The functions g( *) andf(*), the amount of invested capital ki1for all i and
j, the ability level of each manager, and the form of the production function are common
knowledge.
Assumption 5. The level of output of each product qi is observable to both the principal
and the agent (and to any third party enforcing the contract).
Finally, define p to be the expected payment to the manager of the firm, and let w be
the expected value of Eifor all i. Then the firm's expected profit is
m

ni

Pa~h(> yj)w - R Z k>


Z j - p.
j=1

(4)

i=1 j=1

Before analyzing the firm's maximization problem, we must investigate the nature of this
term p, the manager's compensation.
3. The moral-hazard

problem

* I adopt the standard principal-agent methodology. For simplicity, I shall assume that
the manager's (unobserved) effort can take only two values, high (a*) and low (a'), and that
firms will always find it optimal to implement the high level of effort.7
The manageror agent is assumed to be risk averse. He has a von Neumann-Morgenstern
utility function U(I, a) = B(a) W(I) - G(a), where W is a real-valued, continuous, strictly
increasing, and strictly concave function on the interval [d, oo)and W= -co for all I < d;
G( ) and B( . ) are real-valued functions with B(a') > B(a*) > 0 and G(a*) > G(a') > 0.
Finally, I assume that B(a')W(I) - G(a') > B(a*)W(I) - G(a*) for all I.
The principal's problem is to choose capital inputs K = (k,1, .. . , kmnm)and the man-

ager'sincentive contract I(ql ,. . . , qmIK, 0) subjectto incentive-compatibilityand individualrationality constraints. This is a standard principal-agent problem with two wrinkles. First,
the principal must simultaneously choose the contract and K. Second, the observed signals
on effort, on which the contract is based, are functions of the chosen K as well as the
manager's ability 0. If the expected cost or form of the optimal contract implementing a*,
given K, depended in some complicated way on K, this problem could be quite difficult. In
fact, the expected cost of the optimal contract is independent of K, and the form of the
contract depends trivially on K and 0. Given the optimal contract for any arbitrarylevel of
K or 0, the optimal contract for any other level of K or 0 is a simple transformation of that
first contract.
This result is intuitively immediate. Observed output is a monotonic function of as,
and this monotonic function is common knowledge by Assumption 4. Thus, for any level
5 The model treats shareholders or owners of the firm's capital as one "principal." Any strategic interaction
or delegation problem among the owners is ignored.
6 One can show that under appropriate assumptions on consumers' preferences, competitive equilibrium for
this economy is characterizedby the same price for all products. See Aron ( 1985) for a discussion of the competitive
equilibrium.
7 The crucial assumption for what follows is that the optimal effort level is independent of managerial ability,
invested capital, etc. Given this assumption, restrictingthe set of a's to two is merely a convenience.

ARON

77

of capital or ability, observed output can be "unravelled" from equation (2) to arrive at as,
which we can treat as the underlying signal on effort. One need only solve for the optimal
contract 1(z), where z af. For any levels of capital and ability, the optimal contract as a
function of output q is simply I(z) with the domain rescaled.
This establishesthat the principal'scontractingproblem when capital is a choice variable
is fundamentally identical to the standard principal-agent problem. In particular, under our
restrictions on the utility function,8 an optimal contract exists.

The moral-hazardproblem between the capital market and the manager results because
the owners receive only an imperfect signal of the effort of managers. Intuitively, one would
expect that the severity (i.e., costliness) of the problem would increase with the noisiness of
the signal. Holmstrdm (1979) and Grossman and Hart (1983) have shown this to be the
case for particular kinds of "noisiness." This same intuition underlies the benefits to diversification in this model, as I now describe.
We defined p to be the expected payment made to the firm's manager. Since the manager
has disutility for effort, p must be a function of the level of effort the manager is induced
to expend. Then, if effort level a* is implemented, p = p(a*). Consider a firm that produces
only one product, and let p I(a*) be the expected payment to the manager of the firm if a*
is implemented. Then pl(a*) is the solution to (5):
pi(a*) = min

(5)

I(z)ir(zla*)dz

IAz)

subject to

U(I(z), a*)lr(z Ia*)dz

>

U(I(z), a')ir(z Ia')dz

and
U(I(z),

Ia*)dz > U0,

a*)lr(z

where U0 is the agent's reservation utility and lr(z Ia) is the conditional probability density
function of z.
Now suppose that two projects can be undertaken simultaneously and that they have
independent and identically distributed Ei,i.e., they are both described by the distribution
function lr(z Ia). The payment scheme will be some function I(z1, Z2),where zi = afi is the
outcome of project i, given the capital input and ability. The problem facing the principal
in this case is (6):
miz

ff

f(z1,

Z2)r(z1I Ia*)r(z2

I a*)dz Idz2

(6)

AI(Z ,2)

subject to
f

U(f(z

Z2),

a*)lr(zl I a*)lr(z2 Ia*)dzi dz2>

U(f(zI,

Z2),

a')ir(z1 Ia')7r(z2 I a')dzIdZ2

and

fJ U(f(z1,

Z2),

a*)lr(zi I

a*)7r(z2

Ia*)dzIdz2 > UO

Let the solution to this problem be p2(a*).


Lemma 1. pi(a*) > P2(a*).9
8 Some technical integrability restrictions on the admissible set of contracts are also necessary. For a proof
of existence, see Clarke and Darrough (1980). The assumption on W that bounds it below is crucial because it
avoids the kind of nonexistence discussed by Mirrlees (1975).
9 Diamond and Verrecchia(1982) prove this result for a particularutility function (in the hyperbolic absoluterisk-aversionclass).

78

THE RAND JOURNAL OF ECONOMICS

The intuition of the proof is as follows. When the principal gets two independent
observations on the agent's effort, he can reduce the risk facing the agent and still induce
him to choose a*. Since the principal must pay the agent to incur risk (because the agent
requires some minimum utility), the cost to the principal decreases when the risk facing
the agent falls. One feasible way to reduce the risk facing the agent while preserving his
incentives is to pay an amount for the observed pair of outcomes that would yield some
average of the optimal utility payments for the two separate outcomes. That is the scheme
constructed in the proof.
Although the proposition is written for two projects, it clearly can be applied to any
number of independent projects. Let I*(zm) be the optimal contract for zm = {Z I ,. . ., Zm }I
and let I(zm, ZmI) be the contract for the (m + 1) projects. The proof proceeds identically.
Hence, we have established the following corollary.
Corollary 1. Let the cost of the optimal contract when the number of projects is m be
p(m, a*, UO).Then Ap(m, a*, Uo)/Am < 0. That is, the cost of inducing a given level of
effort will decline as the number of product lines increases.
This result is related to results in Grossman and Hart (1985) and Holmstrdm (1979)
on the value of signals. It can be interpreted as a proof that the (m + 1)th signal is indeed
"informative," as defined by Holmstrom (1979).
To get an idea of the magnitude of the effect, I ran some simple simulations.10 Assuming
an additively separable log utility function for the manager, I calculated a lower bound on
the efficiency gains to diversification by using several values for the risk aversion parameter.
For a benchmark case, I found that when the manager has constant relative risk aversion
equal to unity, a firm could acquire a second product line and pay the manager 29% less
(in expected value) without disturbing his incentives and leave him indifferent. Under the
assumptions of Section 2, this efficiency gain will be captured by the manager. This means
that by acquiring a second product line, the manager will raise his expected utility by an
amount corresponding to 29% of his expected income eachyear. Results for a broad range
of risk-aversion parameters were similar.
The results in this section describe the efficiency benefits from diversification that arise
because of the incentive problem. I have established that at any givenlevelof scale,the cost
of providing incentives for the manager is a decreasing function of the degree of diversification. Naturally, the optimal degree of diversification in a firm will also depend on the
production technology and the production-efficiency consequences of increasing scale. I
pursue these issues in the next section.
The results in Lemma 1 and its corollary are robust to many of the assumptions made
here. First, the asumption of general rather than product-specific ability is merely a convenience. I could have written ability as a vector 0 = { I, ., Om1}, where 6' denotes the
manager's ability in the production of product i, and all of the proofs to this point would
proceed identically. As in any model of diversification,product-specificability would mitigate
the benefits of diversification, but it would not affect the incentive benefits, which are the
focus of this article. In addition, it is not necessary that ability be common knowledge. Aron
(1985) analyzed the model in the context of incomplete but symmetric information about
0. The results in this article hold with 0 replaced by its estimated value.
More important, it is not crucial that effort be a general input into the firm's production
function. Suppose that the technology of the firm enabled the manager to choose different
effort levels for different products (in the spirit of, say, Radner and Rothschild (1975)).
Then the manager's effort would be described by a vector a = {a, ..,
.
am}, where ai is
the effort invested in product i, and the incentive problem would be to design a contract
.

10Details of the simulation procedure and results appeared in an earlier version of this article and are available
from the author.

ARON

79

that implements effort al*in product 1, a* in product 2, etc., where in general these optimal
effort levels may not be the same. We could think of this as m contracts with contract i
implementing ai as a function of ai i for i = 1, . . ., m, while taking into account the
existence of the other contracts. When a manager holds m contracts, he could get a bad
outcome (i.e., low realization of e) in any one of them, but a bad random outcome in one
project is relatively likely to be offset by a good outcome in another project. Thus, loosely
speaking, the manager is effectively less risk averse with respect to the randomness of any
one project. This decreases the total compensation for risk required by the manager. For a
formal proof that diversification improves risk sharing when effort is chosen separately
across products, see Ramakrishnan and Thakor (1986).
Although the assumption of general managerial input is not necessary to generate an
agency incentive for diversification, the assumption of general managerial input is crucial
to my comparison of merger and relative-performance contracts in Section 6. The results
in that section do not hold if the manager may choose to apply different levels of effort
across product lines.
Third, I have assumed that the output of each product line is observable within a firm.
Suppose that only the total firm output were observable. Under the assumption once again
that effort is a common input in the firm, it is evident that, when firm size is held constant,

increasing diversification will improve the power of inference on a. This reduces the risk
that must be borne by the manager, and therefore decreases the compensation required to
provide any given level of expected utility.
Finally, it is not necessary that the random variables Eibe independent across products.
As long as the variables are not perfectly correlated, there is a benefit to diversification. I
explore this possibility in Section 6.

4. The firm's problem


* Using the results of Section 3 and the model presented in Section 2, I can rewrite the
firm's maximization problem as:
m

max P

nimki

i=1

ni

j=l

ni

f(kjs)) a, f(ki,)w] - R

E
[a*Og(2
s=i

v=1

ni

i=1 j=1

kij - p(m, Uo),

(7)

ni,ma 1

i=1,...,m

where I suppress the a* in p(m, a*, UO).


Because the plant's production function l(k)is concave, it is evident that for any product
it is optimal to invest the same amount of capital in each operating plant. Further, by the
assumed symmetry of prices and technologies, it is optimal to have the same total number
of plants producing each product. Then I can simplify expression (7) to
max Pa*Oh(nmf(k))w - Rnmk - p(m, UO).

(8)

Notice that n and m enter expected profits perfectly symmetrically, except for the manager's
compensation function p(m, UO).Since p is decreasing in m, the firm will always prefer to
increase output by increasing m (the number of products) rather than n (the number of
plants producing each product). For any given total output, the firm would prefer to set n
as small as possible and m as large as possible. Thus, the lower bound on n will be binding
at the optimum n and m, and I can set n = 1 to solve for the optimal m and k. For simplicity
I shall henceforth treat m as a continuous variable, even though, by its nature, it can only
assume integer values.
The first-orderconditions for the firm's problem are:

Pa*Of(k)h'(mf(k))w- Rk - pm(m, Uo)= 0

(9)

80

THE RAND JOURNAL OF ECONOMICS

Pa*Omf'(k)h'(mf(k))w - Rm = 0,

(10)

where pmis the partial derivative of p(m, UO)with respect to m.


We can immediately make the following observation. Taking the ratio of (9) and (10)
gives
(11)
f/kf' = 1 + (pm/kR),
which implies that ftkf ' < 1 by Corollary 1. Thus, we have the following proposition.
Proposition 1. The optimal amount of capital per plant (now interpreted as per product
line) is at a point to the left of the minimum of the average cost curve.
This result reflects the fact that there is a tradeoff at the margin between increasing the
efficiency of a plant by decreasing average production costs and decreasing managerial costs
by increasing diversification.
Returning to expression(8), consider the case in which there is no moral-hazardproblem.
In that case the manager'spayment will not depend on m, and the firm is indifferentbetween
using its plants in the production of different products or having them all produce the same
product. Let there be any small positive cost of diversification and firms will always choose
to structure their optimal firm size by increasing n and choosing m = 1. This means that
in the model it is the moral hazardinherent to the corporateform that induces diversification.
The efficiency gains achieved by diversification cannot be replicated by manipulating
shareholders' portfolios or by complicated contracting over the use of capital. In Section 6
I show that relative-performancecontracting is also not a perfect substitute for diversification
in alleviating the agency problem in firms. I first turn to the comparative statics of the
diversification model.

5. The comparative

statics

* One would intuitively expect that optimal firm size would be increasing in managerial
ability. To perform the comparative statics, however, one must be careful to account for
the effect of ability on U0, which is determined endogenously. I assume that the market for
managers is perfectly competitive and that the supply of managers of any ability level is
fixed, i.e., the supply of managers is perfectly inelastic at any 0. Together with Assumption
3, this implies that managers earn all the rents they generate. In equilibrium, then, U0(O)is
determined by the equation:
Pa*Oh(m*f(k*))w - Rm*k* - p(m*, UO(O)) 0

(12)

for all 0 such that managers of ability 0 are active in the market, and where m*, k* are the
optimal values of m and k, given 0. Differentiating (12) with respect to 0 and applying the
envelope theorem give
Pa*h(m*f (k*))w/puo = dUoldO.
(13)
One can show that for all U(I, a) that are either multiplicatively or additively separable,
pu0(a,m, UO)> 0. Hence, dUoldOis positive.
Now consider the effect of ability 0 on firm size and diversification. Examination of
the first-orderconditions indicates that the sign of the effect of 0 will depend on the signs
of the partials Pmu0and Pmm,that is, the effects of increased diversification and increased
ability on the (moral-hazard-reducing) marginal benefit of diversification. Consider pmm
first. One expects the marginal benefit of diversification to be decreasing in diversification,
that is, Pmm> 0 (since pm < 0). The precise effect depends on the form of the optimal
contract, but using the proof of Lemma 1, one can show that Pmm> 0 "on average." For
simplicity, in this section I shall assume that utility is exponential, in which case pmu0 0
(see Aron ( 1985) for the proof), but see footnote 11 for a discussion of the greatergenerality
of the results that follow.

ARON

81

Under these conditions we can unambiguously sign the following:1'

dm/d6 > O

(14)

d(mf(k))/d > 0

(15)

df(k)/d6> 0

(16)

dm/dP > 0

(17)

dk/dP>0.

(18)

Thus, we have the following proposition.


Proposition 2. Let U(I, a) = B(a)[-exp(-I/6)], 6 > 0. Then expected firm size (defined as
expected revenue or value of capital), product line size (defined as Pa*0g(mf(k))f(k)w),
and diversification are monotonically increasing in ability.
Corollary2. Expected firm size, product line size, and diversificationare positively correlated
with each other, cross sectionally.12
If we specify some reservation utility outside the managerial labor force for potential
managers, equation (12) determines a lower bound on the ability of managers who will
actually run firms. For example, if the reservation utility in nonmanagerial alternatives is
0, the lower bound 0 is given by
Pa*Oh(m(q)f(k(O)))w- m(O)k(O)R= p(m(O),0).

(19)

This lower bound always exists by Proposition 2 and Corollary 1.


The results in Proposition 2 and Corollary 2 correspond precisely to the stylized facts
presented in the Introduction. I am aware of no other model of diversification that generates
implications for the cross sectional distribution of firm size and degree of diversification.
The alternative models of diversification I discussed have no explicit implications for the
optimal degree of diversification short of expansion into every product in the market.
Proposition 3. If U(I, a) = B(a)[-exp(-I/6)], 6 > 0, as in Proposition 2, then the slope of
the optimal incentive contract with respect to output is independent of ability and therefore
is independent of firm size.
This result relies on exponential utility. More generally, one can show that for the class
of hyperbolic absolute-risk-aversion utility functions described in footnote 11, the slope is
nondecreasing in firm size.
" Inequalities (17) and (18) do not rely on the form of the utility function. Inequalities (14) and (15) hold
unambiguously when the W(I) are hyperbolic absolute-risk-aversionutility functions, where
U(I, a)

In (I + 6) - G(a)

U(I, a)
U(I, a)

B(a)[-exp(-I/6)]

B(a)(1/y

1)(yI + 6)(y - l)/,y,

oy> 0.

In each case W(I) is a hyperbolic absolute-risk-aversionutility function with risk tolerance yI + 6, and the utility
function has been constrained to be either additively or multiplicatively separable. Recall that a utility function
W(x) belongs to such a class if it exhibits linear absolute risk tolerance (the reciprocal of absolute risk aversion) in
x. That is, for some numbers a and 3, - W'(x)/ W"(x) = a + Ox.
12 It should be noted that exponential (or even hyperbolic absolute-risk-aversion)utility is not necessary for
(14)-(16) to hold. We only require that Pmuonot be excessively large. The intuition is as follows. The total amount
of capital devoted to a firm increases with the manager's ability. Some of the increased size would normally take
the form of increasing each plant size, and some would take the form of increasing the number of plants. At higher
levels of ability, however, the manager would earn a higher expected salary. If risk aversion decreases very quickly
as expected income rises (i.e., P.uo is sufficiently large), the incentive to diversify (add plants) falls. In this case the
decreased incentive to diversify may dominate the size effect of increased ability, with larger firms' being less
diversified.

82

THE RAND JOURNAL

OF ECONOMICS

01 A reinterpretationof the problem.We have, to this point, treatedthe contractingproblem


as one in which firms maximize profits subject to the constraints that managers receive a
given level of expected utility and that the incentive-compatibility constraints are satisfied.
This is the way the problem is generally formulated in the literature. The problem could
be described equivalently, however, as the dual one in which managers maximize utility
subject to the incentive-compatibility constraints and a zero-expected-profit constraint on
the firm. Indeed, in the model presented here it is the managers who benefit when a firm
diversifies, because they receive all of the rents they generate, and shareholders always earn
zero profit.
It is perhaps more intuitive and realistic to think of the manager, rather than the
owners, as actually making the decisions whether to diversify, how much to diversify, and
how large the firm should be. But this is perfectly consistent with the model, and by solving
the dual ratherthan the primary problem for an optimal contract, it is explicitly the manager
who maximizes his utility by choosing among different firm structures. Thus, one can interpret the model as one in which managers decide to diversify the firm because of the gains
that will accrue to them personally. This yields a social gain because it optimally shifts to
the shareholders the risk that the managers face.

6. Relative performance comparisons


merger/contracting
tradeoff

and the

* We have thus far analyzed a firm's optimal contracting problem when the only informative signals on a manager's effort are generated by the firm itself. In general, a number
of signals in the economy may be informative with respect to the manager's input. For
example, in industries in which competition is not perfect, it might be difficult by looking
at the firm alone to distinguish between a poor performance by a firm's manager and an
exogenous decline in demand. In such a case observing the performance of other firms in
the industry would improve the ability to evaluate the manager. If all firms did poorly, it
is likely that demand decreased, and it would be inefficient to punish the firm's manager
for the full decline. An efficient contract would adjust for the amount of decline (or improvement) attributableto market performance, to the extent this could be ascertained. We
term such contracts relative-performance contracts.
Holmstr6m (1979, 1982) has analyzed the value of relative-performancecomparisons.
Related work in which incentives are created via competition among agents is the work on
tournaments (Lazearand Rosen, 1981; Green and Stokey, 1983). I have explicitly eliminated
the benefit to such contracting by assuming that the random component of productivity is
independent across firms. In this case there is no benefit to writing the compensation contract
for firm i's manager as a function of the productivity of firmj, since firmj's output has no
informative value for firm i (Holmstr6m, 1982). I now consider the case in which the
correlation between ci and ?jcan vary between -1 and 1, and examine the role of relativeperformance contracts vis-a-vis diversification.
The questions I address are: Would the desirability of a firm as an acquisition target
vary with the correlation between the target'sproduct and the acquirer'sown product? How
can the correlation be exploited in optimal incentive contracts using relative-performance
comparisons? And finally, can one describe a tradeoff between relative-performance comparisons and acquisition?
0

The information structure. For simplicity, assume that firm i produces only one product,

qi. Recall that the owners of firm i can effectively observe aci and that the problem is to
implement a. Similarly, owners of firm i can observe ajej of any other firmj in the economy,
where a1 is the effort level of firm j's manager. Indeed, they can observe cj itself since the

ARON

83

owners of firm i know aj. This follows because, ex post, all parties know what effort levels
are chosen by all managers, since managers are responding to publicly observed incentives. 13
Mergersare differentiated from relative-performancecontracting by the signal that can
be used in each case to create incentives for managers. If any firm i producing qj were to
acquire product j, output qj would be a function of manager i's effort, a. Thus, his compensation would be a function of ac as well as aci. If product j were not acquired, then
manager i's compensation would be a function of ejand aci. Which pair of signals is more
informative about a depends on the correlation between ci and yJ.
In what follows I wish to isolate the informational issues in the tradeoff between contracting and acquisition. Thus, I shall treat product j as a pure signal-producing activity."4
This allows me to disregard the issues of optimal firm size treated in previous sections.
Further, I analyze the tradeoff from a partial equilibrium perspective and ignore the optimal
contracting problem of the second firm.15Finally, to derive definite results I shall consider
a parameterization of the random variables.
O Parameterizing the signals. Assume that has a log-normal distribution. For purposes
of information extraction, observing as is equivalent to observing log(ae) since one can
transformone to the other without loss of information. It will be more convenient to consider
observing the transformed signal log(ae) = log(a) + log(e) since log(e) has a normal distribution.

Consider the problem faced by firm 1. Let the signal firm 1 observes on its own output
be X1, where X1 =o a + q1j, a
log(a), qij log(c), qij N(O, cry). Consider a second firm,
with "output" X2 = a2 + '12, where a2 is the log of a2, and a2 equals the effort of firm 2's
manager. Under the assumption that firm 1 knows that firm 2's contract with its manager
implements a2, firm 1 can observe q2. Thus, without merger firm 1's signal on the effort of
the manager of firm 1 is (X1, pq2)- We assume that q1iand 2 are bivariate normal with means
equal to zero, correlation X, and variances a2 and a2, respectively.
This simple structuregenerates immediate implications about acquisition choices. Suppose that, for some reason, the product of firm 2 had a perfect negative correlation with
the product of firm 1. With perfect negative correlation between its products, a diversified
firm could achieve the first best. This is the case that, at first blush, one might expect to be
the most attractive for merger. Our analysis indicates, however, that the value of acquisition
relative to relative-performance contracting would be zero. The firm would be indifferent
between the two: In either case the firm could write a riskless contract implementing a. The
pair of signals in the merger case, {X1, X2}, allows perfect inference of a, as does the pair
of signals {X1, 7q2} in the relative-performancecontracting case. This gives us the following
proposition.
Proposition4. When the correlationof m
I, q2iis-1, the firm is indifferentbetween acquisition
and relative-performance contracting with respect to solving the agency problem.
Thus, with zero correlation merger is valuable but relative-performance contracting
has no value, while at perfect negative correlation the two are equally valuable. One would
expect that if 11and 2 were identical, the value of merger would be zero, since the additional
signal would carry no additional information. On the other hand, relative-performance
contracting would achieve first-best since one could perfectly infer a (and therefore a) from

13 It is not necessary that the contracts themselves be observed. What a will optimally be implemented for
any firm is common knowledge.
" I am grateful to an Associate Editor for suggesting this simplification.
'5 Aron (1985) analyzes the problem within a general equilibrium framework as a matching problem. The
qualitative results are virtually the same.

84

THE RAND JOURNAL OF ECONOMICS

the pair {X1, m72}. This is indeed the case. To analyze the tradeoff for arbitrary levels of
correlation and variance of the Ei, we further investigate our parameterized structure.
When (X1, 712) is observed, as would be the case if firms 1 and 2 were not merged, the
maximum likelihood estimator of a is
a

X1
(cr1/cr2)Xn2
(20)
for X E (-1, 1). This estimator is unbiased and is a sufficient statistic for a. Further, it is a
complete"6 sufficient statistic, and since it is normally distributed, this means that it is a
minimum variance sufficient statistic for a. The variance (mean-square error) of a is
2(1

(21)

X2).

Now suppose that firm 1 acquires firm 2 so that the manager of firm 1 becomes the
manager of the merged firm. Then firm 1 observes (X1, X2), where Xi = a + ij, and a is
the (log of) effort of the manager of firm 1. When (XI, X2) is observed, the maximum
likelihood estimator of a is
b2X?+X2- Xb(XI + X2)

(22)

where b =c2/r1.
Again, a' is unbiased and is a normally distributed, complete sufficient
statistic for a. The variance of a' is
l
2)
(23)
_b252
1 + b2 -2Xb
We want to derive a condition that indicates when merger is preferred to relativeperformance comparison contracts and vice versa.
We shall now need the following results.
Lemma 3. Consider a firm facing a set of signals {Y = (Y1, . . ., Y,)}, and let I(Y) be the
cost-minimizing contract implementing some effort level a based on the vector Y. Let p be
the cost of this contract, and let W(Y) be a sufficient statistic for Y with respect to a. Then
the firm can write a contract that implements a as a function of W(Y) only-rather than
of Y,, . . ., Y,-whose expected cost is no greater than p.
For a proof, see Holmstrdm (1982).
Lemma 4. Let X be a normally distributed random variable and let be normal with mean
0 and variance ca2. Then X is sufficient for Y = X + e in the Blackwell sense.
By Proposition 13 in Grossman and Hart (1983), then, a contract based on X has a
lower expected cost than one based on Y. This is true for any two normal variates (X, Y)
with the same mean where var(X) < var(Y).
By Lemmas 3 and 4 and our normality assumptions, the compensation for risk in an
incentive contract is monotonically increasingin the variance of the signal. Thus, minimizing
the expected cost of the manager's incentive contract is equivalent to minimizing signal
variance over the two alternatives of relative-performance contracting and merger.
The variance of the signal facing any firm i if firm i does not acquire firmj is given by
(21). The variance facing firm i if firm j is acquired is given by (23). Merger is preferredto
contracting when (21) exceeds (23), or
Vi

X2)] --2b[ + b2 - 2Xb]


[,Y~(1- X2)]
I
= Si2( -2)(

16 For a

proof, see Lehmann (1983).

wh

> ?-

(24)

ARON

85

This is quite intuitive, because for X E (-1, 1) this is equivalent to


2 cov(71l, 12) < var(1 1).

(25)

Further, when agency costs are approximately linear in the variance of the signal, the value
of merger over relative-performance contracting is monotonic in Vij.17The function Vij is
plotted in Figure 1 for various values of b.
Two results emerge for our parameterization. First, there is a tradeoff between contracting based on relative performance and merger, and the tradeoff is not monotonic in
the correlation between the firms' signals. In particular, the relative value of merger is
increasing as correlation increases from -1 and decreasing in the neighborhood of X = 0.
Second, in the region of zero correlation, the relative benefits of merger are a decreasing
function of the variance of the acquired firm's signal (when holding the variance of the
acquiring firm's signal and the covariance between the two signals constant).
By assuming product j to be merely a signal-producing activity, I have avoided the
issue of collusion between managers. When the compensation of manager i is contingent
FIGURE 1
THE VALUE OF A MERGER AS A FUNCTION OF SIGNAL CORRELATION
iV

2
4

-1

~~~~~~~~~~~~0

17 Vijwas derived under the hypothetical condition that only two firms exist. Nevertheless, the derivation is
perfectly general. Suppose we want to determine whether firms i and j, in a population of N + 2 firms, should
merge. In either case both firms will condition their contracts on the signals generated by the N other firms in the
economy. Let nN = {nk, k = 1, . . ., N, k # i, j }. Without merger the two firms, i and j, will have contracts based
on the signals {Xi, tj, 1N} and {X, li, IN}, respectively. If the firms merge, the contract will be written on the signal
{Xi, X;, my}. Let -y be a sufficient statistic for {Xi, 1N}. Then we can write the nonmerger signals as {yi, qj},
{^yj,qi}, and the merger signal as {^yi, yj}. This is equivalent to the case of a two-firm economy in which firm i's
signal is -yjrather than Xi and similarly for j.

86

THE RAND JOURNAL

OF ECONOMICS

on the productivity of firmj and vice versa, the managers of the two firms have an incentive
to collude. On the other hand, collusion is not an issue in the merger case since, after merger,
there is only one manager. Thus, if the ability to collude increases the cost of relativeperformance contracting, the relative benefits of merger will rise.
Nevertheless, for managers to collude effectively, they must be able to monitor each
other's efforts,despite being managersof differentfirms that may well be competitors, despite
the incentive to cheat, and despite the inability of the firms' principalsthemselves to monitor
the managers effectively. For these reasons I do not fully explore the implications of collusion here."8

7. Conclusions
* In this article I have developed a model of the firm that explicitly treats the role of the
manager in the optimization of firm size and level of diversification. Both the manager's
attributes and the incentives he faces are determinants of firm structure.
I showed that diversificationalleviates the moral-hazardproblem firms face with respect
to their managers because it reduces the amount of risk the manager must bear for incentive
purposes. Assuming that the compensation of managers adjusts so that better managers
receive higher expected compensation, I derived comparative statics for the effect of ability
on diversification and firm size. The model yields a positive monotonic cross sectional
relationship between firm size and diversification and between product line size and total
firm size. These results are consistent with the empirical regularitiesdiscussed in the Introduction. Finally, I developed specific implications of the model for optimal diversification
through merger. The value of merger as opposed to relative-performance contracting increases with signal correlation in regions of very low correlation, but decreases with correlation for correlations near zero. If we take the signal on a manager's effort to be the return
to the firm's stock (purged of market effects), this strongly contrasts with the implications
of models, of the sort discussed in Section 1, that explain firm product diversification as a
means of reducing shareholders' risk.
I have assumed that no technological or other complementarities between firms affect
the diversification decision. This assumption enables us to isolate the agency incentive as
one reason that firms diversify. To the extent that firms and managers enjoy a gain from
the agency effects of merger, it is surely only one of many benefits perceived by the merging
parties. But the benefits to diversification that result from alleviating the agency problem
in firms will be realized by diversifying firms regardless of their prime motivation. Thus,
the analysis is entirely complementary to models that focus on benefits to mergers between
firms that are technologically or contractuallyrelated.On the other hand, the model's strength
lies in its generation of an incentive for diversification into unrelated products and mergers
of unrelated firms.
References
ARON,D.J. "ManagerialAbility, Moral Hazard, and the Structureof Business Firms." Ph.D. Dissertation,University
of Chicago, 1985.
CLARKE, F.H. AND DARROUGH, M.N. "Optimal Incentive Schemes." Economic Letters, Vol. 5 (1980), pp. 305310.
DIAMOND,
D.W. ANDVERRECCHIA,
R.E. "Optimal ManagerialContractsand EquilibriumSecurityPrices."Journal
of Finance, Vol. 37 (1982), pp. 221-235.
GORT, M. Diversification and Integration in American Industry. Princeton: Princeton University Press, 1962.

18 This view of collusion between managers of different firms is consistent with that of Ramakrishnan and
Thakor (1986), who assume that managers within firms can monitor one another but that managers across firms
cannot.

ARON
GREEN, J. AND STOKEY, N. "A Comparison

of Tournaments

87

and Contracts." Journal of Political Economy, Vol.

91 (1983), pp. 349-364.


GROSSMAN, S.J. AND HART, O.D. "An Analysis of the Principal-Agent

Problem." Econometrica, Vol. 51 (1983),

pp. 7-45.
HOLMSTROM,B. "Moral Hazard and Observability."Bell Journal of Economics, Vol. 10 (1979), pp. 74-91.

. "Moral Hazard in Teams." Bell Journal of Economics, Vol. 13 (1982), pp. 324-340.
IRELAND,N.J. "A Note on ConglomerateMergerand BehavioralResponse to Risk." Journalof IndustrialEconomics,

Vol. 31 (1983), pp. 283-289.


E.P. AND ROSEN, S. "Rank Order Tournaments as Optimum Labor Contracts." Journal of Political
Economy, Vol. 89 (1981), pp. 841-864.
LEHMANN, E.L. The Theory of Point Estimation. New York: John Wiley and Sons, 1983.
LUCAS,R.E., JR. "On the Size Distribution of Business Firms." Bell Journal of Economics, Vol. 9 (1978), pp. 508523.
MARCUS, A.J. "Risk Sharing and the Theory of the Firm." Bell Journal of Economics, Vol. 13 (1982), pp. 369378.
MIRRLEES, J.A. "The Theory of Moral Hazard and Unobservable Behavior: Part I." Mimeo, Nuffield College,
Oxford, October 1975.
RADNER, R. AND ROTHSCHILD, M. "On the Allocation of Effort." Journal of Economic Theory, Vol. 10 (1975),
pp. 358-376.
RAMAKRISHNAN,R.T.S. AND THAKOR, A.V. "Incentive Problems, Diversification, and Corporate Mergers."Discussion Paper No. 309, Graduate School of Business, Indiana University, December 1986.
SALTER, M.S. AND WEINHOLD, W.A. Diversification throughAcquisition. New York: Free Press, 1979.
SCHERER,F.M. AND RAVENSCRAFr,D. "Growthby Diversification:EntrepreneurialBehaviorin Large-ScaleUnited
States Enterprises."National Bureau of Economic Research Working Paper No. 113, September 1984.
LAZEAR,