Sie sind auf Seite 1von 63

Public Finance & Political Economy Reading

Summaries

Michael Best

May 12, 2010


Contents

1 Optimal Income Support 4


Saez (2002) Optimal Income Transfer Programs . . . . . . . . . . . 5

2 Providing Public Goods: Non-profits and Private


Provision 8
Andreoni & Payne (2003) Do Government Grants to Private Char-
ities Crowd Out Giving or Fund-raising? . . . . . . . . . . . . 9
Besley & Ghatak (2001) Government Versus Private Ownership of
Public Goods . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Besley & Ghatak (2007) Retailing Public Goods . . . . . . . . . . 16
Payne (1998) Does the Government Crowd-out Private Donations? 20

3 Public Goods and Public Provision of Private Goods 22


Atkinson & Stern (1974) Pigou, Taxation and Public Goods . . . . 23
Besley et al. (1999) The Demand for Private Health Insurance . . 26
Gahvari & Mattos (2007) Conditional Cash Transfers, Public Pro-
vision of Private Goods, and Income Redistribution . . . . . . 29
Shelton (2007) The Size and Composition of Government Expenditure 30

4 Public Organisation 33
Besley & Ghatak (2005) Competition and Incentives with Moti-
vated Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Glaeser & Shleifer (2001) Not-for-profit Entrepreneurs . . . . . . . 37
Hart, Shleifer & Vishny (1997) The Proper Scope of Government . 40

5 Reported Income, Tax Evasion and Tax Avoidance 44


Feldstein (1995) The Effect of Marginal Tax Rates on Taxable Income 45
Gruber & Saez (2002) The Elasticity of Taxable Income . . . . . . 47

6 Tax and Labour Supply 50


Chetty et al. (2010) Adjustment Costs, Firm Responses, and Labor
Supply Elasticities . . . . . . . . . . . . . . . . . . . . . . . . 51
Eissa (1995) Taxation and Labor Supply of Married Women . . . . 56

1
CONTENTS

Eissa & Liebman (1996) Labor Supply Response to the Earned


Income Tax Credit . . . . . . . . . . . . . . . . . . . . . . . . 58
Saez (2009) Do Taxpayers Bunch at Kink Points? . . . . . . . . . . 60

2
These summaries are meant to be a public good. Notwithstanding that
they are written as a complement to actually reading the article and to
aid re-reading by pointing out the main points. There is an emphasis on
methodological contributions since that’s where I believe the greatest value-
added lies in having these summaries. As a result, if these notes are useful at
all, it’s likely that they will be most useful to (potential) applied researchers
rather than policy-minded readers, though policy implications and external
validity of results are also addressed.
Health Warning: these notes are bound to be littered with mistakes, all
of which are my fault so use at your own peril. If you find mistakes or think
I should add/change anything please let me know <m.c.best@lse.ac.uk>

3
Chapter 1

Optimal Income Support

4
1. Optimal Income Support

Emmanuel Saez (2002)


Optimal Income Transfer Programs: Intensive Ver-
sus Extensive Labor Supply Responses
QJE 117 (3) 1039–1073
Lowdown: Optimal income transfer programs have to take both intensive
and extensive labour supply margins into account, especially as the empirical
literature suggests that the extensive elasticity is bigger than the intensive
elasticity for low income households. Saez builds a model of this and cali-
brates it. Surprisingly, it’s not that far from the observed system in the US
for single-parent families.

Why Do We Care?
• Most developed countries have income support programs and they have
generated considerable controversy due to their possible effects on in-
centives to work.

• The empirical literature suggests that it is important to think about


both the extensive and intensive margins in labour supply as estimated
participation elasticities seem to be larger for low income people than
intensive margin elasticities.

• Theory suggests very different responses to Negative Income Tax (NIT)


models giving a guaranteed income level that is taxed away, and Earned
Income Tax Credit (EITC) models which have no guaranteed income
but give essentially negative marginal tax rates at low income levels.

Theory
• There are I + 1 occupations. The unemployed earn w0 and the salaries
for the I other jobs are increasing in i; 0 < w1 < . . . < wI . Each class
of worker pays taxes Ti (which may be negative) so after-tax income
in occupation i is ci = wi − Ti .

• Normalise the population to 1 so the proportion of individuals in occu-


pation i is hi and the governmentPhas to finance expenditure H giving
a government budget constraint Ii=0 hi Ti = H.

• The government’s welfare function gives social weights gi to the various


occupations where the weights gi are decreasing in i if the government
wants to redistribute.

• Extensive Responses only:

5
1. Optimal Income Support

– individuals have skill level i ∈ {0, 1, . . . , I} and can only choose


their job or unemployment so therefore if we assume away income
effects hi depends only on ci − c0 .
ci −c0 ∂hi
– We are interested in the elasticity ηi = hi ∂(ci −c0 )
– Proposition 1 at the optimum we have that
Ti − t0 1
= (1 − gi )
ci − c0 ηi
which together with the government budget constraint determines
the tax rates Ti

• Intensive Responses only:

– Without income effects, if the rewards to occupation i go down


relative to occupation i − 1 then some people will switch. In this
case we can show that hi is a function only of ci+1 −ci and ci −ci−1 .
ci −ci=1 ∂hi
– Now we are interested in ζi = hi ∂(ci −ci−1 )
– Proposition 2 at the optimum we have that
 
Ti − Ti−1 1 (1 − gi )hi + (1 − gi+1 )hi+1 + . . . + (1 − gI )hI
=
ci − ci−1 ζi hi

which together with the government budget constraint determines


the tax rates Ti

• Both intensive and extensive responses: At the optimum


I  
Ti − Ti−1 1 X Tj − T0
= hj 1 − gj − ηj
ci − ci−1 ζi hi cj − c0
j=i

Empirical Calibration
• Look at what the optimal tax schedule looks like for various values of
η and intensive elasticity.

• Based on the empirical literature plausible benchmark values are η =


0.5 and intensive elasticity = 0.25 which give a sizeable guaranteed
income, then around 10% tax on the first $4000 earned, then around
60% tax from 4 to 15 thousand and then about 50% for middle and
high income earners.

• This isn’t that far off from the US system for single parent families. It
is not the case for poor families or childless families though.

6
1. Optimal Income Support

Limitations and Moving Forward


• Would be cool to try and back out the parameters for the actually
observed welfare systems in the US and other countries.

• Also, would be interesting to see what the implied welfare weights (gi )
are.

• In the model all individuals live by themselves, but in many coun-


tries taxes and labour supply decisions are made by households. What
would happen?

7
Chapter 2

Providing Public Goods:


Non-profits and Private
Provision

8
2. Providing Public Goods: Non-profits and Private Provision

James Andreoni & A. Abigail Payne (2003)


Do Government Grants to Private Charities Crowd
Out Giving or Fund-raising?
AER 93(3) 792–812
Lowdown: Crowding out of private giving by government support of char-
ities is a long-standing question of interest. Andreoni & Payne identify an-
other interesting channel: charities will try less hard to get private donations
if they have government support. They find strong evidence for this among
arts organisations but not so strong for social service organisations.

Why Do We Care?
• Whether or not government financing of charities crowds out private
giving is an interesting empirical question. We think it does, but only
partially.

• Andreoni & Payne look at another interesting channel. Will charities


try less hard to raise private funds if they get government support?

A Model
• Let xi be an individual’s consumption of the private good and yij i’s
contribution to charity j. Let θj be the probability an individual is
solicited by charity j. The costs of fund-raising are Fj (θj ) which is
increasing and convex. Finally, Gj is the government grants to charity
j. Thus a charity provides services
n
X
Cj = yij + Gj − Fj (θj )
i=1

• The quality of the charity is Lj ∈ [0, 1] and individuals have a most


preferred quality L∗i ∈ [0, 1] so that for the distance measure d(L∗i , Lj )
individuals have preferences Ui = ui (xi , Cj ; lij ) that are increasing in
∂u (x,C;l )/∂C
lij = 1 − d(L∗i , Lj ) with the single crossing property ∂uii (x,C;lijij )/∂x ≥
∂ui (x,C;lik )/∂C
if lij ≥ lik which will mean individuals prefer to give to
∂ui (x,C;lik )/∂x
a charity that is closer to them and give more to closer charities.

• Solicitations increase the number of givers and it also means that givers
are better matched to charities which also raises giving. Individuals
will maximise their utility given the solicitations giving equilibrium
contributions of the form yij∗ = f (θ , θ ; G ).
ij j −j i

9
2. Providing Public Goods: Non-profits and Private Provision

• We can solve the model by backward induction. Assume charities


maximise Vj = Cj (θj , θ−j ; Gj ) − sj θj where sj is the disutility of having
to do fund-raising. This will give best-reply functions θj∗ = θj∗ (θk ; Gj )
with ∂θj∗ /∂θk < 0 and ∂θj∗ /∂Gj ≤ 0.

• These yield Proposition 1: As government grants to a charity increase,


fund-raising efforts by that charity will decrease and Proposition 2:
If the government increases its grant to a charity, the total level of
charitable services will always rise, although not by the full amount of
the grant. This is due to a combination of reduced fund-raising and
classic crowding-out.

Data & Methodology


• Data on nonprofit revenues and expenses from federal tax returns filed
by IRS section 501(c)(3) organisations from 1982 to 1998. Constructed
2 unbalanced panels of 1: arts organisations (art museums, performing
arts groups etc.) and 2: social service organisations (family or children,
poor or homeless etc.)

• Good because it’s very detailed and has categories of expenditure


within fund-raising so can test hypotheses about mechanisms.

• To test proposition 1 they estimate

Fist = αi + γt + βGist + Oist η + Zst λ + εist

where Fist is fundraising expenditure by charity i in year t and state


s, O is a vector of revenue/expenditure variables and Zis a vector of
economic, demographic etc. variables at the state-year level.

• Estimate this by OLS and then by 2SLS.

– Measurement issues will bias OLS:


1. Timing: government funding, private donations and efforts
for fund-raising may not fall within the same year. Should
government funding be lagged?
2. Properly should exclude fund-raising efforts directed at the
government. Will lead to positive bias. Address this by look-
ing at the components of fund-raising expenditure.
3. Matching grants: sometimes government grants are condi-
tional on the charity raising a specific amount from private
giving. Would give a positive bias.
4. Fund-raising expenditures are skewed towards zero so OLS
might not be ideal. Use Tobit to fix this.

10
2. Providing Public Goods: Non-profits and Private Provision

– Use Instruments
1. state-level total transfers to nonprofits by state and federal
governments.
2. Does the area the charity is in have a senator or congressman
on an appropriations committee. (rather weak F-tests)
3. Total research funding to the universities in the state from
the National Institutes of Health lagged by one year.

Results
• OLS & Tobit:

– Government funding generally has a positive coefficient! Suggests


that for arts and extra $1000 from the government increases fund-
raising expenditure by $13 and $7 for social services.

• 2SLS

– Now government funding has large and negative coeficients: -


$265 for arts and -$54 for social services if use NIH grants to
universities and -$143 for arts and -$19 for social services if use
federal transfers to nonprofits.

• Separating out the expenditure categories into 1: professional fund-


raising 2: Officer Salaries & 3: Other salaries. Clearly the first category
is the one intended by the theory.

– Professional fund-raising is indeed negative even in the OLS esti-


mates now for both arts and social service organisations.

Limitations & Moving Forward


• Why are the estimates so different when you use the different instru-
ments? The authors suggest that it’s because the NIH instrument is
strong but the other isn’t. An alternative explanation that may be
more plausible is LATE effects. The compliers may be very different
charities for the two instruments. This suggests that there’s important
heterogeneity in the sample that hasn’t been properly addressed.

• Fund-raising activities directed at the government weren’t satisfactorily


addressed, there could be quite a lot of reverse causality in the OLS
estimates, but it’s not clear that the 2SLS estimates will get rid of this.

11
2. Providing Public Goods: Non-profits and Private Provision

Timothy Besley & Maitreesh Ghatak (2001)


Government Versus Private Ownership of Public Goods
QJE 116(4) 1343–1372
Lowdown: Grossman-Hart (1986) & Hart-Moore (1990) have shown how
important ownership can be in the private sector. This paper looks at public
goods delivery and shows that it isn’t the case any more that the most
important investor should be hte owner. Now it is the party that values
the project most. This is because the public good element of the project
means that whoever values the project most has high bargaining power in
renegotiations.

Why Do We Care?
• Grossman & Hart (1986) and Hart & Moore (1990) started a whole
literature on the importance of ownership in private firms. Variants of
these insights are relevant in the provision of public goods also.

• The private sector is ever more involved in the provision of public


services.

• Besley & Ghatak provide a model to think about these issues and
discuss applications

Model
• 2 players: g and n invest in a project (e.g. improve the quality) but
the investments are specific to the project and lose value if employed
elsewhere. Their investment decisions are the vector Y = (yg , yn ).
The benefits from the project are b(Y ) where b(yg, yn ) is smooth and
concave, satisfies the Inada conditions and has ∂ 2 b(yg , yn )/∂yg ∂yn ≥ 0
so that the investments are complements.

• the players i ∈ {g, n} value the project to which they contribute Ci at


θi b(Y ) − Ci .

• However, investment is sunk once made and the owner of the project
has the right to exclude anybody from working on the project. Thus
the timing of the game is

1. g and n decide who should own the project. The owner designs
the project.
2. If there is a partnership, then g chooses yg and n chooses yn which
are sunk.

12
2. Providing Public Goods: Non-profits and Private Provision

3. g and n bargain over whether to continue with the project. Trans-


fers are allowed at this stage.

• Ownership is going to matter because it determines the bargaining


status quo payoffs. Let B i (yg , yn ) be the benefit from the project if it
breaks down and i is the owner with all the same properties as b(Y ).
The authors’ key assumption is that

b1 (yg , yn ) ≥ B1g (yg , yn ) > B1n (yg , yn ) ∀ yn


b2 (yg , yn ) ≥ B2n (yg , yn ) > B2g (yg , yn ) ∀ yg

• Nash bargaining results in the transfer from n to g of

t = arg max{θn b(Y ) − z − ūin (Y )}{θg b(Y ) + z − ūig (Y )}


z

• combining this with the fact that ūig (Y ) = θg B i (yg , yn ) and ūin (Y ) =
θn B i (yg , yn ) the players make their investment decisions when i is the
owner by maximising

(θn + θg )b(yg , yn ) + (θg − θn )B i (yg , yn )


vgi (yg , yn ) = − yg
2
for g and

(θn + θg )b(yg , yn ) + (θn − θg )B i (yg , yn )


vni (yg , yn ) = − yn
2
for n. This leads to proposition 1: Under assumption 1, in any Nash
equilibrium, investment levels are below the joint surplus maximising
levels. Giving ownership to the party with the highest valuation im-
proves investment incentives for both parties and gives the highest pos-
sible level of joint surplus. This is in contrast to the standard GHM
case where ownership should be given to the most needed investor.
This comes from the fact that if they disagree the person who cares
most about the project has a stronger bargaining position, even if they
haven’t done any investing. This strengthens the incentive to invest
for both parties.

Extensions
Simplify the model a little to have b(Y ) = ag µ(yg ) + an µ(yn ), B n (yg , yn ) =
λg ag µ(yg ) + an µ(yn ) and B g (yg , yn ) = ag µ(yg ) + λn an µ(yn ).

1. Joint Ownership

13
2. Providing Public Goods: Non-profits and Private Provision

• The project cannot go ahead if the parties fail to agree: B g (yn , yg ) =


B n (yn , yg ) = 0. This leads to Proposition 2: under assumption 1
the investment incentive of the more caring party will be lower and
that of the less caring party will be higher under joint ownership
compared with the two pure forms of ownership. If the invest-
ment of the less caring party is relatively more important, joint
ownership may yield a higher level of joint surplus.

2. Contracting with a for-Profit Firm

• Let θn = 0. Now, if there is a small cost c > 0 of completing


the project, the for-Profit firm won’t want to complete it if it
is the owner and it disagrees with the government. This gives
Proposition 3: under assumption 1 the for-Profit firm should be
the owner if its investment is sufficiently important. Otherwise
the higher valuation party should optimally be the owner.

3. Outcomes with a Private Good Component

• For simplicity have a single investor n. and now yn generates a


public good component an µ(yn ) but also a private good compo-
nent β(yn ). This leads to Proposition 4: under assumption 1,
optimal ownership will depend on the relative importance of the
private good and the public good component. If the public good
is sufficiently important, the high valuation party should be the
owner. Otherwise, the investor should be the owner.

4. Perfect Substitutes

• This allows us to think about crowd-out of public provision by


private provision. Let b(yg , yn ) = η(yg + yn ), B g (yg , yn ) = η(yg +
λyn ) and B n (yg , yn ) = η(λyg + yn ). Then we have Proposition 5:
under assumption 1 there is only 1 investor in equilibrium, and
this party should also be the owner. The owner-investor should be
the party that values the project most highly.

5. Ideology

• The owner can choose the ideological component of the design


after investments are sunk (e.g. the religious content of educa-
tion). Let r ∈ 0, 1 represent ideology. Assume g prefers r = 0
and n prefers r = 1. Now the valuation of the project by g is
{qr + (1 − r)}θg instead of θg and n’s valuation is {q(1 − r) + r}θn
instead of θn where q captures the congruence of preferences. This
gives Proposition 6: under assumption 1 if the owner chooses the

14
2. Providing Public Goods: Non-profits and Private Provision

project design, her investment incentive is higher while that of the


other party is lower. If preference differences are large enough,
joint surplus need no longer be higher when the high valuation
party is the owner.

Limitations and Moving Forward


• Would be great to have a strategic government vying for investment
from mission-driven NGOs. e.g. a corrupt government trying to offload
it’s public service delivery onto the private sector or foreign NGOs. In
a long-run setting, which actually delivers higher surplus?

• Endogenise project choice in a world with multiple public goods and a


government budget constraint.

• Interesting to look at repeated interactions in which contracts can be


renewed and NGOs compete for government partnerships. Link with
the theory of regulation.

15
2. Providing Public Goods: Non-profits and Private Provision

Timothy Besley & Maitreesh Ghatak (2007)


Retailing Public Goods: The Economics of Corpo-
rate Social Responsibility
JPubE 91(9) 1645–1663
Lowdown: CSR is essentially bundling a public good with a private good.
The equilibrium level of donations is the same as in the private donations
equilibrium (Bergstrom et al. 1986). Provision by an pportunistic govern-
ment with good monitoring of politicians and a majority of the population
caring leads to overprovision but dominates CSR, but if only a minority cares
or monitoring of politicians is bad, then CSR dominates.

Why Do We Care?
• CSR is essentially providing a public good (or reducing a public bad)
alongside the production of a private good.

• This parallel means that th theory of the private provision of public


goods is relevant. Indeed it turens out that the level of public goods
provided is exactly the same as the standard voluntary contribution
equilibrium for public goods.

• The authors also compare private provision through CSR with


non-profits and public provision.

Theory
• 1 Public good and 2 private goods, one of which is not produced (en-
dowed) and is the numeraire. The level of the produced private good
is x and the level of the public good is g. The N potential consumers
of x each get utility b > 0 from consuming it.

• A “caring” subset of size n of the consumers value the public good


according to the concave function f (g). Preferences are quasilinear
V i (p, g) = b−p+γ i f (g) where γ i ∈ {0, 1} where γ i = 1 if the consumer
is caring.

• There is free entry and there are S > 3 potential producers who can
produce the private good at cost c + αθwhere θ ≥ 0 is the amount of
public good they produce alongside the good.

• Timing: Firms announce their (pj , θj ) pairs. Consumers then decide


which firm to purchase from or not to purchase at all.

16
2. Providing Public Goods: Non-profits and Private Provision

• Consumer Equilibrium: Given the firms’ declarations {(pj , θj )}Sj=1 con-


sumers set δij = 1 if consumer i shops at firm j ∈ S ∪ {0} and 0
otherwise, where firm 0 denotes not buying at all to build up the vec-
tor δi ≡ (δi0 , δi1 , . . . , δiS ). Then, the consumer equilibrium {δi∗ }N
i=1 is
characterised by:
  
 X X 

δij ({(pj , θj )}Sj=1 ) = 1 iff j = arg max V i pj , θj + ∗ 
θs δks ∀i = 1, . . . , N
j∈S∪0  
k6=i s∈S

• Producer Equilibrium: Let sj ({(ps , θs )}Ss=1 ) = N ∗ S


P
i=1 δij ({(ps , θs )}s=1 )
be the number of consumers who shop at firm j in equilibrium. Now
a producer equilibrium is {(p∗j , θj∗ )}Sj=1 which satisfies

(p∗j , θj∗ ) = arg max (pj − c − αθj )sj ({(p∗s , θs∗ )s6=j , (pj , θj )})∀j ∈ S
(pj ,θj )

• This all gives Proposition 1: The unique equilibrium is characterised by


two pairs of price and public goods contributions (p∗n , θn∗ ) and (p∗c , θc∗ ),
the first for neutral consumers and the second for caring consumers
such that: p∗n = c and θn∗ = 0 and p∗c = c + αθc∗ and f 0 (nθc∗ ) = α.

• Which has some interesting features

1. the level of public good under CSR is the same as if the caring
consumers make private voluntary contributions (as in Bergstrom
et al. 1986)
2. In the CSR equilibrium caring consumers strictly prefer buying
the ethical version to switching to the neutral version (because
by switching they reduce the amount of the public good) so the
equilibrium is robust to valuations of the public good being private
information.
3. Raising CSR standards among ethical firms can create a Pareto
improvemtne but not the first best outcome.
The highest level of CSR that makes the caring consumer indif-
ferent between the ethical and the neutral version is θ̄c satisfying
f (nθ̄c ) − f ((n − 1)θ̄c ) − αθ̄c = 0 so θ̄c > θc∗ which is a Pareto
improvement but θ̄c is below the first best outcome θc∗∗ satisfying
nf 0 (nθc∗∗ ) = α. Also, θ̄c isn’t sustainable as an equilibrium as
firms will undercut this level and offer the caring consumers a
lower price and a lower contribution which attracts the caring
consumers.
4. An increase in the provision of the public good (e.g. by the gov-
ernment) will crowd out competitive CSR provision.

17
2. Providing Public Goods: Non-profits and Private Provision

5. Adding warm glow utility v(θ) to the caring consumers’ prefer-


ences will leave the results unchanged.

• This all assumes the firms can credibly promise to provide public goods.
Assume that the firms have an infinite horizon and they use their
reputation to give their commitments credibility. In each period the
firm can either provide the θthey promised or cheat and provide none.
If caught cheating they’re punished forever. If they’re honest they get
pc − c − αθc
Π = (pc − c − αθc ) + βΠ =
1−β
if instead the firm cheats it is caught with probability q so it gets

Π̂ = pc − c + (1 − q)βΠ

so  be honest if Π ≥ Π̂or pc ≥ c + φ(q, β)αθc where φ(q, β) =


 they will
1−β+qβ
qβ > 1. The CSR firms earn a rent to keep them honest. This
gives Proposition 2: The optimal sustainable level of CSR when con-
sumers cannot perfectly monitor whether the firm delivers the promised
level of the public good is given by f 0 (nθ̂c ) = φ(q, β)α which is lower
than with perfect monitoring.

Comparing Institutions
1. Government vs. CSR

• Uniform regulations for all firms:


Proposition 3: Suppose that b > c then a amall uniform regulation
on the level of θhas two effects: (i) it leaves the total contribution
unchanged; (ii) it leads to redistribution of contributions from car-
ing to neutral consumers.
Proposition 4: Let θ̂solve nf 0 (N θ̂) = α then if b ≥ c+αθ̂ the first-
best level of public goods can be achieved by a uniform regulation
making firms contribute θ̂.
• Government failure: Assume that the government can provide
the public good and they do this to please the majority. Define
π = n/N then if π < 1/2, G∗g = 0 and if π ≥ 1/2, N f 0 (G∗g ) = α
which exceeds the surplus level.
Proposition 5: if π < 1/2 CSR generates a Pareto improvement.
If π ≥ 1/2, government provision leads to overprovision with re-
spect to the first best but higher surplus than CSR so long as N is
higher than some critical value N > 1.
Suppose politicians try to divert the budget. infinitely-horizoned
politicians earn a wage w and they pick the level or public good

18
2. Providing Public Goods: Non-profits and Private Provision

to please the majority. He could cheat and eat the whole tax
revenue as perks. If he is caught he gets 0 forever. The same
logic as before says that the politician will be honest as long as
w ≥ 1−β
βqg αG where qg is the probability of getting caught.
Proposition 6: Suppose that government and corporations are op-
portunistic. If π < 1/2 then reputation-enforced CSR generates a
Pareto improvement. However, if π ≥ 1/2 then: (i) if monitor-
ing of government is sufficiently good (qg > q̄g ) then government
provision will lead to overprovision but higher surplus unless N is
small; (ii) if government provision is at some intermediate level
(qg ≤ qg ≤ q̄g ) then government provision dominates CSR; (iii) if
monitoring of government is poor (qg > qg ) then CSR dominates
government provision

2. Non-profits vs. CSR

• Note that the equilibrium with private donations to nonprofits


is the same as the CSR equilibrium as long as the unit cost or
provision is the same. However, there might be complementarities
between the private good production and the public good (e.g.
it’s easy for Nike to reduce child labour by just not employing
children. Much harder for a nonprofit).
• If we add in a non-distribution constraint (Hansmann, 1980; Glaeser
& Shleifer, 2001) then an opportunistic nonprofit manager will
have an identical outcome to opportunistic CSR if he can appro-
priate all the profits (γ = 1) and α = αnp . However, if qnp = q and
αnp = α but nonprofit managers can only get part of the profits
(γ < 1), nonprofits deliver higher levels of the public good.

Limitations & Moving Forward


• Many other interesting dimensions of CSR such as a means to attract
and retain talent. Will Goldman Sachs & BP have more difficulty now
that they’r pariahs? This versus Google’s “Don’t be evil”

• How much do we care about public goods like less child labour on the
other side of the world?

• Can firms offer both products at the same time? Nestle has both kinds
of coffee. More generally, would be nice to try and model the producer
side’s competition a bit more. For example, I suspect that CSR is
more prevalent in markets with a smallish number of producers where
reputation effects are big.

19
2. Providing Public Goods: Non-profits and Private Provision

A. Abigail Payne (1998)


Does the Government Crowd-out Private Donations?
New Evidence From a Sample of Non-Profit Firms
JPubE 69(3) 323–345
Lowdown: Theory gives a variety of plausible stories about how much
crowd-out of private donations we should expect from government giving.
These range from none at all to total crowd out. Payne uses a panel of
charities to look at this question. Her favourite estimates suggest crowd-out
is about -0.5

Why Do We Care?
• There are various theories about how much crowd-out of private do-
nations we should expect. Purely altruistic theories (givers only care
about the public good provided) suggest complete crowd-out whereas
purely egotistic models (giving purely driven by warm glow from giv-
ing) suggest no crowd-out at all.

• The empirical evidence on this isn’t quite satisfactory. Payne’s contri-


bution is her empirical methodology. She uses a panel of charities (so
giving can be directly linked to the recipients) so she can use fixed ef-
fects and she also instruments for government grants to overcome some
endogeneity issues with OLS.

Theory
• Payne uses a model with 3 stages.

1. The government sets its grants in a political economy process like


Besley & Coate (1997)
2. Individuals take government grants and each others’ donations as
given and choose their levels of donations
3. Firms receive donations and supply the charitable good.

• This suggest estimating a 2-equation recursive model (which she only


sort-of does) with government grants determined in the first equation
and individual decisions in the second equation.

Data & Methodology


• The data is from the federal tax returns filed by IRS 501(c)(3) organi-
sations from 1982 to 1992. Payne looks at organisations that are crime

20
2. Providing Public Goods: Non-profits and Private Provision

or disaster related, employment or youth related, food or shelters and


human services. She constructs a panel from the IRS data.

• She estimates
Dijt = α + βGovijt + γZjt + it
where Dijt is the real private donations received by non-profit i in
state j at time t, Govijt is the government grants received and Zjt is
the vector of political and economic measures for the state-year the
non-profit is in.

• Using OLS raises omitted variable and simultaneity concerns

– Omitted variables include local fluctuations in demand that aren’t


absorbed in the state-level indicators but are correlated with gov-
ernment grants. Trying to address this by interacting the first 3
digits of the zip code with the year dummies doesn’t change the
estimates much.
– Simultaneity seems to be a bigger issue as government grants and
private donations may be jointly determined rather than private
donations responding to government grants as the model and OLS
assume. To address this Payne instruments for government grants
with instruments from 2 groups
1. government transfer payments to individuals
2. government transfer payments to non-profits

Results
• OLS produces a very very small, but significantly negative coefficient
suggesting very little crowd out

• 2SLS produces a range of estimates but almost all are around -0.5

Limitations & Moving Forward


• The power behind the 2SLS estimates seem to be driven by the individ-
ual transfers not the total non-profit transfers. I worry a bit about the
interpretation. Who are the compliers here? Individual transfers are
things like retirement & disability benefits, unemployment insurance
etc. It seems to me that these are more likely to be closely related to
employment, food & shelters and that kind of organisation rather than
say crime or immigration type charities. Does this mean that there is
actually crowd in for the other kinds of charities?

21
Chapter 3

Public Goods and Public


Provision of Private Goods

22
3. Public Goods and Public Provision of Private Goods

Anthony B. Atkinson & Nicholas H. Stern (1974)


Pigou, Taxation and Public Goods
REStud 41(1) 119–128
Lowdown: The Famous Samuelson Condition for the efficient supply of
public goods assumes lump-sum taxes are possible. If this isn’t true and
public goods have to be financed using distortionary taxation, we need to
know whether the private marginal utility of income α ≶ λ,the MCPF. Pigou
intuitively claims that the distortion from taxation means that α < λ. he
could be wrong if the public good is complementary with taxed private goods
and/or if an exogenous rise in income would reduce taxes (e.g. if taxed goods
are inferior or normally supplied inputs are subsidised)

Why do we Care?
• Samuelson
P proved that the efficient allocation of public goods satisfies
M RS = M RT when we can use lump sum transfers to finance
public good production.

• When we can’t and we have to use distortionary taxation instead,


Pigou’s intuition was that the extra distortionary cost of taxation
would mean that using the Samuelson condition would lead to over-
production.

• Dasgupta & Stiglitz said that this may not be true and actually it may
be the other way round, we could get underprovision.

• This paper cleans up the debate and shows the conditions uder which
each is true.

Theory
• h identical households maximise U (x, e) s.t. q · x = 0where x is the
vector of consumption of n private goods, q is the consumer prices
and e denotes the supply of public goods. The households get indirect
utility V (q, e)

• Production technology given by G(X, e) = 0and is CRS. Perfect com-


petition means that Gk is proportional to pk , the producer price of good
k. The tax on good k is tk = qk − pk

• The government maximises social welfare hV (q, e) giving the FOC for
i " n #
∂V X ∂Xi
h −λ Gi + Ge = 0
∂e ∂e
i=1

23
3. Public Goods and Public Provision of Private Goods

which, taking good 1 to be the numeraire can be rewritten as


n
Ge α h ∂V
∂e
X ∂Xi
= − (qi − ti )
G1 λ α ∂e
i=1
" n #
αX ∂ X
M RT = M RS + ti Xi
λ ∂e
i=1

where αis the private MU of income

• The second term on the rhs is the effect on tax revenue from substi-
tutability or complementarity of private goods with the public good.
Ignore this since it’s been done by Diamond & Mirrlees

• Now the question boils down to α ≶ λ? Pigou claims distortion means


that λ > αto see why he could be wrong, look at the FOCs for the
taxes:
n n
! !
∂V X ∂Xi ∂ X
h =λ Gi =λ pi Xi
∂qk ∂qk ∂tk
i=1 i=1
P P
which, using Vk = −αxk , ∂( pi Xi )/∂tk + ∂( ti Xi )/∂tk = 0 and the
Slutsky decomposition gives
∂ Pn
α ∂tk ( i=1 ti Xi )
=
λ Xk
n n
X ∂Xi X
= 1− ti + ti (Sik /Xk )
∂I
i=1 i=1

where Sik is the slutsky term (the first part of the compensated elastic-
ity) and I is income

• This means that α ≶ λ depends on 2 things

1. the distortionary effect (the last term): which is the bit Pigou
was talking about
2. the Revenue effect (the middle term): If this is positive then
Pigou is right, but if it’s negative he could be wrong. He could
be wrong if the taxed goods are inferior or if factors that are
subsidised are normal. This makes the substitution away from
taxed goods weaker and could lead to α > λ

Conclusion
The Samuelson rule may not always be the appropriate
P comparison in the
provision of public goods. The Samuelson condition M RS could under-
state the marginal benefit if

24
3. Public Goods and Public Provision of Private Goods

1. the public good is complemetary with taxed private goods


P
2. an exogenous rise in income would reduce the tax take (∂( ti Xi )/∂I <
0)

25
3. Public Goods and Public Provision of Private Goods

Timothy Besley, John Hall & Ian Preston (1999)


The Demand for Provate Health Insurance: Do Wait-
ing Lists Matter?
JPubE 72(2) 155–181
Lowdown: Health care is free in the UK but a significant number of people
still get private health insurance. The waiting times associated with NHS
care may be a reason why as those with higher incomes may prefer to pay to
receive higher quality (shorter waiting times) health care. This is consistent
with a Besley & Coate (1991) style model and data on waiting times and
demand for private health insurance.

Why Do We Care?
• Despite health care being free in the UK through the NHS, there are
still a significant number of people who purchase private insurance.
Why?

• Besley et al show that it is associated with longer waiting times. This is


consistent with a Besley & Coate (1991) style model of public provision
of an indivisible good.

Theory
• Individuals face probability θ of becoming sick. Treatment comes in
varying quality levels q ∈ [q, q̄]. The government provides health care
of quality Q financed by taxation.

• A well individual gets utility from income U (y) where income has con-
tinuous support on y ∈ [y, ȳ]. Sick individuals have utility u(q, y) with
the feature that uqy (·) ≥ 0ensuring that treatment quality is a normal
good.

• Individuals with private insurance pay a premium π. They will clearly


pick the best possible treatment q = q̄. and so if we allow some “load-
ing” in the insurance premium π = βθpq̄ where β < 1 is a subsidy and
β > 1 gives some profits to the insurer (or covers admin costs). This
means an ill, insured individual gets utility u(q̄, y − βθpq̄)

• An individual will choose to become privately insured if his expected


utility from being insured is higher than from remaining uninsured:

V I (θ, p, q̄, y, β) = θu(q̄, y − βθpq̄) + (1 − θ)U (y − βθpq̄) ≥


V P (θ, Q, y) = θu(Q, y) + (1 − θ)U (y)

26
3. Public Goods and Public Provision of Private Goods

• This means that if there is an interior income level ŷ such that someone
with that income is indifferent, all those with higher income will choose
to privately insure and that ŷ is non-decreasing in β and Q.

Empirics
• Data from the British Social Attitudes survey that asks about health,
and from the NHS regional trends data.

• Approach 1: Probit

– let mijt be 1 if individual i in regional health authority j in year t


buys private insurance and 0 otherwise. Probit model so mijt =
I(m∗ijt > 0) and m∗ijt = αXijt + βQjt + γYijt + δt + φj + ijt with
ijt normally distributed, Xijt representing individual character-
istics, Qjt are regional health authority public health provision
indicators, Yijt are occupational dummy variables and δt and φj
are fixed effects.
– This gives a positive and significant coefficient on waiting lists
but not on any of the other public health characteristics.

• Approach 2: “more sophisticated”

– Let hijt be a dummy for having individually purchased private in-


surance, Hijt a dummy for employer provided insurance and Eijt a
dummy for being offered employer insurance. Then we model this
as hijt = I(h∗ijt > 0, Hijt = 0), Hijt = 1(Hijt
∗ > 0, E
ijt = 0) and

Eijt = 1(Eijt > 0). Unfortunately Eijt isn’t observed, only ac-
tually having employer health insurance is observed, so a more
reduced form version is
– hijt = 1(h∗ijt > 0, Hijt
∗∗ ≤ 0) and H ∗∗
ijt = 1(Hijt > 0) where

h∗ijt = αXijt + βQjt + δt + φj + uijt


∗∗
Hijt = AXijt + BQjt + CYijt + Dt + Fj + Uijt

where crucially, the Yijt are occupation dummies that only enter
into the second equation and identify the system. This assumes
that occupations determine the employers demand for insurance
for his employees but doesn’t affect the employee’s demand for
health insurance. This is a little sketchy as occupational choice
may be endogenous to some individual characteristics like risk
aversion that affect individual demand for insurance. The authors
argue that the broadness of the occupational categories makes this
less of a concern. I’m not sure.

27
3. Public Goods and Public Provision of Private Goods

– This produces similar results for individually purchased insurance.


It’s not easy to interpret the results for the second equation as it
is an amalgalm of the demands of the individual and the employer
for health insurance.

Limitations & Moving Forward


• The probit specification is quite restrictive. Why don’t any of the other
characteristics of public health care seem to matter?

28
3. Public Goods and Public Provision of Private Goods

Firouz Gahvari & Enilnson Mattos (2007)


Conditional Cash Transfers, Public Provision of Pri-
vate Goods, and Income Redistribution
AER 97(1) 491–502
Lowdown: Because of information problems, targeted redistribution is dif-
ficult. Many contributions came up with mechanisms that screen out the
undeserving (adverse selection). Besley & Coate (1991) do this in the con-
text of public provision of an indivisible good that comes in different qualities
and each individual can only consume one kind. This is a second-best solu-
tion because to satisfy the rich’s IC constraint you have to give the poor an
inefficient variant of the good. Gahvari & Mattos show that you can reach
a first-best solution buy linking acceptance of the publicly provided good to
a cash transfer, essentially allowing targeted lump-sum transfers.

Summary
• Information asymmetries often place constraints on government redis-
tribution to needy groups as others face an incentive to imitate needy
groups in order to benefit from government transfers.

• Self targeting mechanisms (essentially adverse selection mechanisms)


get the “undeserving” to voluntarily opt out of the government trans-
fer somehow. Besley & Coate (1991) achieve this through the public
provision of an indivisible good that can be provided at various quality
levels. Each individual can only consume one quality level and they
can’t be “topped up” with other variants. If the good is a normal good
the rich will want a higher quality level and if the government picks the
low quality level right, the rich will voluntarily opt out of the publicly
provided good and purchase their own preferred variant.

• Gahvari & Mattos note that this is a second-best solution since there
is a deadweight loss from the fact that the publicly provided good may
not be provided at the poor’s preferred quality level (since it has to
satisfy the downward IC constraint for the rich). Their contribution is
to show that by combining the Besley-Coate mechanism with a cash
transfer that is linked to the acceptance of the publicly provided good.
Essentially, this allows targeted lump-sum transfers which can compen-
sate the poor for the inefficient quality of government provided goods
without violating the rich’s IC constraint.

• They also show that under some circumstances, this enhanced mecha-
nism can induce greater redistribution to the poor (but in others, it is
the opposite).

29
3. Public Goods and Public Provision of Private Goods

Cameron A. Shelton (2007)


The Size and Composition of Government Expendi-
ture
JPubE 91(11-12) 2230–2260
Lowdown: There are loads of theories of government expenditure, but
they’re usually tested one at a time. Shelton does a kind of meta analysis
throwing everything on the rhs to see what’s robust. Openness and bigger
government go together, but it’s not because of social insurance of increased
exposure to external risk. Richer countries have bigger governments because
they have more older people.

Why do we Care?
• There are many theories about the determination of government spend-
ing. Some are based on demand for government goods or transfers
driven by demographics, ethnic fragmentation or trade-openness for
example. Supply side theories focus on the political economy of pol-
icy formation and look at electoral rules, the type of government or
political participation for example.

• Tests of these theories usually test one variable at a time, but the
explanatory variables are very correlated so it’s difficult to establish
any kind of general results. Shelton uses everything at the same time
to look for consistent patterns.

Data & Methodology


• Uses data from the IMF’s Government Financial Statistics so he has
central and local govt expenditure in a variety of different spending
categories.

• Uses a random-effects panel specification. Tradeoff between unob-


served heterogeneity (argues for using fixed effects) and measurement
error (argues for using morebetween variation). Compromise is to use
RE and 5-year averages to minimise unobserved heterogeneity prob-
lems (fine as long as measurement error isn’t auto-correlated).

Theory & Results


• Openness

– Cameron (1978) showed that open countries had bigger govern-


ments. Rodrik (1998) showed this in a bigger sample of countries

30
3. Public Goods and Public Provision of Private Goods

and suggests that the explanation is insurance against increased


external risk.
– Openness & government are correlated. However, especially in
LDCs the increase is in categories we wouldn’t think of as social
insurance.
– This could be because trade increases income volatility and it’s
hard to cut government spending in bad times so you get a com-
mon pool problem and thus higher spending.

• Country Size & Fragmentation

– Alesina & Wacziarg (1998) note that larger countries have smaller
governments. They explain this as sharing public goods lowers
per-capita costs and larger countries have more heterogeneous
preferences over public goods provision and so provide less. East-
erly & Levine (1997) find a negative correlation between fragmen-
tation and public goods spending. They think diverse preferences
lead to disagreement and so low provision.
– Central government expenditure does decrease, but this is par-
tially offset by an increase in local government spending. The
same pattern is observed for fragmentation.

• Income

– Wagner’s law is that richer countries have bigger governments.


Things like education and culture are luxury goods. Richer economies
are more complex and so need better regulation.
– Richer countries do wpend more, but it’s because they have more
old people.

• Income Inequality

– Meltzer & Richard (1981) use the median voter theorem to show
that more unequal countries should redistribute more.
– More income inequality does increase direct transfers but doesn’t
seem to have any effect on other public goods with progressive
benefits.

• Political Rights

– Benabou (1996) shows that if wealthier citizens are better repre-


sented in politics, then the gap between the mean and the median
income will exaggerate the extent of redistribution we should ex-
pect. However, Mulligan, Gil & Sala-i-Martin (2002,2004) find

31
3. Public Goods and Public Provision of Private Goods

that controlling for income, inequality & demograhics, govern-


ment type has no effect on social security expenditures.
– More political rights do increase transfers.
– This seems to contradict Mulligan, Gil & Sala-i-Martin. This is
likely because of the different samples. Shelton’s sample contains
more within variation and is more heavily skewed towards rich
countries so he’s mostly picking up within variation in rich coun-
tries while Mulligan, Gil & Sala-i-Martin are use a cross section
of 65 countries so they only have between variation.

• Institutions

– There are contradictory hypotheses about the effect of majoritar-


ian vs. proportional voting rules. Pesson & Tabellini (1999) look
at presidential vs. parliamentary systems. Their hypothesis is
that presidential systems will have less redistribution and public
goods.
– Majoritarian governments spend less across the board than pro-
portional ones. This is true in both parliamentary and presiden-
tial systems.

32
Chapter 4

Public Organisation

33
4. Public Organisation

Timothy Besley & Maitreesh Ghatak (2005)


Competition and Incentives with Motivated Agents
AER 95(3) 616–636
Lowdown: Intrinsic motivation of principals and agents can affect the op-
timal design of organisations and contracts. Intrinsic motivation of the prin-
cipal and the agent can reduce the necessary bonus payments if they are
able to match. Matched, motivated agents work harder and receive smaller
bonuses. When the mission-oriented and profit-oriented sectors compete for
workers this matching effect is joined by an outside option effect increasing
the bonus payments if there is full employment in the for-profit sector.

Why do we Care?
• Profit isn’t the only thing that motivates people in organisations. Many
organisations try to cohere around a mission.
• This is especially relevant in the provision of collective goods where
people will get utility from working in that industry. When there are
such motivated agents matching of agents and principals will become
important in organisational design (mission choice) and contract design
(strength of monetary incentives).

The Basic Model


• An organisation is a risk-neutral agent and a risk-neutral principal who
carry out a project which can have a high outcome Y H or a low outcome
Y L . The probability of the high outcome is the effort e chosen by the
worker at cost c(e) = e2 /2. Effort is unobservable so not contractible
and the agent can’t post an effort bond so there’s limited liability
meaning that the agent needs a minimum consumption of w ≥ 0 in
each period.
• 3 types of principals i ∈ {0, 1, 2} and agents j ∈ {0, 1, 2}. Successful
projects give the principal of type i a payoff of πi > 0. Type 0 principals
only care about profit so π0 is just profits whereas π1 = π2 = π̂contains
some non-pecuniary benefits. Similarly for agents they get payoffs from
project success of

 0 i = 0 and/or j = 0
θij = θ i ∈ {1, 2} , j ∈ {1, 2} , i 6= j
θ̄ i ∈ {1, 2} , j ∈ {1, 2} , i = j


where the authors assume that 1: max π0 , π̂ + θ̄ < 1 so that effort
is interior in all possible matches and that 2: 14 [min {π0 , π̂}]2 − w > 0

34
4. Public Organisation

so that the principal’s and the agents’ payoffs are non-negative in any
possible match.

• Clearly the first-best contract with contractible effort will have effort
e = πi + θij and hence expected joint surplus 21 (πi + θij )2 .

• The optimal second-best contract of base pay w and a bonus for suc-
cess b is given by Proposition 1: Suppose assumptions 1 & 2 hold. An
optimal contract (b∗ij , wij
∗ ) between a principal of type i and an agent of

type j given a reservation payoff of ūj ∈ [0, v̄ij ] exists and has the fea-
tures that (a) the fixed wage is set at the subsistence level wij ∗ = w; (b)
(
max {0, (πi − θij )/2} if ūj ∈ [0, vij ]
the bonus payment is b∗ij = p ;
2(ūj − w) − θij if ūj ∈ [vij , v̄ij ]
and (c) the optimal effort solves e∗ij = b∗ij + θij . So we see that if the
agent is more motivated than the principal and the outside option is
low, the incentive payment is zero. Also, even when the outside option
is low, the incentive payments are smaller if the agent is motivated and
matched with a motivated principal.

• This means that if the outside option is the same for workers in all
sectors, in the mission oriented sector (i = 1, 2), effort is higher and
bonus payments are lower if the agent’s type is the same as that of the
principal so that within the mission-oriented sector this matching will
mean that incentive payments and effort are negatively correlated.

Competition & Matching


• Let Ap = {p0 , p1 , p2 } be the set of types of principals and Aa =
{a0 , a1 , a2 } be the set of types of the agents. Then a matching process
is summarised by the matching function µ : Ap ∪ Aa → Ap ∪ Aa such
that (a) µ(pi ) ∈ Aa ∪ {pi } ∀pi ∈ Ap ; (b) µ(aj ) ∈ Ap ∪ {aj } ∀aj ∈ Aa ;
and (c) µ(pi ) = aj if and only if µ(aj ) = pi for all (pi , aj ) ∈ Aa × Ap .
Also let npi and naj be the number of principals of type i and agents of
type j. Assume that na1 = np1 and na2 = np2 for simplicity.

• Proposition 2: Consider a matching µ and associated optimal contracts


∗ , b∗ ) for i = 0, 1, 2 and j = 0, 1, 2. Then this matching is stable
(wij ij
only if µ(pi ) = ai for i = 0, 1, 2.

• Let ξ ≡ max θ̄, π̂ + θ̄ and assume that θ̄ + π̂ ≥ π0 so that mission-
oriented production is viable. We have Proposition 3: Suppose that
na0 < np0 (full employment in the profit-oriented sector). Then the fol-
lowing match is stable: µ(aj ) = pj for j = 0, 1, 2 and the associated
optimal contracts have the features that (a) the fixed wage is set at sub-
sistence wjj ∗ = w for j = 0, 1, 2; (b) The bonus payment in the mission-

35
4. Public Organisation

n p o
oriented sector is b∗11 = b∗22 = 1
max ξ, π0 + π02 − 4w − θ̄ and
2

π + π 2 −4w
the bonus payment in the profit-oriented sector is b∗00 = 0 2 0 ;
∗ ∗
and (c) The optimal effort level solves ejj = bjj + θ̄for j = 1, 2 and
e∗00 = b∗00 . Competition and incentives interact. There’s a matching
effect that raises productivity in the mission-oriented sector by allow-
ing lower incentive payments when types are matched. There’s also an
outside option effect that comes from the full employment. A moti-
vated worker’s outside option is to go work
p in the for-profit sector and
if this sector is profitable enough (π0 + π02 − 4w > ξ) the participa-
tion constraint will bind so that the mission-oriented sector will have
to use more incentive pay.

• Proposition 4: Suppose that na0 > np0 (unemployment in the profit-


oriented sector). Then the following match is stable: µ(aj ) = pj for
j = 0, 1, 2 and the associated optimal contracts have the fearures that
(a) the fixed wage is set at subsistence wjj ∗ = w for j = 0, 1, 2; (b) The

bonus payment in the mission-oriented sector is b∗11 = b∗22 = 2ξ − θ̄ and


the bonus payment in the profit-oriented sector is b∗00 = π20 ; and (c) the
optimal effort level solves e∗jj = b∗jj + θ̄for j = 1, 2 and e∗00 = b∗00 . Now
there’s only the matching effect. Since there’s unemployment in the
for-profit sector the outside option of workers in the mission-oriented
sector is unemployment and so the principals get all of the rents.

Limitations and Moving Forward


• Would be cool to investigate the dynamics more. For example, just
how do organisations with motivated agents respond to change? Look
at political appointees versus bureaucrats for example?

• What about expanding the domain of permissible contracts? (would


require more than two project outcomes)

• Adverse Selection. What happens when types aren’t observable? How


do mission-oriented principals attract motivated workers?

36
4. Public Organisation

Edward L. Glaeser & Andrei Shleifer (2001)


Not-for-profit Entrepreneurs
JPubE 81(1) 99–115
Lowdown: Many non-profit organisations were founded by entrepreneurs.
Why do they choose non-profit status when it limits the monetary returns
to them? The non-distribution constraint softens incentives to cut quality
and so non-profit status is a way of committing to higher quality. Donors
who care about quality will also be attracted to donate to non-profits but
not to for-profits.

Why do we Care?
• Many nonprofits are started by entrepreneurs. Why would they ever
choose nonprofit status since this limits their ability to claim profits?
• By credibly committing not to pursue profits too narrowly, they soften
incentives and can commit to higher quality which they may prefer
when quality is an important dimension of the product.

The Model
• At time 0 the entrepreneur decides on non-profit or for-profit status
for the firm. At time 1 the entrepreneur sells 1 unit of the good to a
competitive market of consumers. He collects the price P and agrees
to deliver a product of non-verifiable quality q at time 2. At time 2
the firm produces the good and delivers it to consumers. Crucially,
consumers can’t go to court to complain about shoddy quality because
quality is non-verifiable
• Consumers are willing to pay P = z−m(q ∗ − q̂) for the good where z, m
and q ∗ are constants and q̂ is the consumers’ expectation of the quality.
The firm’s cash profits are P − c(q). If the firm is for-profit, this is the
entrepreneur’s income. If the firm is not-for-profit, the entrepreneur is
forced to spend these revernues on perquisites Z. Also, entrepreneurs
of both types face a cost of shirking on quality of b(q ∗ − q).
• When the entrepreneur chooses q he has already collected P so a for-
profit entrepreneur maximises P − c(q) − b(q ∗ − q) so the for-profit
quality satisfies c0 (qf ) = b. The non-profit entrepreneur can’t keep the
profits, he must spend them on perquisites so Z = P − c(q) which he
only values at d · Z where d < 1 so he maximises d · [P − c(q)] − b(q ∗ − q)
with FOC d · c0 (qn ) = b which combined with convexity of c(·) yields
Proposition 1: Non-verifiable quality of the non-profit firm exceeds that
of the for-profit firm.

37
4. Public Organisation

• The entrepreneur will choose the not-for-profit status if it gives him


higher returns:

(b + m)(qn − qf ) − (c(qn ) − c(qf )) > (1 − d)(z − m(q ∗ − qn ) − c(qn ))

Proposition 2: There is a unique value of m, m∗ above which entrepreneurs


choose non-profit status and below which all entrepreneurs choose for-
˜ then as profits rise (from movemnets
profit status. If c(q) = c + c(q)
in the constants of the demand or cost functions) m∗ increases and
non-profits are less attractive. If for-profit firms make positive income
for the entrepreneur, then for small d for-profit firms will continue to
dominate not-for-profits. If c00 (q)2 > mc000 then m∗ is decreasing in b.

• Not-for-profits often rely on donations. Adjust the model so that in


period 0 the entrepreneur chooses the profit status. In period 1 the
donor decides on general donations level. In period 2 the entrepreneur
sells te good to the consumer at price P . In period 3 the entrepreneur
chooses the quality level q and delivers the good. Also, now have V (Z)
be concave for the entrepreneur.

• Assume the donor cares about quality only so he maximises (1−t)(Y −


dq
D) + F (q) so the FOC is that dD · F 0 (q) = 1 − t. In for-profit firms
0
quality satisfies c (q) = b so the donations have no effect on quality.
In the non-profit, the first order condition is c0 (q) · V 0 (Y ) = b where
Y = P + D − c(q). Playing about with this yields Proposition 3:
Quality rises with the level of donations in non-profits and Proposition
4: Donations rise with the tax rate and decline 1-for-1 as the firm gets
other sources of income.

• Take-homes: We will expect to see non-profits in activities where

1. There are abundant opportunities for ex-post quality reductions


after the good has been purchased
2. The activity is not too profitable or relies on charitable donations
3. Altruism and public-spiritedness are important motivators of en-
trepreneurs
4. It is costly for consumers or employees to change the firms they
deal with.

Limitations and Moving Forward


• Ghatak & Mueller: not-for-profit status also motivates workers who
might shirk.

38
4. Public Organisation

• Interesting to think more about donations. What is the relationship


between our willingness to pay as donors and the profitability of the
activity. Ability to pay for vaccines is very low (and there are external-
ities) by the recipients in LDCs for e.g. Donors’ willingness to pay will
depend on more than just the quality, they’ll care about the quantity
too.

39
4. Public Organisation

Oliver Hart, Andrei Shleifer & Robert W. Vishny


(1997)
The Proper Scope of Government: Theory and and
Application to Prisons
QJE 112(4) 1127–1161
Lowdown: Which services should be provided publicly and which private?
Competition isn’t the only thing to consider. Incompleteness of contracts
especially with regard to quality are important too. The model suggests that
costs will always be lower but quality may be either higher or lower. Private
provision dominates when quality deterioration from cost-cutting not too
severe but public provision dominates when they are severe and incentives
to invest in quality are strong enough in the public sector.

Why do we Care?
• Public provision versus privatisation is a big debate.

• The arguments are often made on the basis of competition. It’s not
clear that’s the right approach, quality is very important too. Con-
tractual incompleteness offers insights into when and where we should
expect costs and quality to be higher ad lower under public or private
provision

Theory
• The assets or facility is F (e.g. the prison buildings) which is run by
a manager M . There is also a bureaucrat/politician G who contracts
with M to provide the good at price P0 . They cannot contract ex-ante
on the quality of the good.

• M may find ways to keep down costs and/or to improve quality. Cru-
cially there are spillovers: Cost reducing initiatives also hurt quality.
The good (as modified by the manager) delivers a benefit to society of
B = B0 − b(e) + β(i) and costs the manager C = C0 − c(e) to produce.
However, investments are costly so his overall cost is C0 − c(e) + e + i.

• Timing: At time 0 M & G contract and choose the ownership of F by


one of them. At time 1/2 M chooses e and i. At time 1 there is rene-
gotiation and then the modified good is supplied if there is agreement
or the basic good is supplied if there is disagreement.

• Assume Nash bargaining means that the gains from renegotiation are
split equally. If M walks away, he takes a fraction λ of the cost and

40
4. Public Organisation

quality mprovements that is embodied in his human capital with him


so that Gcan only realise a fraction (1 − λ) of the net social gains
−b(e)+c(e)+β(i) when F is publically owned and the manager leaves.

– If F is privately owned and negotiations break down, the cost


innovation is implemented but the quality innovation isn’t since
it requires G’s approval, G0 sdefault payoff is B0 − P0 − b(e) and
M ’s default payoff is P0 − C0 + c(e) − e − i
– If F is publicly owned both the cost and quality innovations are
implemented but G has to replace M so he only gets (1 − λ) of the
gains from the innovations. G’s default payoff is B0 − P0 + (1 −
λ)[−b(e) + c(e) + β(i)] and M ’s default payoff is P0 − C0 − e − i.

• The first best maximises the welfare gains maxe,i −b(e) + c(e) + β(i) − e − i
which has FOCs −b0 (e∗ ) + c0 (e∗ ) = 1 and β 0 (i∗ ) = 1.

• Private ownership equilibrium. The parties split the gains from rene-
gotiation using the default payoffs above so that the payoffs are

β(i)
UG = B0 − P0 + − b(e)
2
β(i)
UM = P0 − C0 + + c(e) − e − i
2
M will maximise his payoff with FOCs c0 (eM ) = 1 and 1/2β 0 (iM ) = 1.
Since he ignores the quality deterioration from cost reductions, he cuts
costs too much and because he only gets half the benefits from quality
improvements he under-invests in quality.

• Public ownership equilibrium. Same renegotiation but with different


default payoffs yields
λ
UG = B0 − P0 + (1 − )[−b(e) + c(e) + β(i)]
2
λ
UM = P 0 − C0 + [−b(e) + c(e) + β(i)] − e − i
2
Again M will maximise his payoff but now the FOCs are that λ/2(−b0 (eG )+
c0 (eg )) = 1 and λ/2β 0 (iG ) = 1. Now the manager does take into con-
sideration the quality deterioration but he doesn’t get the full benefit,
so this blunts his incentives. He also has even less incentive to invest
in quality.

• Proposition 1: eM > e∗ , iM < i∗ and Proposition 2: eG < e∗ , ig ≤


iM < i∗

41
4. Public Organisation

• Proposition 3: (1) Suppose b(e) is replaced with θb(e) then for suffi-
ciently small θprivate ownership dominates public ownership. (2) sup-
pose b(e) is replaced with θb(e) and c(e) is replaced with φc(e) then for
sufficiently small θand φand λ < 1 private ownership dominates again.
i.e. private ownership is unambiguously better if quality deteriorations
from cost cutting are small and/or if cost cutting opportunities are
few.
• Proposition 4: (1) suppose b(e) = c(e) − σd(e) then for small σand
λclose to 1, public ownership dominates. (2) suppose b(e) = c(e) −
σd(e) and β(i) is replaced by τ β(i) then for small enough τ, σpublic
ownership is better. i.e. public ownership is better when the adverse
effect on quality reduction is large and either government employees
have strong incentives to invest in quality (λ large) or quality improve-
ments are not that important.
• Proposition 5: Costs are always lower under private ownership. Qual-
ity may be higher or lower.
• Government Failures

– Corruption: If the politician is able to sell off the service and


extract a bribe, he will always want to, even when privatisation
isn’t desirable.
– Patronage: If the politician uses the public provision to provide
jobs or high wages to favoured constituents then there will be too
little privatisation.

Prisons and Other Applications


• Prisons: It does seem that costs are lower in private prisons. Also, it
seems that quality may be lower. Probably best to keep prisons, espe-
cially high-security ones where quality of guards is important public.
• Garbage collection and weapons procurement should be private: garbage
collection quality improvements are trivial. Easy to contract on quality
for weapons but quality innovation is important and private employees
have stronger incentives to improve quality.
• Foreign policy should be public: quality is so vague here and private
buyers would have to pay up front for the service an amount that
equalled the cost of future potential holdups. These holdups are nuclear
attacks so the cost is pretty big!
• Schools is less obvious. Damage to quality probably quite large, but
innovation is important. Competition induced by vouchers probably a
good thing since quality is easily observable.

42
4. Public Organisation

• Health care also less obvious. A lot like education but less easy to
observe quality and change suppliers so maybe good to have public
provision.

• Police and armed forces. Huge holdup problem so don’e privatise.

Limitations and Moving Forward


• Besley & Ghatak (2001) include public good aspect of the good being
provided and get slightly different results that depend on who cares
more about the public good, not who does the investing so much.

• What about competition among providers for contracts/renewal of con-


tracts. Reputation effects from providing good quality (career concern
type things).

43
Chapter 5

Reported Income, Tax


Evasion and Tax Avoidance

44
5. Reported Income, Tax Evasion and Tax Avoidance

Martin Feldstein (1995)


The Effect of Marginal Tax Rates on Taxable In-
come: A Panel Study of the 1986 Tax Reform Act
JPE 103(3) 551–572
Lowdown: First paper to think seriously about the elasticity of taxable
income (ETI) with respect to taxes. Using tax returns data, tries to identify
this elasticity, gets ε ∈ (1, 3). Identification is pretty terrible but it was the
first attempt.

Why Do We Care?
• The Elasticity of Taxable Income (ETI) is of central normative impor-
tance in public finance as it is the critical parameter for thinking about
the revenue generated by income tax systems.

• This is the first paper to really think about this.

• Identification is pretty bad but hey, it’s the first paper.

Data
• look at the Tax Reform Act (TRA) of 1986 that changed marginal tax
rates by different amounts at different income levels.

• uses a panel of data straight from the tax authorities so we can look
at the entire effect on reported income not just simple labour supply
responses.

• Do simple diff in diff on 3 groups who paid marginal tax rates in 1985
of:

– medium: 22-38% experienced a 12.2% change in marginal tax rate


– high: 42-45% experienced a 25.6% change in marginal tax rate
– highest: 49-50% experienced a 42.2% change in marginal tax rate

Results
• high vs. medium → elasticity ∈ (1.04, 1.10)

• highest vs. high → elasticity ∈ (1.48, 3.05)

• highest vs. medium → elasticity ∈ (1.25, 2.14)

45
5. Reported Income, Tax Evasion and Tax Avoidance

Limitations and Moving Forward


• Inequality increasing for non-tax reasons → upward bias

• Mean reversion: some rich people in 1985 were only rich in 1985 by
chance → downward bias if estimated from tax cut at the top

• Diff-in-diff assumes elasticities same for all income levels. If ETI is


increasing in income → upward bias.

• Tiny sample. He doesn’t even report standard errors on his estimates.


Nuff said

• TRA changed both the tax rate and the tax base. He fixes the defini-
tions at the 1985 ones, but this changes how we interpret his estimates.

• Income shifting from corporate to personal income will overstate the


ETI

• Short vs. Long-term effects. Especially rich people can choose the
timing of their income.

• the more recent literature (Saez and coauthors) fixes a lot of these
shortcomings

46
5. Reported Income, Tax Evasion and Tax Avoidance

Jon Gruber & Emmanuel Saez (2002)


The Elasticity of Taxable Income: Evidence and Im-
plications
JPubE 84(1) 1–32
Lowdown: The Elasticity of Taxable Income is hugely important to esti-
mate. Improving on many previous attempts, the authors get a preferred
estimate of 0.4, much lower than Feldstein (1995). Elasticity is higher for
richer people and for state taxes. Income effects are negligible.

Why Do We Care?
• The Elasticity of Taxable Income (ETI) is of central normative impor-
tance in public finance as it is the critical parameter for thinking about
the revenue generated by income tax systems.
• Good identification coming from using multiple tax changes and con-
trolling for initial income flexibly.

Data
• NBER panel of tax returns from 1979 to 1990
• Several tax reforms

– 1981-1984 Economic Recovery Tax Act


– 1986 Tax Reform Act
– 1987 Earned Income Tax Credit expansion
– State level tax changes

• Model: A consumer’s budget constraint is given by c = z(1 − τ ) + R


where z is before-tax income, τ is the marginal rate and R is virtual
income. This lets us decompose the response to a tax change (dz, dR)
into an income and substitution effect.
dz dτ dR − zdτ
= −ζ c +η
z 1−τ z(1 − τ )
where ζ c ≡ [(1 − τ )/z]∂z/∂(1 − τ )|u is the compensated elasticity of
income and η ≡ (1 − τ )∂z/∂R is the income effect parameter.
• We’d like to estimate something like
z  h1 − T0 i h z − T (z ) i
2 2 2 2 2
log = ζ log + η log +ε
z1 1 − T10 z1 − T1 (z1 )
however, there are a number of problems.

47
5. Reported Income, Tax Evasion and Tax Avoidance

– the change in the tax rate is correlated with ε since large shocks
to income will change the tax bracket the individual is in. So, we
use log[(1−Tp0 )(1−T10 )] as an instrument where Tp0 is the marginal
tax rate the individual would have faced if his real income was
unchanged.
– Mean reversion will induce a negative correlation between ε and
first-period income
– A changing income distribution. Increasing inequality, for e.g.
will induce a positive correlation between ε and first-period in-
come.

so the authors control for first-period income flexibly using a ten-piece


spline giving an estimation equation:
z  h1 − T0 i h z − T (z ) i
2 2 2 2 2
log = α0 + ζ log + η log + α1 log(z1 )
z1 1− T10 z1 − T1 (z1 )
X X 10
X
+ α2k marsk + α3j Y EARj + α4i SP LIN Ei (z1 ) + ε
k j i=1

Results
• The elasticity of broad income with respect to tax rates is 0.12

• The elasticity of taxable income with respect to tax rates is 0.40

• The income effects are tiny (around 0.135 for taxable income)

• Elasticities are higher for state taxes (0.292 and 0.632 for broad income
and taxable income respectively) probably because people can move
states more easily

• Wierd that there are very small income effects. Especially when people
have a lot of non-labour income you might expect a bigger effect coming
through the effect on lifetime wealth. Would have been nice to see the
income effect estimates for the different income groups to see if the
same pattern is seen there.

• Elasticities are bigger (0.567 for taxable income) for people with high
(>100K) incomes probably reflecting the fact that a larger part of in-
come for those with higher income is non-labour income which is easier
to move about. However the estimates are not significantly different.

48
5. Reported Income, Tax Evasion and Tax Avoidance

Limitations and Moving Forward


• Flexible control for period-one income is cool, but can’t rule out non-
linear relationships between lagged-income and income changes that
change over time.

• heterogeneous estimates are not significantly different. a pity.

• Would be great to look at a really long-run dataset and incorporate


corporate taxes also to look at very long-run responses and also the
movement of income between corporate and personal tax bases.

49
Chapter 6

Tax and Labour Supply

50
6. Tax and Labour Supply

Raj Chetty, John N. Friedman, Tore Olsen & Luigi


Pistaferri (2010)
Adjustment Costs, Firm Responses, and Labor Sup-
ply Elasticities: Evidence From Danish Tax Records
NBER Working Paper 5023
Lowdown: Really pushes the envelope on what we can learn about re-
sponses to income taxes. Incorporate search costs and firm responses to
think more “general equilibrium”-ly about responses to tax changes. Offers
an explanation for why macro/structural elasticities are bigger than observed
elasticities. Uses evidence from Danish tax records and looks at bunching
around kink points to back out elasticities. Observed elastcities are tiny
(smaller that 0.02) but calibration reveals that the macro elasticity ≥ 0.34.

Why Do We Care?
• Bringing in adjustment costs and firms’ behaviour helps us explain
some of the puzzles of the literature on labour supply responses such
as the small elasticities observed around kinks in the tax schedule.

• Reminds us of how important it is to think about general equilibrium


when thinking about policy changes by showing the discrepancy be-
tween the micro elasticities and the macro elasticities.

Theory
• A firm j employs a single worker to produce output worth p and pays
an hourly wage w(hj ). Firms post their hours packages and can’t
change them after matching with a worker. Thus their profits are
πj = phj − w(hj )hj . Free entry implies w(hj ) = w = p for all hj in
equilibrium.
−1/ε h 1+1/ε
• Worker i has utility ui (c, h) = c − αi 1+1/ε . The taste parameter
αi > 0 is distributed with cdf F (αi ). Workers also get stochastic non-
wage income yi ∼ FY whose realisation is unknown when they choose
their hours.

• There are 2 types of tax system s = {N L, L}. si = N L is a piecewise


linear tax with MTRs of τ1 and τ2 > τ1 if income yi + wi hi exceeds K.
si = L is a linear tax rate of τ on all income. A fraction ζ of workers
have si = N L.

• Workers draw a job offer h0i from the aggregate offer distribution G(h)
which they can either accept or turn down. If they turn it down they

51
6. Tax and Labour Supply

continue searching and draw a new offer h0i from a new distribution
Ge (h0 |h∗i ) centered on the individual’s optimal hours choice h∗i . They
can get a more precise draw by exerting search effort e ∈ [0, 1] which
has monetary cost φi (e). We can think of this search process as a
functional that maps the hours distribution and the wages into a new
distribution F(G(h), w(h)).

• Equilibrium requires that the labour market clear (w = p) and that


the distribution of posted jobs coincide with the distribution of jobs
selected after search: G(h) = F(G(h), w)

• Benchmark case: No frictions: φi (e) = 0. Now all workers choose


their optimal hours. For those with si = L this is h∗i = αi ((1 − τ )w)ε .
For those with si = N L if we assume that there is no uncertainty about
non-wage income (yi = 0) then

ε
αi ((1 − τ1 )w)
 if αi < α
∗ K
hi = hK = w if αi ∈ [α, ᾱ]


αi ((1 − τ2 )w)ε if αi > ᾱ

so we get some bunching around the kink. In this case we can estimate
the “structural” elasticity ε like Saez (2009) does as ε ' B(τ1 ,τ2 )/g(h
1−τ1
K )
K ln 1−τ2
where b = B(τ1 , τ2 )/g(hK ) is the fraction of workes who bunch at the
kink normalised by the density of the hours distribution at the kink.

• Search Costs and Worker Responses: Make three specialising


assumptions. 1: the set of workers affected by the tax change has
measure zero, for kinks this means ζ = 0, for tax reforms this means
ζ = 1 (this means that the tax change has no effect on the equilibrium
hours distribution). 2: make search costs generate a binary search
decision: workers either keep their offer or pay a fixed cost φ which
allows them to choose their preferred hours deterministically. 3: no
uncertainty in non-wage income (yi = 0). This means that a worker
will search for a job if his initial offer h0i ∈ / [hi , h̄i ] where the thresholds
are defined by u(ci (h∗i ), h∗i ) − u(ci (hi ), hi ) = φ and analogously for h̄i .
This means that the “observed” elasticity in the data ε̂is smaller than
the structural elasticity because with  positive search costs workers sit
on the kink if αi ∈ [α, ᾱ] and h0i ∈

/ hi , h̄i . This leads to Prediction 1:
When workers face search costs, the observed elasticity from bunching
rises with the size of the tax change and converges to εas the size of the
tax change grows: ∂ ε/∂τˆ2 > 0, ∂ ε̂/∂τ1 < 0, and limτ2 −τ1 →∞ ε̂ = ε.
Also, the macro elasticity will equal the structural elasticity. consider
2 different linear tax schedules τ and τ 0 , then the macro elasticity is
log hi (τ 0 )−E log hi (τ )
ε̂M AC = Elog(1−τ 0 )−log(1−τ ) . For the workers with optimal hours, the

52
6. Tax and Labour Supply

difference in hours is simply log h∗i (τ 0 ) − log h∗i (τ ) = ε · (log(1 − τ 0 ) −


log(1 − τ )). Using a quadratic approximation to utility the inaction
∂ log h
region also has ∂ log(1−τi ) = ∂ ∂log(1−τ
log h̄i
) ' ε. Approximating the offer
distribution to be uniform in the inaction region we can put these
together to get that E log hi (τ 0 )−E log hi (τ ) ' ε·(log(1−τ 0 )−log(1−τ ))
and therefore that ε̂M AC ' ε regardless of the cost φ.

• Hours Constraints and Firm Responses: Again make 3 special-


ising assumptions. 1: ζ ∈ (0, 1) 2: at each level of αi , a fraction δ of
workers face no search costs (φi (e) = φi = 0) and the rest can’t search
(φi (e) = φi = ∞). 3: no uncertainty about non-wage income. Now,
those who can search choose their optimal hours and those who can’t
just stick with their initial draw. This means that the search process
F maps the distribution of offers to F(G) = δG∗ + (1 − δ)G and hence
the only fixed point of F is G∗ .
Now let B ∗ (τ1 , τ2 ) be the bunching you would observe in the frictionless
world. With search costs, the observed bunching is B = δB ∗ + (1 −
δ)ζB ∗ . The first is individual bunching (BI = δB ∗ ) from individauls
choosing to be onthe kink. The second part is firm bunching (BF =
(1 − δ)ζB ∗ ) arising from workers drawing initial hours that place them
on the kink. Now the observed elasticity will be

B(τ1 , τ2 )/g ∗ (hK )


ε̂ =   = δε + (1 − δ)ζε < ε
K ln 1−τ 1−τ2
1

from which we can see that the observed elasticity will depend on
how many workers face the kink. Prediction 2: Search costs interact
with hours constraints to generate firm bunching. The amount of firm
bunching and the observed elasticity rises with the fraction of work-
ers who face the kink: BF = BL > 0 iff ζ > 0, ∂BF /∂ζ > 0 and
∂ ε̂/∂ζ > 0. Also, Prediction 3: Firms cater to workers’ preferences
- the amount of firm bunching and individual bunching are positively
correlated across occupations: cov(BIq , BFq ) > 0.

Empirics
• Data is on 99.9% of income tax payers in Denmark. Dropping those
not aged between 15 and 70 and those without wage income leaves
17.9 million observations in a panel from 1994-2001. The tax system
is piecewise linear with 3 progressive tax bands.

• Bunching at the top tax kink: There is bunching around the kink but
it’s noisy so need to estimate the counterfactual density by excluding

53
6. Tax and Labour Supply

data near the kink. Run a regression like


q
X R
X
Cj = βi0 · (Zj )i + γi0 · 1[Zj = i] + ε0j
i=0 i=−R

where Cj is the number of individuals in income bin j, Zj is income


relative to the kink in 1000s of DKr, q is the order of the polynomial
and R denotes the width of the excluded region aroundPRthe kink, which
gives an initial estimate of the bunching as b̂n = j=−R Cj − Ĉj0 =
0
PR 0
i=−R γ̂i . However, this will overestimate bunching since it won’t
satisfy the constraint that the counterfactual density must integrate to
the same number of people as the actual density. To fix this they use
an iterative procedure estimating
q R
!
b̂n X
0 i
X
Cj · 1 + 1[j > R] ∞P = βi ·(Zj ) + γi0 ·1[Zj = i]+ε0j
C
j=R+1 j i=0 i=−R

until it converges. With this the final estimate of bunching is the ex-
cess mass relative to the average density of the counterfactual earnings
b̂n
distribution between −R and R: b̂ = PR . This gives an
j=−R Ĉj /(2R=1)
estimate of b̂ = 0.81 meaning that all the excess mass around the kink
is 81% of the average height of the counterfactual distribution around
the kink. The bunching is much larger for married women (b = 1.79)
and there is heterogeneity among occupations also with teachers dis-
playing lots of bunching (b = 3.54) and the military none at all.
• The authors find practically no bunching around the middle tax kink
which is supportive of prediction 1. Similarly, doing diff-in-diff using
all the little tax changes in the sample period and plotting them reveals
that bigger tax changes are associated with bigger estimated elastici-
ties. This is unlikely to be because of heterogeneous elasticities since
small changes at the top don’t have any effect at all.
• To test the second prediction about scope the authors look at variation
in the fraction of workers in the economy facing a given kink in the
tax system coming from heterogeneity in deductions and pension con-
tributions. For instance, 60% of wage earners have net deductions less
than DKr 7500 and hence the relevant kink is the one in the income
tax schedule. the remaining 40% show a spike at DKr 33,000, for 2.7%
of workers, the cap on deductible pension contributions. For these in-
dividuals the relevant kink is higher due to the deductions. Therefore
prediction 2 says that 1) their should be significant firm bunching at
the statutory kink; 2) there should be little firm bunching at the pen-
sion kink that only applies to 2.7% of workers; and 3) more bunching
for individuals with small deductions.

54
6. Tax and Labour Supply

Firm bunching is looked at at the occupation level since wages are


set through collective bargaining at the occupation level. This reveals
significant firm bunching around the statutory kink. There is also
lots of bunching of wage earnings around the pension kink, but only
for workers with deductions abouve DKr 20,000, so this is individual
bunching not firm bunching.

• Prediction 3 is also borne out in the data. To calculate firm bunching


the authors look at bunching at the statutory kink by individuals with
deductions above DKr 20,000 (who shouldn’t be there). Similarly,
they calculate individual bunching by looking at bunching by the same
individuals around the pension kink.

• Calibration to bound the macro elasticity: putting a bit more struc-


ture on the model and calibrating some of the other parameters can
set identify the macro (structural) elasticity. This works because ε de-
termines how convex the disutility of work is and therefore the utility
loss from deviating from the optimal number of hours. Therefore if εis
small there can’t be very different observed elasticities at the middle
and the top kinks unless the search costs are very large. By guessing
at how much people are willing to sacrifice in consumption to save the
search costs, we can bound the elasticity. Doing this gives the authors
an estimate that ε̂ ≥ 0.34.

Limitations and Moving Forward


• Can we use this methodology to re-derive some of the optimal taxation
results?

• Interesting to think about dynamics here

• Competition among countries?

• Why are there people earning above the kink who aren’t making pen-
sion contributions if these are tax-deductible?

55
6. Tax and Labour Supply

Nada Eissa (1995)


Taxation and Labor Supply of Married Women: The
Tax Refom Act of 1986 as a Natural Experiment
NBER Working Paper 5023
Lowdown: The Tax Reform act of 1986 changed the marginal tax rate
faced by high income workers by much more than for lower income workers.
Use diff-in-diff to compare response of married women at the 99th percentile
of the income distribution with women lower down. Estimate elasticities
of participation of around 0.5 and of hours worked of around 1. Serious
identification problems though.

Why Do We Care?
• An early attempt to deal seriously with the identification issues that
plagued previous work on the response of labour supply to taxation.

• Use diff-in-diff methodology to analyse the response to the 1986 Tax


Reform Act (TRA)

Data
• TRA of 86 lowered marginal tax rates at high income levels more than
at lower income levels. This naturally creates a treatment (high in-
come) and a control (lower income) group.

• Data on labour force participation and hours worked from the Current
Population Survey.

• Use married women at the 99th percentile of the income distribution


as the treatment group. 2 control groups: 1) married women at the
75th percentile and 2) married women at the 90th percentile.

• do basic diff-in-diff and also use regression in order to be able to control


for more stuff.
0 α) so
– Participation modeled as a probit: P(lf pit = 1) = Φ(zit
regress

P(lf pit = 1) = Φ(α0 +α1 Qit +α2 Highit +α3 P ost86t +α4 Highi ×P ost86t )

– Hours Worked conditional on participation modeled as a trun-


it −xit β)/σ)
cated normal: f (hit ) = σ1 ϕ((hΦ(x it β/σ)
so regress

hit = β0 + β1 Xit + β2 Highi + β3 P ost86t + β4 Highi × P ost86t

56
6. Tax and Labour Supply

• The identifying (parallel trends) assumption is that the increasing


trend in labour force participation and hours worked is the same for
treatment and control groups.

Results
• Simple Diff-in-Diff:

– Labour Force Participation: Using the 75th percentile as the con-


trol group the diff-in-diff estimate is 0.037 (0.028) which is just
significant at 10%. Using the 90th percentile it is 0.045 (0.028).
– Hours Conditional on Employment: 75th percentile control →
108.6 (65.1) again just 10% significant. 90th percentile control →
67.3 (64.8).
– Hours Worked: 75th percentile control → 84.8 (51.5) again just
10% significant. 90th percentile control → 77.4 (52.5).

• Regression:

– Labour Force Participation: 75th percentile control → α4 = 0.1(0.075).


90th percentile control → α4 = 0.12(0.075). This gives elastici-
ties of participation of 0.4 and 0.6 for the 75th and 90th percentile
treatment groups, respectively.
– Hours Conditional on Employment: 75th percentile control →
β4 = 115.32(63.21). 90th percentile control → β4 = 65.8(71.89).
This gives elasticities of total hours worked of 0.8 and 1.0 for the
75th and 90th percentile treatment groups, respectively.
– Hours Worked (Tobit): 75th percentile control → 189.4 (72.2).
90th percentile control → 167.24 (80.0). This gives elasticities of
total hours worked of 1.0 and 1.1 for the 75th and 90th percentile
treatment groups, respectively.

Limitations and Moving Forward


• Very simple, intuitive diff-in-diff, but with all the possible problems
with diff-in-diff.

• Any number of reasons not to believe parallel trends assumption.

• Very different occupation composition between treatment and control


groups suggests possibility of significant heterogeneity in elasticities.

• Compositional problems as this is not a panel.

57
6. Tax and Labour Supply

Nada Eissa & Jeffrey Liebman (1996)


Labor Supply Response to the Earned Income Tax
Credit
QJE 111 (2) 605–637
Lowdown: The EITC gave single women with children greater incentives
to work. They do participate in the labour force more, but the theoretical
prediction that hours worked should decrease is not borne out in the data.

Why Do We Care?
• Tax Reform Act (TRA) in 1986 expanded the earned income tax credit
(EITC). EITC gives a tax credit to poor households with children who
work. This shifts the whole budget constraint for eligible households.

• Theory predicts:

– Increase in participation among the eligible.


– ambiguous effects for those in the phase-in region.
– decrease in labour supply for individuals in the phase-out region.

Data
• 1985-1987 ad 1989-1991 Current Population Surveys.

• Taxes in 1984-1986 and 1988-1990

• Include Unmarried females between 16 and 44 years old → 67,097 obs.

• 3 Treatment groups: Unmarried women with children, with children


and less than high school education, and with children and high school
education.

• 4 Control groups: Unmarried women without children, without chil-


dren and less than high school education, with children and beyond
high school education, and without children and with high school ed-
ucation.

• Do basic diff in diff and also use regressions for

– Participation: P(lf pit = 1) = Φ(α + βZit + γ0 treatmenti +


γ1 post86t + γ2 (treatment × post86)it )
– Hours worked: Hoursit = α + βZit + γ0 kidsi + γ1 post86t +
γ2 (kids × post86)it + εit

58
6. Tax and Labour Supply

Results
• Participation:

– Simple Diff-in-Diff using With vs. without children → 0.024


(0.006).
– Regression: Predicted increase in participation is 0.019 (0.008),
increases to 0.028 (0.009) if control for unemployment and state
dummies.

• Hours Worked:

– Conditional on Hours > 0, γ2 = 25.22(15.18) so no significant


impact on hours worked.
– For total hours worked, γ2 = 37.37(15.31) so positive impact on
labour supply.

Limitations and Moving Forward


• Women who do and don’t have children are likely to be very different
in any number of ways that will affect their responses to tax incentives.

• The authors attempt to explain the lack of an hours response as be-


ing because people don’t understand the EITC, and if they do they
perceive it as a lump sum benefit. Interesting to explore this more.

59
6. Tax and Labour Supply

Emmanuel Saez (2009)


Do Taxpayers Bunch at Kink Points?
Mimeo
Lowdown: Piecewise linear budget constraints mean that we should see
bunching around the kink points. Saez develops a methodology for looking
for this bunching in tax return data. He finds bunching around the first kink
in the EITI (and, a bit in the federal income tax) but not at higher kinks.
The implied elasticity of reported income is 0.25 on average, but 1 for the
self-employed and 0 for wage earners.

Why Do We Care?
• Theory predicts that piecewise linear budget constraints and a smooth
distribution of preferences/ability should lead to bunching at the kinks
in the budget constraint, so it’s a cool test of the theory to look for
bunching.

• Saez develops a cool methodology for looking for buching at the kinks

• Saez does find bunching but not at all the kinks so we need to rethink
some of the theory

Theory and Methodology


• Derive the empirical methodology from a nice simple model of labour
supply.
1+1/e
• u(c, z) = c − 1+n 1 · nz where c is consumption, z is labour supply,
e
n is ability distributed according to f (n) and we note that the com-
pensated (equal to the uncompensated since quasi-linear utility has no
income effects) elasticity of income wrt wages is constant and equal to
e

• maximising utility subject to c = (1−t)·z +R yields the labour supply


z = n · (1 − t)e

• with a constant marginal tax rate t0 the distribution of earnings is given


by  
e z
H0 (z) = Pr(n · (1 − t0 ) ≤ z) = F
(1 − t0 )e
where H0 is defined as the distribution of earnings under a constant
MTR t0

60
6. Tax and Labour Supply

• If we introduce a kink at z = z ∗ then we now have that above z ∗ the


e
distribution ofearnings
 is   = F(z/(1 − t) )and so the density will
 H(z) e e
1−t0
be h(z) = h0 z · 1−t 1
· 1−t 0
1−t1 so that the left and right limits
∗ h(z ∗ )− = h0 (z ∗ ) and h(z ∗ )+ =
 at z are,
of the density
e 
respectively
e
1−t0
h(z) = h0 z · 1−t 1
· 1−t 0
1−t1

• People with abilities n ∈ [z ∗ /(1 − t0 )e , z ∗ /(1 − t1 )e ] will bunch. The


highest ability person who
 bunches has n = z ∗ /(1 − t1 )e so he would
 e
1−t0
have had earnings of z ∗ · 1−t1
under the old tax so we can say that
∗ ∗ ∗
anyone who had earnings between
h ze andiz + ∆z under the old tax
1−t0
will bunch, where ∆z ∗ = z ∗ 1−t 1
− 1 so that the fraction of the
population that bunches is
ˆ z ∗ +∆z ∗
h0 (z ∗ ) + h0 (z ∗ + ∆z ∗ )
B= h0 (z)dz u ∆z ∗ ·
z∗ 2
 h(z ∗ ) + h(z ∗ ) / 1−t0 e
 
 e
1 − t0 − + 1−t1
= z∗ −1 ·
1 − t1 2

which allows the elasticity e to be solved for in terms of observables.

• The simplest way to empirically estimate the bunching is to use the


difference between the number of people in a band around the kink
with the peole in two surrounding bands:
ˆ z ∗ +δ ˆ z ∗ −δ ˆ z ∗ +2δ
B= h(z)dz − h(z)dz − h(z)dz
z ∗ −δ z ∗ −2δ z ∗ +δ

• The methodological issue here will be to choose the right bandwidth δ.


Too small and stochastic income components and measurement error
will cause us to underestimate bunching. Too big and the second order
effects from the curvature of the income distribution start to bias the
estimate.

• Estimates of the numbers of individuals in the bands come from re-


gressing a dummy for being in the relevant band on a constant in
the subsample of individuals who fall into one of the three bands.
These estimates (denoted by Ĥ− ∗ , Ĥ ∗ and Ĥ ∗ ) can provide estimates of
+
ĥ(z )+ = Ĥ+ /δ, ĥ(z )− = Ĥ− /δ and B̂ = Ĥ ∗ − (Ĥ+
∗ ∗ ∗ ∗ ∗ + Ĥ ∗ ). Standard

errors come from the delta method or from bootstrapping (the large
sample means the two methods yield similar s.e.s).

61
6. Tax and Labour Supply

Data & Results


• Data from the IRS Individual Public Use Tax Files are tax returnsfrom
1960 to 2004. Good because virtually no measurement error as opposed
to using survey data.

• There is significant bunching around the first kink in the EITI but
none around the other two kinks

– This is driven by people with self-employment income (ê = 1.101),


wage earners practically no action (ê = 0.025)
– This can be explained in a model with some income that is self-
reported and can be hidden, but the EITI gives an incentive to
report it if the subsidy rate is higher than the income tax rate. In
such a model you’d get this bunching, but only ever at the first
kink.

• There is a bit of bunching around the first kink in the federal income
tax, but again none at other kinks

– This can’t be as readily explained, but the evidence is much


weaker so maybe there’s not that much to explain...

• Bunching has grown over time as people learn the system and the kinks
stop moving around

Limitations & Moving Forward


• What might the consequences of this bunching be for tax revenue? If
what is happening is that people are underreporting their income it
would be nice to do a back of the envelope type calculation of the lost
tax revenue (and error in GDP figures)

• Use this methodology to look at other kinked budget constraints

62

Das könnte Ihnen auch gefallen