Sie sind auf Seite 1von 84

A Simple Way to Estimate Bid-Ask Spreads from Daily High and Low Prices

Shane A. Corwin
and
Paul Schultz
*
ABSTRACT
We develop a bid-ask spread estimator from daily high and low prices. Daily high (low) prices are almost
always buy (sell) trades. Hence the high-low ratio reflects both the stocks variance and its bid-ask spread.
While the variance component of the high-low ratio is proportional to the return interval, the spread
component is not. This allows us to derive a spread estimator as a function of high-low ratios over one-day
and two-day intervals. The estimator is easy to calculate and can be applied in a variety of research areas.
Through comparisons to TAQ data, we show that the estimator generally outperforms other low-frequency
spread estimators.
*
Both authors are from the Mendoza College of Business at the University of Notre Dame. We are grateful to the Editor,
Cam Harvey, the Associate Editor, and two anonymous referees for their valuable suggestions. We also thank seminar
participants at the University of Notre Dame and the National Bureau of Economic Research Market Microstructure
meeting, and Shmuel Baruch, Robert Battalio, Hank Bessembinder, Ryan Davies, Larry Harris, Joel Hasbrouck and
Asani Sarkar for helpful comments.
In this paper, we derive a simple way to estimate bid-ask spreads from daily high and low prices.
The estimator is based on two uncontroversial ideas. First, daily high prices are almost always buyer-
initiated trades and daily low prices are almost always seller-initiated trades. Hence the ratio of high-to-
low prices for a day reflects both the fundamental volatility of the stock and its bid-ask spread. Second,
the component of the high-to-low price ratio that is due to volatility increases proportionately with the
length of the trading interval, while the component due to bid-ask spreads does not. This implies that the
sum of the price ranges over two consecutive single days reflects two days volatility and twice the
spread, while the price range over one two-day period reflects two days volatility and one spread. This
allows us to derive an estimate of a stocks bid-ask spread as a function of the high-to-low price ratio for
a single two-day period and the high-to-low ratios for two consecutive single days. Simulations reveal
that under realistic conditions, the correlation between high-low spread estimates and true spreads is
about 0.9 and the standard deviation of high-low spread estimates is one-fourth to one-half as large as the
standard deviation of estimates from the popular Roll (1984) covariance spread estimator.
Our spread estimator should prove useful to researchers in a wide variety of applications. Even
with intraday data now widely available, researchers make frequent use of the covariance estimator of
Roll (1984) or its extensions in applications ranging from asset pricing, to corporate finance, to tests of
efficient markets. In some cases, this is because the researcher is studying a time period that predates
intraday data or international markets without intraday data.
1
In other cases, these measures are used
when intraday quotes and trades cannot be reliably matched.
2
Other low-frequency spread measures based
on the frequency of zero returns are developed in Lesmond, Ogden, and Trzcinka (1999) and have been
1
See Asparouhova, Bessembinder and Kalcheva (2010), Bharath, Pasquariello, and Wu (2008), Gehrig and Fohlin (2006), Kim,
Lin, Singh, and Yu (2007, Lesmond, Schill and Zhou (2004), or Lipson and Mortal (2007) for estimates of spreads for periods
before the availability of intraday data. See Amihud, Lauterbach, and Mendelson (2002), Chakrabarti, Huang, Jayaraman, and
Lee (2005) or Griffin, Kelly, and Nardari (2007) for examples of the application of Roll spread estimators to international data.
2
See Antunovich and Sarkar (2006), Fink, Fink, and Weston (2006), or Schultz (2000a).
1
applied in a number of recent papers.
3
The high-low spread estimator derived here has a number of advantages over the daily estimators
used in prior research. First, using TAQ data from 1993 - 2006, we show that it outperforms the still
popular Roll (1984) covariance estimator or the LOT estimator of Lesmond, Ogden, and Trzcinka (1999).
Another advantage is that it is easy to use. We provide a closed-form solution for the spread which can be
easily programmed, unlike measures that require an iterative process (Hasbrouck (2006)) or maximum
likelihood estimation (LOT). Third, unlike Hasbroucks (2006) Gibbs estimator or the Holden (2009)
measure, the high-low spread estimator is not computer-time intensive, making it ideal for large samples.
Finally, the high-low spread estimator is derived under very general conditions. It is not ad-hoc and can
be applied to a variety of markets with different market structures.
We test the performance of the high-low spread estimator by comparing it to monthly TAQ
effective spreads from 1993 through 2006. For comparison purposes, we also estimate spreads from daily
data using the covariance spread estimator of Roll (1984), the effective tick estimator of Holden (2009),
and the LOT measure of Lesmond et al. (1999). Because researchers tackling different problems may care
about different characteristics of the spread estimator, we provide several different performance tests. We
first calculate cross-sectional correlations between spread estimates and TAQ effective spreads on a
month-by-month basis from 1993 through 2006. Examining cross-sectional correlations serves two
purposes. First, in many cases, researchers care about the ability of the spread estimator to capture the
cross-sectional distribution of spreads. Second, looking at cross-sectional correlations on a month-by-
month basis allows us to examine the performance of the estimators in different trading environments.
The three subperiods that we examine, 1993-1996, 1997-2000, and 2001-2006, correspond closely to
periods when the minimum tick size in U.S. markets was one-eighth, one sixteenth, and one penny,
3
See Bekaert, Harvey, and Lundblad (2007), Lesmond, Schill, and Zhou (2004), Mei, Scheinkman, and Xiong (2005) for
applications of the LOT measure. Amihud (2002) and Pstor and Stambaugh (2003) provide low frequency measures that
attempt to capture liquidity more generally. These measures tend to be highly correlated with low frequency spread estimates but
incorporate both spreads and the price impact of trades.
2
respectively. In all subperiods, cross-sectional correlations between high-low spreads and TAQ effective
spreads are higher than cross-sectional correlations between TAQ effective spreads and any of the other
estimators. For the entire period, the average cross-sectional correlation of high-low spread estimates with
TAQ effective spreads is 0.829, compared to 0.637 for the Roll spread, 0.683 for the effective tick spread,
and 0.635 for the LOT measure. As additional evidence, we examine cross-sectional correlations between
monthly changes in high-low spreads and monthly changes in TAQ effective spreads. For the entire
period, the high-low spread estimator dominates with an average cross-sectional correlation of 0.472,
compared to 0.249 for the Roll spread, 0.183 for the effective tick spread, and 0.186 for the LOT
measure. Thus, the high-low estimator outperforms the alternative measures in capturing the cross-
sections of both spread levels and month-to-month changes in spreads. Notably, the high-low spread
estimator performs particularly well during the 1993-1996 subperiod when the tick size was one-eighth,
suggesting that it should perform well during earlier time periods when intraday data were not available.
Next, we calculate stock-by-stock time-series correlations between each of the spread estimators
and TAQ effective spreads. This analysis serves two purposes. First, for some applications, researchers
may be particularly interested in the ability of the estimator to capture the time-series of spreads. Second,
this allows us to see how well the estimators perform for different types of stocks. For all size quintiles
and all exchange listings, we find that high-low spreads have much higher average time-series
correlations with TAQ effective spreads than do Roll spreads or spreads estimated from the LOT
measure. For the great majority of stocks, high-low spreads also have higher time-series correlations with
TAQ effective spreads than do effective tick spreads. However, effective tick spread estimates have
higher correlations for the largest stocks, especially in the most recent time period.
The high-low spread estimator is derived under very general conditions. It is simple and easy to
use. Both simulations and comparisons to TAQ data suggest that, for most applications, the high-low
spread estimator outperforms alternative low-frequency spread estimators. It clearly dominates other low-
frequency measures in capturing the cross-sections of bid-ask spreads and month-to-month changes in
3
spreads. For small stocks, such as those listed on Nasdaq and Amex, the high-low spread also dominates
other measures in capturing the time-series variation in individual stock spreads, especially within
subperiods corresponding to different tick sizes. It is important to note that the high-low spread estimator
captures liquidity more broadly than just the bid-ask spread. Price pressure from large orders will often
lead to execution at daily high or low prices. Likewise, a succession of buy or sell orders in a shallow
market may result in executions at daily high or low prices. The high-low spread estimator captures these
transitory price effects in addition to the bid-ask spread.
To further demonstrate the potential uses of the high-low spread estimator, we provide several
example applications. First, we use the high-low estimator to examine patterns in spreads for NYSE and
Amex stocks for the period from 1926 through 2006. Using this data we then demonstrate that, despite it
simplicity, the high-low estimator has similar power to the Amihud (2002) measure for capturing the
relation between liquidity and stock returns. We also demonstrate the application of the high-low
estimator to non-U.S. stocks by analyzing patterns in high-low spreads estimated from Datastream data
for stocks listed Hong Kong and India. These, and several additional applications provided in the internet
appendix, demonstrate the potential uses of the high-low estimator in a wide variety of research areas
across many types of markets.
4

The remainder of the paper is organized as follows. The high-low spread estimator is derived in
Section I. Section II discusses practical issues in estimating spreads using high and low prices. Section III
discusses existing spread estimators that use daily data and reviews empirical tests of these estimators.
We present simulation results for the high-low spread estimator in Section IV. In Section V, spread
estimates from the high-low spread estimator are compared with TAQ effective spreads and with
estimates based on the Roll spread, the effective tick spread, and the LOT measure. In Section VI, we
provide examples of applications of the high-low spread estimator. Section VII concludes.
4
In a recent paper, Deuskar, Gupta, and Subrahmanyam (2011) use the high-low estimator to measure transaction costs in OTC
options markets.
4
I. The High-Low Spread Estimator
The high-low spread estimator is based on a simple insight. The high-low price ratio reflects both
the true variance of the stock price and the bid-ask spread. While the variance component grows
proportionately with the time period, the spread component does not. This allows us to solve for both the
spread and variance by deriving two equations: the first a function of the high-low ratios on two
consecutive single days and the second a function of the high-low ratio from a single two-day period.
We assume that the true or actual value of the stock price follows a diffusion process. We also
assume that there is a spread of S%, which is constant over the two-day estimation period. Because of the
spread, observed prices for buys are higher than the actual values by (S/2)%, while observed prices for
sells are lower than the actual value by (S/2)%. We assume further that the daily high price is a buyer-
initiated trade and is therefore grossed up by half of the spread, while the daily low price is a seller-
initiated trade and is discounted by one half of the spread. Hence the observed high-low price range
contains both the range of the actual prices and the bid-ask spread. With H
t
A
(L
t
A
) as the actual high (low)
stock price on day t and H
t
O
( L
t
O
) as the observed high (low) stock price for day t, we can write
(1)
| |
ln( / ) ln
( / )
( / )
. H L
H S
L S
t
O
t
O t
A
t
A
2
2
1 2
1 2
=
+

|
\

|
.
|

(
Rearranging (1) gives
(2)
( )
| |
ln / ln ln ln ln . H L
H
L
H
L
S
S
S
S
t
O
t
O t
A
t
A
t
A
t
A
2
2
2
2
2
2
2
2
=
|
\

|
.
|

(
+
|
\

|
.
|

(
+

|
\

|
.
|

(
+
+

|
\

|
.
|

(
This equation can be simplified by noting that the natural log of the ratio of high to low prices
that appears as the first term in (2) is proportional to the stocks variance. Specifically, under the
assumptions that stock prices follow the usual geometric Brownian motion and the price is observed
continuously, Parkinson (1980) and Garman and Klass (1980) show that
(3)
E
T
H
L
k
t
t t
T
HL
1
2
1
1
2
ln ,
|
\

|
.
|

=
=

o
5
where H
t
(L
t
) is the high (low) on day t and k
1
= 4ln(2).
5
Similarly, Parkinson (1980) shows that
(4)
E
T
H
L
k where k
t
t t
T
HL
1 8
1
2 2
ln , .
|
\

|
.
|

= =
=

o
t
Taking expectations of (2) and substituting from (3) and (4) yields
(5)
E
H
L
k k
S
S
S
S
t
o
t
O
HL HL
ln ln ln .
|
\

|
.
|

= +
+

|
\

|
.
|
+
+

|
\

|
.
|

(
2
1
2
2
2
2
2
2
2
2
o o
The expectation of the sum of (5) over two single days is
(6)
E
H
L
k k
S
S
S
S
t j
o
t j
O
j
HL HL
ln ln ln .
+
+ =
|
\

|
.
|
|

(
(

= +
+

|
\

|
.
|
+
+

|
\

|
.
|

2
0
1
1
2
2
2
2 4
2
2
2
2
2
o o
To simplify the notation going forward, we set
(7)
o | =
+

|
\

|
.
|

(
=
|
\

|
.
|
|

(
(

+
+ =

ln , ln
2
2
2
0
1
S
S
E
H
L
t j
O
t j
O
j
This allows us to rewrite (6) as
(8)
2 4 2 0
1
2
2
2
k k
HL HL
o o o o | + + = .
Equation (8) links the high-low price ratios on two consecutive single days with two unknowns: and .
To solve for these unknowns, we define a second equation that links the high-low ratio from the two-day
period and the same two unknowns. Squaring the log price range over a two-day period yields
(9)
ln ln ln ln ln ,
,
,
,
,
,
,
H
L
H
L
H
L
S
S
S
S
t t
O
t t
O
t t
A
t t
A
t t
A
t t
A
+
+
+
+
+
+
|
\

|
.
|

(
(
=
|
\

|
.
|

(
(
+
|
\

|
.
|

(
(
+

|
\

|
.
|

(
+
+

|
\

|
.
|

(
1
1
2
1
1
2
1
1
2
2
2
2
2
2
where H
t,t+1
is the high price over the two days t and t+1 and L
t,t+1
is the low price over days t and t+1. To
further simplify notation, we set
(10)
=
|
\

|
.
|

(
(
+
+
ln .
,
,
H
L
t t
O
t t
O
1
1
2
Using this notation and taking expectations in (9) yields
5
Using a sample of 208 stocks over 29 quarters from January 1973 through March 1980, Beckers (1983) demonstrates that the
high-low variance estimator is more accurate than the traditional variance estimator based on closing prices. See Gallant, Hsu,
and Tauchen (1999) for an application of high-low volatility estimators to the estimation of stochastic volatility.
6
(11) 2 2 2 0
1
2
2
2
k k
HL HL
o o o o + + = .
This leaves two equations, (8) and (11), and two unknowns, and . Because the spread is positive, must
also be positive. Hence, from (8), we choose the positive root for .
(12)
( )
o o o
|
= + + k k k
HL HL 2
2
2
2
1
2
.
Substituting from (12) into (11) and rearranging yields
(13)
( ) ( ) ( ) ( )
o o o
|
|

HL HL
k k k k k
2
2
2
1 2
2
2
2
1
2 2 2 2 2 2
2
2
0 + + + + = .
Equation (13) can be easily solved numerically for and the result inserted into (12) to obtain a value for
. A simple transformation of in (7) then provides the high-low spread estimate:
(14)
( )
S
e
e
=

+
2 1
1
o
o
.
Inspection of either (7) or (14) reveal that for small spreads, and the spread are almost equal.
This can simplify estimation in practice. A further simplifying assumption allows us to obtain closed form
solutions for and . If we ignore Jensens inequality in (4) and assume that
(15)
E
T
H
L
E
T
H
L
k k
t
t t
T
t
t t
T
HL HL
1 1
1
2
1
1
2
1
ln ln ,
|
\

|
.
|

=
|
\

|
.
|

= =
= =

o o
then and (12) and (13) become
1
2
2
k k =
(12') o o | = + k
2
2
(13')
( ) ( ) ( )
o o
|
|

HL
k k
2
2
2
2
3 2 2 2 2 2
2
2
0 + + = .
Rearranging yields
(16)
( )
o o
| |
|

HL HL
k k
2
2 2
2
2 2
3 2 2
2
3 2 2
0 +

=
( )
.
Solving for and using the positive root to insure a positive estimate yields
(17)
( ) ( )
o
|
|

HL
k
k
k
=

2
3 2 2 3 2 2
2
2
2
2
.
7
Inserting the standard deviation from (17) into (12') provides an estimate of :
(18) o
| |
=

2
3 2 2 3 2 2
.
This closed form solution can be inserted into (14) as before to yield our simple high-low estimator.
The spread estimator given in (14) is easy to compute and does not require the researcher to iterate
through successive estimates of the spread to get the correct value. Instead, the procedure we outline above
produces an estimate of the spread and an estimate of the daily standard deviation using only the high and
low prices from two consecutive days. To get spreads for longer periods like a month, we average the
spread estimates from all overlapping two-day periods within the month.
One note of caution is needed here. In estimating spreads and variances, we use the observed ratio
of high to low prices, while the estimator is derived using the expected ratio. Because the variance and the
spread are non-linear functions of the high-low price ratio, an average of spread estimates is not an
unbiased estimate of the spread. However, both our simulation results and empirical analysis suggest that
this is not a problem in practice.
6
II. Using the High-Low Spread Estimator in Practice
There are a number of implicit assumptions underlying the high-low spread estimator. One is that
the stock trades continuously while the market is open. Another is that stock values do not change while
the market is closed. These assumptions are not true, of course, raising some issues for the estimation of
high-low spreads in practice.
A. Adjustment for Overnight Price Changes
Because markets are closed overnight, the ratio of high to low prices for the two-day period
reflects both the range of prices during each day and the overnight return. On the other hand, the two
single-day high-low ratios reflect only the range of prices during trading hours. Though stock prices are
6
To address the importance of this problem, we reestimate monthly spreads using an average of the high-low ratio parameters
rather than an average of daily spread estimates over the month. We find in both our simulations and empirical tests that this
method does not produce more accurate monthly spread estimates. We are grateful to an anonymous referee for this suggestion.
8
more volatile during the trading day than at other times, stock prices often move significantly over non-
trading periods (see French and Roll (1986) and Harris (1986)). This causes the high-low price ratio (and
hence variance) estimated using one two-day period to be inflated relative to the variance estimated using
two one-day periods. Without an adjustment for overnight returns, the spread portion of the high-low price
ratio will therefore be underestimated.
To correct for overnight returns, we determine whether the close on day t is outside the range of
prices for day t+1 for every pair of consecutive trading days. If the day t+1 low is above the day t close, we
assume the price rose overnight from the close to the day t+1 low and decrease both the high and low for
day t+1 by the amount of the overnight change when calculating spreads. Similarly, if the day t+1 high is
below the day t close, we assume the price fell overnight from the close to the day t+1 high and increase
the day t+1 high and low prices by the amount of this overnight decrease.
As an alternative, we could adjust for overnight returns using the difference between the day t
close price and the day t+1 open price. There are three reasons why we do not use this adjustment. First,
we want to adjust only those cases where the true value changes overnight. For many stocks, the change
from close to open is more likely to occur as a result of bid-ask bounce than from an overnight change in
the true value. Second, a primary use of this estimator is to estimate historic trading costs during periods
when data on open prices may not be available. For example, open prices are missing on CRSP from July,
1962 through June, 1992. Finally, we found a small number of cases where the open price was outside the
high-low price range reported by CRSP, suggesting that open price data may be unreliable.
7
B. True High and Low Prices are not Observed for Infrequently Traded Stocks
High and low prices are observed trade prices. Garman and Klass (1980) note that if a stock trades
infrequently, the observed high price will be lower than the true high price for the day and the observed
low price will be greater than the true low price for the day. In practice, it seems likely that the probability
7
For completeness, we reestimated high-low spreads using no overnight adjustment and using an alternative adjustment based
on the price change from the previous days close to the current days open. The results show that the overnight adjustment used
in the paper clearly dominates either of the alternatives. These results are provided in the internet appendix.
9
of a trade will be especially high when prices are near their high and low values for the day. Infrequent
trading is clearly a problem if a stock trades only once during a day or, more generally, if all trades occur
at the same price. In such cases, if the trade price is within the previous days price range, we assume the
same high and low prices as the previous day. In those less common cases where the high and low are
equal, but at a price outside the previous days range, we use the same dollar range as the previous day
assuming the high and low are increased or decreased by the amount the price lies outside the previous
days high-low price range. When a stock does not trade at all during a day, CRSP lists closing bid and ask
prices in place of low and high prices. In practice, a researcher may benefit from using this information
when estimating spreads. However, to provide a fair comparison with other estimators, we eliminate the
bid and ask prices provided by CRSP in these cases and replace them with the most recent high and low
trade prices available from a prior trading day.
8

C. High-Low Spread Estimates May Be Negative
The high-low estimator assumes that the expectation of a stocks true variance over a two-day
period is twice as large as the expectation of the variance over a single day. Even if this is true in
expectation, the observed two-day variance may be more than twice as large as the single day variance
during volatile periods, in cases with a large overnight price change, or when the total return over the two-
day period is large relative to the intraday volatility. If the observed two-day variance is large enough, the
high-low spread estimate will be negative. For most of the analysis to follow, we set all negative two-day
spreads to zero before calculating monthly averages. As described in more detail below, this produces
more accurate monthly estimates than either including or deleting negative two-day spread observations.
9
III. Other Classes of Spread Estimators that Use Daily Data
To our knowledge, this is the first use of high and low prices to estimate trading costs. Researchers
8
We note that in cases where the current days high and low are equal to the previous days high and low, the variance
component of the high-low estimator will be zero and the spread component is set to the high-low range. This may overstate the
spread component in cases where the variance component is small but is not observed.
9
Table A1 in the internet appendix describes the frequency of each data adjustments described in Section II.
10
have derived several other classes of spread estimators based on daily data. We describe several of these
alternative estimators below.
A. Spread Estimators Derived from Return Covariances
Roll (1984) assumes that the true value of a stock follows a random walk and that P
t
, the observed
closing price on day t, is equal to the stocks true value plus or minus half of the effective spread. Under
these conditions, the expected autocorrelation of returns from observed prices will be negative and Roll
derives the following simple estimator for the spread:
(19) S Cov P P
t t
=

2
1
( , ) A A
Rolls measure is intuitive and easy to compute. It provides accurate spread estimates with
intraday data if a researcher has trade prices but not quotes (Schultz (2000a)). Even with a long time-series
of daily data though, the covariance of price changes is frequently positive, forcing the researcher to
arbitrarily convert an imaginary number into a spread estimate. In fact, Roll (1984) finds that cross-
sectional average covariances are positive for some entire years. In these cases, researchers usually do one
of three things: 1) treat the observation as missing, 2) set the Roll spread to zero, or 3) multiply the
covariance by negative one, estimate the spread, and multiply the spread by negative one to produce a
negative spread estimate.
Harris (1990) examines the small-sample properties of the Roll estimator. He demonstrates that the
estimator is noisy even in relatively large samples and shows that the large number of positive
autocovariance estimates is not surprising given the level of noise. He also shows that as a result of
Jensens inequality, spread estimates are significantly downward biased.
Researchers have proposed and tested a number of refinements to the Roll estimator. George,
Kaul, and Nimalendran (1991) note that the Roll estimator is downward biased if expected returns are
time-varying and hence positively autocorrelated. They propose using a covariance estimator that is based
on the residual of the regression of a stocks return on a measure of its expected return. Holden (2009)
11
observes that when a stock does not trade for a day, CRSP records the midpoint of its bid-ask range as its
closing price. He proposes a revised version of the Roll estimator in which the covariance of price changes
is divided by the percentage of days with trading. Hasbrouck (2004, 2006) uses a Gibbs sampler and
Bayesian estimation to improve the simple Roll estimator. As in Roll (1984), price changes are assumed to
occur as a result of new, serially uncorrelated information, and as a result of shifts between bid and ask
prices. The Gibbs estimator then uses information in the series of prices to assign a posterior probability
that each specific trade is a buy or sell. Hasbrouck (2006) finds that Gibbs estimates of annual effective
spreads are more accurate than spreads estimated with the basic Roll estimator, but that the procedure is
computationally intensive.
B. Spread Estimators Derived from Transaction Price Tick Size
The effective tick estimator, developed in Holden (2009) and Goyenko et al. (2009), is based on
the idea that wider spreads are associated with larger effective tick sizes. For example, their model assumes
that when both the tick size and the bid-ask spread are one eighth, all possible prices are used, but when
the tick size is one eighth and the spread is one quarter, only prices ending on even-eighths are used.
Christie and Schultz (1994) document a very strong relation between effective tick size and bid-ask
spreads for Nasdaq stocks in the early 1990's, but the relation is much weaker for NYSE stocks.
Goyenko et al. (2009) show that their assumed relation between spreads and the effective tick size
allows researchers to use price clustering to infer spreads. Suppose that there are four possible bid-ask
spreads for a stock: $1/8, $1/4, $1/2 and $1. The number of quotes with odd-eighth price fractions,
associated only with $1/8 spreads is given by N
1
. The number of quotes with odd-quarter fractions, which
occur with spreads of either $1/8 or $1/4, is N
2
. The number of quotes with odd-half fractions, which can
be due to spreads of $1/8, $1/4, or $1/2, is N
3
. Finally, the number of whole-dollar quotes, which can occur
with any spread width, is given by N
4
.
To calculate an effective spread, the proportion of prices observed at each price fraction is
calculated as
12
(20) F
N
N
for j J
j
j
j
j
J
= =
=

1
1,..., .
The unconstrained probability of the j
th
spread (which corresponds to the j
th
price fraction) , U
j
,
occurring is given by
(21)
2 1
2 2
1
1
F j
U F F j J
F F j J
j
j j j
j j
=
= =
=

,...,
.
The effective tick measure is a probability-weighted average of all possible spreads. However,
using unconstrained probabilities can be problematic. When the number of observed prices on finer
increments is high, the effective tick estimators unconstrained probability of a narrow spread can exceed
one and the unconstrained probability of a wider spread may be negative. In the example above, if ten
prices were observed and six had odd-eighth price fractions, the unconstrained probability of a one-eighth
spread would be 1.2. If one of the ten prices had an odd-quarter fraction, the probability of a one-quarter
spread would be .2 - .6 = -.4. Holden (2009) and Goyenko et al. (2009) constrain the probabilities of
spreads estimated by the effective tick method to be non-negative and constrain the probability of an
effective spread to be no more than one minus the probability of a finer spread, a practice we also adopt in
our examination of the effective tick estimator.
10,11
C. Spread Estimators Derived from the Frequency of Zero Returns
Lesmond, et al. (1999) develop an effective spread estimator (the LOT estimator) based on the
idea that a stocks true return is given by the market model, but the observed return is only different from
zero if the true return exceeds the costs of trading. With
1
< 0 as the cost of selling and
2
> 0 as the cost
of buying, the observed day t stock return R
t
O
is:
10
During decimal pricing, we assume the effective tick can be 1, 5, 10, 25, 50, or $1.00
11
Holden (2009) derives a spread estimator that nests both the Roll covariance spread estimator and the effective tick estimator
as special cases. The estimator performs well, but is computationally intensive.
13
(22)
R R if R
R if R
R R if R
t
O
mt t t
A
t
O
t
A
t
O
mt t t
A
= + <
= s s
= + >
| c o o
o o
| c o o
1 1
1 2
2 2
0
.
Using this relation between trading costs and observed returns, Lesmond et al. (1999) estimate
trading costs by maximizing the likelihood function for a year of daily stock returns with respect to
1
,
2
,
and . The LOT estimate of the effective spread is then defined as
2
!
1
.
12,13
IV. Simulation Results for the High-Low Spread Estimator
To see how well the high-low spread estimator works under different conditions, we simulate
10,000 months of stock data. Each month contains 21 days and each day has 390 minutes. At the
beginning of each month, the stock price is arbitrarily set to $100. Then for each minute of each day, m,
the true value of the stock price, P
m
, is simulated as
(23)
P P e
m m
x
=
1
o
,
where is the stock standard deviation per minute and x is a random drawing from a unit normal
distribution. The bid price for each minute is obtained by multiplying P
m
by one minus half the bid-ask
spread and the ask price is obtained by multiplying P
m
by one plus half the bid-ask spread. We assume a
50% chance that the observed price at minute m is a bid, and a 50% chance it is an ask. The high and low
for the day are the highest and lowest observed prices, respectively, whether a bid or ask. The closing
price for the day is the observed price for minute 390.
12
Goyenko et al. (2009) propose two alternative methods to define the three regions of the likelihood function used to estimate
the LOT measure. In the first method, referred to as LOT Mixed (as proposed in the original Lesmond et al. (1999) paper), the
regions are defined based on both the stock return and the market return. In the second method, referred to as LOT Y-Split, the
three regions are defined based on the stock return only. Goyenko et al. find that LOT Y-Split generally dominates LOT Mixed,
and it is the LOT Y-Split that we use in this paper. As a robustness check, we reestimated the empirical results from Section V
using the alternative LOT Mixed methodology. These results are provided in the internet appendix. Although the two measures
can produce somewhat different estimates, our conclusions regarding the relative performance of the high-low spread and the
LOT measure are robust to this estimation choice. The conclusions are also robust to the exclusion of outliers.
13
LOT estimation requires the choice of a market return proxy. The results reported in this paper are based on the CRSP value
weighted index, whereas Lesmond et al. (1999) and Goyenko et al. (2009) use the CRSP equal weighted index. We reestimated
the LOT Y-Split using the CRSP equal weighted index as the market return proxy and find that the results are unchanged.
14
A. The Distribution of Simulated Spread Estimates
We first examine the performance of the high-low spread estimator under the near ideal
conditions of no overnight return and prices observed every minute. For comparison, we also present
simulated results for the Roll spread estimator. For these simulations, we assume that the true value for
the first minute of the day is equal to the true value for the last minute of the previous day. We assume the
daily standard deviation of returns is 3% and repeat the simulations for spreads of 0.5%, 1%, 3%, 5%, and
8%. The simulated monthly spread estimates are described in Panel A of Table I, which reports the mean
and standard deviation of spread estimates across the 10,000 simulations, as well as the percentage of
simulated monthly spreads that are non-positive.
Column 1 reports simulation results for the simple closed-form high-low spread estimator
defined in equations (18) and (14). To estimate monthly spreads, we estimate spreads separately for each
two-day period and calculate the average across all overlapping two-day periods in the month. For this
spread estimator, the mean estimate across 10,000 simulations is very close to the assumed spread
regardless of the spread width. For example, the mean estimate from the simple high-low spread estimator
is 7.84% when the true spread is 8% and is 2.92% when the true spread is 3%. Regardless of the size of
the true spread, the standard deviation of the high-low spread estimates is around 0.62%. For spreads of
3% or more, none of the monthly high-low spread estimates are negative. However, for spreads of 0.5%,
almost 20% of monthly high-low spread estimates are negative.
The next two columns report simulation results when ad-hoc adjustments are made for negative
spreads. The first adjustment, shown in column two, sets all negative two-day estimates to zero before
taking the monthly average. Under these near ideal conditions, with no overnight returns and almost
continuous observation of prices, this adjustment produces spread estimates that are too large for spreads
of 0.5% to 3%. For wider spreads, setting negative two-day spreads to zero before taking monthly
averages leads to a slight improvement in the average spread estimates. The second adjustment, shown in
the following column, includes negative two-day spread estimates in the monthly average, but sets
15
negative monthly spread estimates to zero. Using this alternative adjustment produces spread estimates
that are comparable to those from the simple unadjusted high-low spread estimator. Thus, under these
near ideal conditions, it is not worthwhile to adjust for negative estimates when examining means.
The fourth column of Panel A provides the performance of the high-low spread estimator with the
adjustment for Jensens inequality in equations (12), (13), and (14). The performance of this alternative
version of the estimator is very similar to, but slightly worse than the performance of the simple high-low
spread estimator reported in column one. Under these near ideal conditions, there is little benefit from
incorporating the adjustment for Jensens inequality.
As noted above, the spread estimate is a non-linear function of the and parameters. As a
result, averaging two-day spread estimates can produce a biased estimate of the spread. To address the
importance of this Jensens inequality problem, the estimators in the fifth and sixth columns take a
slightly different approach to aggregating daily high and low prices into a monthly estimate. Rather than
averaging spread estimates across two-day periods within the month, we average the and parameters
across two-day periods within the month. The fifth column reports results when these monthly parameter
averages are plugged into the simple high-low spread estimator, while column six reports results from the
more complicated estimator that incorporates an adjustment for the other Jensens inequality complication
- that the square root of the high-low variance estimate is not an unbiased estimate of the standard
deviation. As the table shows, averaging the and parameters over the month produces less accurate
spread estimates and a larger proportion of non-positive spreads than when two-day spread estimates are
averaged over the month.
The last column provides simulation results for the Roll spread estimator. As described in Section
III, we set the Roll spread to zero in cases where the serial covariance is positive. The results show that
the Roll estimator performs far worse than the high-low spread estimator, especially when spreads are
narrow. When the true spread is 8%, the mean high-low spread estimate is 7.84% and the mean Roll
estimate is 7.54%. More importantly, the standard deviation of spread estimates is 0.0063 for the simple
16
high-low spread estimator, compared to 0.0279 for the Roll estimator. When the true spread is 0.5%, the
mean high-low spread estimate is 0.59% and the mean Roll estimate is 1.18%. Here again, the Roll
estimates have a much larger standard deviation (0.0137) than the high-low estimates (0.0062).
Under the near ideal conditions of these simulations, the simple version of the high-low spread
estimator appears to work best, without adjustments for negative spreads or Jensens inequality. Monthly
spread estimates are also more accurate when two-day spread estimates are averaged within the month, as
opposed to using average parameter estimates within the month. The results also suggest that the high-
low spread estimator performs significantly better than the Roll covariance estimator.
There are several ways in which the above simulation assumptions depart from market realities.
We next examine two important complications that may affect the performance of the spread estimator.
First, overnight returns affect the high-low ratio for a two-day period, but do not affect the high-low ratios
for either of the single days. This leads to an underestimate of spreads. Second, with infrequently
observed prices, the observed high and low price may not reflect the true high and low, leading to a
misestimation of the true spread.
We simulate infrequent observation of prices by assuming there is a 10% chance of observing a
price at any given minute. This corresponds to an average of 39 trades per day - a realistic assumption for
most Nasdaq and smaller NYSE stocks. We simulate overnight returns that are normally distributed with
zero mean and standard deviation equal to 0.5 times the open-to-close standard deviation of returns.
14
We
then adjust for the effects of overnight returns based on the method described in Section II.
Simulation results incorporating both overnight returns and infrequently observed prices are
described in Panel B of Table I. Even after incorporating an adjustment for overnight returns, mean
spread estimates decline for all versions of the high-low spread estimator. The simple high-low spread
estimator provides mean spread estimates of 6.65% when the true spread is 8%, 3.69% when the true
14
Lockwood and Linn (1990) (Table I) estimate the average to be 0.569 for the Dow Jones Industrials over 1964 - 1989.
Oldfield and Rogalski (1980) (Table I) provide data that allows estimation of the ratio for five large individual stocks for
October, 1974 - December, 1977. The average ratio across the five stock is 0.502.
17
spread is 5%, 0.05% when the true spread is 1%, and -0.24% when the true spread is 0.50%. In contrast to
Panel A, when prices are observed infrequently and there are overnight returns, ad-hoc adjustments for
negative spread estimates make the estimates more accurate. For spreads of 1% or greater, setting
negative two-day spreads to zero before taking monthly averages produces mean estimates that are much
closer to the assumed spreads than the simple high-low spread estimator with no adjustment for negative
spreads. Setting negative two-day spreads to zero before taking monthly averages also results in a much
smaller standard deviation of spread estimates than does the unadjusted estimator. As in Panel A,
adjusting for negative spreads at the daily level appears to work better than adjusting for negative
monthly spreads, and there appears to be little benefit to either incorporating an adjustment for Jensens
inequality or taking average parameter estimates rather than average two-day spreads within the month.
The last column of the table shows that overnight returns and infrequently observed prices have
little impact on the Roll spread estimator. In Panel B, mean spread estimates from the Roll estimator are
slightly closer to true spreads than are high-low spread estimates for spreads of 3% or greater. However,
the high-low spread estimator with negative two-day spreads set to zero provides better mean estimates
than the Roll estimator for spreads of 0.5% or 1%. More importantly, the standard deviation of spread
estimates from the high-low spread estimator with negative two-day spreads set to zero is only one-half to
one-fourth as large as the standard deviation of Roll spread estimates. Even under these unfavorable
conditions, the high-low spread estimator appears to outperform the Roll estimator.
B. The Cross-Sectional Correlation of Simulated Spread Estimates
The simulations described in Table I illustrate the accuracy of the high-low spread estimator.
Next, we examine how well the different versions of the estimator capture the cross-section of spreads
under alternative assumptions about prices and spreads. Our simulations again consist of 10,000 stock-
months with 21 days in each month and 390 minutes in each day, with stock prices simulated as in
equation (27). However, for each of the 10,000 stock months, we randomly assign a spread from a
uniform distribution with a range from 0% to 6%. As in Table I, there is a 50% chance that an observed
18
price is a bid and a 50% chance that it is an ask.
Table II reports the correlations of spread estimates with simulated true spreads across the 10,000
stock months, where the standard deviation of daily returns is set to either 3% or 5%. The first two rows
report simulations under the near-ideal conditions of no overnight return and prices observed each minute.
Under these conditions, correlations between high-low spread estimates and simulated spreads are high
for all versions of the high-low spread estimator, ranging from 0.926 to 0.940 when the daily return
standard deviation is 3%. When the standard deviation of daily returns is 5%, the correlation between
high-low spread estimates and simulated spreads ranges from 0.822 when and parameters are
averaged over the month to 0.865 when spreads are calculated for two-day periods and averaged over the
month with negative two-day spreads set to zero.
The correlations between Roll spread estimates and simulated spreads are much lower. When
Roll spreads are set to zero for positive serial correlations, the correlation is 0.573 for a daily return
standard deviation of 3% and 0.338 for a standard deviation of 5%. However, while setting Roll spreads
to zero is the common ad hoc adjustment when the autocovariance is positive, it is not clear a priori
whether this adjustment increases or decreases the correlation between Roll spreads and simulated
spreads. As an alternative, the last column reports correlations between Roll spreads and simulated
spreads when positive serial correlations are treated as negative spreads. Here, the correlation between
Roll spreads and simulated spreads drops to 0.524 when the standard deviation is 3% and to 0.297 when
the standard deviation is 5%.
The next two rows of the table report correlations from simulations that incorporate overnight
returns and the overnight return adjustment described in Table I. As in Table I, the standard deviation of
close-to-open returns is assumed to be 0.5 times the open-to-close return standard deviation. With
overnight returns, correlations decline slightly for all version of the high-low spread estimator. However,
correlations remain highest for the simple version of the estimator in which negative two-day spreads are
19
set to zero before taking the monthly average. For this version of the estimator, the correlation falls from
0.940 to 0.925 for a standard deviation of 3% and from 0.865 to 0.835 for a standard deviation of 5%.
The middle two rows of Table II provide correlations between spread estimates and simulated
spreads incorporating both overnight returns and infrequently observed prices. As in Table I, we assume
there is a 10% chance of observing a price at any specific minute. Incorporating infrequent observation of
prices reduces correlations slightly for all versions of the high-low spread estimator. As in Table I,
however, infrequent observation of prices has little impact on the Roll spread estimator. Again, the
correlations suggest that all versions of the high-low spread estimator dominate the Roll spread estimator,
and the simple version of the high-low estimator in which negative two-day estimates are set to zero
outperforms other versions of the high-low spread estimator. This version of the high-low spread
estimator produces a correlation of 0.922 for a daily return standard deviation is 3% and 0.828 for a
standard deviation is 5%.
The next two rows report correlations when daily returns are positively autocorrelated.
Specifically, we assume that the innovation to the expected return is normally distributed with a standard
deviation of 1% per day. The daily expected return is then defined as the sum of the innovation plus 0.5
times the previous days expected return and the expected return for each one minute time interval is the
daily expected return divided by 390. When returns are positively autocorrelated, the correlation with
simulated spreads declines for every version of the high-low spread estimator, but remains high. Again,
the version of the high-low spread estimator in which negative two-day spread estimates are set to zero
produces the highest correlation and significantly outperforms the Roll estimator. For this version of the
high-low spread estimator, the correlation between high-low spread estimates and simulated spreads is
0.914 for a daily standard deviation of 3% and 0.821 for a standard deviation of 5%.
Finally, the last two rows of Table II report correlations under the assumption that spreads
themselves change randomly. In these simulations, we continue to assume that stock prices change
overnight, that there is only a 10% chance that a price is observed at a particular minute, and that daily
20
returns are positively autocorrelated. For each stock, an initial spread is drawn randomly from a uniform
distribution over 0% to 6%. Then, each days spread is obtained by multiplying the previous days spread
by e

, where is normally distributed with zero mean and standard deviation equal to 0.1. The simulated
monthly spread is then defined as the average simulated spread across the 21 days within the month.
Interestingly, the correlations between spread estimates and mean simulated spreads are higher for all
estimators than the correlations when spreads are assumed to be constant within the month. This may
reflect that the range of mean spreads in these simulations is wider than the original 0% to 6%. More
important though is that the relative performance of the spread estimators is unaffected by allowing
spreads to vary over time.
The simulation results presented in Tables I and II provide several key findings. First, the simple
high-low spread estimator in which two-day spread estimates are averaged over the month works well and
is far more accurate than Roll spread. Second, adjusting for Jensens inequality complicates the estimation
but doesnt improve the accuracy of the estimates. Third, estimating spreads over two-day periods and
averaging the spreads over a month works better that averaging the parameters and calculating a single
spread for the month. Finally, setting negative two-day spread estimates to zero before calculating
monthly spreads improves estimates, particularly when returns are generated overnight and stocks do not
trade continuously. Throughout the remaining empirical analysis, we present results based on the simple
high-low spread estimator, where monthly spreads are based on an average of two-day spread estimates
after setting negative two-day estimates to zero.
V. A Comparison of Spread Estimates from Daily Data with TAQ Spreads
In this section, we compare the performance of monthly high-low spread estimates to estimates
generated by three common alternative spread estimators: the Roll spread estimator, the effective tick
estimator, and LOT estimator. These alternatives provide estimates based on the autocovariance of
returns, the price fraction of trade prices, and the frequency of zero returns, respectively. We focus on
these specific estimators because they have been used as building blocks for other, more complex
21
methods of estimating spreads (see, for example, Holden (2009)). Goyenko et al. (2009) provide a
comprehensive study of the properties of these estimators and estimators derived from them. Given its
simplicity and accuracy, we believe that the high-low spread estimator may also serve as a foundation for
more complex estimation techniques.
Monthly Roll, effective tick, LOT, and high-low spread estimates are calculated using daily data
from CRSP for the period from January 1993 through December 2006. For each spread estimator, we
require at least 12 daily observations to calculate a monthly spread estimate. The CRSP sample includes
all NYSE, Amex, and Nasdaq stocks with CRSP share codes of 10 or 11 (i.e., U.S. common shares).
To assess the performance of these monthly spread measures, we compare them to trade-weighted
effective spreads estimated for each security each month using the NYSEs TAQ data.
15
For each security
and each trading day, we first determine the highest bid and lowest ask across all quoting venues at every
point during the day.
16
At any time t, let Bid
t
equal the inside bid, Ask
t
equal the inside ask, and Midpoint
t
equal (Bid
t
+ Ask
t
)/2. To estimate effective spreads, we compare each trade price during the day to the
inside bid and ask posted one second prior to the trade. For each trade i, let Price
i
equal trade price and
Midpoint
i
equal the bid-ask midpoint outstanding at the time of trade i. The percentage effective spread
for trade i is then defined as 2*|P
i
Midpoint
i
|/Midpoint
i
. The average effective spread for each day is a
15
To match securities in the CRSP data to securities in the TAQ data, we first identify all unique cusip-ticker combinations in
both the TAQ and CRSP datasets from 1993 through 2006. We use eight-digit cusip numbers, where cusip numbers for TAQ
securities are taken from the monthly TAQ Master Files. We then merge the TAQ and CRSP samples by cusip and ticker,
assigning a CRSP perm number to each TAQ security. For those securities that cannot be matched in the first step, we then
match based solely on the eight-digit cusip number. Finally, we attempt to match any remaining securities by either ticker
symbol or six-digit cusip number. All securities matched solely by ticker or cusip are then hand verified for accuracy and
corrections are made, where needed. Finally, we hand verify any CRSP-TAQ matches where the number of daily observations in
the two datasets differs by more than 10 days.
16
For Nasdaq securities, we first establish the best bid and ask across all Nasdaq market makers. These inside quotes are then
compared to the quotes on other venues. We apply several standard filters to the trade and quote data. We include only regular
NBBO-eligible quotes with positive prices and positive depth. We also exclude quotes if the ask is less than or equal to the bid or
if either the bid or ask differs by more than 25% from the previous quote. We use only trades that occur during regular trading
hours, have a positive price and quantity traded, have normal condition codes, and have trade correction codes less than two. We
also exclude the first trade each day and trades for which the price differs by more than 25% from the preceding price. Finally,
we exclude observations for which either the effective or quoted spread exceeds $1 with a midpoint of $5 or less, $5 with a
midpoint of $100 or less, or $10 with a midpoint greater than $100.
22
trade-weighted average across all trades during the day. The monthly Effective Spread for each security is
then defined as the average across all trading days within the month.
Any comparison of alternative spread estimators must be qualified, because the estimators may
capture different components of liquidity or transitory volatility. As noted earlier, the high-low estimator
captures transitory volatility over two trading days, which may include temporary price pressure from
large orders in addition to bid-ask spreads. At the daily frequency, we expect the estimator to closely
approximate effective spreads. We therefore test the performance of the high-low estimator at capturing
effective bid-ask spreads as measured by intraday TAQ data and compare this performance to that of
several alternative low-frequency spread estimators. We note, however, that one inherent benefit of the
high-low estimator is that it may capture other forms of transitory volatility, and therefore liquidity costs,
that are not reflected in the effective spread.
17

A. Summary Statistics
Table III provides summary statistics for spread estimates using the pooled sample of all stocks
and all months from 1993 through 2006 for which all four spread estimators could be calculated. For
comparison purposes, data on effective spreads from TAQ are presented first. The simple average
effective spread from TAQ across all stock months is 2.38%.
Roll spread estimates are reported next. For the full sample of stocks over 1993-2006, positive
monthly serial correlation estimates occur for 38.0% of the stock months. We adopt the common ad-hoc
adjustment of setting Roll spreads to zero in these cases. This yields a mean Roll spread of 2.42%, which
is very close to the mean TAQ effective spread. If the positive correlations are instead omitted, more than
a third of the observations are lost and the mean Roll spread is 3.90%, much greater than the mean
17
To examine the relation between high-low spreads and more general measures of liquidity, we provide results based on the
Amihud illiquidity measure, defined as the average ratio of absolute return to dollar volume across all trading days during the
month. As expected, the high-low spread measure has a higher correlation with the Amihud measure than any of the other
spread estimators. The correlation between the high-low spread and the Amihud measure is 0.359, compared to correlations of
0.297, 0.278, and 0.268, for the Roll spread, effective tick spread, and LOT measure, respectively. These results are provided in
the internet appendix.
23
effective spread from TAQ. In the analysis to follow, we use the version of the Roll spread estimator in
which positive autocorrelations are defined as zero spreads.
Spread estimates obtained from the effective tick estimator and the LOT estimator are presented
next. By construction, effective tick estimates are always positive. The mean effective tick spread is
1.67% and the median is 0.72%, both of which are less than the comparable effective spread estimate
from TAQ. The mean and median LOT estimates are 2.15% and 0.86%, with 24.4% of monthly LOT
estimates being non-positive.
18

Results for three versions of the high-low spread estimator are reported next. The first high-low
spread estimator sets all negative two-day spread estimates to zero before calculating the monthly
average. This high-low spread estimator produces a mean spread of 2.10%, compared to the mean TAQ
effective spread of 2.60%. The median spread estimate from this version of the high-low spread estimator
is 1.32%, which is very close to the median TAQ effective spread of 1.29%. When negative spreads are
included, the mean high-low spread estimate equals 1.26%, with 24.0% of monthly spread estimates
being non-positive. When negative two-day spreads are omitted, the mean high-low spread increases to
2.76%. These findings are consistent with the simulation results and suggest that the high-low spread
estimator performs best when negative two-day estimates are set to zero before taking the monthly
average. The results throughout the remainder of the paper are therefore based on the simple version of
the high-low spread estimator in which negative two-day spreads are set to zero.

B. Cross-Sectional Comparisons of Spread Estimates with TAQ Effective Spreads
To analyze the performance of the high-low spread estimator, we first calculate the cross-
sectional correlation between spread estimates and the TAQ effective spread each month from 1993
through 2006. This cross-sectional analysis serves two purposes. First, in many applications, researchers
18
The frequency of zero estimates is sensitive to the LOT estimation method. As discussed above, we report results based on the
LOT Y-Split. In robustness tests reported in the internet appendix, we find that the LOT Mixed produces a significantly lower
frequency of zero estimates. In our full sample, the frequency of zero estimates based on the LOT Mixed estimator is 4.4%. As
noted earlier, the conclusions regarding the relative performance of the high-low spread and the LOT measure are unaffected by
the choice of LOT estimation method. See Goyenko et al. (2009) for a description of these alternative estimation methods.
24
may be particularly concerned with how well the estimator captures the cross-section of execution costs.
Second, examining cross-sectional correlations on a month-by-month basis allows us to examined the
performance of the estimator during different time periods. We calculate time-series averages of the
cross-sectional correlations using the entire period and three subperiods: 1993-1996, 1997-2000, and
2001-2006. These subperiods correspond roughly to the periods when the regulatory minimum tick size
and quoted spread were an eighth of a dollar, a sixteenth of a dollar, and one cent.
19
An important use for all of the low frequency spread estimators is to estimate trading costs for
periods before intraday data were available. Thus, we are particularly concerned with how the spread
estimators perform during the 1993-1996 subperiod. During this period, the tick size in U.S. markets was
$0.125, just as it was during earlier periods. Hence the performance of spread estimators over 1993-1996
is likely to be a better predictor of their performance for earlier periods than is their performance during
either 1997-2000 or 2001-2006.
Panel A of Table IV reports time-series means of the monthly cross-sectional correlations. For the
entire period and for each subperiod, the high-low spread estimator produces higher cross-sectional
correlations with TAQ effective spreads than any of the alternative spread estimators. For the entire
period, the mean cross-sectional correlation of TAQ effective spreads with high-low spreads is 0.829,
compared to a correlation of 0.637 for the Roll spread, 0.683 for the effective tick spread, and 0.635 for
the LOT measure. Based on these results, the high-low spread estimator appears to dominate the
alternative spread estimators at capturing the cross-section of TAQ effective spreads. At 0.930, the cross-
sectional correlation of high-low spread estimates with the TAQ effective spread is particularly high
during the 1993-1996 subperiod. This suggests that the estimator should work well for earlier periods.
19
The minimum tick size on Amex changed from one-eighth to one-sixteenth during May 1997. The change occurred on both
Nasdaq and the NYSE during June 1997. Both the NYSE and AMEX began to phase in decimal pricing in August 2000, with
full implementation by January 2001. Nasdaq switched to decimal pricing during March and April of 2001. Hence we define the
1993-1996 period as all months from January 1993 through May 1996 and the 1997-2000 period as all months from June 1996
through December 2000. While not precise cutoffs, these breakpoints should capture the broad differences across the three tick-
size regimes.
25
The month-by-month cross-sectional correlations between the various spread estimators and TAQ
effective spreads are plotted in Panel A of Figure 1. As the figure shows, the cross-sectional correlation
between high-low spread estimates and TAQ effective spreads are consistently higher than the
correlations based on the Roll spread, the effective tick spread, or the LOT measure. Again, the
correlations based on the high-low spread estimates are particularly high during the 1993-1996 period.
The Roll spread slightly outperforms the effective tick spread and the LOT measure based on cross-
sectional correlations during 1993-1996, but generally underperforms other measures in later periods.
For some applications, researchers may be interested in how well spread estimators capture
changes in spreads. To address this issue, we estimate cross-sectional correlations between month-to-
month changes in spread estimates and changes in TAQ effective spreads. Results are shown in Panel B
of Table IV. Not surprisingly, correlations based on changes in spreads are lower than those based on
spread levels. Still, the high-low spread estimator does a far better job of explaining the cross-section of
changes in TAQ effective spreads than any of the alternative measures. For the entire sample period, the
mean cross-sectional correlation between changes in high-low spreads and changes in TAQ effective
spreads is 0.472. The comparable correlations for the Roll spread, the effective tick spread, and the LOT
measure are 0.249, 0.183, and 0.186, respectively. As in Panel A, the high-low spread estimator performs
best during the 1993-1996 subperiod, with an average cross-sectional correlation of 0.570.
The month-by-month correlations between changes in TAQ spreads and changes in estimated
spreads are plotted in Panel B of Figure 1. The correlations are consistently higher for the high-low
spread estimator than any of the alternative spread estimators. The effective tick estimator and LOT
measure perform particularly poorly in capturing the cross-section of month-to-month changes in spreads,
especially during the earlier part of the sample period.
Taken together, the results in Table IV and Figure 1 provide clear evidence that the high-low
spread estimator dominates the alternative estimators at capturing the cross-section of both spread levels
26
and changes in spreads. As noted above, the high-low spread estimator performs particularly well during
the 1993-1996 subperiod, suggesting that it should work well for earlier time periods.
C. The Time-Series of Spread Estimates for Individual Stocks
We next calculate stock-by-stock time-series correlations between the different spread estimates
and the TAQ effective spread. These tests serve two purposes. First, they tell us how well the spread
estimators work for different kinds of stocks. Second, for some applications, research may be concerned
with how well the spread estimator captures the time series of spreads. We summarize the time-series
correlations across all stocks, by exchange, and by market capitalization quintile. Quintile breakpoints are
based on NYSE stock capitalizations, so the smaller size quintiles have a disproportionate number of
Nasdaq and Amex stocks. The results provided in the table are based on the exchange listing and size
quintiles of each stock as of its last listing date on CRSP.
The time-series correlations are summarized in Table V. Panels A, B, and C report results for the
three tick-size subperiods, while Panel D reports results for the full sample period. Time-series
correlations between each of the spread estimates and effective spreads are lower than pooled estimates.
This is not surprising as there is a lot of variation in spreads across securities. One clear result that
emerges from Table V is that the high-low spread estimator and the effective tick estimator dominate both
the Roll spread and the LOT measure in explaining time-series variation in the spreads of individual
stocks. Estimates from the Roll spread and the LOT measure produce the lowest time-series correlations
for the full sample of stocks, for stocks on each of the exchanges, and for stocks in all size quintiles.
A second clear result is that the high-low spread estimator outperforms the effective tick
estimator on average because it does a better job with the spreads of smaller stocks. For each of the three
subperiods, the high-low spread estimator has significantly higher time-series correlations with TAQ
effective spreads for Nasdaq and AMEX stocks than does the effective tick estimator. However, for
NYSE stocks, the correlation for the high-low spread estimator is slightly lower than that for the effective
tick estimator in the first two subperiods, and substantially lower in the 2001-2006 subperiod. Turning to
27
size quintiles, we see that the high-low spread estimator exhibits higher time-series correlations than the
effective tick estimator for the first three quintiles in the 1993-1996 subperiod, for the first two quintiles
in the 1997-2000 subperiod, and only for the first quintile in the 2001-2006 subperiod. Notably, these
quintiles contain the majority of the sample stocks. In the 1993-1996 subperiod, for example, the first
quintile includes approximately two-thirds of the sample stocks and the first three quintiles include
approximately 87% of the sample stocks. For the largest firms, however, the effective tick estimator
produces higher time-series correlations with TAQ effective spreads than does the high-low spread
estimator. The high-low spread estimator has trouble with the largest stocks in part because their trading
costs are low, resulting in a small signal-to-noise ratio. In contrast, the effective tick estimator works well
for the largest stocks because this estimator is similar to simply dividing the minimum tick size by the
stock price. This works well when the tick size places a binding lower bound on the spread width.
The subperiod results in Panels A, B, and C suggest that the high-low estimator dominates the
Roll and LOT estimators in capturing time-series variation in spreads. It does better than the effective tick
estimator for most stocks, but not for the largest stocks. In addition, the estimator performs particularly
well in the 1993-1996 subperiod. In Panel D, we present results for the full sample period. On average,
the high-low estimator continues to dominate the alternative estimators. However, the variation in
performance between small and large stocks is magnified in the full sample period results. The average
time-series correlation between high-low spreads and TAQ effective spreads drops from 0.702 for the
smallest size quintile to 0.182 for the largest quintile. In contrast, the average correlation based on the
effective tick estimator increases from 0.550 for the smallest quintile to 0.789 for the largest quintile. In
addition, the average correlation based on the LOT measure averages between 0.40 and 0.45 for all
categories of stocks. When combined with the subperiod results, these findings suggest that the improved
performance of the effective tick estimator and the LOT measure for large stocks in the full sample period
may be driven by the ability of these estimators to capture changes in the minimum tick size and the
28
overall downward trend in spreads over this time period. During historical periods with a constant tick
size, however, we expect the high-low estimator to dominate for all but the largest stocks.
D. Summary of the Estimator Comparisons
In pooled and cross-sectional analyses, the high-low spread estimator dominates the Roll spread
estimator, the effective tick estimator, and the LOT measure. It has significantly higher cross-sectional
correlations with TAQ effective spreads and with month-to-month changes in TAQ effective spreads than
the alternative estimators. The high-low spread estimator does particularly well during the period from
1993 through 1996, when the minimum tick size was $0.125. These results suggest that the high-low
spread estimator may be superior to other estimators for historical analyses.
20
In stock-by-stock time-series analyses, we again find that the high-low spread estimator
dominates the Roll spread estimator and the LOT measure. The effective tick estimator appears to work
well for the very largest stocks, where the tick size provides a binding lower bound on the spread.
However, for the vast majority of stocks, the high-low spread estimator is superior to the effective tick
estimator, especially during the 1993-1996 subperiod when the tick size was one-eighth.
VI. Example Applications of the High-Low Spread Estimator
To demonstrate the potential uses of the high-low spread estimator, we provide several example
applications. The first is a description of historical spreads for NYSE and Amex stocks from 1926
through 2006. The second is an illustration of the potential use of the estimator in asset pricing tests. The
third is an application to non-U.S. markets using data from Datastream.
A. Estimating Historical Spreads for U.S. Stocks using Daily CRSP Data
Using high and low price data from CRSP, we calculate bid-ask spreads based on the high-low
estimator for each NYSE/Amex stock each month from 1926 to 2006. As before, monthly spreads are
20
We also calculate mean absolute errors based on the difference between monthly spread estimates for each of the low-
frequency estimators and monthly effective spreads from TAQ. Across all sample months, the cross-sectional mean absolute
error averages 0.0090 for the high-low spread estimator, compared to 0.0166 for the Roll spread, 0.0113 for the effective tick
spread, and 0.0132 for the LOT measure. Results based on month-to-month changes in spreads provide similar conclusions.
These results suggest that the high-low spread estimator is more accurate than the three alternative estimators in capturing both
the level of spreads and changes in spreads. These results are provided in the internet appendix.
29
defined as the average of all two-day spreads within the calendar month, negative two-day spread
estimates are set to zero, and we require a minimum of 12 daily price ranges to calculate a monthly
spread.
21
The results are illustrated in Figure 2.
Panel A of Figure 2 plots the cross-sectional average of high-low spread estimates for
NYSE/Amex stocks each month from 1926 through 2006. Results are shown for the full sample of
NYSE/Amex stocks and for the smallest and largest market capitalization deciles. Examining the market-
wide average, we see that spreads display considerable variation over time. They were very high in the
early years of the depression, with mean spreads exceeding 10% for several months in 1932 and 1933.
Spreads declined in 1935 and 1936 but increased sharply as the market performed poorly in 1937 and
1938. Spreads declined steadily until the early 1950's and remained relatively low through the early
1970's. The recession of 1974-1975 is clearly visible in the figure as a period of increased spreads.
Spreads are also relatively high in the early 1990's and during the tech bubble of the late 1990's.
As expected, the results show that small stocks tend to have higher execution costs than large
stocks. However, the graph also illustrates that the difference between these groups is highly variable. For
most months, spreads are 1% to 2% higher for small stocks than large stocks. During the depression, on
the other hand, small stock spreads sometimes exceeded large stock spreads by 50%. So, at the time that
spreads were 8% or 9% for large stocks, they were around 60% for small stocks. This shows that trading
strategies involving small stocks were extremely expensive during the depression. It also indicates that if
the returns to small stocks contain a premium to compensate for trading costs, that premium would have
been especially high in the 1930's.
Panel B of Figure 2 provides a similar graph for the 1950-2006 subperiod. By omitting the
depression and altering the scale of the graph, we get a clearer picture of the intertemporal variation in
spreads over the last 50 years. Here, the impact of recessions and stock market declines in 1974-1975 and
21
As noted in Section II, CRSP provides closing bid and ask prices in place of low and high prices on those days when a stock
does not trade. In this section, we make use of this information and include these prices when calculating the high-low spread
estimator.
30
1991-1992, the 1987 crash, and the technology bubble are clearly visible. The difference between
spreads of small and large stocks was relatively large in the mid-1970's and also in the early 1990's.
However, in recent years, the difference in spreads between small and large stocks has shrunk to almost
nothing. Thus, while trading strategies involving small stocks may have been prohibitively expensive
during the mid-1970s and early-1990s, these trading strategies may be more profitable today.
The point of this exercise is to illustrate how the high-low estimator can be used in practice.
However, there is also a lesson in the analysis: trading costs prior to the early 1940s are too large to be
ignored. The high-low estimator allows researchers studying this period to incorporate bid-ask spreads.
B. Using High-Low Spread Estimates in Asset Pricing Tests
In recent years, there has been a great deal of interest in the effects of liquidity and liquidity risk
on asset pricing. We leave it to another paper to thoroughly study the impact of liquidity, as measured by
our estimator, on asset prices. However, to demonstrate the potential usefulness of the estimator in asset
pricing studies, we provide an analysis of abnormal returns on portfolios sorted by liquidity, where
liquidity is defined by either the Amihud (2002) measure or the high-low spread estimator.
The Amihud measure, like the high-low spread estimator, is well suited to asset pricing studies
because it can be estimated for a long time-series using daily data. However, it is important to note that
the two estimators may capture very different things. The Amihud illiquidity measure captures how much
a given trading volume moves prices. The high-low spread estimator captures transitory volatility at the
daily level and will closely approximate the effective spread, or the cost of immediacy. The high-low
estimator also has the advantage that it does not require data on trading volume. It can therefore be
applied in settings such as emerging markets, where the quality or availability of volume data may be
suspect (see, for example, Bekaert, Harvey, and Lundblad (2007)), and for comparisons across markets
such as the NYSE and Nasdaq, where volume measures may not be comparable.
For each month from June 1926 through December 2006, we calculate Amihud illiquidity
measures and high-low spread estimates for each NYSE/Amex stock with a price of $5 or more. We sort
31
stocks each month into ten portfolios based on their average Amihud measure and, separately, into ten
portfolios based on their average high-low spread, using data from the prior six months. One month ahead
and six month ahead returns are then calculated for each portfolio. To avoid the biases in return estimates
that come from using equal-weighted portfolios, we weight each stock by its prior month return as
suggested by Asparouhova, Bessembinder and Kalcheva (2010). Abnormal returns are calculated for each
portfolio by regressing the time series of monthly returns on the Fama-French factors, as obtained from
Ken Frenchs website.
Panel A of Table VI provides results for the month after portfolio formation, where portfolios are
formed based on the Amihud measure. Looking across each row, we see that as we go from liquid to
illiquid stocks, coefficients on SMB and HML increase sharply. The intercept coefficient is small and
insignificant in general, but is 0.0094 and highly significant for the portfolio of least liquid stocks. The
last column shows the coefficients from a regression of the difference in returns between the least and
most liquid stocks on the Fama-French factors. This long-short portfolio loads strongly on SMB and
HML. The intercept indicates that after adjusting for the Fama-French factors, this portfolio would earn a
statistically significant abnormal return of 1.03% per month. These findings are generally consistent with
the results from prior studies that show a significant relation between liquidity, as measured by the
Amihud measure, and stock returns.
Results for one month returns on portfolios sorted based on the high-low spread estimator are
provided in Panel B of Table VI. Despite its simplicity and other advantages, the results are very similar
to those based on the Amihud measure. Again, coefficients on SMB and HML increase as we go from
portfolios formed of more liquid stocks to portfolios formed of less liquid stocks. More importantly, the
intercept for the illiquid portfolio is 0.0120 and the intercept from the regression based on the difference
between the least liquid and most liquid portfolios is a statistically significant 1.05% per month.
Panel C reports regression results for six month returns, where portfolios are formed based on the
Amihud measure. Because the six-month periods overlap, all t-statistics are calculated based on Newey-
32
West standard errors with five lags. As in panels A and B, the less liquid portfolios have larger
coefficients on SMB and HML than do the more liquid portfolios. The regression intercept for the least
liquid portfolio suggests that these stocks earn a statistically significant abnormal return of 5.50% in the
six months after portfolio formation. Again, the last column shows regression results when the difference
in returns between the least liquid and most liquid portfolios are regressed on the Fama-French factors.
The intercept is a highly significant 6.53% per six months.
Results for six month returns on portfolios formed based on the high-low estimator are provided
in Panel D of Table VI. Again, the results are very similar to those based on the Amihud measure. The
intercept for the least liquid portfolio is now 6.64%. Further, when the difference in returns between the
least liquid portfolio and the most liquid portfolio is regressed on the Fama-French factors, the intercept is
0.0626 and is highly significant. This difference in abnormal returns is slightly smaller than, but very
similar to that for portfolios sorted based on the Amihud measure.
Overall, the results in Table VI suggest that the power of the high-low spread estimator to predict
cross-sectional differences in returns is very similar to that of the Amihud illiquidity measure. Given its
simplicity and potential application in settings where the Amihud measure cannot be used, this suggests
that the high-low spread estimator may have many potential applications in asset pricing and other related
research areas.
C. Estimating Spreads for Non-U.S. Markets using Datastream Data
To demonstrate the applicability of the high-low estimator to non-U.S. markets, we estimate high-
low spreads for individual stocks in Hong Kong and India using daily high and low prices from
Datastream.
22
As discussed below, each of these markets provides a specific event around which we
22
Datastream data definitions suggest that closing bid and ask data are available for a small number of countries beginning in
the mid-1990s, while data on daily high and low prices are available for more than 50 countries and for longer time periods.
While data coverage in Datastream differs by country and improves over time, this suggests that the high-low spread estimator
will be a useful tool for researchers who require a measure of liquidity or transaction costs for these markets. Intraday quote data
are available for many international markets through Thomson Reuters and Bloomberg starting in the mid-1990s, but earlier
intraday data is limited. Lesmond (2005) studies the ability of the Roll (1984), Amihud (2002), and Lesmond et al. (1999)
measures to explain differences in bid-ask spreads within and across emerging markets.
33
expect execution costs to change. Results for additional countries covered by Datastream are provided in
the internet appendix. Again, we include only those stock-months with at least 12 daily spread
observations and we set all negative estimates to zero before taking the monthly average.
Hong Kong was significantly affected by the Asian Currency Crisis beginning in October 1997,
when its currency came under pressure. During this period, the equity market in Hong Kong became more
volatile, with the Hang Sang index falling 23% between October 20 and 23, 1997. We expect a
significant increase in execution costs in the Hong Kong market during this period. The cross-sectional
average of high-low spread estimates for stocks in Hong Kong is plotted by month in Panel A of Figure 3.
Because Datastream coverage increases over time, the graph also plots the number of firms used to
compute the market-wide average in each month. As expected, average bid-ask spreads in Hong Kong
increased sharply starting in October 1997. Average spreads increased from approximately 0.75% prior
to 1997 to over 1.5% in late 1997, peaking at 2.3% in February 2000. This shift in spreads coincides with
the Asian Currency Crisis and related turmoil in Hong Kongs equity markets in 1997-1998.
As of 1994, the Bombay Stock Exchange (BSE) was Indias dominant market, accounting for
75% of equity volume. In November 1994, the National Stock Exchange (NSE) opened, providing Indian
investors with an order-driven electronic limit order book, reduced tick sizes, satellite technology with
links to sites all over India, and improved settlement and clearing standards (see Shaw and Thomas
(2000)). By October 1995, NSE had surpassed the BSE, becoming the dominant equities market in India.
We expect execution costs to decrease with the introduction of this new market structure. Monthly high-
low spread estimates for India are plotted in Panel B of Figure 3. Again, the graph shows the cross-
sectional average across all stocks with available data in a given month, along with the number of firms
used to compute the market-wide average each month. As expected, the average bid-ask spread across
stocks in India decreased sharply in early 1995. Bid-ask spreads dropped from an average of
approximately 4.5% in early 1994 to approximately 1.5% in early 1995. Spreads remain low after the
introduction of the NSE, ranging from one to two percent from 1995 through 2006. This shift in spreads
34
is consistent with the hypothesis that the change in market structure brought about by the introduction of
the NSE led to a significant and permanent decrease in execution costs in India.
VII. Summary and Conclusions
In this paper, we derive a new technique for estimating bid-ask spreads from high and low prices.
The estimator is intuitive and easy to calculate. It is derived under very general conditions and does not
rely on the characteristics of any particular market. We provide a closed-form solution for the spread, so it
is easy to program and requires little computation time. The high-low spread estimator can be used with
daily high and low prices when intraday trade and quote data are unavailable. It can also be used to
estimate spreads from intraday trades when quotes are unavailable or are difficult to match with trades. It
is useful for researchers who need a simple but accurate measure of trading costs for work in corporate
finance, asset pricing, or as part of a study of market efficiency.
Simulations reveal that the high-low spread estimator is very accurate under ideal conditions.
When there are significant overnight returns and prices are observed sporadically, the high-low spread
estimator tends to underestimate spreads. Even under these more general conditions, however, high-low
spread estimates produce a correlation with simulated spreads of approximately 0.9. The simulations also
suggest that the high-low spread estimator is far more accurate than the Roll estimator.
The simulation results are borne out in the data. We examine the performance of the high-low
estimator by comparing effective spreads from TAQ with spread estimates from the high-low spread
estimator, the Roll (1984) covariance estimator, the effective tick estimator of Goyenko et al. (2009) and
Holden (2009), and the LOT measure of Lesmond et al. (1999). In cross-sectional tests, the high-low
spread estimator clearly dominates, providing higher correlations with TAQ effective spreads and with
month-to-month changes in spreads than any of the alternative spread estimators. The high-low spread
estimator works particularly well in the 1993-1996 subperiod when the minimum tick was one-eighth.
This suggests that the estimator should perform well in applications involving data from earlier time
periods. In time-series tests, the high-low spread estimator dominates other measures for smaller stocks,
35
such as those listed on Nasdaq and Amex, and performs particularly well during subperiods defined by
different tick sizes.
To illustrate the potential applications of the high-low spread estimator, we apply the estimator to
several other settings. First, we use the estimator to calculate bid-ask spreads for all NYSE/Amex stocks
from 1926 through 2006. Among other things, we show that effective spreads were extremely high during
the depression, and increased sharply in the 1974-1975 bear market and following the 1987 crash. Using
this same data, we then examine the performance of the high-low spread estimator in simple asset pricing
tests. Despite its simplicity and other advantages, we find that the power of the high-low spread estimator
to predict cross-sectional differences in returns is very similar to that of the Amihud illiquidity measure.
We also document the potential application of the estimator to non-U.S. markets by calculating high-low
spreads for securities in India and Hong Kong using data from Datastream. Several additional
applications are provided in the internet appendix. Together, these examples demonstrate the wide range
of applications for which the high-low spread estimator can be used. The high-low spread estimator can
also be used to calculate trading costs for assets other than common stock or in settings where quote data
are either unavailable or difficult to use. For example, Deuskar, Gupta, and Subrahmanyam (2011) apply
the estimator to OTC options markets and the estimator could also be applied to trade and sales data from
the futures markets.
The most important direction for further research may not be with spread estimation at all. In
deriving our spread estimator, we jointly derive an estimate of the spread and an estimate of the variance
of a stocks true value - that is, the variance without microstructure noise. Bid-ask spreads can induce a
significant upward bias in variance estimates for small stocks or even large stocks during periods with
high trading costs. Hence a variance measure that is free from bid-ask bounce may prove very useful.
23
We leave a more detailed analysis of this high-low variance estimator to future work.
23
Bandi and Russell (2006) use high-frequency data to separate the true variance from microstructure noise for S&P 100
stocks.
36
References
Amihud, Yakov, 2002, Illiquidity and stock returns: Cross-section and time-series effects, Journal of
Financial Markets 5, 31-56.
Amihud, Yakov, Beni Lauterbach, and Haim Mendelson, 2003, The value of trading consolidation:
evidence from the exercise of warrants, Journal of Financial and Quantitative Analysis 38, 829-
846.
Angel, James, 1997, Tick size, share prices, and stock splits, Journal of Finance 52, 655-681.
Antunovich, Peter, and Asani Sarkar, 2006, Fifteen minutes of fame? The market impact of internet stock
picks, Journal of Business 79, 3209-3251.
Asparouhova, Elena, Hendrik Bessembinder, and Ivalina Kalcheva, 2010, Liquidity biases in asset pricing
tests, Journal of Financial Economics 96, 215-237.
Bandi, Federico, and Jeffrey Russell, 2006, Separating microstructure noise from volatility, Journal of
Financial Economics 79, 655-692.
Beckers, Stan, 1983, Variances of security price returns based on high, low, and closing prices, Journal of
Business 56, 97-112.

Bekaert, Geert, Campbell Harvey, and Christian Lundblad, 2007, Liquidity and expected returns: Lessons
from emerging markets, Review of Financial Studies 20, 1783-1831.
Bharath, Sreedhar, Paolo Pasquariello, and Guojun Wu, 2008, Does asymmetric information drive capital
structure decisions?, Forthcoming, Review of Financial Studies.
Chakrabarti, Rajesh, Wei Huang, Narayanan Jayaraman, and Jinsoo Lee, 2005, Price and volume effects of
changes in MSCI Indices - nature and causes, Journal of Banking and Finance 29, 1237-1264.
Chan, K. C., W. G. Christie, and P. H. Schultz, 1994, Market structure and the intraday evolution of bid-
ask spreads for Nasdaq securities, Journal of Business 68, 35-60.
Christie, William, and Paul Schultz, 1994, Why do Nasdaq market makers avoid odd-eighth quotes?,
Journal of Finance 49, 1813-1840.
Chordia, Tarun, Sahn-Wook Huh, and Avanidhar Subrahmanyam, 2007, The cross-section of expected
trading activity, Forthcoming, Review of Financial Studies.
Chordia, Tarun, Richard Roll, and Avanidhar Subrahmanyam, 2001, Market liquidity and trading activity,
Journal of Finance 56, 501-530.
Conroy, Robert, Robert Harris, and Bruce Benet, 1990, The effects of stock splits on bid-ask spreads,
Journal of Finance 45, 1285-1295.
Cremers, K. J. Martijn, and Jianping Mei, 2007, Turning over turnover, Review of Financial Studies 20,
1749-1782.
37
Deuskar, Prachi, Anurag Gupta, and Marti G. Subrahmanyam, 2011, Liquidity effect in OTC options
markets: Premium or discount? Journal of Financial Markets 14, 127-160.
Fink, Jason, Kristin Fink, and James Weston, 2006, Competition on the Nasdaq and the growth of
electronic communication networks, Journal of Banking and Finance 30, 2537-2559.
French, Kenneth, and Richard Roll, 1986, Stock return variances: The arrival of information and the
reaction of traders, Journal of Financial Economics 17, 5-26.
Gallant, A. Ronald, Chien-Te Hsu, and George Tauchen, 1999, Using daily data to calibrate volatility
diffusions and extract the forward integrated variance, Review of Economics and Statistics 81, 617-631.

Garman, M. B., and M. J. Klass, 1980, On the estimation of security price volatilities from historical data,
Journal of Business 53, 67-78.
Gehrig, Thomas, and Caroline Fohlin, 2006, Trading costs in early securities markets: The case of the
Berlin stock exchange 1880-1910, Review of Finance 10, 587-612.
George, Thomas, Gautam Kaul, and M. Nimalendran, 1991, Estimation of the bid-ask spread and its
components: A new approach, Review of Financial Studies 4, 623-656.
Goyenko, Ruslan, Craig Holden, and Charles Trzcinka, 2009, Do liquidity measures measure liquidity,
forthcoming, Journal of Financial Economics.
Griffin, John, Patrick Kelly, and Federico Nardari, 2007, Measuring short-term international stock market
efficiency, Working paper, University of Texas.
Griffin, John, Federico Nardari, and Ren Stulz, 2007, Do investors trade more when stocks have
performed well? Evidence from 46 countries, Forthcoming, Review of Financial Studies.
Harris, Lawrence, 1986, A transaction data study of weekly and intradaily patterns in stock returns,
Journal of Financial Economics 16, 99-117.
Harris, Lawrence, 1989,A day-end transaction price anomaly, Journal of Financial and Quantitative
Analysis 24, 29-45.
Harris, Lawrence, 1990, Statistical properties of the Roll serial covariance bid/ask spread estimator,
Journal of Finance 45, 579-590.
Hasbrouck, Joel, 2004, Liquidity in the futures pits: Inferring market dynamics from incomplete data,
Journal of Financial and Quantitative Analysis 39, 305-326.
Hasbrouck, Joel, 2006, Trading costs and returns for U.S. equities: Estimating effective costs from daily
data. Working paper, New York University, New York, NY.
Holden, Craig, 2009, New low-frequency spread measures, Journal of Financial Markets 12, 778-813.
Kim, Joonghyuk, Ji-Chai Lin, Ajai Singh, and Wen Yu, 2007, Dual-class splits and stock liquidity,
Working paper, Case Western University.
38
Kyle, Albert, 1985, Continuous Auctions and Insider Trading, Econometrica 53, 1315-1335.
Lesmond, David, 2005, Liquidity of emerging markets, Journal of Financial Economics 77, 411-452.
Lesmond, David, Joseph Ogden, and Charles Trzcinka, 1999, A new estimate of transactions costs, Review
of Financial Studies 12, 1113-1141.
Lesmond, David, Michael Schill, and Chunsheng Zhou, 2004, The illusory nature of momentum profits,
Journal of Financial Economics 71, 349-380.
Lipson, Marc, and Sandra Mortal, 2007, Capital structure decisions and equity market liquidity, Working
paper, University of Virginia.
Lockwood, Larry J., and Scott C. Linn, 1990, An examination of stock market return volatility during
overnight and intraday periods, 1964 - 1989, Journal of Finance 45, 591-601.
McInish, Thomas H., and Robert A. Wood, 1992, An analysis of intraday patterns in bid/ask spreads for
NYSE stocks, Journal of Finance 47, 753-764.
Mei, Jianping, Jos Scheinkman, and Wei Xiong, 2005, Speculative trading and stock prices: Evidence
from Chinese A-B share premia, Working paper, Princeton University.
Oldfield, George, and Richard Rogalski, 1980, A Theory of stock returns over trading and non-trading
periods, Journal of Finance 35, 729-751.
Parkinson, M., 1980, The extreme value method for estimating the variance of the rate of return, Journal of
Business 53, 61-65.
Pstor, Lubo, and Robert Stambaugh, 2003, Liquidity risk and expected stock returns, Journal of Political
Economy 111, 642-685.
Porter, David, 1992, The probability of a trade at the ask: An examination of interday and intraday
behavior, Journal of Financial and Quantitative Analysis27, 209-227.
Roll, Richard, 1984, A simple implicit measure of the effective bid-ask spread in an efficient market,
Journal of Finance 39, 127-1139.
Schultz, Paul, 2000a, Regulatory and legal pressures and the costs of NASDAQ trading, Review of
Financial Studies 13, 917-957.
Schultz, Paul, 2000b, Stock splits, tick size, and sponsorship, Journal of Finance 55, 429-450.
Shaw, A., and S. Thomas, 2000, David and Goliath: Displacing a primary market, Global Financial
Markets, Spring 2000, 14-23.
39
Table I - The Distribution of Estimated Spreads for Alternative Spread Estimators
The table describes the distributions of estimated spreads based on several alternative forms of the high-low spread
estimator and the Roll spread estimator. Each simulation consists of 10,000 stock-months. Each month consists of 21 days
and each day consists of 390 minutes. For each minute of the day, the true value of the stock price, P
m
, is simulated as
P
m
=P
m-1
e
x
, where is the standard deviation per minute and x is a random draw from a unit normal distribution. The daily
standard deviation equals 3% and the standard deviation per minute equals 3% divided by 390. Stock prices are assumed
to be observed each minute, with a 50% chance that a bid (ask) is observed. The bid (ask) for each minute is defined as
P
m
multiplied by one minus (plus)half the assumed bid-ask spread. Daily high and low prices equal the highest and lowest
observed prices during the day. Monthly high-low spreads are estimated either by taking an average of daily High-Low
Spread estimates within the month, or by using the average and parameters within the month. Results are shown both
with and without an adjustment for Jensens inequality. Negative High-Low Spread estimates are either left unadjusted
or adjusted using one of two methods: (1) setting negative two-day spread estimates to zero before taking the monthly
average, or (2) setting negative monthly spread estimates to zero. Roll spreads are calculated as -2%Cov, where Cov is
the autocovariance of daily returns obtained from simulated closing prices. Roll Spread estimates in months with positive
autocorrelations are set to zero. For each assumed spread level, Panel A reports the mean spread estimate, the standard
deviation of spread estimates, and the proportion of spread estimates that are non-positive across the 10,000 simulations.
Panel B reports results from simulations incorporating overnight returns and infrequent observation of prices. In these
simulations, we assume a 10% chance of observing a trade at any given minute. To simulate overnight returns, we assume
that the standard deviation of close-to-open returns equals 0.5 times the standard deviation of open-to-close returns. We
then adjust for overnight returns as follows: If the high (low) for day t is less (greater) than the close for day t-1, the stock
price is assumed to have fallen (risen) overnight and both the bid and ask on day t are reduced (increased) by the
difference between the previous close and the current high (low).
Panel A. Simulated Spread Estimates Under Near Ideal Conditions
Aggregation: Average Two-Day Spread Estimates Average Parameters Roll
Spreads
Jensens Inequality: No Adj. No Adj. No Adj. Adj. No Adj. Adj.
Negative Set to Zero: No Daily Monthly No No No Monthly
0.5%
Spread
Mean 0.52% 1.43% 0.59% 0.43% 0.23% 0.24% 1.18%
0.62% 0.33% 0.50% 0.67% 0.71% 0.73% 1.37%
% # 0 19.62% 0.00% 19.62% 25.10% 36.41% 35.93% 49.01%
1.0%
Spread
Mean 0.99% 1.74% 1.01% 0.93% 0.71% 0.74% 1.31%
0.62% 0.37% 0.58% 0.66% 0.70% 0.72% 1.44%
% # 0 6.01% 0.00% 6.01% 8.25% 15.92% 15.65% 45.83%
3.0%
Spread
Mean 2.92% 3.21% 2.92% 2.93% 2.68% 2.74% 2.61%
0.62% 0.50% 0.62% 0.64% 0.70% 0.70% 1.90%
% # 0 0.00% 0.00% 0.00% 0.00% 0.02% 0.02% 23.72%
5.0%
Spread
Mean 4.88% 4.96% 4.88% 4.91% 4.67% 4.73% 4.54%
0.63% 0.58% 0.63% 0.63% 0.69% 0.68% 2.24%
% # 0 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 9.02%
8.0%
Spread
Mean 7.84% 7.85% 7.84% 7.89% 7.67% 7.73% 7.54%
0.63% 0.63% 0.63% 0.63% 0.68% 0.67% 2.79%
% # 0 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 3.14%
40
Table I (continued)
Panel B. Simulated Spread Estimates with an Overnight Return and Only 10% of Prices Observed
Aggregation: Average Two-Day Spread Estimates Average Parameters Roll
Spreads
Jensens Inequality: No Adj. No Adj. No Adj. Adj. No Adj. Adj.
Negative Set to Zero: No Daily Monthly No No No Monthly
0.5%
Spread
Mean -0.24% 1.03% 0.15% -0.39% -0.52% -0.55% 1.32%
0.65% 0.29% 0.28% 0.72% 0.74% 0.79% 1.53%
% # 0 64.08% 0.00% 64.08% 70.01% 75.74% 75.37% 49.22%
1.0%
Spread
Mean 0.05% 1.23% 0.30% -0.07% -0.23% -0.24% 1.43%
0.67% 0.32% 0.40% 0.74% 0.76% 0.80% 1.60%
% # 0 45.74% 0.00% 45.74% 52.62% 60.96% 60.45% 46.94%
3.0%
Spread
Mean 1.76% 2.45% 1.76% 1.70% 1.48% 1.53% 2.01%
0.74% 0.47% 0.73% 0.78% 0.82% 0.84% 2.05%
% # 0 1.21% 0.00% 1.21% 2.12% 4.21% 4.15% 27.42%
5.0%
Spread
Mean 3.69% 4.02% 3.69% 3.69% 3.45% 3.51% 4.48%
0.75% 0.59% 0.75% 0.77% 0.82% 0.82% 2.43%
% # 0 0.01% 0.00% 0.01% 0.01% 0.03% 0.03% 11.78%
8.0%
Spread
Mean 6.65% 6.74% 6.65% 6.68% 6.45% 6.52% 7.49%
0.75% 0.69% 0.75% 0.76% 0.81% 0.80% 2.95%
% # 0 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 4.12%
41
Table II - Correlations Between Estimated and Simulated Spreads
The table reports correlations between spread estimates and simulated spreads. Each simulation consists of 10,000 stock-months, where each month consists of 21 days
and each day consists of 390 minutes. For each minute of the day, the true value of the stock price, P
m
, is simulated as P
m
=P
m-1
e
x
, where is the standard deviation
per minute and x is a random draw from a unit normal distribution. When a stock price is observed, we assume a 50% chance that a bid (ask) is observed. The bid (ask)
for each minute is defined as P
m
multiplied by one minus (plus) half the assumed bid-ask spread. Daily high and low prices equal the highest and lowest observed prices
during the day. Monthly high-low spreads are estimated either by taking an average of daily High-Low Spread estimates within the month, or by using the average
and parameters within the month. Results are shown both with and without an adjustment for Jensens inequality. Negative High-Low Spread estimates are either
left unadjusted or adjusted using one of two methods: (1) setting negative two-day spread estimates to zero before taking the monthly average, or (2) setting negative
monthly spread estimates to zero. Roll spreads are calculated as -2Cov, where Cov is the covariance of daily returns obtained from simulated closing prices. Roll
Spread estimates in months with positive autocorrelations are either defined as negative spreads or set to zero. In Panel A, the true spread for each stock-month is drawn
randomly from a uniform distribution with a range from 0% to 6%. We then perform simulations under various alternative assumptions about prices and spreads. When
an overnight return is incorporated, we assume that the standard deviation of close-to-open returns equals 0.5 times the standard deviation of open-to-close returns.
We then adjust for overnight returns as follows: If the high (low) for day t is less (greater) than the close for day t-1, the stock price is assumed to have fallen (risen)
overnight and both the bid and ask on day t are reduced (increased) by the difference between the previous close and the current high (low). When prices are observed
infrequently, we assume a 10% change of observing a price in any given minute. In simulations with autocorrelated returns, we assume the innovation to the daily
expected return is normally distributed with a standard deviation of 1% per day. The daily expected return is then defined as the sum of the innovation plus 0.5 times
the previous days expected return and the expected return for each one minute interval is defined as the daily expected return divided by 390. In simulations with
random spreads, the initial spread is drawn from a uniform distribution ranging from 0% to 6%. Each days spread is then obtained by multiplying the previous days
spread by e

, where is normally distributed with zero mean and standard deviation equal to 0.1. In this case, the simulated monthly spread is defined as the average
spread across the 21 days within the month. In Panel B, the true spread for each stock-month is drawn randomly from a uniform distribution with a specified range.
These simulations incorporate both overnight returns and infrequent observation of prices.
42
Table II (continued)
Daily
Standard
Deviation
Aggregation: Average Two-Day Spread Estimates Average Parameters
Roll Spreads
Jensens inequality: No Adj. No Adj. No Adj. Adjusted No Adj. Adjusted
Negative Set to 0: No Daily Monthly No No No Monthly No
No Overnight Returns, Prices Observed Each Minute, Spreads Constant Over 21 Days, Returns not Autocorrelated:
3% 0.937 0.940 0.928 0.937 0.926 0.926 0.573 0.524
5% 0.848 0.865 0.825 0.847 0.822 0.822 0.338 0.297
With Overnight Return:
3% 0.912 0.925 0.899 0.911 0.898 0.898 0.523 0.470
5% 0.788 0.835 0.753 0.784 0.755 0.755 0.290 0.248
With Overnight Return and Prices Observed for Only 10% of Minutes:
3% 0.902 0.922 0.880 0.899 0.886 0.886 0.523 0.470
5% 0.757 0.828 0.697 0.749 0.720 0.719 0.289 0.248
With Overnight Return, Prices Observed for Only 10% of Minutes, and Autocorrelated Daily Returns:
3% 0.890 0.914 0.859 0.886 0.870 0.869 0.504 0.449
5% 0.747 0.821 0.682 0.738 0.708 0.707 0.281 0.240
With Overnight Return, Prices Observed for Only 10% of Minutes, Autocorrelated Daily Returns, and Spreads that Follow a Random Walk:
3% 0.923 0.941 0.908 0.919 0.910 0.910 0.602 0.531
5% 0.815 0.875 0.789 0.806 0.789 0.787 0.381 0.319
43
Table III - Summary Statistics for Spreads based on Alternative Estimation Methods
The table provides summary statistics for spread estimates based on the pooled sample of monthly time-series and cross
sectional observations from 1993 through 2006. The sample includes all NYSE, Amex, and Nasdaq listed securities with
at least six months of data and for which TAQ and CRSP data could be matched. Monthly observations are dropped if
based on fewer than 12 daily observations or if spread estimates are missing for the Roll Spread, Tick Spread, LOT
Measure, or HL Spread. Effective Spread is the trade-weighted percentage effective spread estimated from TAQ and
averaged across days within the month. The Roll Spread is two times the square root of the -1 x the autocovariance of
daily returns. The Effective Tick Spread assumes that the spread is equal to the tick increment used in trade prices. The
effective tick spread for the month is based on the average of the two-day spreads implied by daily trade prices. HL Spread
is the equally-weighted average of the high-low spread estimator across all overlapping two-day periods within the month.
The table lists results for the HL Spread using three alternative methods to account for negative spread estimates within
the month: (1) set negative two-day spreads to zero, (2) leave negative two-day spreads unchanged, and (3) exclude
negative two-day spreads. Similarly, results for the Roll Spread are provided using two alternative methods to handle
negative covariances: (1) setting spreads to zero when the covariance is negative and (2) exclude spreads when the
covariance is negative.
N Mean(%) Median(%) Std. Dev.(%) % #0
Effective Spread 973229 2.38 1.29 3.37 0.00
Roll Spread
Neg=0
973229 2.42 1.21 3.88 38.04
Roll Spread
Neg Dropped
603032 3.90 2.66 4.30 0.00
Eff. Tick Spread 973229 1.67 0.72 3.39 0.00
Lot Measure 973229 2.15 0.86 4.94 24.43
HL Spread
Neg=0
973229 2.10 1.32 2.65 0.00
HL Spread
NegIncluded
973229 1.26 0.56 2.62 24.00
HL Spread
NegDropped
973227 2.76 1.91 2.97 0.00
44
Table IV - Average Cross-Sectional Correlations
For each spread measure and each month from 1993 through 2006, we estimate the cross-sectional correlation between
the spread measure and the effective spread from TAQ. The table lists the average cross-sectional correlation across all
months. The sample includes all NYSE, Amex, and Nasdaq listed securities with at least six months of data and for which
TAQ and CRSP data could be matched. Monthly observations are dropped if based on fewer than 12 daily observations
or if spread estimates are missing for the Roll Spread, Tick Spread, LOT Measure, or HL Spread. Panel A lists results
based on monthly spreads and Panel B lists results based on first differences in monthly spreads.
N
Roll
Spread
Effective Tick
Spread
Lot
Measure
High-Low
Spread
Panel A - Correlations with Effective Spread, Monthly Estimates
Full Period 168 0.637 0.683 0.635 0.829
1993-1996 53 0.789 0.741 0.718 0.930
1997-2000 43 0.632 0.718 0.680 0.862
2001-2006 72 0.528 0.619 0.546 0.732
Panel B Correlations with Changes in Effective Spreads, Monthly Estimates
Full Period 167 0.249 0.183 0.186 0.472
1993-1996 52 0.320 0.177 0.221 0.570
1997-2000 43 0.249 0.162 0.191 0.446
2001-2006 72 0.199 0.147 0.158 0.391
45
Table V - Summary Statistics for Stock-by-Stock Time Series Correlations
For spread measure and each stock, we estimate the time-series correlation between the estimated spread measure and
the effective spread from TAQ. The table lists the average time-series correlation across all stocks. The sample includes
all NYSE, Amex, and Nasdaq listed securities with at least six months of data and for which TAQ and CRSP data could
be matched. Monthly observations are dropped if based on fewer than 12 daily observations or if spread estimates are
missing for the Roll Spread, Tick Spread, LOT Measure, or HL Spread. Panels A, B, and C provide results for subperiods
corresponding to different tick size regimes. Panel D provides results for the full sample period. Stocks are also separated
by exchange and market capitalization quintile based on the CRSP exchange code and market capitalization on the last
date the firm is listed on CRSP. Market capitalization quintiles are based on NYSE breakpoints.
N
Roll
Spread
Effective Tick
Spread
Lot
Measure
High - Low
Spread
Panel A Correlations with Effective Spread (1993-1996)
Full Sample 9056 0.348 0.476 0.338 0.649
NYSE 2360 0.199 0.506 0.268 0.495
Amex 757 0.313 0.436 0.313 0.602
Nasdaq 5939 0.412 0.469 0.368 0.716
MV Quintile 1 5968 0.408 0.456 0.372 0.700
MV Quintile 2 1107 0.307 0.495 0.275 0.652
MV Quintile 3 749 0.240 0.514 0.280 0.586
MV Quintile 4 644 0.172 0.523 0.255 0.497
MV Quintile 5 497 0.106 0.567 0.253 0.338
Panel B Correlations with Effective Spread (1997-2000)
Full Sample 9349 0.279 0.507 0.312 0.574
NYSE 2317 0.163 0.516 0.246 0.496
Amex 771 0.321 0.507 0.392 0.628
Nasdaq 6261 0.317 0.503 0.327 0.596
MV Quintile 1 6311 0.344 0.509 0.361 0.639
MV Quintile 2 1058 0.186 0.492 0.220 0.525
MV Quintile 3 760 0.139 0.510 0.211 0.441
MV Quintile 4 658 0.106 0.507 0.193 0.382
MV Quintile 5 488 0.081 0.512 0.191 0.305
46
Table V (continued)
N
Roll
Spread
Effective Tick
Spread
Lot
Measure
High-Low
Spread
Panel C Correlations with Effective Spread (2001-2006)
Full Sample 7427 0.273 0.515 0.288 0.564
NYSE 1898 0.160 0.592 0.221 0.470
Amex 718 0.306 0.506 0.387 0.579
Nasdaq 4811 0.313 0.487 0.299 0.599
MV Quintile 1 5056 0.312 0.466 0.322 0.582
MV Quintile 2 819 0.181 0.599 0.257 0.474
MV Quintile 3 598 0.183 0.631 0.229 0.490
MV Quintile 4 508 0.188 0.648 0.176 0.567
MV Quintile 5 408 0.208 0.626 0.133 0.616
Panel D Correlations with Effective Spread (1993-2006)
Full Sample 12507 0.343 0.599 0.443 0.626
NYSE 2894 0.159 0.704 0.441 0.412
Amex 1118 0.381 0.547 0.441 0.667
Nasdaq 8495 0.400 0.571 0.444 0.693
MV Quintile 1 8559 0.418 0.550 0.451 0.702
MV Quintile 2 1481 0.247 0.662 0.405 0.581
MV Quintile 3 977 0.173 0.719 0.437 0.480
MV Quintile 4 804 0.122 0.746 0.431 0.385
MV Quintile 5 573 0.050 0.789 0.444 0.182
47
Table VI - Abnormal Returns on Liquidity-Sorted Portfolios
For each month over 1927 - 2006, we calculate the high-low spread estimate and Amihud illiquidity measure for all NYSE/Amex stocks with month-end prices of
$5 or more. We discard all stocks without at least 12 days of positive volume (needed for the Amihud measure) or fewer than 12 CRSP returns. Stocks are then
sorted into 10 portfolios by high-low spreads, and separately, into 10 portfolios by Amihud measure. To minimize biases in calculated returns, stocks are weighted
by formation month gross returns. Portfolio returns are calculated for one month and six months ahead. Abnormal returns are calculated by regressing the time series
of portfolio returns on the Fama-French factors. The table reports intercepts and slope coefficients from these regressions, with t-statistics reported in parentheses
below the coefficients. Panels A and B report results for one-month ahead returns for portfolios sorted based on the Amihud measure and the high-low spread
measure, respectively. Panels C and D report results for six-month ahead returns for portfolios sorted based on the Amihud measure and the high-low spread
measure, respectively. In panels C and D, where overlapping six-month ahead returns are examined, t-statistics are based on Newey-West standard errors with five
lags.
Liquid 2 3 4 5 6 7 8 9 Illiquid 10 - 1
Panel A: Amihud Measure, One Month Ahead Returns
R
t
Mkt
- R
t
F
1.0510
(143.2)
1.0359
(109.5)
1.0496
(112.2)
1.0354
(105.0)
1.0215
(90.20)
1.0035
(95.64)
1.0450
(86.82)
1.0237
(84.21)
1.0587
(78.94)
1.0843
(60.02)
0.0333
(1.75)
SMB -0.1142
(-9.82)
0.0111
(0.74)
0.1478
(9.97)
0.2740
(17.53)
0.3378
(18.82)
0.4628
(27.83)
0.5500
(28.83)
0.6917
(35.90)
0.8317
(39.13)
1.1393
(39.79)
1.2535
(41.47)
HML 0.0609
(5.78)
0.1031
(7.59)
0.2065
(15.38)
0.1981
(14.00)
0.2005
(12.33)
0.2909
(19.31)
0.4217
(24.40)
0.4739
(27.15)
0.6127
(31.83)
0.8924
(34.41)
0.8315
(30.37)
Intercept -0.0009
(-2.39)
0.0001
(0.22)
-0.0003
(-0.73)
0.0005
(0.97)
0.0001
(0.20)
0.0002
(0.30)
0.0007
(1.22)
0.0016
(2.58)
0.0024
(3.48)
0.0094
(10.26)
0.0103
(10.64)
Panel B: High-Low Spreads, One Month Ahead Returns
R
t
Mkt
- R
t
F
0.8277
(81.50)
0.9052
(98.06)
0.9683
(103.5)
1.0446
(99.68)
1.0918
(103.5)
1.1233
(98.50)
1.1328
(98.20)
1.1095
(85.42)
1.1167
(81.18)
1.0959
(63.73)
0.2682
(12.78)
SMB 0.0091
(0.57)
0.1027
(7.02)
0.1703
(11.48)
0.2384
(14.35)
0.3225
(19.28)
0.3898
(21.57)
0.4856
(26.56)
0.6004
(29.17)
0.8194
(37.58)
1.2229
(44.87)
1.2138
(36.50)
HML 0.0069
(0.48)
0.1083
(8.17)
0.1724
(12.83)
0.2585
(17.18)
0.3910
(25.81)
0.4078
(24.91)
0.4572
(27.61)
0.4468
(23.96)
0.5452
(27.61)
0.6802
(27.55)
0.6732
(22.35)
Intercept 0.0014
(2.80)
0.0008
(1.75)
0.0005
(1.04)
0.0001
(0.27)
-0.0004
(-0.82)
-0.0007
(-1.27)
-0.0012
(-2.05)
-0.0002
(-0.36)
0.0014
(1.95)
0.0120
(13.69)
0.0105
(9.87)
48
Table VI (Continued)
Liquid 2 3 4 5 6 7 8 9 Illiquid 10 - 1
Panel C: Amihud Measure, Six Month Ahead Returns
R
t
Mkt
- R
t
F
0.9825
(39.08)
0.9269
(30.95)
0.9260
(33.27)
0.9330
(34.52)
0.9263
(32.28)
0.9337
(33.11)
0.9704
(30.15)
0.9949
(25.21)
0.9980
(25.82)
1.0664
(24.57)
0.0839
(2.05)
SMB -0.0762
(-1.69)
0.1033
(1.64)
0.2552
(4.94)
0.3797
(8.06)
0.4788
(8.52)
0.5441
(8.44)
0.7126
(14.20)
0.8444
(13.46)
1.0071
(16.70)
1.5051
(16.49)
1.5812
(18.51)
HML 0.0560
(1.19)
0.1387
(2.77)
0.1493
(2.72)
0.1737
(2.93)
0.2012
(3.44)
0.2432
(3.90)
0.3239
(5.03)
0.4044
(5.06)
0.4859
(8.42)
0.5748
(7.22)
0.5188
(7.58)
Intercept -0.0103
(-3.74)
-0.0055
(-2.02)
-0.0068
(-2.44)
-0.0045
(-1.56)
-0.0047
(-1.60)
-0.0033
(-1.15)
-0.0023
(-0.74)
0.0021
(0.57)
0.0082
(2.19)
0.0550
(10.22)
0.0653
(12.34)
Panel D: High-Low Spreads, Six Month Ahead Returns
R
t
Mkt
- R
t
F
0.7969
(24.88)
0.8715
(32.37)
0.9229
(32.40)
0.9474
(33.30)
0.9733
(33.93)
0.9984
(34.22)
1.0162
(33.95)
1.0081
(35.81)
1.0116
(28.56)
1.1174
(27.02)
0.3206
(6.51)
SMB 0.0440
(0.82)
0.1652
(3.92)
0.2342
(4.05)
0.3214
(6.63)
0.4510
(9.54)
0.5360
(11.19)
0.6346
(11.48)
0.8299
(13.93)
0.9838
(15.69)
1.5598
(17.52)
1.5158
(16.38)
HML -0.0021
(-0.04)
0.1053
(1.88)
0.1529
(2.85)
0.2244
(3.93)
0.3116
(5.59)
0.3186
(5.78)
0.3514
(5.68)
0.4072
(7.22)
0.4303
(7.12)
0.4557
(4.95)
0.4579
(5.13)
Intercept 0.0038
(1.15)
-0.0016
(-0.61)
-0.0044
(-1.55)
-0.0042
(-1.56)
-0.0078
(-2.91)
-0.0088
(-3.15)
-0.0097
(-3.05)
-0.0072
(-2.05)
0.0011
(0.30)
0.0664
(12.51)
0.0626
(10.08)
49
Panel A - Correlations based on Percentage Spreads
Panel B - Correlations based on First Differences in Percentage Spreads
Figure 1 - Cross Sectional Correlations of Spread Estimates with TAQ Effective Spreads by Month
The figure plots monthly cross-sectional correlations between three estimated spread measures and the effective spread from
TAQ. The correlations shown in Panel A are estimated from monthly spread estimates. The correlations shown in Panel B are
estimated from first differences in monthly spread estimates. The sample includes all NYSE, Amex, and Nasdaq listed securities
with at least six months of data and for which TAQ and CRSP data could be matched. Monthly observations are dropped if
based on fewer than 12 daily observations or if spread estimates are missing for the Roll Spread, Tick Spread, LOT Measure,
or HL Spread.
50
Panel A - Mean Monthly Spread Estimates from 1926 to 2006
Panel B - Mean Monthly Spread Estimates from 1950 to 2006
Figure 2 - Historical High-Low Spread Estimates based on CRSP Data
High-low spreads are estimated for each stock each month by averaging two-day spread estimates within the month. The graph
plots the equally weighted average spread by month across all stocks with at least 13 daily spread observations within the
month. Results are shown for the full sample of NYSE stocks, and for the smallest and largest deciles by market capitalization.
The graph also shows the number of firms included in the average each month. Panel A shows results from 1926-2006 and
while Panel B shows results from 1950-2006. All data are from CRSP.
51
Panel A - Average High-Low Spreads for Stocks in Hong Kong by Month, 1988-2007
Panel B - Average High-Low Spreads for Stocks in India by Month, 1990-2007
Figure 3 - Historical High-Low Spread Estimates based on Datastream Data
High-low spreads are estimated for each stock each month by averaging two-day spread estimates within the month. The graph
plots the equally weighted average spread by month across all stocks with at least 12 daily spread observations within the
month. The graph also shows the number of firms included in the average each month. Panel A shows results for stocks in Hong
Kong and Panel B shows results for stocks in India. All data are from Datastream.
52






Internet Appendix for
A Simple Way to Estimate Bid-Ask Spreads from Daily High and Low Prices
*




Shane A. Corwin
and
Paul Schultz












*Citation Format: Corwin, Shane A., and Paul Schultz, 2011, Internet Appendix to "A Simple Way to Estimate Bid-
Ask Spreads from Daily High and Low Prices," Journal of Finance [vol #], [pages],
http://www.afajof.org/IA/2011.asp. Please note: Wiley-Blackwell is not responsible for the content or functionality
of any supporting information supplied by the authors. Any queries (other than missing material) should be directed
to the authors of the article.
A-1

Internet Appendix for
A Simple Way to Estimate Bid-Ask Spreads from Daily High and Low Prices

Contents

Page
Additional Results Related to Monthly High-Low Spread Estimates:
Figure A1: Cross-Sectional Mean and Median Absolute Errors by Month ..................................... A-2
Table A1: Frequency of Daily Data Adjustments and Spread Characteristics ................................. A-3
Table A2: High-Low Spread Estimates based on Alternative Overnight Return Adjustments ........ A-4
Table A3: Pooled Correlations between TAQ Spreads and Estimated Liquidity Measures ............ A-5
Table A4: Cross-Sectional Mean Absolute Errors ........................................................................... A-6
Table A5: Stock-by-Stock Time-Series Mean Absolute Errors ....................................................... A-7
Table A6: Summary Statistics for Spread Estimates (Alternative LOT Estimators) ....................... A-8
Table A7: Average Cross-Sectional Correlations (Alternative LOT Estimators) ............................ A-9
Table A8: Average Time-Series Correlations (Alternative LOT Estimators) ................................ A-10
Additional Applications of the High-Low Spread Estimator:
An Application to Non-U.S. Markets using Datastream Data........................................................ A-12
An Application to Daily Event Studies: Stocks Splits, 1926-1982 ................................................ A-15
An Application to Intraday Trade Data .......................................................................................... A-16
Table A9: Summary Statistics for High-Low Spreads by Country ................................................ A-20
Table A10: Correlations in High-Low Spreads Across Countries ................................................. A-21
Figure A2: Average High-Low Spreads by Month using Datastream Data ................................... A-22
Figure A3: Average High-Low Spreads around Stock Splits ......................................................... A-25
Table A11: Summary Statistics for Daily High-Low Spreads and TAQ Effective Spreads .......... A-26
Table A12: Correlations of Daily High-Low Spreads with TAQ Effective Spreads ..................... A-27
Figure A4: Daily Patterns in High-Low Spreads............................................................................ A-28
Figure A5: Intraday Patterns in High-Low Spreads ....................................................................... A-29
A-2

0.000
0.005
0.010
0.015
0.020
0.025
1
9
9
3
0
1
1
9
9
3
0
7
1
9
9
4
0
1
1
9
9
4
0
7
1
9
9
5
0
1
1
9
9
5
0
7
1
9
9
6
0
1
1
9
9
6
0
7
1
9
9
7
0
1
1
9
9
7
0
7
1
9
9
8
0
1
1
9
9
8
0
7
1
9
9
9
0
1
1
9
9
9
0
7
2
0
0
0
0
1
2
0
0
0
0
7
2
0
0
1
0
1
2
0
0
1
0
7
2
0
0
2
0
1
2
0
0
2
0
7
2
0
0
3
0
1
2
0
0
3
0
7
2
0
0
4
0
1
2
0
0
4
0
7
2
0
0
5
0
1
2
0
0
5
0
7
2
0
0
6
0
1
2
0
0
6
0
7
SPRD_ROLL SPRD_TCK SPRD_LOT SPRD_HL
0.000
0.005
0.010
0.015
0.020
0.025
0.030
0.035
1
9
9
3
0
1
1
9
9
3
0
7
1
9
9
4
0
1
1
9
9
4
0
7
1
9
9
5
0
1
1
9
9
5
0
7
1
9
9
6
0
1
1
9
9
6
0
7
1
9
9
7
0
1
1
9
9
7
0
7
1
9
9
8
0
1
1
9
9
8
0
7
1
9
9
9
0
1
1
9
9
9
0
7
2
0
0
0
0
1
2
0
0
0
0
7
2
0
0
1
0
1
2
0
0
1
0
7
2
0
0
2
0
1
2
0
0
2
0
7
2
0
0
3
0
1
2
0
0
3
0
7
2
0
0
4
0
1
2
0
0
4
0
7
2
0
0
5
0
1
2
0
0
5
0
7
2
0
0
6
0
1
2
0
0
6
0
7
SPRD_ROLL SPRD_TCK SPRD_LOT SPRD_HL
Panel A - Mean Absolute Errors by Month





















Panel B - Median Absolute Errors by Month






















Figure A1: Cross Sectional Mean and Median Absolute Errors of Spread Estimates by Month
The figure plots the mean and median absolute error across all securities by month from 1993 through 2006. The
error and absolute error are defined for each stock-month based on the difference between the estimated spread
measure and the effective spread from TAQ. The mean and median errors are then calculated across all stocks in a
given month. The full sample includes all NYSE, Amex, and Nasdaq listed securities for which TAQ and CRSP data
could be matched. Observations are then dropped if there are fewer than six monthly observations for the firm or if
spread estimates are missing for the Roll Spread, Tick Spread, LOT measure, or HL Spread.
A-3

Table A1: Frequency of Daily Data Adjustments and Spread Characteristics
The table reports the fraction of daily observations for which specific data adjustments are made or for which specific data characteristics apply. The first three
columns report the percentage of daily observations for which missing high and low prices are replaced based on the prior day's high and low prices. This can
occur because there is zero volume on the day or because all trades take place at the same price (i.e., high=low). Column four reports the percentage of daily
observations for which the high and low are adjusted due to overnight returns, based on the method described in Section IIA. Column five reports the percentage
of daily observations for which the high and low are identical to the prior day's high and low. Finally, the last column lists the percentage of daily observations
for which the high-low spread estimate is negative. The monthly high-low spread estimates in the paper are estimated after setting these negative daily values to
zero. Frequencies are calculated based on daily observations within each month. The first row of the table then reports the average across all stock months.
Subperiod frequencies are estimated by taking a cross-sectional average for each month and then averaging across all months within the subperiod. Frequencies
for market capitalization quintiles are estimated by taking a time-series average for each stock and then averaging across all stocks within the specific market
capitalization quintile. Stocks are assigned to market capitalization quintile based on NYSE breakpoints and firm market capitalization on the last date the firm is
listed on CRSP.

Reset Missing High and Low Overnight
Return
Adjustment
High and Low
Constant Across
Days
Negative
Daily Spread
Estimate
Zero Volume High=Low All Cases
Full Sample
4.11 7.35 11.46 20.19 16.47 29.26

1993-1996
5.31 10.88 16.19 15.16 28.95 24.37
1997-2000
3.99 7.76 11.75 21.11 15.06 31.00
2001-2006
3.07 3.67 6.74 24.04 5.99 32.34

MV Quintile 1
7.30 12.05 19.35 19.92 25.05 24.44
MV Quintile 2
1.55 3.96 5.51 18.20 12.73 31.18
MV Quintile 3
0.48 1.60 2.08 18.93 7.96 35.35
MV Quintile 4
0.28 0.87 1.15 19.93 5.51 37.87
MV Quintile 5
0.06 0.12 0.18 22.67 2.07 40.18

Across All Daily
Observations
5.99 8.10 14.08 22.55 16.34 29.97

Across All Monthly
Observations (N>=12)
6.01 6.84 12.85 22.88 16.53 29.99


A-4

Table A2: High-Low Spread Estimates based on Alternative Overnight Return Adjustments
The table provides characteristics for monthly high-low spread estimates without any adjustment for overnight
returns and with two alternative overnight return adjustments. The first overnight return adjustment is the one used
throughout the paper, based on the day t close and the day t+1 high-low range. Specifically, if the day t+1 low
(high) is above (below) the day t close, we decrease (increase) the day t+1 high and low by the amount of the
difference. The alternative overnight return adjustment is based on the day t close and the day t+1 open. In this case,
if the day t+1 open is above (below) the day t close, we decrease (increase) the day t+1 high and low by the amount
of the difference. The mean, median, standard deviation, and correlation are estimated using the full sample of
pooled stock-months. The frequency of overnight return adjustments and negative daily estimates are calculated
within each stock-month and then averaged across all stock-months.


No Overnight
Return Adjustment
Current Overnight
Return Adjustment
Alternative
Overnight Return
Adjustment

Pooled Correlation with TAQ Eff Spread 0.877 0.899 0.745
Mean (%) 1.95 2.10 0.88
Median (%) 1.22 1.32 0.50
Std. Dev. (%) 2.53 2.65 1.98
Daily Obs with Overnight Return Adjustments (%) 0.00 18.09 87.81
Negative Daily Estimates (%) 34.92 29.26 49.22
Negative Monthly Estimates when Negative
Daily Estimates are Included (%)
43.15 24.00 83.70


A-5

Table A3: Pooled Correlations between TAQ Spreads and Estimated Liquidity Measures
The table lists correlations among the spread estimates based on the pooled sample of monthly time-series and cross-sectional observations from 1993 through 2006.
The sample includes all NYSE, Amex, and Nasdaq listed securities with at least six months of data and for which TAQ and CRSP data could be matched. Monthly
observations are dropped if based on fewer than 12 daily observations or if spread estimates are missing for the Roll Spread, Tick Spread, LOT measure, or HL Spread.

TAQ Effective
Spread
TAQ Quoted
Spread
Roll
Spread
Effective Tick
Spread
LOT
Measure
Amihud
Measure
High-Low
Spread
Effective Spread 1.000
Quoted Spread 0.979 1.000
Roll Spread 0.707 0.695 1.000
Tick Spread 0.734 0.724 0.526 1.000
Lot Measure 0.699 0.676 0.506 0.677 1.000
Amihud Measure 0.362 0.360 0.297 0.278 0.268 1.000
HL Spread 0.899 0.883 0.702 0.719 0.692 0.359 1.000



A-6

Table A4: Cross-Sectional Mean Errors and Absolute Errors
For each stock-month, errors and absolute errors are defined for each spread measure based on the difference between the spread measure and the TAQ effective
spread. For each month from 1993 through 2006, we estimate the mean error and mean absolute error across all cross sectional observations. The table then lists the
mean across months. For reporting purposes, mean errors are multiplied by 100. The full sample includes all NYSE, Amex, and Nasdaq listed securities for which
TAQ and CRSP data could be matched. Observations are then dropped if there are fewer than six monthly observations for the firm or if spread estimates are missing
for the Roll Spread, Tick Spread, LOT measure, or HL Spread. Panel A lists results based on monthly spreads and Panel B lists results based on first differences in
monthly spreads.



N Mean Error Mean Absolute Error
Roll
Spread
Eff. Tick
Spread
LOT
Measure
High-Low
Spread
Roll
Spread
Eff. Tick
Spread
LOT
Measure
High-Low
Spread
Panel A Errors Based on Monthly Spread Levels
Full Period 168 0.0007 -0.0072 -0.0026 -0.0024 0.0166 0.0113 0.0132 0.0090

1993-1996 53 -0.0057 -0.0086 -0.0007 -0.0077 0.0173 0.0152 0.0180 0.0094
1997-2000 43 0.0020 -0.0043 -0.0004 -0.0027 0.0195 0.0104 0.0142 0.0092
2001-2006 72 0.0047 -0.0079 -0.0053 0.0016 0.0144 0.0088 0.0090 0.0086
Panel B Errors Based on Changes in Monthly Spreads
Full Period 167 0.0071 0.0021 0.0084 0.0012 0.0214 0.0086 0.0146 0.0067

1993-1996 52 0.0045 0.0008 0.0038 -0.0012 0.0219 0.0127 0.0214 0.0078
1997-2000 43 0.0307 0.0144 0.0300 0.0071 0.0255 0.0108 0.0175 0.0075
2001-2006 72 -0.0051 -0.0044 -0.0010 -0.0006 0.0187 0.0044 0.0081 0.0054


A-7

Table A5: Stock-by-Stock Time-Series Mean Errors and Absolute Errors
For each stock-month, errors and absolute errors are defined for each spread measure based on the difference between the spread measure and the TAQ effective
spread. For each stock, we estimate the mean error and mean absolute error across all monthly time series observations from 1993 through 2006. The table then lists
the mean across all stocks. The full sample includes all NYSE, Amex, and Nasdaq listed securities for which TAQ and CRSP data could be matched. Observations are
then dropped if there are fewer than six monthly observations for the firm or if spread estimates are missing for the Roll Spread, Tick Spread, LOT measure, or HL
Spread. Stocks are also separated by exchange and market capitalization quintile based on the CRSP exchange code and market capitalization on the last date the firm
is listed on CRSP. Market capitalization quintiles are based on NYSE breakpoints.

Mean Error Mean Absolute Error
N Roll
Spread
Eff. Tick
Spread
LOT
Measure
High-Low
Spread
Roll
Spread
Eff. Tick
Spread
LOT
Measure
High-Low
Spread
Full Sample 12,507 -0.0014 -0.0086 -0.0016 -0.0048 0.0197 0.0145 0.0174 0.0108

NYSE 2,894 0.0021 0.0003 0.0043 0.0008 0.0100 0.0036 0.0080 0.0046
Amex 1,118 -0.0068 -0.0050 0.0051 -0.0100 0.0227 0.0168 0.0232 0.0129
Nasdaq 8,495 -0.0019 -0.0121 -0.0045 -0.0060 0.0225 0.0180 0.0199 0.0126

MV Quintile 1 8,555 -0.0032 -0.0112 -0.0022 -0.0076 0.0236 0.0192 0.0223 0.0131
MV Quintile 2 1,487 0.0016 -0.0043 -0.0009 0.0004 0.0123 0.0058 0.0082 0.0060
MV Quintile 3 975 0.0032 -0.0021 0.0001 0.0020 0.0106 0.0033 0.0056 0.0049
MV Quintile 4 803 0.0036 -0.0010 0.0004 0.0028 0.0091 0.0021 0.0042 0.0044
MV Quintile 5 574 0.0049 -0.0001 0.0004 0.0042 0.0079 0.0010 0.0025 0.0046

A-8

Table A6: Summary Statistics for Spread Estimates (Alternative LOT Estimators)
The table provides summary statistics for spread estimates based on the pooled sample of monthly time-series and
cross-sectional observations from 1993 through 2006. The sample includes all NYSE, Amex, and Nasdaq listed
securities with at least six months of data and for which TAQ and CRSP data could be matched. Monthly
observations are dropped if based on fewer than 12 daily observations or if spread estimates are missing for the Roll
Spread, Tick Spread, LOT Measure, or HL Spread. Effective Spread is the trade-weighted percentage effective
spread estimated from TAQ and averaged across days within the month. HL Spread is the equally-weighted average
of the high-low spread estimator across all overlapping two-day periods within the month, where negative two-day
spreads are set to zero. The LOT Measure is estimated based on two alternative methods for separating the positive
and negative return regions in the likelihood function, as described in Goyenko, Holden, and Trzcinka (2009). The
first method, LOT Y-Split, defines these regions based on stock returns only. The second method, LOT-Mixed,
defines these regions based on both stock returns and market returns. Following the estimation in the paper, we
define the market return for the LOT Y-Split measure based on the CRSP value weighted index. For the LOT Mixed
measure, we follow Goyenko et al. (2009) in defining the market return based on the CRSP equal weighted index.
The LOT measure is undefined if either the number of returns during the month is less than five or the percentage of
zero returns during the month is greater than 80%. A firm-month is deleted if any of the spread estimates are missing
or are greater than 50%.


N
Mean
(%)
Median
(%)
Std. Dev.
(%)
%<=0
Pooled
Correlation
with Eff.
Spread
Effective Spread 966,333 2.32 1.28 3.05 0.00 1.000
HL Spread 966,333 2.10 1.32 2.65 0.00 0.888
Roll Spread 966,333 2.36 1.21 3.54 38.08 0.685
Eff. Tick Spread 966,333 1.60 0.71 2.83 0.00 0.732
Lot Y-Split 966,333 2.02 0.86 3.50 24.54 0.733
Lot Mixed 966,333 4.02 2.65 4.57 4.39 0.740

A-9

Table A7: Average Cross-Sectional Correlations (Alternative LOT Estimators)
For each spread measure and each month from 1993 through 2006, we estimate the cross-sectional correlation
between the spread measure and the effective spread from TAQ. The table lists the average cross-sectional
correlation across all months, with results shown for the full sample period and for subperiods corresponding to
different tick size regimes. Spread measures and sample restrictions are as described in Table A6.


N
High-Low
Spread
Roll
Spread
Effective
Tick Spread

LOT Y-Split

LOT Mixed
Full Period 168 0.823 0.626 0.684 0.658 0.701
1993-1996 53 0.924 0.765 0.732 0.730 0.832
1997-2000 43 0.856 0.621 0.727 0.707 0.722
2001-2006 72 0.729 0.527 0.624 0.576 0.593



A-10

Table A8: Average Stock-by-Stock Time-Series Correlations (Alternative LOT Estimators)
For each spread measure and each stock, we estimate the time-series correlation between the estimated spread
measure and the effective spread from TAQ. The table lists the average time-series correlation across all stocks.
Panels A, B, and C provide results for subperiods corresponding to different tick size regimes. Panel D provides
results for the full sample period from 1993-2006. Stocks are also separated by exchange and market capitalization
quintile based on the CRSP exchange code and market capitalization on the last date the firm is listed on CRSP.
Market capitalization quintiles are based on NYSE breakpoints. Spread measures and sample restrictions are as
described in Table A6.




N
High-Low
Spread
Roll
Spread
Effective
Tick Spread

LOT Y-Split

LOT Mixed
Panel A Average Correlation (1993-1996)
Full Sample 9,031 0.650 0.345 0.477 0.338 0.404

NYSE 2,358 0.498 0.200 0.506 0.272 0.275
Amex 755 0.606 0.310 0.439 0.321 0.395
Nasdaq 5,918 0.716 0.408 0.471 0.367 0.456

MV Quintile 1 5,942 0.700 0.404 0.458 0.372 0.463
MV Quintile 2 1,111 0.655 0.312 0.496 0.279 0.324
MV Quintile 3 746 0.589 0.238 0.517 0.284 0.290
MV Quintile 4 642 0.499 0.171 0.525 0.254 0.260
MV Quintile 5 498 0.338 0.107 0.567 0.252 0.213
Panel B Average Correlation (1997-2000)
Full Sample 9,331 0.574 0.277 0.506 0.313 0.316

NYSE 2,316 0.496 0.163 0.517 0.247 0.222
Amex 770 0.630 0.325 0.507 0.396 0.397
Nasdaq 6,245 0.595 0.313 0.502 0.328 0.342

MV Quintile 1 6,292 0.637 0.341 0.509 0.362 0.379
MV Quintile 2 1,061 0.530 0.188 0.492 0.224 0.213
MV Quintile 3 758 0.441 0.139 0.509 0.210 0.168
MV Quintile 4 657 0.384 0.107 0.505 0.193 0.120
MV Quintile 5 489 0.307 0.080 0.513 0.192 0.168
Panel C Average Correlation (2001-2006)
Full Sample 7,419 0.563 0.272 0.514 0.287 0.347

NYSE 1,898 0.470 0.159 0.592 0.219 0.288
Amex 718 0.576 0.305 0.505 0.391 0.423
Nasdaq 4,803 0.598 0.312 0.485 0.298 0.359

MV Quintile 1 5,046 0.582 0.311 0.464 0.322 0.378
MV Quintile 2 825 0.473 0.180 0.597 0.252 0.283
MV Quintile 3 594 0.487 0.183 0.633 0.229 0.281
MV Quintile 4 508 0.564 0.183 0.647 0.179 0.269
MV Quintile 5 408 0.617 0.205 0.626 0.133 0.273
A-11

Table A8 (continued)




N
High-Low
Spread
Roll
Spread
Effective
Tick Spread

LOT Y-Split

LOT Mixed
Panel D Average Correlation (1993-2006)
Full Sample 12,505 0.624 0.339 0.599 0.443 0.438

NYSE 2,894 0.412 0.158 0.705 0.443 0.356
Amex 1,118 0.665 0.381 0.547 0.447 0.474
Nasdaq 8,493 0.691 0.395 0.569 0.443 0.461

MV Quintile 1 8,553 0.700 0.413 0.549 0.451 0.482
MV Quintile 2 1,487 0.582 0.248 0.662 0.405 0.361
MV Quintile 3 975 0.477 0.172 0.719 0.435 0.341
MV Quintile 4 803 0.382 0.120 0.747 0.435 0.320
MV Quintile 5 574 0.184 0.047 0.789 0.445 0.267




A-12

Additional Applications of the High-Low Spread Estimator
The main text provides an analysis of the performance of the high-low spread estimator relative
to monthly effective spreads from TAQ. The text also provides applications of the estimator to historical
CRSP data and to Datastream data for stocks in Hong Kong and India. In this section, we describe
several additional applications to demonstrate the potential usefulness of the high-low spread estimator to
a variety of markets and research questions. The first application is an extension of the Datastream
application in the text, incorporating data from nine additional countries. The second application
describes the estimation of daily spreads in an event study around stock splits from 1926 through 1982.
The third application describes the use of the estimator for intraday, rather than daily, periods.
A. An Application to Non-U.S. Markets Using Datastream Data
In this section, we demonstrate the application of the high-low spread estimator to non-U.S.
markets using Datastream data for individual stocks in eleven countries. Results for Hong Kong and India
were described in Section VI of the text. The nine additional markets include Korea, Japan, Italy, France,
Belgium, Sweden, the U.K., Brazil, and New Zealand. As in the previous analyses, we calculate high-low
spreads for each two-day interval following the derivation in Section I. We then calculate monthly
spreads for each stock by averaging across all overlapping two-day intervals within the month. We
include only those stock-months with at least 12 daily spread observations and we set all negative daily
estimates to zero before taking the monthly average.
1
Finally, for each country, we calculate the cross-
sectional average of high-low spreads by month using all stocks with sufficient data.
Table A9 provides summary statistics for the monthly high-low spreads in each of the eleven
countries. The period of Datastream data coverage differs by country. We therefore provide results based
on the period from January 1994 through December 2007, when data is available for all eleven countries.
As the results show, the market-wide average spread varies substantially across countries, ranging from a
low of 0.58% for New Zealand to a high of 1.47% for Korea. Among countries with at least 100 sample
firms, the minimum monthly spread is lowest in the UK, at 0.90%. The monthly market-wide average

1
We adjust for overnight returns as described in the text, based on a comparison of daily high and low prices to the prior days
close. Spreads are not estimated in cases where the daily high and low are either equal or missing.
A-13

spreads exhibit even more variation across time within individual countries. In India, for example,
monthly average spreads range from a low of 0.96% to a high of 5.23%. Similarly, monthly average
spreads in Italy range from a low of 0.58% to a high of 3.91%. The number of firms with sufficient data
to allow spread estimation also varies widely across countries, with the average number of firms ranging
from 54 for New Zealand to 2,481 for Japan.
To examine the relation between spreads across countries, we compute time-series correlations
between the monthly market-wide spreads for each pair of countries. The results are provided in Table
A10. The paired correlations range from a low of -0.441 for Hong Kong and India to a high of 0.736 for
Sweden and France. The average time-series correlation across all country pairs is 0.214.
To illustrate the time-series patterns in high-low spreads, we plot the monthly cross-sectional
average by country in Panels A through I of Figure A2. Because data coverage in Datastream increases
over time, the graphs also plot the number of firms used to compute the market-wide average in each
month. Similar graphs for Hong Kong and India are provided in the main text.
Results for Korea and Japan are shown in Panels A and B of Figure A2. As is the case for Hong
Kong, as discussed in the main text, average spreads in Korea and Japan exhibit a sharp increase at the
time of the Asian currency crisis in October 1997. In Korea, mean high-low spreads range from 0.5% to
1.0% prior to the currency crisis, but jump to over 2% during much of 1998, 1999, and 2000. Spreads
come down in early 2001, stabilizing at approximately 1.5%. While spreads appear to be high in 2002-
2007 relative to the pre-crisis period, this may simply reflect the substantial increase in Datastream
coverage in the later period. The results are similar for Japan, where high-low spreads exhibit a sharp
increase in October 1997 and a subsequent decrease from late 2003 through early 2005. For both
countries, the patterns in high-low spreads are consistent with a substantial increase in execution costs
during the Asian currency crisis. For the period starting in January 1994, the average paired correlation
between market-wide average spreads of Korea, Japan, and Hong Kong is 0.640.
Results for Italy, France, Belgium, and Sweden are provided in Panels C through F of Figure A2.
For all four countries, the most striking feature is a significant increase in spreads in December 1994. In
Italy, for example, spreads increased from an average of less than 1.0% to a peak of nearly 4.0%. Similar
A-14

increases are evident in the other three countries and all four countries experience a significant decrease in
spreads in December of 1995. These effects correspond to the Mexican peso crisis in 1994-1995 and
suggest that contagion effects or exposure to Mexican debt led to large increases in execution costs for
these countries. For Sweden, the graph also points to an increase in spreads from late 1990 through 1993.
This increase in execution costs may reflect the banking problems and subsequent kroner depreciation
experienced in Sweden during this period. For the period starting in January 1994, the average paired
correlation between market-wide average spreads in these four countries is 0.609.
Panel G of Figure A2 presents results for the UK. Several notable patterns are evident from the
graph. First, there is a sharp increase in spreads from late 1989 through early 1991. This is followed by a
significant drop in spreads during 1993. From late 1999 through early 2003, spreads in the UK roughly
double, from less than 1.0% to over 1.8%. Finally, the graph shows a significant decrease in spreads in
August 2003, coupled with a significant drop in the number of firms. While we cannot place economic
meaning on all of these patterns, the decrease in spreads during 1993 coincides with the UKs exit from
the European exchange rate mechanism.
Results for Brazil and New Zealand are provided in Panels H and I of Figure A2. There do not
appear to be any systematic patterns in spreads for either of these countries. While spreads in Brazil are
significantly higher in 1993 than in later years, this appears to reflect the small sample size during this
period. For New Zealand, there are several spikes in spreads, including a general increase in spreads from
mid-1998 through 2002. However, the number of stocks included in the sample for New Zealand is
relatively small, making it difficult to draw firm conclusions.
Overall, the results presented in Figure A2 point to several economically important patterns in
execution costs across multiple countries. These examples illustrate the potential use of the high-low
spread estimator for analyzing execution costs in non-U.S. markets using Datastream data.
B. An Application to Daily Event Studies: Stock Splits from 1926-1982
One of the advantages of the high-low spread estimator is that it can be used to produce spread
estimates over relatively short time intervals, such as days or weeks. This makes it ideal for analyzing
A-15

changes in spreads around specific events. Spreads, of course, provide information about the profitability
of trading around events. In addition, spreads can be used as a measure of asymmetric information around
events.
To illustrate this potential application of the estimator, we examine daily high-low spread
estimates around stock splits over the period from 1926 to 1982. Notably, this period precedes the
availability of intraday data through TAQ or ISSM. A number of previous studies, including Angel
(1997), Conroy, Harris, and Benet (1990), and Schultz (2000b), examine bid-ask spreads around stock
splits and conclude that quoted and effective spreads rise following stock splits. All of these studies
examine recent splits, because intraday trade and quote data are unavailable before 1983. Because the
high-low spread estimator requires only daily high and low prices, it can be used to evaluate spread
patterns around stock splits during earlier time periods.
To begin, we collect the full sample of stock splits for CRSP stocks from 1926-1982 (CRSP
distribution code 5523). We include only those splits that increase shares outstanding by at least 20%.
Using the high-low spread estimator, we calculate the cross-sectional average spread on each day from
-10 to +10 around the split date. The high-low estimator involves comparing high and low price ranges
over a single two-day period with the high-low price range over two one-day periods. To obtain daily
spread estimates on particular days around the stock split, we take an average of the high-low spread
estimates for the two overlapping intervals that include that day. For example, for day -5, we take the
average of the spreads estimated over the two-day intervals from -6 to -5 and from -5 to -4. For the day
before the split (day -1), we use just the spread estimated from days -2 to -1. For the first post-split day
(day +1), we use just the spread estimated from days 0 to 1.
2

If a stock trades at just one price over a two-day period, the high-low estimator for that period is
not defined. We could restrict our sample to stocks with spread estimates every day around the split, but
that would eliminate the less active stocks from our sample and we are interested in how their trading

2
In the main text, we compute monthly spread estimates by taking an average across all two-day periods within the month. These
monthly estimates are considerably less noisy than the daily estimates produced here. If intraday data are available, an alternative
would be to calculate high-low spread estimates for intraday periods and then average across intraday periods to get a daily
estimate. An application of the estimator to intraday trade data is provided below.
A-16

costs are affected by splits. Instead, we include stocks in our sample even if spreads cannot be estimated
every day. Hence the number of cross-sectional observations varies from 2,924 to 3,150 over the 21-day
event period, with only one day having fewer than 3,000 observations. As in the main text, negative two-
day estimates are set to zero. However, the conclusions are similar if negative spreads are included.
The results of the event study analysis are depicted in Figure A3. Mean spreads show little
variation day-to-day but increase sharply the day following a split. Prior to splits, the mean high-low
spread ranges from 2.92% to 3.09%. In the ten days following the split, the mean spread ranges from
3.72% to 4.01%, with the maximum at day +1. Differences between post-split and pre-split spreads
(measured from days -15 through -11) are highly statistically significant. These findings are consistent
with studies of stock splits from more recent time periods.
The stock split analysis illustrated here demonstrates the potential usefulness of the high-low
spread estimator in measuring changes in trading costs or uncertainty around events. In particular, this
analysis demonstrates how the high-low estimator can be used to analyze spreads at the daily level.
C. An Application to Intraday Trade Data
The derivation of the high-low estimator described in the main text is based on a comparison of
one two-day period to two single days. However, the estimator is not limited to use with daily data. One
potentially useful application of the estimator is with intraday trade data in cases where quote data are
unavailable or when trades cannot be reliably matched with quotes. The use of the estimator with intraday
data is also useful in cases where quote data are unwieldy. For example, recent TAQ quote files have
become increasingly challenging to use, having grown to more than 10 times the size of the trade files.
To illustrate this potential application of the estimator, we calculate high-low spreads for 15-
minute intraday periods in 1993 and 2006 for all NYSE, Amex, and Nasdaq stocks. The estimation
follows the derivation in the main text, except that it uses consecutive intraday rather than daily trading
intervals. For example, the high-low spread estimate for the 15-minute interval starting at 9:45 a.m. is
computed using the high and low prices from the 9:30-9:45 interval and the 9:45-10:00 interval, where
A-17

the high and low are identified using intraday trade data from TAQ.
3
We then calculate daily spreads for
each stock by averaging across all 15-minute intervals within the day. We include only those stock-days
with at least 10 intraday spread observations. In addition, we provide results using three alternative
methods to account for negative spread estimates within the day: (1) setting negative intraday spreads to
zero, (2) leaving negative two-day spreads unchanged, and (3) excluding negative two-day spreads. To
test the performance of the high-low spread estimator as applied to intraday data, we compare the
resulting high-low spread estimates to trade-weighted effective spreads calculated directly from TAQ.
The TAQ data include 1,544,655 stock-day observations for 1993 and 1,183,472 stock-day observations
for 2006.
4
The analysis below is based on the subset of observations for which both a TAQ effective
spread and a high-low spread could be calculated.
Summary statistics for daily estimates of TAQ effective spreads and high-low spreads are
provided in Table A11. The TAQ effective spread averages 1.320% in 1993 and 0.272% in 2006. In
comparison, the high-low spread for 1993 averages 0.837% if negative intraday values are included,
1.048% if negative values are set to zero, and 1.214% if negative values are excluded. For 2006, the three
versions of the high-low spread average 0.056%, 0.181%, and 0.262%, respectively. While all three
versions of the high-low estimator appear to underestimate spreads relative to the TAQ effective spread,
the results suggest that eliminating negative intraday spread estimates prior to calculating the daily
average produces results most comparable to the TAQ effective spread. This conclusion is confirmed by
examining the mean absolute errors based on the difference between TAQ effective spreads and high-low
spreads. The mean absolute error is lowest when negative intraday spreads are excluded, averaging
0.0026 for 1993 and 0.0013 for 2006. In the correlation analysis to follow, we focus on the version of the
high-low spread estimator that excludes negative intraday spread estimates.

3
Because spread estimates require two consecutive 15-minute intervals, no spread estimate is calculated for the 9:30-9:45 period
on each day. In contrast to the daily analysis in the main text, analysis based on intraday periods does not require an adjustment
for overnight returns.
4
The calculation of effective spreads and related TAQ data screens are discussed in more detail in Section V. The high-low
estimator requires at least two consecutive 15-minute intervals with trades at multiple prices (i.e., high low). As a result,
requiring at least 10 intraday spread estimates eliminates a large fraction of stock-day observations. This data loss could be
substantially reduced by requiring fewer intraday spread estimates or by using longer intraday intervals. For 1993, 24.9% of
available stock-day observations have at least 10 intraday spread estimates, while 64.8% of stock-day observations have at least
one intraday spread estimate. In 2006, 74.3% of available stock-day observations have at least 10 intraday spread estimates, while
89.8% of stock-day observations have at least one intraday estimate.
A-18

We note that the underestimation of high-low spreads relative to TAQ effective spreads may
simply reflect intraday patterns in spreads and trading activity. While the high-low estimator weights each
15-minute period within the day equally, TAQ effective spreads are calculated as a trade-weighted
average. As a result, the TAQ effective spread places greater weight on the relatively high spreads at the
beginning and end of the trading day.
To provide additional tests of the performance of the high-low spread estimator as applied to
intraday data, we examine correlations between the high-low spread estimates and TAQ effective spreads.
We provide three alternative types of tests. First, we examine the correlation between high-low spreads
and TAQ effective spreads in the pooled sample of time-series and cross-sectional observations. Second,
we estimate the cross-sectional correlations by day and calculate the mean cross-sectional correlation
across all days in the year. Finally, we estimate stock-by-stock time-series correlations and calculate the
mean time-series correlation across all stocks and across stocks in each market capitalization decile. For
each test, we provide separate results for 1993 and 2006. The results are presented in Table A12.
The pooled correlation between high-low spreads and TAQ effective spreads is 0.912 in 1993 and
0.803 in 2006. Similarly, the average cross-sectional correlation between high-low spreads and TAQ
effective spreads is 0.912 in 1993 and 0.815 in 2006. These results are generally consistent with the
monthly analysis provided in Section V and suggest that the high-low estimator produces very accurate
daily spread estimates.
Across all sample stocks, the average time-series correlation is 0.525 in 1993 and 0.458 in 2006.
During 1993, high-low spreads have a higher correlation with TAQ effective spreads for small stocks than
for large stocks. The average time-series correlation is 0.568 for decile one and 0.427 for decile ten. In
2006, the performance of the high-low spread estimator is more consistent across size deciles, with an
average time-series correlation of 0.454 for decile one and 0.438 for decile ten.
To further illustrate the accuracy of the high-low spread estimator, Figure A4 plots the cross-
sectional average TAQ effective spread and the cross-sectional averages of three alternative high-low
spread estimates by day. Results for 1993 are shown in Panel A and results for 2006 are shown in Panel
B. As in Table A11, the results show that the high-low spread is slightly lower than the TAQ effective
A-19

spread, on average, with the best results being evident when negative intraday spread estimates are
excluded. The effects of negative spread estimates are most pronounced in the results for 2006. What is
most striking in Figure A4 is the ability of the high-low estimator to capture the time-series patterns in
average daily spreads, especially in 1993. When negative intraday spread estimates are excluded, the
time-series correlation between market-wide high-low spreads and market-wide effective spreads is 0.98
in 1993 and 0.67 in 2006.
Previous research shows that bid-ask spreads exhibit pronounced intraday patterns. Using data
from the late 1980s, McInish and Wood (1992) find that bid-ask spreads for NYSE stocks follow a U-
shaped pattern, with high spreads at the beginning and end of the trading day. Using data from the early
1990s, Chan, Christie, and Schultz (1995) find that Nasdaq stocks exhibit high spreads at the beginning of
the day, but a marked decrease in spreads at the end of the day. As an additional test of the high-low
estimator, we examine whether estimated spreads follow the expected intraday patterns.
Figure A5 plots the average high-low spread by intraday period for NYSE and Nasdaq stocks.
Results for 1993 are shown in Panels A and B and results for 2006 are shown in Panels C and D. As
expected, bid-ask spreads for NYSE stocks exhibit a U-shaped pattern with the highest spreads at the
beginning of the trading day.
5
This pattern is evident in both 1993 and 2006, though the time-series
variation is strongest in 2006. The patterns are similar for Nasdaq stocks, with spreads being highest at
the beginning of the trading day in both 1993 and 2006. While Nasdaq spreads remain relatively flat at
the end of the trading day in 1993, they appear to increase at the end of the trading day in 2006. This
suggests that intraday patterns in the spreads of Nasdaq stocks may have changed over time, becoming
more like NYSE patterns in recent years.
Taken together, the results provided here suggest that the high-low estimator performs very well
when applied to intraday data. Not only do the spread estimates follow the expected intraday patterns, but
the aggregation of intraday estimates produces daily spreads that are very accurate in comparison to TAQ
effective spreads. Thus, we expect the application of the high-low spread estimator to intraday data to

5
Recall that the high-low spread is not estimated for the 9:30-9:45 period. The beginning-of-day patterns illustrated in Figure 2
would likely be more pronounced if this period was included.
A-20

prove very useful in cases where intraday quote data are unavailable or cumbersome, and in cases where
trades cannot be reliably matched with quotes.


A-21

Table A9 Summary Statistics for High-Low Spreads by Country
The table provides summary statistics for the cross-section of high-low spread estimates for each of eleven countries
covered by Datastream. Monthly high-low spreads for each stock-month with at least 12 daily spread estimates
within the month. For each country, we then calculate the cross-sectional average high-low spread each month. The
table reports the time series properties of these country-specific cross-sectional averages, along with information on
the number of firms included in each average each month. Data are from January 1994 through December 2007.

Country
Monthly High-Low Spread Number of Firms in Cross-Sectional Avg
Mean Min Max Mean Minimum Maximum
Korea 1.47% 0.73% 2.59% 1265 350 1865
Japan 0.76% 0.48% 1.12% 2481 1582 3623
Hong Kong 1.16% 0.50% 2.29% 475 198 957
India 1.65% 0.90% 5.23% 1083 406 1482
Italy 1.02% 0.55% 3.91% 264 147 315
France 0.95% 0.57% 2.01% 451 249 569
Belgium 0.83% 0.32% 2.26% 83 37 107
Sweden 1.10% 0.54% 2.95% 295 142 435
UK 0.90% 0.50% 1.85% 1189 673 1581
Brazil 1.31% 0.68% 3.73% 144 62 326
New Zealand 0.58% 0.34% 1.17% 54 15 79




A-22

Table A10 Correlations in High-Low Spreads Across Countries
The table provides paired time-series correlations between market-wide high-low spreads for each of eleven countries covered by Datastream. Monthly high-low
spreads for each stock-month with at least 12 daily spread estimates within the month. For each country, we then calculate the cross-sectional average high-low
spread each month. The table reports the paired time-series correlations between these country-specific cross-sectional averages. Data are from January 1994
through December 2007.

Korea Japan Hong Kong India Italy France Belgium Sweden UK Brazil
Japan 0.563
Hong Kong 0.720 0.637
India -0.424 -0.220 -0.441
Italy -0.284 0.198 -0.258 0.146
France -0.067 0.494 0.097 -0.066 0.574
Belgium -0.050 0.305 0.051 0.007 0.542 0.628
Sweden 0.061 0.496 0.213 0.004 0.589 0.736 0.583
UK 0.274 0.471 0.425 -0.220 -0.005 0.542 0.266 0.323
Brazil 0.168 0.295 0.230 0.362 -0.031 0.045 -0.105 0.096 0.176
New Zealand 0.277 0.451 0.350 0.081 0.058 0.332 0.044 0.293 0.362 0.389


A-23

0
200
400
600
800
1000
1200
1400
1600
1800
2000
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
3.00%
1
9
8
4
0
7
1
9
8
5
0
7
1
9
8
6
0
7
1
9
8
7
0
7
1
9
8
8
0
7
1
9
8
9
0
7
1
9
9
0
0
7
1
9
9
1
0
7
1
9
9
2
0
7
1
9
9
3
0
7
1
9
9
4
0
7
1
9
9
5
0
7
1
9
9
6
0
7
1
9
9
7
0
7
1
9
9
8
0
7
1
9
9
9
0
7
2
0
0
0
0
7
2
0
0
1
0
7
2
0
0
2
0
7
2
0
0
3
0
7
2
0
0
4
0
7
2
0
0
5
0
7
2
0
0
6
0
7
2
0
0
7
0
7
N
u
m
b
e
r
o
f
F
i r
m
s

w
i t
h
A
v
a
i l a
b
l e

D
a
t
a
M
e
a
n
B
i d
-
A
s
k

S
p
r
e
a
d
0
500
1000
1500
2000
2500
3000
3500
4000
0.00%
0.20%
0.40%
0.60%
0.80%
1.00%
1.20%
1
9
9
0
0
1
1
9
9
1
0
1
1
9
9
2
0
1
1
9
9
3
0
1
1
9
9
4
0
1
1
9
9
5
0
1
1
9
9
6
0
1
1
9
9
7
0
1
1
9
9
8
0
1
1
9
9
9
0
1
2
0
0
0
0
1
2
0
0
1
0
1
2
0
0
2
0
1
2
0
0
3
0
1
2
0
0
4
0
1
2
0
0
5
0
1
2
0
0
6
0
1
2
0
0
7
0
1
N
u
m
b
e
r
o
f
F
i r
m
s

w
i t
h
A
v
a
i l a
b
l e

D
a
t
a
M
e
a
n
B
i d
-
A
s
k

S
p
r
e
a
d
Panel A - Mean High-Low Spreads in Korea Panel B - Mean High-Low Spreads in Japan






















Figure A2 - Average High-Low Spreads by Month based on Datastream Data
High-low spreads are estimated for each stock each month by averaging all overlapping two-day spread estimates within the month. The graph plots the equally
weighted average spread by month across all stocks with at least 12 daily spread observations within the month. The graph also shows the number of firms
included in the average each month. Panels A through I show results for stocks in Korea, Japan, Italy, France, Belgium, Sweden, the U.K., Brazil, and New
Zealand, respectively. All data are from Datastream.



A-24

0
50
100
150
200
250
300
350
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
3.00%
3.50%
4.00%
4.50%
1
9
9
2
0
1
1
9
9
3
0
1
1
9
9
4
0
1
1
9
9
5
0
1
1
9
9
6
0
1
1
9
9
7
0
1
1
9
9
8
0
1
1
9
9
9
0
1
2
0
0
0
0
1
2
0
0
1
0
1
2
0
0
2
0
1
2
0
0
3
0
1
2
0
0
4
0
1
2
0
0
5
0
1
2
0
0
6
0
1
2
0
0
7
0
1
N
u
m
b
e
r
o
f
F
i r
m
s

w
i t
h
A
v
a
i l a
b
l e

D
a
t
a
M
e
a
n
B
i d
-
A
s
k

S
p
r
e
a
d
0
100
200
300
400
500
600
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
1
9
8
8
0
6
1
9
8
9
0
6
1
9
9
0
0
6
1
9
9
1
0
6
1
9
9
2
0
6
1
9
9
3
0
6
1
9
9
4
0
6
1
9
9
5
0
6
1
9
9
6
0
6
1
9
9
7
0
6
1
9
9
8
0
6
1
9
9
9
0
6
2
0
0
0
0
6
2
0
0
1
0
6
2
0
0
2
0
6
2
0
0
3
0
6
2
0
0
4
0
6
2
0
0
5
0
6
2
0
0
6
0
6
2
0
0
7
0
6
N
u
m
b
e
r
o
f
F
i r
m
s

w
i t
h
A
v
a
i l a
b
l e

D
a
t
a
M
e
a
n
B
i d
-
A
s
k

S
p
r
e
a
d
0
20
40
60
80
100
120
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
1
9
8
6
0
1
1
9
8
7
0
1
1
9
8
8
0
1
1
9
8
9
0
1
1
9
9
0
0
1
1
9
9
1
0
1
1
9
9
2
0
1
1
9
9
3
0
1
1
9
9
4
0
1
1
9
9
5
0
1
1
9
9
6
0
1
1
9
9
7
0
1
1
9
9
8
0
1
1
9
9
9
0
1
2
0
0
0
0
1
2
0
0
1
0
1
2
0
0
2
0
1
2
0
0
3
0
1
2
0
0
4
0
1
2
0
0
5
0
1
2
0
0
6
0
1
2
0
0
7
0
1
N
u
m
b
e
r
o
f
F
i r
m
s

w
i t
h
A
v
a
i l a
b
l e

D
a
t
a
M
e
a
n
B
i d
-
A
s
k

S
p
r
e
a
d
0
50
100
150
200
250
300
350
400
450
500
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
3.00%
3.50%
1
9
8
2
0
1
1
9
8
3
0
1
1
9
8
4
0
1
1
9
8
5
0
1
1
9
8
6
0
1
1
9
8
7
0
1
1
9
8
8
0
1
1
9
8
9
0
1
1
9
9
0
0
1
1
9
9
1
0
1
1
9
9
2
0
1
1
9
9
3
0
1
1
9
9
4
0
1
1
9
9
5
0
1
1
9
9
6
0
1
1
9
9
7
0
1
1
9
9
8
0
1
1
9
9
9
0
1
2
0
0
0
0
1
2
0
0
1
0
1
2
0
0
2
0
1
2
0
0
3
0
1
2
0
0
4
0
1
2
0
0
5
0
1
2
0
0
6
0
1
2
0
0
7
0
1
N
u
m
b
e
r
o
f
F
i r
m
s

w
i t
h
A
v
a
i l a
b
l e

D
a
t
a
M
e
a
n
B
i d
-
A
s
k

S
p
r
e
a
d
Panel C - Mean High-Low Spreads in Italy Panel D - Mean High-Low Spreads in France

















Panel E - Mean High-Low Spreads in Belgium Panel F - Mean High-Low Spreads in Sweden

















Figure A2 - continued
A-25

0
200
400
600
800
1000
1200
1400
1600
1800
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
1
9
8
7
0
1
1
9
8
8
0
1
1
9
8
9
0
1
1
9
9
0
0
1
1
9
9
1
0
1
1
9
9
2
0
1
1
9
9
3
0
1
1
9
9
4
0
1
1
9
9
5
0
1
1
9
9
6
0
1
1
9
9
7
0
1
1
9
9
8
0
1
1
9
9
9
0
1
2
0
0
0
0
1
2
0
0
1
0
1
2
0
0
2
0
1
2
0
0
3
0
1
2
0
0
4
0
1
2
0
0
5
0
1
2
0
0
6
0
1
2
0
0
7
0
1
N
u
m
b
e
r
o
f
F
i r
m
s

w
i t
h
A
v
a
i l a
b
l e

D
a
t
a
M
e
a
n
B
i d
-
A
s
k

S
p
r
e
a
d
0
50
100
150
200
250
300
350
0.00%
1.00%
2.00%
3.00%
4.00%
5.00%
6.00%
1
9
9
3
0
1
1
9
9
4
0
1
1
9
9
5
0
1
1
9
9
6
0
1
1
9
9
7
0
1
1
9
9
8
0
1
1
9
9
9
0
1
2
0
0
0
0
1
2
0
0
1
0
1
2
0
0
2
0
1
2
0
0
3
0
1
2
0
0
4
0
1
2
0
0
5
0
1
2
0
0
6
0
1
2
0
0
7
0
1
N
u
m
b
e
r
o
f
F
i r
m
s

w
i t
h
A
v
a
i l a
b
l e

D
a
t
a
M
e
a
n
B
i d
-
A
s
k

S
p
r
e
a
d
0
10
20
30
40
50
60
70
80
90
0.00%
0.20%
0.40%
0.60%
0.80%
1.00%
1.20%
1.40%
1
9
9
0
0
1
1
9
9
1
0
1
1
9
9
2
0
1
1
9
9
3
0
1
1
9
9
4
0
1
1
9
9
5
0
1
1
9
9
6
0
1
1
9
9
7
0
1
1
9
9
8
0
1
1
9
9
9
0
1
2
0
0
0
0
1
2
0
0
1
0
1
2
0
0
2
0
1
2
0
0
3
0
1
2
0
0
4
0
1
2
0
0
5
0
1
2
0
0
6
0
1
2
0
0
7
0
1
N
u
m
b
e
r
o
f
F
i r
m
s

w
i t
h
A
v
a
i l a
b
l e

D
a
t
a
M
e
a
n
B
i d
-
A
s
k

S
p
r
e
a
d
Panel G - Mean High-Low Spreads in the U.K. Panel H - Mean High-Low Spreads in Brazil

















Panel I - Mean High-Low Spreads in New Zealand
















Figure A2 - continued
A-26

0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
3.00%
3.50%
4.00%
4.50%
-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 1 2 3 4 5 6 7 8 9 10
M
e
a
n

S
p
r
e
a
d
Event Date
Figure A3 - Average High-Low Spreads Around Stock Splits: 1926-1982
The figure plots the cross-sectional average of daily high-low spreads for the 10 days prior to and 10 days after stock
splits. The sample includes all stocks splits for NYSE, Amex, and Nasdaq firms from 1926 through 1982 that
increase shares outstanding by at least 20%, where stock splits are identified by CRSP distribution code 5523. High-
low spreads are estimated for each two-day period, as described in Section V. In general, daily estimates are then
defined by averaging the high-low spread estimates for the two overlapping intervals that include that day. For the
day before the split (day -1), we use just the spread estimated from days -2 to -1. For the first post-split day (day +1),
we use just the spread estimated from days 0 to 1.


















A-27

Table A11 - Summary Statistics for Daily High-Low Spreads and TAQ Effective Spreads
The High-Low Spread for each trading day is defined as the equally-weighted average of the High-Low Spread
across all overlapping 15-minute intervals within the day. Results are shown based on three alternative methods to
account for negative spread estimates within the day: (1) leave negative two-day spreads unchanged, (2) set negative
two-period spreads to zero, or (3) exclude negative two-period spreads. We include only those trading days with at
least 10 intraday High-Low Spread estimates. The TAQ effective spread for each trading day is defined as a trade-
weighted average across all trades within the day. For each day and each stock, mean absolute errors are defined
based on the difference the between the High-Low Spread and the TAQ effective spread. The sample includes
384,557 stock-days IN 1993 and 878,782 stock-days in 2006 for which both the TAQ effective spread and the High-
Low Spread could be estimated.

1993 2006
Mean Spreads:
TAQ Eff. Spread 1.320% 0.272%
High-Low Spread
NegIncl
0.837% 0.056%
High-Low Spread
Neg=0
1.048% 0.181%
High-Low Spread
NegExcl
1.214% 0.262%

Mean Absolute Errors:
High-Low Spread
NegIncl
0.0054 0.0022
High-Low Spread
Neg=0
0.0035 0.0013
High-Low Spread
NegExcl
0.0026 0.0013




A-28

Table A12 - Correlations of Daily High-Low Spreads with TAQ Effective Spreads
The table lists correlations between the High-Low Spread estimates and TAQ effective spread. The High-Low
Spread for each trading day is defined as the equally-weighted average of the High-Low Spread across all
overlapping 15-minute intervals within the day, where negative spread estimates within the day are excluded prior to
calculating the average. We include only those trading days with at least 10 intraday High-Low Spread estimates.
The TAQ effective spread for each trading day is defined as a trade-weighted average across all trades within the
day. The sample includes 384,557 stock-days in 1993 and 878,782 stock days in 2006 for which both the TAQ
effective spread and the High-Low Spread could be estimated.

1993 2006
Pooled Correlation:
Full Sample 0.912 0.803

Mean Cross-Sectional Correlation:
Full Sample 0.912 0.815

Mean Time-Series Correlation:
Full Sample 0.525 0.458
Size Decile 1 0.568 0.454
Size Decile 2 0.543 0.450
Size Decile 3 0.492 0.459
Size Decile 4 0.461 0.466
Size Decile 5 0.452 0.463
Size Decile 6 0.452 0.467
Size Decile 7 0.485 0.483
Size Decile 8 0.389 0.470
Size Decile 8 0.419 0.481
Size Decile 10 0.427 0.438



A-29

0.00%
0.20%
0.40%
0.60%
0.80%
1.00%
1.20%
1.40%
1.60%
1.80%
2.00%
TAQ Ef f Sprd High-Low Sprd (Neg Excl) High-Low Sprd (Neg=0) High-Low Sprd (Neg Incl)
0.00%
0.05%
0.10%
0.15%
0.20%
0.25%
0.30%
0.35%
0.40%
0.45%
TAQ Ef f Sprd High-Low Sprd (Neg Excl) High-Low Sprd (Neg=0) High-Low Sprd (Neg Incl)
Panel A - Average High-Low Spreads by Day in 1993




















Panel B - Average High-Low Spreads by Day in 2006




















Figure A4 - Daily Patterns in High-Low Spreads
The figure plots cross-sectional average spreads by day. Results are shown for TAQ effective spreads as well as
three alternative versions of the High-Low estimator, where negative spread estimates within the day are (1)
excluded, (2) set to zero, or (3) included, prior to calculating the daily average. Results for 1993 are shown in Panel
A and results for 2006 are shown in Panel B.
A-30

0.00%
0.20%
0.40%
0.60%
0.80%
1.00%
High-Low Sprd (Neg Excl) High-Low Sprd (Neg=0) High-Low Sprd (Neg Incl)
0.00%
0.50%
1.00%
1.50%
2.00%
2.50%
3.00%
High-Low Sprd (Neg Excl) High-Low Sprd (Neg=0) High-Low Sprd (Neg Incl)
0.00%
0.05%
0.10%
0.15%
0.20%
0.25%
0.30%
0.35%
High-Low Sprd (Neg Excl) High-Low Sprd (Neg=0) High-Low Sprd (Neg Incl)
0.00%
0.10%
0.20%
0.30%
0.40%
0.50%
0.60%
High-Low Sprd (Neg Excl) High-Low Sprd (Neg=0) High-Low Sprd (Neg Incl)
Panel A - NYSE Stocks in 1993 Panel B - Nasdaq Stocks in 1993
















Panel C - NYSE Stocks in 2006 Panel D - Nasdaq Stocks in 2006
















Figure A5 - Intraday Patterns in High-Low Spreads
The figure plots average High-Low Spreads across all days and stocks by 15-minute period. Results for NYSE and Nasdaq stocks in 1993 are shown in Panels A
and B, respectively. Results for NYSE and Nasdaq stocks in 2006 are shown in Panels C and D, respectively. Results are shown for two alternative versions of
the High-Low estimator, where negative spread estimates within the day are either (1) excluded, or (2) set to zero.

Das könnte Ihnen auch gefallen