Sie sind auf Seite 1von 7

NESUG 2010 Applications Development

Portfolio Backtesting: Using SAS® to Generate Randomly Populated


Portfolios for Investment Strategy Testing

Xuan Liu, Mark Keintz


Wharton Research Data Services

Abstract
One of the most regularly used SAS programs at our business school is to assess the investment
returns from a randomly populated set of portfolios covering a student-specified historic period,
rebalancing frequency, portfolio count, size, and type. The SAS program demonstrates how to
deal with dynamically changing conditions, including periodic rebalancing, replacement of
delisted stock, and shifting of stocks from one type of portfolio to another.

The application is a good example of the effective use of hash tables, especially for tracking
holdings, investment returns.

1. What is Backtesting and how does it work?


Backtesting is the process of applying an investment strategy to historical financial information
to asses the results (i.e. change in value). That is, it answers the question “what if I had applied
investment strategy X during the period Y?.

The backtest application developed at the Wharton school, used for instructional rather than
research purposes, is currently applied only to publically traded stocks. Later, in the “Creating a
Backtest” section, we go over the more important considerations in creating a backtest program.

However, as an example, a user might request a backtest of 4 portfolios, each with 20 stocks, for
the period 2000 through 2008. The four portfolios might be from a cross-classification of (a) the
top 20% of market capitalization (and bottom 20%) crossed with (b) top 20% and bottom 20%
of book-to-market ratios. The user might rebalance the stocks every 3 months (i.e. redivide the
investment equally among the stocks) and refill (i.e. replace no-longer eligible stocks) every 6
months.

2. Source file for Backtesting


The source file used for Backtesting is prepared by merging monthly stocks data (for monthly
prices and returns), “event” data (to track when stocks stopped or restarted trading), and annual
accounting (for book equity data) data filed with the SEC. (shown in Figure 2.1).

1
NESUG 2010 Applications Development

Monthly Annual
Stocks&Event accounting
Files dataFile

Sourcefile
for
Backtesting

Fig. 2.1 Data file used for Backtesting

This yielded a monthly file, with changes in price, monthly cumulative returns, and yearly
changes in book value). Because the data is sorted by stock identifier (STOCK_ID) and DATE,
it allows the calculation of a monthly cumulative return (CUMRET0) for each stock in the
dataset using the single month returns (RETURN), as below. CUMRET0 will be used later to
determine the actual performance of each portfolio.
/*Calculation of monthly cumulative returns */
data monthly_cumreturns;
set monthly_file;
by stockid date;
if first.stock_id then do;
if missing(return)=0 then cumret0=(1+return);
else cumret0=1;
end;
else do;
if missing(return)=0 then cumret0=cumret0*(1+return);
else cumret0=cumret0;
end;
retain cumret0;
run;

Now, as mentioned earlier users may restrict portfolios to specific percentile ranges of variables
like market capitalization (MARKETCAP). These percentiles (deciles in this example) are
generated via PROC RANK for each refill date, as below:

/*Portfolio deciles using the market cap criteria*/


proc sort
data=monthly_cumreturns out=source;
by date stockid;
run;
proc rank data=source out=temp group=10;
by date;
var marketcap;
ranks rMarketcap;
run;

The resulting dataset looks like this:

2
NESUG 2010 Applications Development

Exampleofth
E eDataFileUssedforBackte
esting
rMarketcap
Cum
mret0 marketcap
stock_id date Retturn (Rankfor … …
 (M
Marketcap)
marketcap)
50091 19711031 0.01
10000 0.6
67726 33270.00 5 … …
50104 19711031 0.01
19802 2.3
34271 49723.25 6 … …

Table 2.1 Data


D file usedd for Backtestiing

3. Creaating a Backktest
Once thhe primary ffile has beenn created, thhe backtest can be definned throughh these paraameters:

“Strucctural” Paraameters:
- Daate range off the investm ment.
- Nuumber of poortfolios andd amount in nvested in eeach portfoliio.
- Nuumber of sttocks in eachh portfolio.
- Reebalancing Frequency: For portfollios designaated as “equually weightted” the stoccks in each
poortfolio are pperiodicallyy reallocated
d so they haave equal vaalue. (Portffolios that are
a “value
weeighted” aree not rebalaanced).
- Reefilling Freqquency: Thhe frequencyy of determ mining whethher a stock sstill qualifiees for a
poortfolio (seee “Portfolio Criteria” beelow) and replacing thee stock if it doesn’t.

“Portffolio Criteriia” (these arre specifiedd in date-speecific percenntile, not abbsolute values):
- Market
M capittalization: thhe total valuue of all pubblically tradded shares ffor a firm.
- Boook-to-Marrket Ratio: thhe accountiing value off a firm vs. its i market ccapitalizatio on.
- Laagged Returrns: the retuurn for the previous
p fisccal period.
- P//E Ratio: Raatio of the price
p of eachh share to thhe companyy earnings pper share
- Prrice: Price oof a share.
The proocess of takiing these paarameters annd generatinng a backtesst is displayyed in the fo ollowing
figure:

•Initialcash
•Startandenddate Screening(o
optional)
Analysis
•Stockksperportfolio •keepeveryythinginone
•Typeofportfolio portfolio
weigh hting(market • Usedeiffe
erentmetricsto
capoorequal •Scrreenbydeciles.SScreen dividesecu
uriesintomultiple
weigh hted) meetricsaremarkettcap, potfolios
boo oktomarketratio,
•rebalance&reͲfill earrningstopriceraatioorlag
periood retturns,priceetc.) Paartition(optional)

Sourcefilesetu
up

Fig. 3.1 Creating a Backtest

3
NESUG 2010 Applications Development

This introduces a number of programming tasks. The primary tasks are:

1. For the start date and each refill date, generate percentiles for the portfolio criteria.
2. At the start date, randomly draw stocks for each portfolio from qualifying stock.
3. Track monthly cumulative return (i.e. cumulative increase or decrease) in the value of each
stock in each portfolio. Each stock is tracked so that rebalancing can be done, if needed.
4. If a stock stops trading at any point, reallocate its residual value to the rest of the portfolio.
5. At every “refill” point, keep all stocks in the portfolio that are still eligible (buy and hold)
and randomly select replacements for all stocks no longer eligible.

By default, all available securities are considered for inclusion in the backtest. The universe can
be filtered by adding one or more screens based on the portfolio criteria (expressed in deciles in
this paper). Multiple portfolios can be created by dividing securities into distinct partitions based
on the value of one or two metrics. For example, using two metrics, book to market and price
with 2 partitions for book-to-market and 3 partitions for price will result in 6 portfolios. Once the
portfolio is constructed, performance of each portfolio will be analyzed.

4. Portfolios are populated by randomly selected securities


During the creation of a backtest, securities within a portfolio are randomly selected, which is
made possible by generation of a random number for each stock_id,
/*randomization of the stocks*/
proc sort data=inds out=outds;
by stock_id date;
run;
%let seed =10;
data randomized_stocks / view = randomized_stocks;
set outds;
by stock_id;
retain ru;
if first.stock_id then ru=ranuni(&seed);
output;
run;

inds is the input dataset with one record per stock_id - date. outds is the output dataset with
added random variable sorted by stock_id - date. ru is the random variable generated from the
seed. A constant unique random value is generated for each stock_id. Each call with different
seed will cause a new set of random numbers generated for the stock_ids (See table 4.1).
ru ru
stock_id date
(seed=10) (seed=30)
10042 20050831 0.70089 0.10266
10042 20060831 0.70089 0.10266
10042 20070831 0.70089 0.10266
10078 20050831 0.99824 0.99473
10078 20060831 0.99824 0.99473
10078 20070831 0.99824 0.99473

Table 4.1 Sample outputs with different seed

4
NESUG 2010 Applications Development

5. The refill process


People buy and hold securities for a certain period of time. During the holding period, some
stocks may disappear due to delisting or become disqualified using the initial portfolio set up
criteria. In either case, the size of the portfolio shrinks. To bring the portfolio back to its original
size, a refill process is performed on each user specified date.

One possible problem that can distort the refill process is the possibility that a stock can cease
trading (become “delisted”) and later reappear on the market. If the stock retains the same
randomly assigned priority used in the initial sampling then it would be included in the refill
event after its re-entry on the market. In order to avoid this problem we used the following
approach: generate the random number that associates with the date variable and assign a stage
variable to indicate its on-off appearance if any. Whenever the stock reappears, generate a new
random number for that stock. Sort the stock pool by date and random number. When it is the
time for refill, the first nth stocks (n is the number of stocks asked by the user) should be selected
to form the desired portfolio.
/* Randomization procedure used for portfolio Buy & Hold and Refill process*/
data stocks_held(drop=lagdate);
set stocks;
by stock_id;
retain ru stage;
lagdate = lag(date);
if first.stock_id then do;
stage =1;
ru = date + ranuni(&seed);
end;
else if intck('month', lagdate, date)>1 then do;
stage = stage +1;
ru = date + ranuni(&seed);
end;
run;

proc sort data= stocks_held;


by date ru;
run;

6. Rebalance
Rebalancing brings your portfolio back to your original asset allocation mix. This is necessary
because over time some of your investments may become out of alignment. Table 6.1 illustrates
a simple example for equal- weighted portfolio with two stocks,









5
NESUG 2010 Applications Development

OnJan31,1990,Initialcash:$120.Bought12sharesofstock1and15sharesofstock2
stock_id=1 Stock_id=2 total
Date price cumret0 money money amountin
return Price return cumret0
 forstock1 invested invested Portfolio
19900131 $5 . 1 $60 $4 . 1 $60 $120
19900228 $6 0.2000 1.2000 $72 $5 0.2500 1.2500 $75 $147
19900331 $10 0.6667 2.0000 $120 $6 0.2000 1.5000 $90 $210
OnApril1,1990,theportfolioisrebalanced.Initialcash:$210.Sold1.5sharesofstock1andpurchased2.5sharesofstock2
19900430 $12 0.2000 2.4000 $126 $10 0.6667 2.5000 $175 $301
19900531 $15 0.2500 3.0000 $157.5 $10 0.0000 2.5000 $175 $332.5
19900630 $18 0.2000 3.6000 $189 $12 0.2000 3.0000 $210 $399
OnJuly1,1990,,theportfolioisrebalanced.Initialcash:$399.Bought0.583sharesofstock1andsold0.875sharesofstock2

Note:$399=$210*[(1+0.2000)*(1+0.2500)*(1+0.2000)+(1+0.6667)*(1+0.0000)*(1+0.2000)]/2
=$210*(3.6000/2.0000+3.0000/1.5000)/2

Table 6.1 Equal- weighted portfolio with two stocks

The task is to calculate the cumret0 divide by the cumret0 at beginning of the rebalance period
(denoted by eq_rebal_wgt in the following SAS code). The following SAS uses SAS hash
object. It can quickly retrieve cumret_rebal_start (cumret0 at beginning of the rebalance period).
The hash object is uniquely suited to this step in the process. Not only does it provide a quick
lookup of the starting values for each stock, it easily accommodates the changing composition of
a portfolio, and updating of those values “in place”. The result is listed in table 6.2.

/* equal- weighted portfolio rebalance weight calculation for a single stock*/


data bal_source;
if _n_=1 then do;
declare hash ht();
ht.defineKey("stock_id");
ht.definedata("cumret_rebal_start");
ht.definedone();
end;
set source_sample end=done;
if rebal_flag=1 then do;
cumret_rebal_start= (cumret0)/ (1+return);
rc=ht.replace();
end;
else do;
rc=ht.find();
end;
drop rc;
eq_rebal_wgt = cumret0 / cumret_rebal_start;
run;

return cumret0 rebal_flag cumret_rebal_start eq_rebal_wgt


date Stock_id=1
19900131 . 1 1  
19900228 0.2000 1.2000 0 
19900331 0.6667 2.0000 0 
19900430 0.2000 2.4000 1 2.0000 
19900531 0.2500 3.0000 0 2.0000 
19900630 0.2000 3.6000 0 2.0000 1.8

Table 6.2 Calculation of eq_rebal_wgt

6
NESUG 2010 Applications Development

Once eq_rebal_wgt is calculated for all the stocks, the rebalance weight for the portfolio
(p_rebal_wt) can easily be calculated by use of proc means on eq_rebal_wgt as following,

/* equal- weighted portfolio rebalance weight calculation*/


proc means data= bal_source ;
class portfolio_id date;/*Date here corresponds to rebalance date.*/
var eq_rebal_wgt ;
output out = outds
mean(eq_rebal_wgt)= p_rebal_wt;
run;

Conclusion
This paper focuses on the randomization procedure used for portfolio construction for
backtesting, as well as how the portfolio is refilled and rebalanced during its evolution. The
randomization procedure is designed to accommodate the buy-and-hold strategy of portfolio
management. We also illustrate how a SAS hash object is used for fast and simple retrieval of
stock cumulative returns, making the calculation of multi-stock portfolio returns a simple use of
proc means.

CONTACT INFORMATION
Author: Xuan Liu
Address: Wharton Research Data Services
216 Vance Hall
3733 Spruce St
Philadelphia, PA 19104-6301
Email xuanliu@wharton.upenn.edu

Author: Mark Keintz


Address: Wharton Research Data Services
216 Vance Hall
3733 Spruce St
Philadelphia, PA 19104-6301
Email mkeintz@wharton.upenn.edu

TRADEMARKS
SAS® and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS
Institute Inc. in the USA and other countries. ® Indicates USA registration.

Das könnte Ihnen auch gefallen