Sie sind auf Seite 1von 14

1.

Introduction to R for Finance

Shubham Mehta

16/05/2020

Financial returns (1)

Time for some application! Earlier, Lore taught you about financial returns. Now, its time for you to put
that knowledge to work! But first, a quick review.
Assume you have $100. During January, you make a 5% return on that money. How much do you have at
the end of January? Well, you have 100% of your starting money, plus another 5%: 100% + 5% = 105%.
In decimals, this is 1 + .05 = 1.05. This 1.05 is the return multiplier for January, and you multiply your
original $100 by it to get the amount you have at the end of January.
105 = 100 * 1.05
Or in terms of variables: post_jan_cash <- starting_cash * jan_ret
A quick way to get the multiplier is: multiplier = 1 + (return / 100)
Instructions

• Your new starting cash, January’s return, and January’s return multiplier have been defined for you.
• Use them to calculate post_jan_cash.
• Print post_jan_cash.
• What if the return for January was 10%? Calculate the new jan_mult_10.
• Calculate post_jan_cash_10 using the new multiplier!
• Print post_jan_cash_10 to see the impact of different interest rates!

# Variables for starting_cash and 5% return during January


starting_cash <- 200
jan_ret <- 5
jan_mult <- 1 + (jan_ret / 100)

# How much money do you have at the end of January?


post_jan_cash <- starting_cash * jan_mult

# Print post_jan_cash
post_jan_cash

## [1] 210

# January 10% return multiplier


jan_ret_10 <- 10
jan_mult_10 <- 1 + (jan_ret_10 / 100)

1
# How much money do you have at the end of January now?
post_jan_cash_10 <- starting_cash * jan_mult_10

# Print post_jan_cash_10
post_jan_cash_10

## [1] 220

Financial returns (2)

Let’s make you some more money. If, in February, you earn another 2% on your cash, how would you
calculate the total amount at the end of February? You already know that the amount at the end of January
is $100 * 1.05 = $105. To get from the end of January to the end of February, just use another multiplier!
$105 * 1.02 = $107.1
Which is equivalent to: $100 * 1.05 * 1.02 = $107.1
In this last form, you see the effect of both multipliers on your original $100. In fact, this form can help
you find the total return over both months. The correct way to do this is by multiplying the two multipliers
together: 1.05 * 1.02 = 1.071. This means you earned 7.1% in total over the 2 month period.
Instructions

• Your starting cash, and the returns for January and February have been given.
• Use them to calculate the January and February return multipliers: jan_mult and feb_mult.
• Use those multipliers and starting_cash to find your total_cash at the end of the two months.
• Print total_cash to see how your money has grown!

# Starting cash and returns


starting_cash <- 200
jan_ret <- 4
feb_ret <- 5

# Multipliers
jan_mult <- 1 + (jan_ret / 100)
feb_mult <- 1 + (feb_ret / 100)

# Total cash at the end of the two months


total_cash <- starting_cash * jan_mult * feb_mult

# Print total_cash
total_cash

## [1] 218.4

Data type exploration

To get started, here are some of R’s most basic data types:

• Numerics are decimal numbers like 4.5. A special type of numeric is an integer, which is a numeric
without a decimal piece.

2
• Integers must be specified like 4L.
• Logicals are the boolean values TRUE and FALSE. Capital letters are important here; true and false
are not valid.
• Characters are text values like “hello world”.

Instructions

• Assign the numeric 150.45 to apple_stock.


• Assign the character “AAA” to credit_rating.
• Answer the final question with either TRUE or FALSE, we won’t judge!
• Print my_answer!

# Apple’s stock price is a numeric


apple_stock <- 150.45

# Bond credit ratings are characters


credit_rating <- "AAA"

# You like the stock market. TRUE or FALSE?


my_answer <- TRUE

# Print my_answer
my_answer

## [1] TRUE

What’s that data type?

Up until now, you have been determining what data type a variable is just by looks. There is actually a
better way to check this.
class(my_var)
This will return the data type (or class) of whatever variable you pass in.
The variables a, b, and c have already been defined for you. You can type ls() in the console at any time to
“list” the variables currently available to you. Use the console, and class() to decide which statement below
is correct.
a is a logical, b is a numeric, c is a character

c()ombine

Now is where things get fun! It is time to create your first vector. Since this is a finance oriented course, it
is only appropriate that your first vector be a numeric vector of stock prices. Remember, you create a vector
using the combine function, c(), and each element you add is separated by a comma.
For example, this is a vector of Apple’s stock prices from December, 2016:
apple_stock <- c(109.49, 109.90, 109.11, 109.95, 111.03, 112.12)
And this is a character vector of bond credit ratings:
credit_rating <- c(“AAA”, “AA”, “BBB”, “BB”, “B”)
Instructions

3
• Another example of a numeric vector for IBM stock prices is shown for you.
• Create a character vector of the finance related words “stocks”, “bonds”, and “investments”, in that
order.
• Create a logical vector of TRUE, FALSE, TRUE in that order.

# Another numeric vector


ibm_stock <- c(159.82, 160.02, 159.84)

# Another character vector


finance <- c("stocks", "bonds", "investments")

# A logical vector
logic <- c(TRUE, FALSE, TRUE)

Coerce it

It is important to remember that a vector can only be composed of one data type. This means that you
cannot have both a numeric and a character in the same vector. If you attempt to do this, the lower ranking
type will be coerced into the higher ranking type.
For example: c(1.5, “hello”) results in c(“1.5”, “hello”) where the numeric 1.5 has been coerced into the
character data type.
The hierarchy for coercion is: logical < integer < numeric < character
Logicals are coerced a bit differently depending on what the highest data type is. c(TRUE, 1.5) will return
c(1, 1.5) where TRUE is coerced to the numeric 1 (FALSE would be converted to a 0). On the other hand,
c(TRUE, “this_char”) is converted to c(“TRUE”, “this_char”).
The vectors a, b, and c have been defined for you from the following commands:
a <- c(1L , “I am a character”) b <- c(TRUE, “Hello”) c <- c(FALSE, 2)
Which statement is correct about type conversion?
a is a character vector, b is a character vector, c is a numeric vector

Vector names()

Let’s return to the example about January and February’s returns. As a refresher, in January you earned
a 5% return, and in February, an extra 2% return. Being the savvy data scientist you are, you realize that
you can put these returns into a vector! That would look something like this: ret <- c(5, 2)
This is great! Now all of the returns are in one place. However, you could go one step further by adding
names to each return in your vector. You do this using names(). Check this out:
names(ret) <- c(“Jan”, “Feb”)
Printing ret now returns:
Jan Feb 5 2 Pretty cool, right?
Instructions

• Defined for you are a vector of 12 monthly returns, and a vector of month names.
• Add months as names to ret to create a more descriptive vector.
• Print out ret to see the newly named vector!

4
# Vectors of 12 months of returns, and month names
ret <- c(5, 2, 3, 7, 8, 3, 5, 9, 1, 4, 6, 3)
months <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")

# Add names to ret


names(ret) <- months

# Print out ret to see the new names!


print(ret)

## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 5 2 3 7 8 3 5 9 1 4 6 3

Weighted average (1)


As a finance professional, there are a number of important calculations that you will have to know. One
of these is the weighted average. The weighted average allows you to calculate your portfolio return over a
time period. Consider the following example:
Assume you have 40% of your cash in Apple stock, and 60% of your cash in IBM stock. If, in January, Apple
earned 5% and IBM earned 7%, what was your total portfolio return?
To calculate this, take the return of each stock in your portfolio, and multiply it by the weight of that stock.
Then sum up all of the results. For this example, you would do: 6.2 = 5 * .4 + 7 * .6
Or, in variable terms: portf_ret <- apple_ret * apple_weight + ibm_ret * ibm_weight
Instructions

• Weights and returns for Microsoft and Sony have been defined for you.
• Calculate the portf_ret for this porfolio.

# Weights and returns


micr_ret <- 7
sony_ret <- 9
micr_weight <- .2
sony_weight <- .8

# Portfolio return
portf_ret <- micr_ret * micr_weight + sony_ret * sony_weight

Weighted average (2)


Wait a minute, Lore taught us a much better way to do this! Remember, R does arithmetic with vectors!
Can you take advantage of this fact to calculate the portfolio return more efficiently? Think carefully about
the following code:
ret <- c(5, 7) weight <- c(.4, .6)
ret_X_weight <- ret * weight
sum(ret_X_weight) [1] 6.2
First, calculate ret * weight, which multiplies each element in the vectors together to create a new vector
ret_X_weight. All you need to do then is add up the pieces, so you use sum() to sum up each element in
the vector.

5
Now its your turn!
Instructions

• ret and weight for Microsoft and Sony are defined for you again, but this time, in vector form!
• Add company names to your ret and weight vectors.
• Use vectorized arithmetic to multiply ret and weight together.
• Print ret_X_weight to see the results.
• Use sum() to get the total portf_ret.
• Print portf_ret and compare to the last exercise!

# Weights, returns, and company names


ret <- c(7, 9)
weight <- c(.2, .8)
companies <- c("Microsoft", "Sony")

# Assign company names to your vectors


names(ret) <- companies
names(weight) <- companies

# Multiply the returns and weights together


ret_X_weight <- ret * weight

# Print ret_X_weight
ret_X_weight

## Microsoft Sony
## 1.4 7.2

# Sum to get the total portfolio return


portf_ret <- sum(ret_X_weight)

# Print portf_ret
portf_ret

## [1] 8.6

Weighted average (3)

Let’s look at an example of recycling. What if you wanted to give equal weight to your Microsoft and Sony
stock returns? That is, you want to be invested 50% in Microsoft and 50% in Sony.
ret <- c(7, 9) weight <- .5
ret_X_weight <- ret * weight ret_X_weight
[1] 3.5 4.5
ret is a vector of length 2, and weight is a vector of length 1. R reuses the .5 in weight twice to make it the
same length of ret, then performs the element-wise arithmetic.
Instructions

• A named vector, ret, containing the returns of 3 stocks is in your workspace.

6
• Print ret to see the returns of your 3 stocks.
• Assign the value of 1/3 to weight. This will be the weight that each stock receives.
• Create ret_X_weight by multiplying ret and weight. See how R recycles weight?
• sum() the ret_X_weight variable to create your equally weighted portf_ret.
• Run the last line of code multiplying a vector of length 3 by a vector of length 2. R reuses the 1st
value of the vector of length 2, but notice the warning!

# Print ret
print(ret)

## Microsoft Sony
## 7 9

# Assign 1/3 to weight


weight <- 1/3

# Create ret_X_weight
ret_X_weight <- ret * weight

# Calculate your portfolio return


portf_ret <- sum(ret_X_weight)

# Vector of length 3 * Vector of length 2?


ret * c(.2, .6)

## Microsoft Sony
## 1.4 5.4

Vector subsetting

Sometimes, you will only want to use specific pieces of your vectors, and you’ll need some way to access just
those parts. For example, what if you only wanted the first month of returns from the vector of 12 months
of returns? To solve this, you can subset the vector using [ ].
Here is the 12 month return vector: ret <- c(5, 2, 3, 7, 8, 3, 5, 9, 1, 4, 6, 3)
Select the first month: ret[1]. Select the first month by name: ret[“Jan”]. Select the first three months:
ret[1:3] or ret[c(1, 2, 3)].
Instructions

• The named vector ret is defined in your workspace.


• Subset the first 6 months of returns.
• Subset only March and May’s returns using c() and “Mar”, “May”.
• Run the last line of code to perform a subset that omits the first month of returns.

# First 6 months of returns


ret[1:6]

## Microsoft Sony <NA> <NA> <NA> <NA>


## 7 9 NA NA NA NA

7
# Just March and May
ret[c(3,5)]

## <NA> <NA>
## NA NA

# Omit the first month of returns


ret[-1]

## Sony
## 9

Matrix <- bind vectors

Often, you won’t be creating vectors like we did in the last example. Instead, you will create them from
multiple vectors that you want to combine together. For this, it is easiest to use the functions cbind() and
rbind() (column bind and row bind respectively). To see these in action, let’s combine two vectors of Apple
and IBM stock prices:
apple <- c(109.49, 109.90, 109.11, 109.95, 111.03) ibm <- c(159.82, 160.02, 159.84, 160.35, 164.79)
cbind(apple, ibm)

apple ibm

[1,] 109.49 159.82 [2,] 109.90 160.02 [3,] 109.11 159.84 [4,] 109.95 160.35 [5,] 111.03 164.79
rbind(apple, ibm)

[,1] [,2] [,3] [,4] [,5]

apple 109.49 109.90 109.11 109.95 111.03 ibm 159.82 160.02 159.84 160.35 164.79 Now its your turn!
Instructions

• The apple, ibm, and micr stock price vectors from December, 2016 are in your workspace.
• Use cbind() to column bind apple, ibm, and micr together, in that order, as cbind_stocks.
• Print cbind_stocks.
• Use rbind() to row bind the three vectors together, in the same order, as rbind_stocks.
• Print rbind_stocks.

apple <- c(109.49, 109.90, 109.11, 109.95, 111.03, 112.12, 113.95, 113.30, 115.19, 115.19, 115.82, 115.

micr <- c(59.20, 59.25, 60.22, 59.95, 61.37, 61.01, 61.97, 62.17, 62.98, 62.68, 62.58, 62.30, 63.62, 63.
63.28, 62.99, 62.90, 62.14)

ibm <- c(159.82, 160.02, 159.84, 160.35, 164.79, 165.36, 166.52, 165.50, 168.29, 168.51, 168.02, 166.73,

# cbind the vectors together


cbind_stocks <- cbind(apple, ibm, micr)

# Print cbind_stocks
print(cbind_stocks)

8
## apple ibm micr
## [1,] 109.49 159.82 59.20
## [2,] 109.90 160.02 59.25
## [3,] 109.11 159.84 60.22
## [4,] 109.95 160.35 59.95
## [5,] 111.03 164.79 61.37
## [6,] 112.12 165.36 61.01
## [7,] 113.95 166.52 61.97
## [8,] 113.30 165.50 62.17
## [9,] 115.19 168.29 62.98
## [10,] 115.19 168.51 62.68
## [11,] 115.82 168.02 62.58
## [12,] 115.97 166.73 62.30
## [13,] 116.64 166.68 63.62
## [14,] 116.95 167.60 63.54
## [15,] 117.06 167.33 63.54
## [16,] 116.29 167.06 63.55
## [17,] 116.52 166.71 63.24
## [18,] 117.26 167.14 63.28
## [19,] 116.76 166.19 62.99
## [20,] 116.73 166.60 62.90
## [21,] 115.82 165.99 62.14

# rbind the vectors together


rbind_stocks <- rbind(apple, ibm, micr)

# Print rbind_stocks
print(rbind_stocks)

## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## apple 109.49 109.90 109.11 109.95 111.03 112.12 113.95 113.30 115.19 115.19
## ibm 159.82 160.02 159.84 160.35 164.79 165.36 166.52 165.50 168.29 168.51
## micr 59.20 59.25 60.22 59.95 61.37 61.01 61.97 62.17 62.98 62.68
## [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20]
## apple 115.82 115.97 116.64 116.95 117.06 116.29 116.52 117.26 116.76 116.73
## ibm 168.02 166.73 166.68 167.60 167.33 167.06 166.71 167.14 166.19 166.60
## micr 62.58 62.30 63.62 63.54 63.54 63.55 63.24 63.28 62.99 62.90
## [,21]
## apple 115.82
## ibm 165.99
## micr 62.14

Correlation using vectors and binds

# cbind the vectors together


cbind_stocks <- cbind(apple, ibm, micr)

# Print cbind_stocks
print(cbind_stocks)

## apple ibm micr

9
## [1,] 109.49 159.82 59.20
## [2,] 109.90 160.02 59.25
## [3,] 109.11 159.84 60.22
## [4,] 109.95 160.35 59.95
## [5,] 111.03 164.79 61.37
## [6,] 112.12 165.36 61.01
## [7,] 113.95 166.52 61.97
## [8,] 113.30 165.50 62.17
## [9,] 115.19 168.29 62.98
## [10,] 115.19 168.51 62.68
## [11,] 115.82 168.02 62.58
## [12,] 115.97 166.73 62.30
## [13,] 116.64 166.68 63.62
## [14,] 116.95 167.60 63.54
## [15,] 117.06 167.33 63.54
## [16,] 116.29 167.06 63.55
## [17,] 116.52 166.71 63.24
## [18,] 117.26 167.14 63.28
## [19,] 116.76 166.19 62.99
## [20,] 116.73 166.60 62.90
## [21,] 115.82 165.99 62.14

# rbind the vectors together


rbind_stocks <- rbind(apple, ibm, micr)

# Print rbind_stocks
print(rbind_stocks)

## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## apple 109.49 109.90 109.11 109.95 111.03 112.12 113.95 113.30 115.19 115.19
## ibm 159.82 160.02 159.84 160.35 164.79 165.36 166.52 165.50 168.29 168.51
## micr 59.20 59.25 60.22 59.95 61.37 61.01 61.97 62.17 62.98 62.68
## [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20]
## apple 115.82 115.97 116.64 116.95 117.06 116.29 116.52 117.26 116.76 116.73
## ibm 168.02 166.73 166.68 167.60 167.33 167.06 166.71 167.14 166.19 166.60
## micr 62.58 62.30 63.62 63.54 63.54 63.55 63.24 63.28 62.99 62.90
## [,21]
## apple 115.82
## ibm 165.99
## micr 62.14

Create a factor

Bond credit ratings are common in the fixed income side of the finance world as a simple measure of how
“risky” a certain bond might be. Here, riskiness can be defined as the probability of default, which means
an inability to pay back your debts. The Standard and Poor’s and Fitch credit rating agency has defined
the following ratings, from least likely to default to most likely: AAA, AA, A, BBB, BB, B, CCC, CC, C,
D This is a perfect example of a factor! It is a categorical variable that takes on a limited number of levels.
To create a factor in R, use the factor() function, and pass in a vector that you want to be converted into a
factor.
Suppose you have a portfolio of 7 bonds with these credit ratings: credit_rating <- c(“AAA”, “AA”, “A”,
“BBB”, “AA”, “BBB”, “A”)

10
To create a factor from this: factor(credit_rating)
[1] AAA AA A BBB AA BBB A
Levels: A AA AAA BBB
A new character vector, credit_rating has been created for you in the code for this exercise.
Instructions

• Turn credit_rating into a factor using factor(). Assign it to credit_factor.


• Print out credit_factor.
• Call str() on credit_rating to note the structure.
• Call str() on credit_factor and compare the structure to credit_rating.
• Use levels() on credit_factor to identify the unique levels.
• Using the same “1A”, “2A” notation as in the example, rename the levels of credit_factor. Pay close
attention to the level order!
• Print the renamed credit_factor.
• First call summary() on credit_rating. Does this seem useful?
• Now try summary() again, but this time on credit_factor.

# credit_rating character vector


credit_rating <- c("BB", "AAA", "AA", "CCC", "AA", "AAA", "B", "BB")

# Create a factor from credit_rating


credit_factor <- factor(credit_rating)

# Print out your new factor


print(credit_factor)

## [1] BB AAA AA CCC AA AAA B BB


## Levels: AA AAA B BB CCC

# Call str() on credit_rating


str(credit_rating)

## chr [1:8] "BB" "AAA" "AA" "CCC" "AA" "AAA" "B" "BB"

# Call str() on credit_factor


str(credit_factor)

## Factor w/ 5 levels "AA","AAA","B",..: 4 2 1 5 1 2 3 4

# Identify unique levels


levels(credit_factor)

## [1] "AA" "AAA" "B" "BB" "CCC"

# Rename the levels of credit_factor


levels(credit_factor) <- c("2A", "3A", "1B", "2B", "3C")

# Print credit_factor
print(credit_factor)

11
## [1] 2B 3A 2A 3C 2A 3A 1B 2B
## Levels: 2A 3A 1B 2B 3C

# Summarize the character vector, credit_rating


summary(credit_rating)

## Length Class Mode


## 8 character character

# Summarize the factor, credit_factor


summary(credit_factor)

## 2A 3A 1B 2B 3C
## 2 2 1 2 1

# Visualize your factor!


plot(credit_factor)
2.0
1.5
1.0
0.5
0.0

2A 3A 1B 2B 3C

Bucketing a numeric variable into a factor

Your old friend Dan sent you a list of 50 AAA rated bonds called AAA_rank, with each bond having an
additional number from 1-100 describing how profitable he thinks that bond will be (100 being the most
profitable). You are interested in doing further analysis on his suggestions, but first it would be nice if the

12
bonds were bucketed by their ranking somehow. This would help you create groups of bonds, from least
profitable to most profitable, to more easily analyze them.
This is a great example of creating a factor from a numeric vector. The easiest way to do this is to use
cut(). Below, Dan’s 1-100 ranking is bucketed into 5 evenly spaced groups. Note that the ( in the factor
levels means we do not include the number beside it in that group, and the ] means that we do include that
number in the group.
head(AAA_rank) [1] 31 48 100 53 85 73
AAA_factor <- cut(x = AAA_rank, breaks = c(0, 20, 40, 60, 80, 100)) head(AAA_factor)
[1] (20,40] (40,60] (80,100] (40,60] (80,100] (60,80] Levels: (0,20] (20,40] (40,60] (60,80] (80,100]
In the cut() function, using breaks = allows you to specify the groups that you want R to bucket your data
by!
Instructions

• Instead of 5 buckets, can you create just 4? In breaks = use a vector from 0 to 100 where each element
is 25 numbers apart.

• Assign it to AAA_factor.
• The 4 buckets do not have very descriptive names. Use levels() to rename the levels to “low”, “medium”,
“high”, and “very_high”, in that order.
• Print the newly named AAA_factor.
• Plot the AAA_factor to visualize your work!

AAA_rank <- c(31, 48, 100, 53, 85, 73, 62, 74, 42, 38, 97, 61, 48, 86, 44, 9, 43, 18, 62, 38, 23, 37,

# Create 4 buckets for AAA_rank using cut()


AAA_factor <- cut(x = AAA_rank, breaks = c(0, 25, 50, 75, 100))

# Rename the levels


levels(AAA_factor) <- c("low", "medium", "high", "very_high")

# Print AAA_factor
print(AAA_factor)

## [1] medium medium very_high high very_high high high


## [8] high medium medium very_high high medium very_high
## [15] medium low medium low high medium low
## [22] medium high very_high very_high very_high medium very_high
## [29] low low low medium very_high low very_high
## [36] low very_high low low high medium medium
## [43] medium low low low low medium medium
## [50] medium
## Levels: low medium high very_high

# Plot AAA_factor
plot(AAA_factor)

13
15
10
5
0

low medium high very_high

14

Das könnte Ihnen auch gefallen