Univariate Random Variables

Alright. So, we're going to talk about random variables and probability review.
And so, again, this is a course about modeling financial data. And the primary thing that we're going to be looking at are asset returns. And, and when you think about an asset return, You know, we went over return calculations before, you know? Say, we're investing in Microsoft stock, we buy it today, we hold it for a month, one month from now, we sell it, we can calculate the percentage change in price that's the rate of return. But from the point of view of today, the rate of return on this one-month investment is not known, because we need to know what the price next month is in order to be able to calculate that return. So, we can think of the rate of a return as a random variable because the future is not known and the return depends upon the future price, which is not known. And so you know, the outcome is uncertain. And so we can characterize returns as a random variable, which is a variable that can take on a, a possible set of values called the sample space. So, the future price could go up, it could go down. And so, there is a potential list of values of the future price and then, we attach a probability to each of those prices and that gives us the probability distribution over those potential values. So, what we are going to review today is various mathematical ways of describing random variables and distributions and with several examples towards thinking about asset returns and the properties of probability distributions for asset returns that look reasonable. So some examples typically, an uppercase letter denotes a random variable. So, uppercase X is, in this case, the price of Microsoft stock next month. It's a random variable cuz we don't know what the future price is going to be. What's the sample space of the future prices? Well, the sample space, this is the possible values that we can think, that we think the prices can take on. Now, prices can't become negative so we know that the sample space is going to be t hose say positive values. And you know, if the stock is trading, if the stock doesn't go out of business, then the price should be positive. Now, prices can't go up to infinity, so there's some realistic upper bound on prices and that upper bound might be a $1,000 or something like that. So, we can think future prices can lie anywhere,
any real number, say, between zero and some big number M like a 1,000. So, that would be a, a characterization of a random variable, the future price and its sample space. Another random variable could be the rate of return on the investment and the rate of return, which is the percentage change in price. Now what is this appropriate example space for a rate of return. Now, what's the smallest value that a rate of return can take on? The worst you can do is lose all your money, right? So, if you, you buy Microsoft today for 30, and if Microsoft goes bankrupt, and its price goes to zero in the future, then the percentage change in price is minus a 100%. So, the returns are bounded from below by -one, so it can't be any more than that. And the upper bound, again, is some big positive number. You know, say, Microsoft, you know, I don't know, in, invent some product that revolutionizes the world and its price is going to shoot up. And but, you know, it's not going to go off to infinity. So, there's some reasonable upper bound associated with that. Another random variable we can think of, and that's used a lot in probability modelling in finance, is a just a, a discrete random variable that just takes on two values. So, we're going to set x to be equal to one if the stock price goes up and we'll say, x is equal to zero if the stock price goes down. This is sort of like a coin flipping example. You flip the coin and it lands heads, that's like the stock prices going up. You flip the coin, it lands tails, it's like the stock prices going down, and then we're just coding the random variable to be 1,0 based upon those events. And here, the sample space is very simple with just, just two va lues zero and one. So, those are examples of, of random variables that we'll be looking at. So, once we have these random variables, we need to characterize their probability distribution. Now we in, in probability models, we usually distinguish between what are call discrete random variables. And the discrete random variables is a random variables that can only take on finite set of values. So, in the last example, the up-down indicator, we took on two values, zero and one, that's the discrete random variables and its sample space is two discrete points, okay? The probability distribution of a discrete random variable is a function, say, P(x). Such that P(x) is the probability that the
random variable is equal to little x. So typically, in notation capital letter denotes the random variable and a lower case letter denotes a value in the sample space that the random variable can take on. Now, this probability function must satisfy certain conditions in order to be a valid probability function. So, probabilities are greater and equal to zero for all values in the sample space. Probabilities are equal to zero for values outside of the samples space. You know, we assume that all the values in the sample space, you know, again are, are discrete, distinct points. The sum of the probabilities of the values in the sample space is equal to a 100%. And probabilities have to be less than one, as well. So, as an example of a simple discrete random variable and a probability distribution, here's a case where we can have a random variable x here and X is going to denote the annual rate of return on Microsoft stock. Alright. So, we have a random variable X that's going to represent the annual rate of return on Microsoft stock. And this is an example that's kind of like the case where you might have a stock analyst who's working for an investment bank, and they're doing some fundamental an analysis and they need you know, make some forecast of what the rate of the return might be over the next year. And the analysis might have a very simplified way of viewing the world that you know, over the next year the price of Microsoft stock is primarily contingent upon what's going to happen to the state of the economy. And so, the analyst says, well, I think there are really one, two, three, four, five potential states of the economy, you know, depression, you know, it's a very bad state. And if the economy is in a depression, then Microsoft is going to lose 30%. And the analyst puts a probability of five% on that event occurring. And then, the other state of the world could be, say, a recession. And if a recession happens, then Microsoft has an annual return of zero%, and the analyst puts a probability of twenty% on that. And then similarly, normal mild boom, major booms, these are other states of the world, and then we see as the economy gets better, the rate of return goes up, and, and then we have these probabilities. So notice that the, the normal state of the world is the state that gets the highest probability associated with it. Now this is a valid probability distribution, all
the probabilities are between zero and one and the sum of all the probabilities add up to one. Now, in this case, you know, where do these probabilities come from? You now, in, in this case, the probabilities are the subjective beliefs of the analyst. They may have nothing to do with the real world, you know, in quotations. But there are just, you know, views or opinions associated with, with the analyst. In, in probability theory, there are generally two types of ways that we view probability. One way, the subjective approach is probabilities are opinions or degrees of belief. And this is associated with what is often referred to as Basian statistics. The other approach views probabilities of, actually, actually, real physical objective things . So, the, the, the objective view of probability. Think of the coin flipping example. So, if you have a fair coin that is, it's weighted such that the probability that the coin lands heads or tails is 50%. So, where doe s that probability of 50% come from? Well, it's an objective feature of the coin, and the experiment of flipping the coin and the idea is that, you know, when you flip a fair coin, you know, and you, you do this in an experimental setting, say, a million times, the probability that the coin lands heads is a fraction of the times in your experiment that you actually observe the coin landing heads. And the probability that lands tails is the fraction of times you actually observe the coin landing tails. So there, you think of probability as being a property of the coin, the experimental setting, and something that you can repeat over and over again and, and reproduce. Whereas, the opinion approach is, there's no experiment going on here. You can't repeat, you know, the idea of the views going over and over again. It's just a degree of belief. Both views of probability theory are, are, are equally valid, and but, you know, within the statistics literature, you know, there's often very strong opinions about what is the right way or the superior way to view probability. In this class, we won't take a, make a judgement on that, but we'll, you know, essentially just take probability as given and do computations. So one of the things I want to, to show you is an Excel spreadsheet that I have and now, Excel is not necessarily a good tool for doing probability calculations. And so, the point here is just to
illustrate how to do certain computations in Excel and, and, and show some, and show graphics and, and, and things like that. R is a much better environment for doing a probability calculations and, and so on. But in, for my case, you know, I have my example I just put out an Excel spreadsheet, I have my returns, I have my probabilities, and then I can do a nice simple bar chart to represent the probability distribution. So, in the graphical representation of the distribution, I put the values in the sample space on this axis. And then, the heights of the bars just represent the probabilities. And so, from this graphic al representation, we can clearly see the most likely value is around one%. The shape of the distribution is symmetric and that this is the middle of the distribution. The shape to the left of the middle, and the shape to the right of the middle is the same and, and so on. We can do the same computation in, in R. So, here is my this is new, this is again, just based on the examples in my lecture notes. So, if I wanted to plot the probability distribution in R I would just create a vector of values that represent the values in the sample space, create a vector of probabilities, and then do a bar plot. And that gives me the, the values here. Okay. A more mathematical model for a discreet random variable is based on what's called the Bernoulli distribution And the Bernoulli distribution is a probability model that essentially describes the coin-flipping experiment. So, we have two mutually exclusive events generically called success and failure, right? So, in modeling stock prices, a success event is the stock price goes up, a fail event is the stock price goes down. That's assuming you have a long position in, in, in, in the asset. If you have a short position, then stock price going up is the failure, and the stock price going down is the success. What, so, in investments, a long position means you buy something today. You hold it, you sell it in the future. A short position means you sell it today, you, and then you buy it back in the future. Okay, and so, when your long something, you're hoping the price will go up. When you're short something, you hope the price is going to go down. Typically shorting works, I just, just as an aside, because this is going to come up lat er on. When you short a stock, typically, you open a brokerage account in, you know,
like at Fidelity or E-trade or something like that. Say, you want to short Microsoft. Well, you are going to sell something you don't own. So, how do you do that? Well, if you have a brokerage account, you can borrow the stock from somebody who owns it and you borrow it, and you sell it, you get the proceeds that you hold on to it. But because you borrowed it, you have to give it back at some point. So, when you close out the short position, you go back in the market, you buy it back, and then you return the stock to who you borrowed it from. So, you want to think of that transaction is taking place when you do a short sale. Alright. So we're going to calling this in this, go back to the Bernoulli example. Let's say X = one, if a success occurs and X = zero, if a failure occurs, okay? So, that's, that's coin lands head, success, coin lands tail failure. Now, the probability that we have a success, that X = one is were, is been equal to pi and pi is some number between zero and one. And then the probability of a failure that X = zero is then one - pi. Alright, cuz we have two events. The sum of the pi will always have to add to one, so pi + one pi = one. Now, a simple mathematical model for this probability distribution are P(x), we can write as pi^x one - pi^1 - x where x only takes two values, zero and one. So, this P(x) function gives us our probabilities. Notice that when X is zero, P of zero is pi^0 one - pi^1 - zero. So, that's one - pi. So then, we have pi^0 is one. One - pi^1 is one - pi. And then, when X = one, my P of one is pi^1 one pi^1 - one so that's equal to -pi. So, this very simple mathematical representation gives us our probability function for the Bernoulli distribution. Now, the other type of random variables that we work at, look at, are called continuous random variables. A continuous random variable is one, is a random variable that can take on any real value, alright? And so now, we talk about the probab ility density function of a continuous random variable. We're going to have a probability function that we're going to denote as f(x) to distinguish it from P(x) for the discrete random variable. And this f(x) represents what's often referred to as the probability curve, okay? Now, the probability curve satisfies such that, if A is any inte rval on the real line, the probability that the continuous random variable is in this
interval is equal to the interval of the probability curve over that interval. So, in other words, the probability that X is, is in this interval, is the area under the probability curve over this particular interval. So, we have a continuous random variable, we have a probability curve, and probabilities are associated with areas under the curve. Now, this probability curve must satisfy, it always, is always positive cuz we want to compute areas under a curve. And the total area under the probability curve is equal to a 100%. So, that's the ideas that the, all the probabilities add to one. So, if we think of a probability curve here, let's say, X is continuous random variable, here, its the probability curve, and I've witnessed, suggestively like a, a bell shape curve like a normal distribution which we will talk about later. And we want to say, what is the probability that this random variables between -two and one. So, our intervals between -two and one and the probability of this event is equal to the area under the probability curve over this interval. So, we see that one of the reasons why probability theory with Calculus is useful because in order to calculate probabilities, we have to find area under a curve. In order to find area under the curve, we have to integrate the probability function. So, when you take a more mathematically-oriented probability theory course, you do a lot of C`alculus to do these types of calculations. In this class, we are not going to do the Calculus, calculations, right? We need to know the concept of doing this and then if, even if you have to integrate something, we can do it in r numerically. So, we have the function, one in front of the area under the curve, we can write a function in r to represent the probability curve and then, we can use the function called Integrate to numerically calculate the area under the curve for us. So, we're going to be using tools that will do Calculus for us and, but we need to know the concept of, of what's going on, alright. A very simple example of a continuous random variable in a distribution is, is the so-called uniform distribution over the interval ab. So we say, x is distributed uniform. So, this is a bit of notation in a probability theory. X represents a random variable. This little squiggle character, character represent, is to be read as is distributed a. So, x is distributed as uniform over
the interval ab, okay? The probability curve of the uniform distribution is a rectangle. So, if we want the think about the uniform distribution, we have some interval a to b, and we know that the area under the probability curve has to equal a 100%. And the idea of a uniform distribution is that, you know, probability over any interval of the same length is the same, okay? So, it's a, it's a way of thinking of like, equal probability for events of the same size, so to say. So, if this total area has to be 100%, then we know that length times width is equal to one, so the height of the probability curve is one / b - a. So, that represents the probability curve for a uniform end variable. And we know that this probability curve is greater or equal to zero provided the, the right end point is bigger than the left end point. And we know that the total area under this curve, length times width, if we do the integration it's equal to one.

Univariate Random Variables

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Univariate Random Variables

Hochgeladen von

Copyright:

Verfügbare Formate

Alright. So, we're going to talk about random variables and probability review.

Das könnte Ihnen auch gefallen