You are on page 1of 4

# The Memoryless Property

Theorem 1 Let X be an Exponential random variable with parameter > 0. Then X has the memoryless property, which means that for any two real numbers a, b > 0, P (X > a + b|X > b) = P (X > a). WARNING: This is not saying that P (X > a + b|X > b) = P (X > a + b). That would mean - literally - that the future values of X are independent of the past, which is not correct. It is simply saying - intuitively - that the probability that X is greater than some positive value does not remember the past; it may still depend on the past, however. Proof. First, well derive an expression for P (X > t) for any t > 0.
t

P (X > t) = 1 P (X t) = 1

ex dx = 1

= 1 + et 1 = et
0

Now, compute P (X > a + b|X > b) using the denition of conditional probability. P (X > a + b|X > b) = P ({X > a + b} {X > b}) P (X > b) P (X > a + b) = P (X > b) e(a+b) eb e a eb = eb = ea = But we just showed that ea is exactly P (X > a). Thus, we see that P (X > a + b|X > b) = P (X > a). Some Remarks on the Consequences and Interpretations of the Memoryless Property First of all, keep in mind that this property is not a universal property. It does not hold for all continuous random variables. Moreover, it would obviously not apply in all physical situations. For instance, suppose X is the lifetime of a car engine given in terms of number of miles driven. If the engine has lasted 200,000 miles, we might not expect - based on actual, physical experience - that the probability that the engine lasts another 100,000 miles is the same as the probability that the engine lasts 100,000 miles from the time it was rst built. That is, we would probably not expect to have P (X > 300, 000|X > 200, 000) = P (X > 300, 000 200, 000) = P (X > 100, 000). But if empirical data showed that the lifetime of a car engine was, in fact, exponentially distributed, then this property would, indeed, hold, whether it matches your intuition or not. In fact, this property and the questions that were raised in class regarding it bring up an important philosophical point in probability theory. Probabilities do not exist vacuously, and they are not universal from situation to another. They carry with them some particular distribution or form. When you estimate or compute probabilities in real life, everyday experiences, you are -

and 1 p (e.g. either a 1 or a 0, or a success or a failure, etc.), then X describes when the rst occurrence of the success outcome happens. Now, suppose, for instance, you run through 10 trials of this experiment, all of which have been failures or 0s. Just because youve gotten 10 failures in a row to begin the experiment doesnt mean probabilistically that you should necessarily expect a success to be more likely (or a failure to be less likely) on the 11th trial. If we go purely on intuition - which, as Ive tried to point out above, isnt always the best route in probability you might expect that we should eventually have to get a success, so the longer we go without one, the more likely we are to get one next. But thats not true! Because X is described by a geometric random variable, which has the memoryless property, it actually doesnt matter how many consecutive failures we get. The probability that the rst success occurs at any particular trial is the same as it is at the beginning of the sequence! If the intuition worked in this case, then X wouldnt be a geometric random variable, because it wouldnt be memoryless. But it is geometric, so we have an example where the everyday intuition we try to apply to probabilities fails us. Proof. The probability mass function of X is p(i) = P (X = i) = p(1 p)i1 . As we did before, we will nd an expression for P (X > n) to make things easier.
n n1

P (X > n) = 1 P (X n) = 1
k=1

p(1 p)k1 = 1 p
k=0

(1 p)k = 1 p

1 (1 p)n 1 (1 p)

= P (X > n) = (1 p)n Proceeding as in Theorem 1, we have the following. P (X > m + n|X > n) = P ({X > m + n} {X > n}) P (X > n) P (X > m + n) = P (X > n) (1 p)m+n = (1 p)n = (1 p)m = P (X > m).

Some Basic Examples Memoryless random variables like the exponential random variable may seem strange, but they actually describe many dierent real-world phenomena. For instance, the time between arrivals of customers to a store, drive-thru, or any comparable service center is an exponential random variable. You might expect more customers to arrive at certain times of the day than at others, but just because a customer hasnt arrived in, say, the last 20 minutes doesnt mean that you should expect one in the next 5 minutes with anymore likelihood than you would have 20 minutes ago. By the same reasoning, the time between telephone calls received by a particular phone is an exponential random variable. Ignoring circumstances where youre expecting a planned phone call, and regardless of your average frequency of calls, just because you havent received a call in, say, the last hour doesnt mean that youre anymore likely to receive a call in the next 10 minutes than you were at the beginning of the hour. On the other hand - to admittedly contradict one of my own examples from class - the lifetime of a cell phone battery would not be well-modeled by an exponential random variable. It works ne

as a mathematical example, but experience tells us that - in physical terms - the longer a battery is used and recharged the more likely it is to ultimately fail. You could model the batterys lifetime with an exponential random variable if you simply chose to, and it would even be appropriate for short times. But in the long run, modeling the overall lifetime exponentially wouldnt be a good idea. Also, you should read through Examples 5d in Section 5.5, and I encourage you to read the portion of Section 4.7 that discusses Poisson processes and see how the exponential random variable is used to model the time between occurrences of an event. In particular, notice that if X is a Poisson random variable with parameter (which means is average/expected number of events that occur in a particular time interval), it follows from the Poisson p.m.f. that P (X = 1) = e . This is just the p.d.f. of an exponential random variable with parameter evaluated at x = 1. If we let Y be this exponential random variable, then the probability that Y is close to 1 is approximately P (1 x < Y < 1 + x) fY (1)x = P (X = 1)x. This shows the connection between the Poisson and Exponential random variables. In particular, it shows why we interpret the expected values the way we do for the Poisson and exponential random variables. If we expect events to occur in a given time interval T (from a Poisson() random variable), then we expect them to occur at a rate of T events per unit time. This means that we T would expect units of time between events. If we set T = 1, we see that expected value of a Poisson random variable is the number of events per unit time, and the reciprocal of that is the rate at which the events occur (unit of time per event). This is, intuitively, why the expected values 1 of Poisson() random variables and Exponential() random variables are, respectively, and reciprocals of each other. Each variable measures an inverse aspect of the same problem from the other variable.