Sie sind auf Seite 1von 6

15/04/2012

R Tutorial: Basic Probability

R Tutorial
Cyclismo.org

More Tutorials: Front Page

Basic Probability Distributions


We look at some of the basic operations associated with probability distributions. There are a large number of probability distributions available, but we only look at a few. If you would like to know what distributions are available you can do a search using the command help.search("distribution"). Here we give details about the commands associated with the normal distribution and briefly mention the commands for other distributions. The functions for different distributions are very similar where the differences are noted below. For this chapter it is assumed that you know how to enter data which is covered in the first chapter. 1. 2. 3. 4. The The The The Normal Distribution t Distribution Binomial Distribution Chi-Squared Distribution

The Normal Distribution


There are four functions that can be used to generate the values associated with the normal distribution. You can get a full list of them and their options using the help command:
>hl(oml epNra)

The first function we look at it dnorm. Given a set of values it returns the height of the probability distribution at each point. If you only give the points it assumes you want to use a mean of zero and standard deviation of one. There are options to use different values for the mean and standard deviation, though:
>dom0 nr() []03843 1 .992 >dom0*qt2p) nr()sr(*i []1 1 >dom0ma=) nr(,en4 []000380 1 .01332 >dom0ma=,d1) nr(,en4s=0 []00620 1 .3871 > < c012 v - (,,) >domv nr() []03842 02177 00399 1 .9928 .4902 .5907 >x< sq-02,y.) - e(2,0b=1 >y< domx - nr() >po(,) ltxy >y< domxma=.,d01 - nr(,en25s=.) >po(,) ltxy

The second function we examine is pnorm. Given a number or a list it computes the probability that a normally distributed random number will be less than that number. This function also goes by the rather ominous title of the "Cumulative Distribution Function." It accepts the same options as dnorm:
>pom0 nr() []05 1 . >pom1 nr()
www.cyclismo.org/tutorial/R/probability.html 1/6

15/04/2012

R Tutorial: Basic Probability

[]08147 1 .434 >pom0ma=) nr(,en2 []00251 1 .2703 >pom0ma=,d3 nr(,en2s=) []02295 1 .542 >v< c012 - (,,) >pomv nr() []050000814709749 1 .000 .434 .729 >x< sq-02,y.) - e(2,0b=1 >y< pomx - nr() >po(,) ltxy >y< pomxma=,d4 - nr(,en3s=) >po(,) ltxy

The next function we look at is qnorm which is the inverse of pnorm. The idea behind qnorm is that you give it a probability, and it returns the number whose cumulative distribution matches the probability. For example, if you have a normally distributed random variable with mean zero and standard deviation one, then if you give the function a probability it returns the associated Z-score:
>qom05 nr(.) []0 1 >qom05ma=) nr(.,en1 []1 1 >qom05ma=,d2 nr(.,en1s=) []1 1 >qom05ma=,d2 nr(.,en2s=) []2 1 >qom05ma=,d4 nr(.,en2s=) []2 1 >qom02,en2s=) nr(.5ma=,d2 []06125 1 .500 >qom033 nr(.3) []-.364 1 04142 >qom033s=) nr(.3,d3 []-.993 1 1243 >qom07,en5s=) nr(.5ma=,d2 []6388 1 .49 >v=c010307) (.,.,.5 >qomv nr() []-.851 -.240 06488 1 12156 05405 .749 >x< sq01b=0) - e(,,y.5 >y< qomx - nr() >po(,) ltxy >y< qomxma=,d2 - nr(,en3s=) >po(,) ltxy >y< qomxma=,d01 - nr(,en3s=.) >po(,) ltxy

The last function we examine is the rnorm function which can generate random numbers whose distribution is normal. The argument that you give it is the number of random numbers that you want, and it has optional arguments to specify the mean and standard deviation:
>rom4 nr() [] 12821-.335 -.038 -.788 1 .377 02229 12001 16143 >rom4ma=) nr(,en3 []2638 3678 2086 2613 1 .300 .146 .381 .093 >rom4ma=,d3 nr(,en3s=) []4505 2940 4769 6359 1 .856 .793 .507 .984 >rom4ma=,d3 nr(,en3s=) [] 3005 3748 1.301 3256 1 .082 .110 0022 .967 >y< rom20 - nr(0) >hs() ity >y< rom20ma=2 - nr(0,en-) >hs() ity >y< rom20ma=2s=) - nr(0,en-,d4 >hs() ity >qnr() qomy >qln() qiey

www.cyclismo.org/tutorial/R/probability.html

2/6

15/04/2012

R Tutorial: Basic Probability

The t Distribution
There are four functions that can be used to generate the values associated with the t distribution. You can get a full list of them and their options using the help command:
>hl(Ds) epTit

These commands work just like the commands for the normal distribution. One difference is that the commands assume that the values are normalized to mean zero and standard deviation one, so you have to use a little algebra to use these functions in practice. The other difference is that you have to specify the number of degrees of freedom. The commands follow the same kind of naming convention, and the names of the commands are dt, pt, qt, and rt. A few examples are given below to show how to use the different commands. First we have the distribution function, dt:
>x< sq-02,y.) - e(2,0b=5 >y< d(,f1) - txd=0 >po(,) ltxy >y< d(,f5) - txd=0 >po(,) ltxy

Next we have the cumulative probability distribution function:


>p(3d=0 t-,f1) []006788 1 .0612 >p(,f1) t3d=0 []09322 1 .938 >1p(,f1) -t3d=0 []006788 1 .0612 >p(,f2) t3d=0 []0966 1 .942 >x=c-,4-,1 (3-,2-) >p(ma()2/dx,f2) t(enx-)s()d=0 []001658 1 .0154 >p(ma()2/dx,f4) t(enx-)s()d=0 []000004 1 .0636

Next we have the inverse cumulative probability distribution function:


>q(.5d=0 t00,f1) []-.141 1 1826 >q(.5d=0 t09,f1) []1826 1 .141 >q(.5d=0 t00,f2) []-.278 1 1741 >q(.5d=0 t09,f2) []1741 1 .278 >v< c005.2,0) - (.0,05.5 >q(,f23 tvd=5) []-.941-.635-.589 1 2550 1998 1609 >q(,f2) tvd=5 []-.846-.559-.011 1 2773 2093 1784 >

Finally random numbers can be generated according to the t distribution:


>r(,f1) t3d=0 []094902133506822 1 .403 .746 .756 >r(,f2) t3d=0 [] 01430-.629 00103 1 .030 14818 .751 >r(,f2) t3d=0 [] 08282-.798 -.562 1 .033 04570 10415

The Binomial Distribution


There are four functions that can be used to generate the values associated with the binomial distribution. You
www.cyclismo.org/tutorial/R/probability.html 3/6

15/04/2012

R Tutorial: Basic Probability

can get a full list of them and their options using the help command:
>hl(ioil epBnma)

These commands work just like the commands for the normal distribution. The binomial distribution requires two extra parameters, the number of trials and the probability of success for a single trial. The commands follow the same kind of naming convention, and the names of the commands are dbinom, pbinom, qbinom, and rbinom. A few examples are given below to show how to use the different commands. First we have the distribution function, dbinom:
>x< sq05,y1 - e(,0b=) >y< dio(,002 - bnmx5,.) >po(,) ltxy >y< dio(,006 - bnmx5,.) >po(,) ltxy >x< sq010b=) - e(,0,y1 >y< dio(,0,.) - bnmx1006 >po(,) ltxy

Next we have the cumulative probability distribution function:


>pio(45,.) bnm2,005 []04364 1 .482 >pio(55,.) bnm2,005 []05636 1 .517 >pio(55,.) bnm2,105 []05 1 . >pio(65,.) bnm2,105 []0601 1 .116 >pio(55,.) bnm2,005 []05636 1 .517 >pio(55,.5 bnm2,002) []0996 1 .992 >pio(55002) bnm2,0,.5 []4955e3 1 .568-3

Next we have the inverse cumulative probability distribution function:


>qio(.,112 bnm055,/) []2 1 5 >qio(.55,/) bnm02,112 []2 1 3 >pio(35,/) bnm2,112 []02727 1 .894 >pio(25,/) bnm2,112 []0203 1 .051

Finally random numbers can be generated according to the binomial distribution:


>rio(,0,2 bnm510.) []3 2 2 1 1 1 0 3 1 9 8 >rio(,0,7 bnm510.) []6 6 5 6 6 1 6 6 8 8 3 >

The Chi-Squared Distribution


There are four functions that can be used to generate the values associated with the Chi-Squared distribution. You can get a full list of them and their options using the help command:
>hl(hsur) epCiqae

These commands work just like the commands for the normal distribution. The first difference is that it is assumed that you have normalized the value so no mean can be specified. The other difference is that you have to specify the number of degrees of freedom. The commands follow the same kind of naming convention, and
www.cyclismo.org/tutorial/R/probability.html 4/6

15/04/2012

R Tutorial: Basic Probability

the names of the commands are dchisq, pchisq, qchisq, and rchisq. A few examples are given below to show how to use the different commands. First we have the distribution function, dchisq:
>x< sq-02,y.) - e(2,0b=5 >y< dhs(,f1) - ciqxd=0 >po(,) ltxy >y< dhs(,f1) - ciqxd=2 >po(,) ltxy

Next we have the cumulative probability distribution function:


>phs(,f1) ciq2d=0 []003587 1 .0694 >phs(,f1) ciq3d=0 []00879 1 .1554 >1phs(,f1) -ciq3d=0 []0912 1 .844 >phs(,f2) ciq3d=0 []4070e0 1 .951-6 >x=c2456 (,,,) >phs(,f2) ciqxd=0 []1145e0 4690e0 2732e0 1128e0 1 .125-7 .488-5 .751-4 .048-3

Next we have the inverse cumulative probability distribution function:


>qhs(.5d=0 ciq00,f1) []3909 1 .429 >qhs(.5d=0 ciq09,f1) []1.00 1 8374 >qhs(.5d=0 ciq00,f2) []1.58 1 0801 >qhs(.5d=0 ciq09,f2) []3.14 1 1403 >v< c005.2,0) - (.0,05.5 >qhs(,f23 ciqvd=5) []1886 2085 2711 1 9.11 1.35 1.73 >qhs(,f2) ciqvd=5 []1.16 1.17 1.14 1 0595 3192 4611

Finally random numbers can be generated according to the Chi-Squared distribution:


>rhs(,f1) ciq3d=0 []1.07 2.81 1.99 1 6805 0242 2309 >rhs(,f2) ciq3d=0 []1.388 8513 1.832 1 7887 .996 7467 >rhs(,f2) ciq3d=0 []1.97 2.60 2.15 1 1129 3897 4821

The previous tutorial is a description of the basic operations. The next tutorial is a description of basic plots. Go back to the main page. This tutorial written by Kelly Black. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike2.5 License.

www.cyclismo.org/tutorial/R/probability.html

5/6

15/04/2012

R Tutorial: Basic Probability

Splunk Syslog Server Analyze logs, configurations & more with Splunk. Free Download! www.splunk.com/ITSearch Direct Mail Corporation Fulfilment Print & Mailing Experts Enviro Magazine Wrapping - CALL NOW
www.directmail.com.au

VicSuper Finance Tips Super Woman Money. financial tips From our all-female money experts! www.superwomanmoney.com/experts

www.cyclismo.org/tutorial/R/probability.html

6/6

Das könnte Ihnen auch gefallen