Sie sind auf Seite 1von 5

Starting Commands:

require(package): loads the package for use in that session


library(package): opens the package
installed.packages(lib.loc=NULL,priority=NULL,
noCache=FALSE,fields=NULL,

subarch=.Platform$r_arch)
source: creates a script that can be saved and run
Commands are separated by a ; or by a new line.
# signifies a comment which is not executed
To get help regarding a command type ?
before it.
Things are assigned to and stored
in objects using <
or =. A list of all objects in the current session can
be obtained by typing ls()
To remove objects type:remove(object,list=character(),inherits=FALSE)

Entering Data Sets:


R works well with data that is stored in text files where the data is separated either by commas (csv)
or spaces/tabs (tables).
Data is loaded to R by typing the function r
ead.csv or read.table. It needs to be assigned to an
object.
i.e.
dat.tab<read.table(addressofdata,header=TRUE,sep=\t)
dat.csv<read.csv(addressofdata)
You can also use R as a fancy calculator and store strings of numbers (called a vector) in objects:
i.e. x<c(1,2,3,4,5)
c designates the beginning of a string of numbers/variables and binds them together.
Other functions:
is.na(object) tells you if there are any elements in your object that are not defined.
cumsum(object) gives you a cumulative sum of the elements in the object.
mean(object) mean of elements
sd(object) standard deviation
sum(object) sum of all elements
Then you can perform simple calculator functions such as x^2, and then it will perform the function on
all the numbers in the object.
The foreign package also allows us to read other formats of data such as spss (read.spss) or stata
(read.dta), but first you must type in require(foreign).
To load data
from excel

we use the x
lsx and rjava packages.
i.e.
dat.xls<read.xlsx(addressofdata.xlsx,sheetIndex=1)
If you have trouble getting it working, just click "save as" in Excel and export the data to a comma
separated values file (.csv).
The XML package is for importing XML and HTML files. (use function: readHTMLTable)
To view your data as a whole: V
iew(nameofobject)
To view the first couple of rows: head(object)
To view the last couple of rows: tail(object)
To view the variable names: c
olnames(object)
Data sets in R are typically stored as d
ata frames in a matrix structure.
To access individual data points type in object[row,column]
Row:object[row,]
Column:object[,column]

Range of rows and columns:object[row:row,column:column]


Or you can use the direct variable name: o
bject[,variable] or nameofobject$variable
The c function is commonly used to access nonsequential rows and columns from a data frame,
i.e.
object[c(row,row,row),column]
or
object[row,c(variable,variable]
We can also use the c function to add variable names:
colnames(object)<c(variable,variable,variable)
To change single variables use indexing:
colnames(object)[column]<variable
class(object)tells you the type of object, i.e. data.frame
We can also use R to rewrite the data files in multiple formats:
i.e.
write.csv(object,file=filepath/filename.csv)
or
write.table(object,file=filepath/filename.txt,sep=\t,na=.)
To save to R binary code format: s
ave(object,file=filepath/filename.RData)

To build a matrix (example):matrix(1:10,ncol=2)


Lists can be used to store many things including matrices, vectors, and anything else.
To see a part of a list type it in this manner:$variable.
The $ is an indexing symbol that is used to specify something inside an object.
Exploring Data:
To see the number of rows (aka observations) and columns (aka variables) in a data set: dim(object)
To see the structure and class types of all variables in a data set: str(object)
summary(object)gives an overview of the data in the data set such as mean and quartiles.
If we want to look at a summary for only a portion of the data we use a subset.
type: summary(subset(object,variable>/<=number))
CURRENTLY:
http://www.youtube.com/watch?v=HKjSKtVV6GU 1:07:01
http://www.ats.ucla.edu/stat/r/seminars/intro.htm
https://gist.github.com/bobthecat/3361094
http://www.cyclismo.org/tutorial/R/input.html
Making Maps with R:
http://www.zoology.ubc.ca/~kgilbert/mysite/Miscellaneous_files/R_MakingMaps.pdf
template:map(database,Country,xlim=c(x,y)#longitude,ylim=c(x,y)#latitude,
col=color,fill=T/F)
sample: map("worldHires","Canada",xlim=c(141,53),ylim=c(40,85),col="gray90",
fill=TRUE)

pointsplots
Important packages:
map; mapdata; mapproj;
Other mapping packages

rworldmap: Maps of countries with attributes


mapplots: data visualization on maps
geo: N. Atlantic used by HAFRO for fisheries
sp: Spatial data
Imap: Point and click interface
ggmap: plot on top of Google maps
gmt: interface with GMT mapping software
plotGoogleMaps: data on Google maps
RgoogleMaps: Google map backgrounds
rworldxtra: enhanced country borders and boundaries
http://cran.rproject.org/web/views/Spatial.html
XXX.shp holds the actual vertices.
XXX.shx hold index data pointing to the structures in the .shp file.
XXX.dbf holds the attributes in xBase (dBase) format.
require(maps)
require(maptools)
require(ggmap)
require(RgoogleMaps)
require(plotGoogleMaps)

BIOS205

helpful details:
spaces dont matter
PUT COMMENTS (using #) in the scripts file
R precedence to operators
In R, there are two assignment operators. They have subtly different meanings
< requires that you type two characters
= is easier to type
CASE MATTERS
tab completion i.e. put part of a command and then hit tab and it will help you find the command
you are looking for.
brackets operator is used for indexing. R starts counting vector indices from 1.
mathematical commands:
atan is the oneargument arctangent
log is natural log
* is for multiplication
use a colon to easily create a new vector from one number to another with a stepsize of 1.
Using c to create a vector. Keep in mind input is different from output.
>x<c(9,12,6,10,10,16,8,4)
>x
[1]912610101684
>length(x)

[1]8
>sum(x)
[1]75
>sum(x)/length(x)
[1]9.375
>mean(x)
[1]9.375

logical values are either TRUE or FALSE. >= asks if the values in the vector are larger than the
indicated value.
>x>=1
[1]TRUETRUETRUETRUETRUETRUETRUETRUE
seq is the function equivalent of the colon operator
arguments can be specified by position, with one supplied argument for each name in the function
parameter list, and in the same order.
>1:5
[1]12345
>seq(1,5)
[1]12345
>seq(from=1,to=5)
[1]12345

arguments can be supplied by name using the syntax v


ariable=value.
elementwise vs. aggregate functions.
USE THE SCRIPT PANE
can save a script to a file.
Floating Point Numbers
i.e. 1==1+1e23
or2==sqrt(2)^2
the computer believes these are illogical.
homework: review the first 70 slides.
play around with R: https://stanford.app.box.com/bios205
To control NA processing:
put na.rm=TRUE at the end and will ignore the NA values.
i.e.
>mean(c(1,2,NA,4),na.rm=TRUE
[1]2.333

NULL: empty object


NA: missing
rows are things or objects and variables and columns are the values of a variable for the corresponding
instance.
**data frame indexing is an important part of R

str shows the internal structure of a data frame, including the type of each column.

GO OVER MERGE FUNCTION


& NAMING OF ROWS & VECTORS
PRACTICE DATA FRAME INDEXING
factors and data frames are the two most important concepts of R
factors are a powerful way to work with discretevalued data
Note that many measured values are better represented as oatingpoint (realvalued) numbers, not
as factors. Example:weight, height.
lists allow you to collect a bunch of things that are different types. Onedimensional like a vector. You
can name the elements of a list and can index them by name instead of position.

Das könnte Ihnen auch gefallen