Beruflich Dokumente
Kultur Dokumente
dwivedishashwat@gmail.com
Background
Background
R is a free software environment for statistical computing and graphics Very active and vibrant user community Graphical capabilities Physical memory Base R and around 4000 packages
1/21/2014
Introduction
memory.limit(): To find out maximum amount of available physical memory memory.size(): To find out how much memory is in use getwd(): Shows the path of your current working directory setwd(path): Allows you to set a new path for your current working directory dir(): List down all the files in your working directory Program Editor (open, load, run, save) ls(): List all objects in your workspace rm(): Removes object from your workspace
4
Introduction
Commands to R are expressions (4/3) or assignments (x <- 4/3) R is case sensitive Everything in R is a object Normally R objects are accessed by their names which is made up from letters, and digits (0 to 9) or a period (.) in non-initial positions. Every object has a class R has 5 basic classes of objects
character
Background
Ex.
Ex. Q. X <- c(1.7, a) X <- c(TRUE, 2) X <- c(a, TRUE) x <- c(0.5, 0.6) # numeric X <- c(TRUE, FALSE) # logical X <- c(T, F) # logical X <- c(a, b, c, d) # character X <- 1:20 # integer X <- c(1+0i, 2+4i) #complex seq(from=1, to=10, by=1) rep(c(1,2,3,4,5), times=2, each=2) x <- 1 # assignment Print(x) # explicit printing X # auto printing
When different objects are mixed in a vector, coercion occurs so that every element in the vector is of same class
Session 1 Remaining
rm(list=setdiff(ls(), test))
rm(list=ls())
Introduction
Ex.
X <- 0:6 X <- c(a, b, c) X <- c(1, 2, 3)
Data Types
R objects can have attributes (attributes())
Class (class()) Length (length()) names (colnames for a matrix), dimnames (rownames, colnames for a matrix)
dimensions (dim())
other user defined attributes
Lists: Special type of vectors which can contain objects of different classes.
Data Types
Matrix: vectors with dimension attribute. Dimension itself is an integer vector of length 2 (nrow, ncol). Matrices are constructed column wise.
m <- matrix(nrow=2, ncol=3) m <- matrix(1:6, nrow=2, ncol=3) x <- 1:3 y <- 10:12 cbind(x, y) rbind(x,y)
Data frames (data.frame()) https://stat.ethz.ch/pipermail/rhelp/attachments/20101027/05a229bb/attachment.pl Factors: Used for categorical data i.e. Male & Female or analyst, senior analyst, manager etc.
x <- factor(c(a, b, b, c, c, c, d)) levels() unclass(x) levels([4:6]) Levels([4:6, drop=TRUE])
Sub-setting
[ always returns an object of the same class as the original; can be used to select more than one element [[ is used to extract elements of list or data frames; it can only be used to extract single element and the class of the returned object will not necessarily be a list or data frame $ is used to extract elements of a list or data frames by names; semantics are similar to [[
Operators
Some Examples
x <- c(a, b, c, c, d, a)
x[1], x[1:4], x[x > a], u <- x >a
x <- matrix(1:6,2,3)
x[1,2], x[1,], x[,1], x[1,2, drop=FALSE]
dput()
save() serialize()
Write.table()
X, the object to be written, preferable a matrix or a data frame File, path and name of the file to be created Sep, a string indicating how the columns are separated Row.names, col.names, logical indicating whether the row names or col names to be written along with x
pretty(x, n): Compute a sequence of about n+1 equally spaced round values which cover the range of the values in x. pretty(x, 100) substr(x, start, stop) <- value: Extract or replace substrings in a character vector. strsplit(): Split the elements of a character vector x into substrings according to the matches to substring split within them. rank(): Returns the sample ranks of the values in a vector. Ties (i.e., equal values) and missing values can be handled in several ways aggregate(): Splits the data into subsets, computes summary statistics for each, and returns the result in a convenient form.
ddply(): For each subset of a data frame, apply function then combine results into a data frame.
Control Structures
Allows you to control the flow of execution of the program
if, else (testing a condition)
if (condition) {do something} else if {do something different} else {do something different}
Create a vector with all integers from 1 to 1000 and replace all even number by their inverse
Loop Functions
lapply: Returns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X
lapply(airquality, mean)
Calculate sum of all the variables of the airquality dataset excluding NAs
sapply: Sapply is a user-friendly version of lapply by default returning a vector or matrix if appropriate
sapply(airquality, mean) Repeat the problem present in lapply using sapply and see the difference
apply: Returns a vector or array or list of values obtained by applying a function to margins of an array or matrix
apply(airquality, 1, sum) Calculate deciles including min and max of all the variables of the dataset airquality excluding NAs Calculate square of each element of a matrix with dimensions 10 & 2 and entries 1 to 20
Loop Functions
tapply: Apply a function to each cell of a ragged array, that is to each (non-empty) group of values given by a unique combination of the levels of certain factors
tapply(airquality$Ozone, aiqruality$Month, sum) Calculate sum of Ozone variable for observations having month equals to 5
mapply: mapply is a multivariate version of sapply. mapply applies FUN to the first elements of each argument, the second elements, the third elements, and so on
mapply(rep, 1:4, 4:1) Calculate sum of two lists with dimensions 10 & 2 and having entries 1 to 20, 101 to 120, 201 to 220 & 301 to 320
Plotting Functions
plot(x,y) hist(x) par()
pch: plotting symbol
Plotting Functions
Functions
function ()
Exact match > Partial match > Positional match Return value of a function is the last expression in the function body to be evaluated Functions can be nested, so that a function can be defined inside another function Functions can be passed as arguments to other functions
Debugging
debug: flags a function for debug mode which allows you to step through execution of a function one line at a time
browser: suspends the execution of a function whenever it is called and puts the function in debug mode
trace: allows you to insert debugging code into a function at specific places
recover: allows you to modify the error behavior so that you can browse the function call stack
Debugging
dwivedishashwat@gmail.com