Sie sind auf Seite 1von 16

R u ready

Divide and win


Group Red

Sravika

Radhika Reddy

Mamatha
Group Green

Sudharsan
Reddy Yettapu

Swapna C
Group Blue

Vinita

Madhusudhan

Naveen
So Why R???
Its free Yes we dont have to pay to buy licence
It has a robust community used by academics and researches
around the world
Graphics and data visualization lattice, ggplot2 (we will learn
later)
Basics of Data structure in R
Vector - The fundamental of data structure in R. All the elements in
a vector must be of the same type
Integer (numbers without decimals)
Numeric (numbers with decimals)
Character (text data)
Logical (TRUE or FALSE values)
Factors - A factor is a special case of vector that is solely used for
representing nominal variables
List - special type of vector, unlike a vector that requires all
elements to be the same type, a list allows different types of
values to be collected
Data frame - a structure analogous to a spreadsheet or database
since it has both rows and columns of data
Matrix - A matrix is a data structure that represents a two-
dimensional table, with rows and columns of data. R matrixes can
contain any single type of data
Vector
paitent_name <- c(Abhishek", Venu", Naveen")
Character vector
temperature <- c(98.1, 98.6, 101.4)
What type of vector is it?? numeric vector
flu_status <- c(FALSE, FALSE, TRUE)
What type of vector is it?? logical vector

temperature[2]
[1] 98.6
temperature[-2]
[1] 98.1 101.4
Factors
An advantage of using factors is that they are generally more
efficient than character vectors because the category labels
are stored only once. Rather than storing MALE, MALE,
FEMALE, the computer may store 1, 1, 2. This can save
memory
gender <- factor(c("MALE", "FEMALE", "MALE"))
gender
[1] MALE FEMALE MALE
Levels: FEMALE MALE
List
Unlike vector that requires all elements to be the same type, a list allows different types of
values to be collected. Due to this flexibility, lists are often used to store various types of input
and output data
subject1 <- list(fullname = paitent_name[1], temperature = temperature[1], flu_status =
flu_status[1], gender = gender[1])
subject1
$fullname
[1] Abhishek
$temperature
[1] 98.1
$flu_status
[1] FALSE
$gender
[1] MALE
Levels: FEMALE MALE
Matrix
R matrixes can contain any single type of data, although they are
most often used for mathematical operations and therefore
typically store only numeric data
To create a matrix, simply supply a vector of data to the matrix()
function, along with a parameter specifying the number of rows
(nrow) or number of columns (ncol)
m <- matrix(c('a', 'b', 'c', 'd'), nrow = 2)
m
[1,] [2,]
[,1] "a" "c"
[,2] "b" "d"
data is stored in column first
Data frame
In R terms, a data frame can be understood as a list of vectors
or factors, each having exactly the same number of values.
Because the data frame is literally a list of vectors, it
combines aspects of both vectors and lists
pt_data <- data.frame(subject_name, temperature, flu_status,
gender, stringsAsFactors = FALSE)
pt_data
paitent_name temperature flu_status gender
1 Abhishek 98.1 False M
Some useful function for data frame
How to read/load data in R
read.csv (file name.csv, header =T, stringsAsFactors = FALSE)
read.table( file name, header = T, sep = ,, stringsAsFactors =
FALSE)
"csv" for comma-separated and "tab" for tab-separated
read.csv (file.choose(),header=T) (select the file)
How to save data in R
write.csv(pt_data, file = "pt_data.csv")
save(pt_data, file = "pt_data.RData")

Lets start Running
rm (list =ls()) to remove data already in RAM
getwd () show the path where R operates
setwd ("D:/R learning/first class") to change the path
install.packages (ca, dep=T)
library (ca)
library (help=ca) ----- tells about the package
Help for any function
?read.csv ()
??read.csv ()
Google/youtube



Exploring and understanding data
df<- read.csv (file.choose(),header=T)
str (df)
dim (df)
summary (df)
head (df,10)
summary (df$year)
df1 =df[which(df$year==2011),]
df1 =subset(df, year == 2011')
quantile(df$price)
quantile(df$price, probs = c(0.01, 0.99))




Exploring and understanding data




table(df$year)
table(df$model)
table(df$color)
color_percentage <- prop.table(table(df$color)) * 100
install.packages("gmodels") ------cross tab
library (gmodels)
CrossTable(x = df$model, y = df$transmission)
Case Study 1 Brand perception
How my brand is different from competition
What consumer think of my Nokia vis a. viz Samsumg,
Micromax, HTC
Is my brand differentiated compared to competition
So how you as a analyst can help brand manager
to solve above problem?
Data for Correspondence Analysis
rm (list =ls())
df<- read.csv (file.choose(),header=T, row.names="RW")
str (df)
dim (df)
summary (df)
head (df,10)
install.packages (FactoMineR)
library (FactoMineR)
ca2 = CA(df, graph = FALSE)
ca2$eig
head(ca2$col$coord)
head(ca2$row$coord)
plot(ca2)


What Next???
Need to work on one case studies with these 2 technique by
each team

Assignment should be submit before Thursday

You need to take the data from Canon Wave 2 /Dentsu/ IOL

Next Session --- 23/05/2014

Das könnte Ihnen auch gefallen