0 Bewertungen0% fanden dieses Dokument nützlich (0 Abstimmungen)
17 Ansichten16 Seiten
R has a robust community - used by academics and researchers around the world vector - the fundamental of data structure in R. A factor is a special case of vector that is solely used for representing nominal variables. List - special type of vector allows different types of values to be collected.
R has a robust community - used by academics and researchers around the world vector - the fundamental of data structure in R. A factor is a special case of vector that is solely used for representing nominal variables. List - special type of vector allows different types of values to be collected.
R has a robust community - used by academics and researchers around the world vector - the fundamental of data structure in R. A factor is a special case of vector that is solely used for representing nominal variables. List - special type of vector allows different types of values to be collected.
Naveen So Why R??? Its free Yes we dont have to pay to buy licence It has a robust community used by academics and researches around the world Graphics and data visualization lattice, ggplot2 (we will learn later) Basics of Data structure in R Vector - The fundamental of data structure in R. All the elements in a vector must be of the same type Integer (numbers without decimals) Numeric (numbers with decimals) Character (text data) Logical (TRUE or FALSE values) Factors - A factor is a special case of vector that is solely used for representing nominal variables List - special type of vector, unlike a vector that requires all elements to be the same type, a list allows different types of values to be collected Data frame - a structure analogous to a spreadsheet or database since it has both rows and columns of data Matrix - A matrix is a data structure that represents a two- dimensional table, with rows and columns of data. R matrixes can contain any single type of data Vector paitent_name <- c(Abhishek", Venu", Naveen") Character vector temperature <- c(98.1, 98.6, 101.4) What type of vector is it?? numeric vector flu_status <- c(FALSE, FALSE, TRUE) What type of vector is it?? logical vector
temperature[2] [1] 98.6 temperature[-2] [1] 98.1 101.4 Factors An advantage of using factors is that they are generally more efficient than character vectors because the category labels are stored only once. Rather than storing MALE, MALE, FEMALE, the computer may store 1, 1, 2. This can save memory gender <- factor(c("MALE", "FEMALE", "MALE")) gender [1] MALE FEMALE MALE Levels: FEMALE MALE List Unlike vector that requires all elements to be the same type, a list allows different types of values to be collected. Due to this flexibility, lists are often used to store various types of input and output data subject1 <- list(fullname = paitent_name[1], temperature = temperature[1], flu_status = flu_status[1], gender = gender[1]) subject1 $fullname [1] Abhishek $temperature [1] 98.1 $flu_status [1] FALSE $gender [1] MALE Levels: FEMALE MALE Matrix R matrixes can contain any single type of data, although they are most often used for mathematical operations and therefore typically store only numeric data To create a matrix, simply supply a vector of data to the matrix() function, along with a parameter specifying the number of rows (nrow) or number of columns (ncol) m <- matrix(c('a', 'b', 'c', 'd'), nrow = 2) m [1,] [2,] [,1] "a" "c" [,2] "b" "d" data is stored in column first Data frame In R terms, a data frame can be understood as a list of vectors or factors, each having exactly the same number of values. Because the data frame is literally a list of vectors, it combines aspects of both vectors and lists pt_data <- data.frame(subject_name, temperature, flu_status, gender, stringsAsFactors = FALSE) pt_data paitent_name temperature flu_status gender 1 Abhishek 98.1 False M Some useful function for data frame How to read/load data in R read.csv (file name.csv, header =T, stringsAsFactors = FALSE) read.table( file name, header = T, sep = ,, stringsAsFactors = FALSE) "csv" for comma-separated and "tab" for tab-separated read.csv (file.choose(),header=T) (select the file) How to save data in R write.csv(pt_data, file = "pt_data.csv") save(pt_data, file = "pt_data.RData")
Lets start Running rm (list =ls()) to remove data already in RAM getwd () show the path where R operates setwd ("D:/R learning/first class") to change the path install.packages (ca, dep=T) library (ca) library (help=ca) ----- tells about the package Help for any function ?read.csv () ??read.csv () Google/youtube
Exploring and understanding data df<- read.csv (file.choose(),header=T) str (df) dim (df) summary (df) head (df,10) summary (df$year) df1 =df[which(df$year==2011),] df1 =subset(df, year == 2011') quantile(df$price) quantile(df$price, probs = c(0.01, 0.99))
Exploring and understanding data
table(df$year) table(df$model) table(df$color) color_percentage <- prop.table(table(df$color)) * 100 install.packages("gmodels") ------cross tab library (gmodels) CrossTable(x = df$model, y = df$transmission) Case Study 1 Brand perception How my brand is different from competition What consumer think of my Nokia vis a. viz Samsumg, Micromax, HTC Is my brand differentiated compared to competition So how you as a analyst can help brand manager to solve above problem? Data for Correspondence Analysis rm (list =ls()) df<- read.csv (file.choose(),header=T, row.names="RW") str (df) dim (df) summary (df) head (df,10) install.packages (FactoMineR) library (FactoMineR) ca2 = CA(df, graph = FALSE) ca2$eig head(ca2$col$coord) head(ca2$row$coord) plot(ca2)
What Next??? Need to work on one case studies with these 2 technique by each team
Assignment should be submit before Thursday
You need to take the data from Canon Wave 2 /Dentsu/ IOL