Beruflich Dokumente
Kultur Dokumente
Day 4
0900-1000
1000-1040
1040-1100
1100-1300
1300-1400
1400-1500
1500-1540
1540-1600
1600-1700
Details
Plotting Systems
Mainly on plotting of graphs using 3 different
systems in R
{base}
{lattice}
{ggplot2}
Learning Objectives
Towards the end of the todays training, the
following should be covered.
Principles of analytic graphics & making
exploratory graphs
Plotting systems in R
The {base} & {lattice} plotting systems in R
Graphics devices in R
Exploratory Graphics
Using exploratory graphics, the goal is to assist
ones understanding of the data that is
available.
The graphics are usually informal
in nature and very plain (nonfanciful)
Axes and labelling may be
optional (but still highly
recommended)
Plotting systems in R
There are 3 common plotting systems
available in R.
{base} which is in the default R installation. We
will use this for the exercise today.
{lattice} which is loaded using the
library(lattice) command.
{ggplot2} which is loaded using the
library(ggplot2) command.
We will explore this in detail more next week.
Plots
Basic
Histogram
Basic Boxplot
Scatterplot
Basic Histogram
ylim
ylab
main
library(datasets)
hist(warpbreaks$b
reaks,
breaks
breaks=20,
xlab =
"Breaks",
xlim
main="Number
Breaks in Yarn
during Weaving",
xlab
ylim =
c(0,20))
type
lwd
col
abline(f
it1)
sglong<read.table("sg_long_density.tx
t", header=TRUE)
plot(sglong$Year,
sglong$PopSize,
type="l",
col="red",
lwd=3,
xlab="Year",
ylab="Population Size",
main="Population Size
around Sg Long 1971-2001")
fit1<-lm(PopSize~ Year,
data=sglong)
abline(fit1, lty="dashed")
text(x=1976, y=750, labels="
R2=0.896\n P=2615e-15")
Basic Scatterplot
legend
()
col
pch
cex
library(datasets)
plot(iris$Sepal.Length,
iris$Petal.Length,
col=iris$Species,
pch=16,
cex=0.5,
xlab="Sepal Length",
ylab="Petal Length",
main="Flower
Characteristics in Iris")
legend(x=4.2, y=7,
legend=levels(iris$Specie
s),col=c(1:3), pch=16)
Basic Boxplot
library(datasets)
boxplot(iris$Sepal.Lengt
h ~ iris$Species,
ylab="Sepal
Length",
xlab="Species",
main="Sepal
Length by Species in
Iris")
par()
mfrows()
Exercise
Line types
Par () margins
Graphics Device
Computer Screen
NOT
Input (such as Mouse &
Keyboard)
File System
Network Connection
BMP
JPG / JPEG
TIFF
GIF
PNG
Vector
PS
EPS
SVG
Exercise - Subsetting
Use the Sandakan data
For the Rain, Wind, Humidity and Radiation,
we need
Rows 12 to 2688 & Columns 2 to 6.
To clean the data for Rain, transform trace to 0
and also negative numbers of 0.
Exercise aggregate()
Note that the Air Pollutant Index file capture
data by the hour while the Rain, Wind,
Humidity and Radiation data is by day.
Use aggregate() to compute the mean
API per day and save the API dataset by day
instead of by hour.
Exercise paste()
Note that the date of the Rain, Wind,
Humidity and Radiation dataset is stored in
three columns. Will need to combine them to
form a date.
Use paste() with the separator as - so
that it will be in the format %Y-%m-%d.
Exercise hist()
Use the hist()
function.
The graph shows that
most days (the dataset
should have about 549
observations, so
approximately 450 out
of 549 days) have less
than 0-20 mm of
rainfall.
Exercise plot()
Use plot() function.
There is a slight issue
with the dataset, where
the wind measurement
data entry seems to be
using a different scale
around September
2014.
Exercise merge()
Merge both the datasets into one using
merge()
Use the date as the join
Lattice Function
Lattice functions generally take a formula for their first
argument, usually of the form
xyplot(y ~ x | f * g, data)
We use the formula notation here, hence the ~.
On the left of the ~ is the y-axis variable, on the right is the
x-axis variable
f and g are conditioning variables they are optional
the * indicates an interaction between two variables
Type of plots
Function
histogram()
densityplot()
qqmath()
qq()
stripplot()
bwplot()
dotplot()
barchart()
xyplot()
splom()
contourplot()
levelplot()
wireframe()
cloud()
parallel()
Default Display
Histogram
Kernel Density Plot
Theoretical Quantile Plot
Two-sample Quantile Plot
Stripchart (Comparative 1-D Scatter
Plots)
Comparative Box-and-Whisker Plots
Cleveland Dot Plot
Bar Plot
Scatter Plot
Scatter-Plot Matrix
Contour Plot of Surfaces
False Color Level Plot of Surfaces
Three-dimensional Perspective Plot of
Surfaces
Three-dimensional Scatter Plot
Parallel Coordinates Plot
library(lattice)
library(datasets)
#mtcars displacement by
factor cylinder
histogram(~disp |
factor(cyl), data=mtcars,
main="Displacement by
Cylinders",
xlab="Displacment (cu
in)",
col="blue")
{lattice} Scatterplot
example
library(nlme)
library(lattice)
xyplot(weight ~ Time | Diet,
BodyWeight)
#boxplot Sepal.Length by
Species
bwplot(Sepal.Length ~
factor(Species) , data=iris,
xlab="Species",
col="red",
pch=16,
main=("Sepal.Length by
Species"))
{lattice} example
library(nlme)
library(lattice)
p <- xyplot(weight ~ Time | Diet,
BodyWeight)
{ggplot2} introduction
install.packages("ggplot2")
library(ggplot2)
head(mtcars)
{ggplot2} Scatterplots
ggplot(data = mtcars, aes(x = hp, y = mpg)) + geom_point()
Modify the
content of the
aes() to include
colours and
produce the
following graph.
Include the
appropriate
parameters
here (hint:
vectors)
+ scale_color_discrete(labels = )
Exercise 3: Re-labelling
labs(color = "Transmission", ? ? )
{ggplot2} Scatterplots
Add different themes to
your plots
theme_bw()
theme_light()
theme_minimal()
theme_classic()
{ggplot2} Scatterplots
{ggplot2} geom_smooth()
set.seed(1)
time<-rep(1:5, each=2)
income <- runif(10,1000,5000)
dt <- data.frame(time,income,type)
ggplot(dt, aes(x = time, y = income, colour = type)) +
geom_smooth()
GGPLOT2 EXERCISE
data("msleep")
data("msleep")