- No class on Thursday (9/18) or Friday (9/19).
- Start thinking about your project for the first half of the semester
- Emphasis on data analysis
- Due 10/24/2104 (friday)
- Feel free to come to office hours to discuss project ideas
Patrick D. Schloss, PhD (microbialinformatics.github.io)
Department of Microbiology & Immunology
m <- matrix(seq(1:96), nrow=8, ncol=12) #create a 8 x 12 matrix
colnames(m)<-1:12
rownames(m)<-c("A", "B", "C", "D", "E", "F", "G", "H")
dim(m)
nrow(m)
ncol(m)
m[1:5,1:5]
m[1:5,]
m[,1:5]
t(m) # transpose the matrix
1/m # take each value of m and find it's reciprocal
m * m # calculate the square of each value in m
m %*% t(m) # performs matrix multiplication
crossprod(m,m) # performs the cross product
rowSums(m) # calculate the sum for each row
colSums(m) # calculate the sum for each column
lower.tri(m) # find the indices that are below the diagonal
m[lower.tri(m)] # give the lower triangle of m
diag(m) # the values on the diagonal of m
det(m[1:8,1:8]) # the determinent of m
apply(m, 1, sum) # get the sum for each row - same as rowSums(m)
apply(m, 2, sum) # get the sum for each column - same as colSums(m)
gene | start | end | strand | length | annotation |
---|---|---|---|---|---|
rbcA | num | num | logic | num | character |
rbcB | |||||
rbcC | |||||
rbcD | |||||
etc. |
Be sure to set correct working directory in RStudio
metadata <- read.table(file="wild.metadata.txt", header=T)
head(metadata) # look at the first lines of table
dim(metadata)
nrow(metadata)
ncol(metadata)
colnames(metadata)
rownames(metadata) # notice a problem here?
summary(metadata) # output a summary of each column in table
Check out the Data section of the Environment tab of RStudio
What problems can you see with this output?
metadata$Age # output column named "Age"
metadata[,"Age"] # output column named "Age"
metadata[,7] # output 4th column ("end")
metadata[,-7] # output everything but the 4th column ("end")
metadata["23", ] # output row with Group 6_16m33
metadata[23, ] # output 23rd row (aka Group 6_16m33)
metadata[-23,] # output everything but the 23rd row
rownames(metadata) <- metadata$Group
metadata <- metadata[,-1]
head(metadata)
What do these commands do?
metadata$Weight[1:5]
metadata[1:5,"Weight"]
metadata["5_31m2",]
What's the difference between these commands?
metadata[-23,]
metadata <- metadata[-23,]
Can make new columns
metadata[,"sequences"] <- rep(NA, nrow(metadata))
metadata[metadata$SP=="PL",]
metadata[metadata$SP=="PL" & metadata$Sex=="M",]
factor(metadata$ET)
metadata$ET <-factor(metadata$ET)
summary(metadata)
levels(metadata$polymer)
aggregate(metadata$Weight, by=metadata$Sex, mean)
aggregate(metadata$Weight, by=list(metadata$Sex), mean)
sex.weight <- aggregate(metadata$Weight, by=list(metadata$Sex), mean)
sex.weight$x
aggregate(metadata$Weight, by=list(metadata$Sex, metadata$SP), mean)