- A new homework has been posted and is due on November 22nd
- work with a partner
- no more than one explicit loop
- Will have lab period on Friday
Patrick D. Schloss, PhD (microbialinformatics.github.io)
Department of Microbiology & Immunology
relabund <- matrix(rep(runif(20000)), ncol=20, nrow=100)
relabund[5,] <- c(runif(10,0,0.4), runif(10, 0.3,0.7))
colnames(relabund) <- c(paste0("Lean", 1:10), paste0("Obese", 1:10))
rownames(relabund) <- paste0("Species", 1:100)
treatments <- c(rep("lean", 10), rep("obese", 10))
head(relabund)
## Lean1 Lean2 Lean3 Lean4 Lean5 Lean6
## Species1 0.7320719 0.61367991 0.44710369 0.93737694 0.1545328 0.7581348
## Species2 0.1587366 0.29431957 0.18656555 0.40873243 0.1751768 0.9519493
## Species3 0.7459371 0.14643946 0.37502507 0.83650445 0.2839269 0.3636480
## Species4 0.2116673 0.03895666 0.66441798 0.09118599 0.2525140 0.8759450
## Species5 0.2533841 0.09087869 0.04734376 0.01431301 0.2551368 0.2670180
## Species6 0.7963445 0.57078249 0.90997548 0.38780057 0.8780423 0.2803217
## Lean7 Lean8 Lean9 Lean10 Obese1 Obese2
## Species1 0.48390810 0.09841650 0.37743814 0.84738340 0.68320363 0.3305247
## Species2 0.10043277 0.05339326 0.64165944 0.39090620 0.61785717 0.2844650
## Species3 0.02560588 0.63525609 0.02907226 0.08486640 0.03923239 0.8935477
## Species4 0.69394501 0.74943690 0.42591778 0.53587751 0.87731287 0.1407579
## Species5 0.12074160 0.12838108 0.08421323 0.08451048 0.67639246 0.3258745
## Species6 0.72716501 0.82467974 0.69858696 0.77484355 0.47065886 0.5433217
## Obese3 Obese4 Obese5 Obese6 Obese7 Obese8
## Species1 0.7718421 0.02277831 0.89896404 0.2127805 0.1474924 0.8126880
## Species2 0.6259197 0.43031531 0.88595164 0.4238480 0.3491638 0.7818655
## Species3 0.7366177 0.16466986 0.38247831 0.3801840 0.9368757 0.5069368
## Species4 0.9165949 0.99351859 0.25953107 0.4532728 0.9596510 0.4002479
## Species5 0.6271704 0.34291623 0.57269919 0.5209073 0.6038838 0.6461251
## Species6 0.8557418 0.04982476 0.02552185 0.8431182 0.5089379 0.3121548
## Obese9 Obese10
## Species1 0.5373861 0.02844547
## Species2 0.3988145 0.45445103
## Species3 0.4414108 0.49137888
## Species4 0.1843113 0.76326338
## Species5 0.3180390 0.52756064
## Species6 0.8503138 0.22655002
Perform a wilcoxon test on each Species differentiating between lean and obese individuals
Write R code, without the use of for
loops that produces the following output:
test <- function(x, design){
p <- wilcox.test(x~design)$p.value
return(p)
}
p.values <- apply(relabund, 1, test, design=treatments)
padj.values <- p.adjust(p.values)
for
loops that produces the following output:
Species5
was significantly different between the two groupsrownames(relabund)[padj.values<0.05]
was significantly different between the two groupsSpecies5
" should be produced by r coderownames(relabund)[padj.values<0.05]
" should be produced by r codeProblem: We typically have large datasets and need a way to get that information into the program. Similarly, we generally have large amounts of output that we can't just send to the screen.
Solution: We have seen a number of functions already for reading in data and in the very first homework you had a function to write out a table. R has a number of solutions to do both
readline
: read a single line from the screen > readline()
ab de fg
[1] "ab de fg"
> readline("enter your name: ")
Pat Schloss
> name <- readline("enter your name: ")
Pat Schloss
> name
[1] "Pat Schloss"
scan
: reads in a vector from a file or keyboard 123
2 4 5
53
scan("file1.txt")
## [1] 123 2 4 5 53
scan
: reads in a vector from a file or keyboard 123
abcd
2 4 5
54
scan("file2.txt")
## Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'a real', got 'abcd'
scan("file2.txt", what="")
## [1] "123" "abcd" "2" "4" "5" "54"
scan("file2.txt", what="", sep="\n")
## [1] "123" "abcd" "2 4 5" "54"
> scan("") #will read from the keyboard
>scan("")
1: 12 5 2 3
5: 23 24
6:
Read 6 items
[1] 12 5 2 3 23 24
read.table
name age sex
pat 36 M
john 22 M
susan 55 F
read.table("table.txt", header=TRUE)
## name age sex
## 1 pat 36 M
## 2 john 22 M
## 3 susan 55 F
read.table
read.table("table.txt", header=TRUE, row.names=1)
## age sex
## pat 36 M
## john 22 M
## susan 55 F
data <- read.csv("http://archive.ics.uci.edu/ml/machine-learning-databases/echocardiogram/echocardiogram.data",
header = F)
head(data)
## V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13
## 1 11 0 71 0 0.260 9 4.600 14 1 1 name 1 0
## 2 19 0 72 0 0.380 6 4.100 14 1.700 0.588 name 1 0
## 3 16 0 55 0 0.260 4 3.420 14 1 1 name 1 0
## 4 57 0 60 0 0.253 12.062 4.603 16 1.450 0.788 name 1 0
## 5 19 1 57 0 0.160 22 5.750 18 2.250 0.571 name 1 0
## 6 26 0 68 0 0.260 5 4.310 12 1 0.857 name 1 0
read.csv
is a special version of read.table
for reading data talbe sthat are separated by commas c <- file("file2.txt", open="r")
readLines(c, n=1)
readLines(c, n=1)
readLines(c, n=1)
readLines(c, n=1)
readLines(c, n=1)
## [1] "123"
## [1] "abcd"
## [1] "2 4 5"
## [1] "54"
## character(0)
close
the file when you're done!close(c)
print
x <- 1:3
print(x^2)
## [1] 1 4 9
cat
cat(x, "abc", x^2)
## 1 2 3 abc 1 4 9
cat(x, "abc", x^2, "\n")
## 1 2 3 abc 1 4 9
\n
is useful to just to be safewrite.table
: structured writing to a filehead(data)
## V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13
## 1 11 0 71 0 0.260 9 4.600 14 1 1 name 1 0
## 2 19 0 72 0 0.380 6 4.100 14 1.700 0.588 name 1 0
## 3 16 0 55 0 0.260 4 3.420 14 1 1 name 1 0
## 4 57 0 60 0 0.253 12.062 4.603 16 1.450 0.788 name 1 0
## 5 19 1 57 0 0.160 22 5.750 18 2.250 0.571 name 1 0
## 6 26 0 68 0 0.260 5 4.310 12 1 0.857 name 1 0
colnames(data) <- LETTERS[1:ncol(data)]
rownames(data) <- 1:nrow(data)
write.table(file="ecg.data", data)
head ecg.data
## "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M"
## "1" "11" "0" "71" 0 "0.260" "9" "4.600" "14" "1" "1" "name" "1" "0"
## "2" "19" "0" "72" 0 "0.380" "6" "4.100" "14" "1.700" "0.588" "name" "1" "0"
## "3" "16" "0" "55" 0 "0.260" "4" "3.420" "14" "1" "1" "name" "1" "0"
## "4" "57" "0" "60" 0 "0.253" "12.062" "4.603" "16" "1.450" "0.788" "name" "1" "0"
## "5" "19" "1" "57" 0 "0.160" "22" "5.750" "18" "2.250" "0.571" "name" "1" "0"
## "6" "26" "0" "68" 0 "0.260" "5" "4.310" "12" "1" "0.857" "name" "1" "0"
## "7" "13" "0" "62" 0 "0.230" "31" "5.430" "22.5" "1.875" "0.857" "name" "1" "0"
## "8" "50" "0" "60" 0 "0.330" "8" "5.250" "14" "1" "1" "name" "1" "0"
## "9" "19" "0" "46" 0 "0.340" "0" "5.090" "16" "1.140" "1.003" "name" "1" "0"
cat
: cat("abc\n", file="u")
cat("def\n", file="u", append=T)
write
: write(x, file="filename", sep= " ")
write(x, file="filename", append=T, sep= "\t")
write(x, "")
## 1 2 3
append=T
prevents the file from being written overfile.info("data")
: tells file size, creation time, etcdir()
: gives the directory contents for the specified dirlist.files()
: same as dir()file.exists()
: tells whether a file exists, duh.getwd()
: tells you where you are in the file treesetwd()
: allows you to change the current directory