- Homework 3 is out
- For the next two Fridays, the first hour of class will be a lecture
- Read Introduction to Statistics with R (Chapter 8: Tabular data)
Patrick D. Schloss, PhD (microbialinformatics.github.io)
Department of Microbiology & Immunology
Use the sample
command to randomly draw a number from a range of integers
r<-sample(1:10, 10, replace=T)
r
What should these look like?
hist(r)
stripchart(r, method="jitter")
plot(r)
summary(r)
What happens if we run the following repeatedly?
r<-sample(1:10, 1000, replace=T)
hist(r, xlim=c(0,10), breaks=seq(0,10, 1))
We can "fix" the distribution:
set.seed(1)
r<-sample(1:10, 1000, replace=T)
hist(r, xlim=c(0,10), breaks=seq(0,10, 1))
Being able to set the random seed is an importat feature for reproducible reserach
Discrete: Hits in baseball, number of infected mice, number of people
r<-sample(1:10, 1000, replace=T)
plot(r, ylim=c(0,10))
Continuous: Weight, temperature, concentrations
r<-runif(1000, min=1, max=10)
plot(r, ylim=c(0,10))
Notice a difference?
Flipping a fair coin...
sample(c("H", "T"), 100, replace=T)
rbinom(10, size=1, prob=0.5)
Flipping a cooked coin...
heads <- rbinom(10, size=1, prob=0.8)
sum(heads)
Hall of fame hitter...
hits <- rbinom(5, size=1, prob=0.3)
sum(hits)
rbinom
: random samples
dbinom
: distribution function
pbinom
: cumulative distribution function
qbinom
: inverse cumulative distribution function
breedings <- 1000
npups <- 10
p.males <- 0.5
obs.males <- 2
r <- rbinom(breedings, npups, p.males)
r.hist <- hist(r, plot = FALSE, breaks = seq(-0.5, 10.5, 1))
par(mar = c(5, 5, 0.5, 0.5))
plot(r.hist$density ~ r.hist$mids, type = "h", lwd = 2, xlab = "Number of male mice",
ylab = "Density", xlim = c(0, 10))
arrows(x0 = 2, x1 = 2, y0 = 0.15, y1 = 0.08, lwd = 2, col = "red")
n.two <- sum(r == obs.males)
p.two.empirical <- n.two / breedings
p.two.empirical
## [1] 0.046
dbinom
p.two.R <- dbinom(2, 10, 0.5)
p.two.R
## [1] 0.04395
p.two.empirical - p.two.R
## [1] 0.002055
plot(r.hist$density ~ r.hist$mids, type = "h", lwd = 2, xlab = "Number of male mice",
ylab = "Density", xlim = c(0, 10))
points(x = 0:10, dbinom(0:10, 10, 0.5), col = "red", lwd = 3, type = "l", lty = 1)
pbinom
)n.two.or.fewer <- sum(r <= obs.males)
p.two.or.fewer.empirical <- n.two.or.fewer / breedings
p.two.or.fewer.empirical
## [1] 0.054
p.two.or.fewer.R <- pbinom(2, 10, 0.5)
p.two.or.fewer.R
## [1] 0.05469
p.two.or.fewer.empirical-p.two.or.fewer.R
## [1] -0.0006875
inv.cdf <- qbinom(0.9, 10, 0.5)
dunif
, punif
, qunif
, runif
dnorm
, pnorm
, qnorm
, rnorm
dbinom
, pbinom
, qbinom
, rbinom