The t-test and t-distribution are widely considered to be at the very foundation of statistics. Who would believe they were invented to make great beer better?
-The t-test invented by William Gosset (‘Student’), Guiness Brewery
last updated: 2021-11-02
The t-test and t-distribution are widely considered to be at the very foundation of statistics. Who would believe they were invented to make great beer better?
-The t-test invented by William Gosset (‘Student’), Guiness Brewery
Â
The t-test is a foundational tool for scientists
Compare mean differences (2 sample)
1-sample difference
Paired sample differences…
Â
“2 sample test”
The main question is did these 2 samples come from populations with different means?
“1 sample test”
The main question is did this 1 sample come from population of a known mean?
“paired sample test”
Is there a consistent difference between paired sample observations?
Â
Formal assumptions
Gaussian residuals (for EACH SAMPLE)
Heteroscedasticity
Independence of observations
Informal assumptions (the ones we have responsibility to evaluate)
Gaussian distribution (for EACH SAMPLE)
Heteroscedasticity (in practice we account for this difference with math by using the pooled SD)
Independence of observations (if this is not true, perhaps paired samples is appropriate)
Â
Example of mean human height by sex
Iris data
data(iris) names(iris)
## [1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"
iris2 <- iris[1:100, c(1,5)] iris2$Species <-droplevels(iris2$Species) boxplot(Sepal.Length~Species, data = iris2) stripchart(Sepal.Length~Species, data = iris2, pch = 16, col = 'red', vertical = T, add = T, method = 'jitter')
hist(iris2$Sepal.Length, main = 'wrong way to examine distribution')
par(mfrow = c(2,1)) hist(iris2$Sepal.Length[1:50], xlim = c(4,7), main = 'setosa') hist(iris2$Sepal.Length[51:100], xlim = c(4,7), main = 'versicolor')
Slice out perch