Test for Equal Means Slide 3: In hypothesis testing for equal means, we test the null hypothesis that the means are all equal (mu1 = mu2 = ... = muk) against the alternative hypothesis that at least one pair of means is different. Slide 4: Under the null hypothesis, we can combine all the samples into one large sample because they all follow a normal distribution with a common mean and variance. We estimate the common mean by pooling all the response samples and calculating the sample mean. Similarly, we estimate the common variance using the sample variance of the combined samples. The sampling distribution of the variance estimator under the null hypothesis follows a chi-square distribution with N-1 degrees of freedom. Slide 5: The sum of squares total (SST) can be decomposed into the sum of squares error (SSE) and the sum of squares treatments (SSTr). SSE measures the variability within each group, while SSTr measures the variability between the sample means of each group. This decomposition helps us compare the within-group variability to the between-group variability. Slide 6-7: The F-test is the ratio of SSTr divided by the degrees of freedom of treatments (k-1) and SSE divided by the degrees of freedom of residuals (N-k). It compares the between-group variability to the within-group variability. Under the null hypothesis, this ratio follows an F distribution with (k-1) and (N-k) degrees of freedom. If the F-test value is large, indicating more between-group variability, we reject the null hypothesis. This decision can also be based on the p-value, which is the probability of observing a test statistic as extreme as the one calculated. Slide 8: In the first example of comparing suicide rates by country region, we want to determine if the mean rates are statistically different. Slide 9: Using the aov() command in R, we perform an ANOVA test. The ANOVA table provides important information such as degrees of freedom, sum of squares, mean sum of squares, F-value, and p-value. In this example, the p-value is close to zero, indicating that we reject the null hypothesis of equal means for suicide rates across regions. Slide 10: In the second example of comparing typing speeds for different keyboard layouts, we want to determine if the mean typing times are statistically different.
Slide 11: Similar to the previous example, we perform an ANOVA test using R. The ANOVA table provides the necessary information. In this case, the p-value is close to zero, indicating that we reject the null hypothesis of equal mean typing speeds. In both examples, the statistical tests provide evidence that the mean values are significantly different, supporting the alternative hypothesis.
Comparing Pairs of Means Slide 3: In ANOVA, one goal is to determine which treatment means are larger or smaller. To compare pairs of means, we estimate the difference between them and use confidence intervals to make statements about the differences. The confidence interval is centered at the estimated mean difference and is determined using the studentized range distribution, which corrects for simultaneous inference. Slide 4: We use the studentized range distribution (q critical point) instead of the t-distribution critical point to correct for simultaneous inference. Without correction, the confidence intervals would be narrower, but by considering multiplicity, we make them wider to account for the fact that we are making multiple comparisons. Slide 5: In the first example of comparing suicide rates across country regions, we want to determine which means are statistically different. Slide 6: Using the TukeyHSD() command in R with the fitted ANOVA model, we can obtain pairwise confidence intervals. The output provides the pairs of means, the differences in estimated means, the lower and upper bounds of the confidence intervals, and the adjusted p-values. We interpret the intervals by looking for zero values (indicating possible equality) and considering the confidence levels that only contain positive or negative values. Slide 7: Based on the output, we conclude that there are several pairs of means that could be equal, but three pairs have adjusted p-values smaller than 0.05, suggesting statistically significant differences. It's important to correct for multiplicity in such cases with a large number of comparisons. Slide 8: In the second example of comparing typing speeds for different keyboard layouts, we want to determine which mean typing speeds are statistically different. Slide 9: Using the Tukey method for pairwise comparison, we find that the confidence interval for the pair of layouts 3 and 1 includes zero, indicating a plausible similarity in mean typing speed. However, there are statistically significant differences between layouts 1 and 2 and between layouts 3 and 2. This suggests that layout 2 has a significantly higher typing speed than layouts 1 and 3, while layouts 1 and 3 have similar typing speeds on average.