# STA 9708 LN8.A Two-Sample t-Test 7-22-21-1

.docx
Baruch College, STA 9708 Draft 7-22-21 Lecture Notes 8: Two-Sample t-Test Section 1. The Two-Sample t-Test using Excel Data Analysis In the previous lesson we saw that a t-test test is used to assess a claimed value of μ for a given population. That t- test was applied to only one population, so it is termed a one-sample t-test. When a hypothesis test is used to compare the population averages of two different populations , we term it a two-sample t-test. As an example, consider the weight of a U.S. quarter. It is natural to think that all quarters have the same weight, but that is not true. There is variation, everywhere. Controlling the variability of the weights of quarters is important because vending machines and change-counters operate by measuring weight. A question that interests me is the extent to which quarters lose weight with age. Is so, this would show up as a change in the population average. For example, it might be that the population of quarters minted in the 1970's would have a population average, μ, which is lower than that of quarters minted in 2010's. Consider the population consisting of the weights of all quarters made in the decade of 1970 to 1979. And consider a second population of consisting of the weights of all quarters made in the decade of 2010 to 2019. A reasonable question is whether or not the population average weight of that first population, μ 1 , equals that of the second, μ 2 . We will apply a two-sample t-test to that problem. I have a sample of 7 coins from the first population, years 1970-79, and 4 coins from the second population, years 2010-19. That gives us two samples, the first with sample size n 1 =7 and the second with n 2 =4. The data is given on the right. The photograph above is of those 11 coins. Originally, the date of a U.S. quarter was on the "heads" side with George Washington; in 1999, the mint started making "State" quarters and those show the date on the "tails" side. Therefore, in the photo, I have shown the tails side of the State coins. In a two-sample t-test, the null hypothesis asserts that the two population averages are equal: H o : μ 1 = μ 2 . The alternate hypothesis asserts that the two population averages are not equal: H a : μ 1 ≠ μ 2 . To perform the test, we start by taking a random sample from population #1 and a random sample from population #2. We compute then compute the two sample averages, ¯ x 1 and ¯ x 2 . If the null hypothesis were true, then the sample averages will likely be close in value; that is, if in fact μ 1 equals μ 2 , then the difference between ¯ x 1 1
and ¯ x 2 will not be large. Therefore, we will reject the claim that μ 1 2 only if the difference between ¯ x 1 and ¯ x 2 is large. Here is a summary of the samples. We see that the sample average of the newer coins is higher, but is 201.25 far from 197.43? That, I hope you will see, is a difficult question to answer. It will depend on what we call far! The t-test will define what is "far" for us. It employs sophisticated probabilistic arguments that can only be outlined in an introductory course. We turn now to the "Two-sample t-test assuming equal variances" function in Excel's Data Analysis toolpack. Using that function for the coin data is shown, below. Excel's raw output is shown, below. The arithmetic is correct, but the terminology employed is confused and confusing. On the next page, I have cleaned it up. This cleaned-up output, below, is somewhat easier to read and the terminology is correct. Take a t-Test: Two-Sample Assuming Equal Variances Wt1970 Wt201 0 Mean 197.428 6 201.25 Variance 4.95238 1 2.25 Observations 7 4 Pooled Variance 4.05158 7 Hypothesized Mean Difference 0 df 9 t Stat - 3.02898 P(T<=t) one-tail 0.00713 5 t Critical one-tail 1.83311 3 P(T<=t) two-tail 0.01427 t Critical two-tail 2.26215 7 2
minute and check that the sample averages and variance computed earlier agree with those shown here. That is a good way to check against errors. Two-sided, Two-Sample t-test, Equal Var Wt197 0 Wt2010 Sample Avg 197.43 201.25 Sample Var 4.952 2.250 Sample Size 7 4 Pooled Sample Var 4.052 Null Hyp: mu1-mu2 = 0 df 9 t-statistic -3.029 p-value, one-tail 0.00714 t-critical value, 1-tail 1.833 p-value, two-tail 0.01427 t-critical value, 2-tail 2.262 Textbooks usually show a "formal" presentation of a t-test as below, but this is not how the results are reported in practice or in scientific journals; if I ask that you explain the test as I did starting back on page 1, do not offer a cryptic outline like the following. (1) state the null and alternate hypotheses, (2) state the alpha level of the test, (3) state the degrees of freedom and t-critical value, (4) state the rejection region, (5) state the t-statistic, (6) determine if the t-statistic lies in the rejection region, (7) announce if the null hypothesis is rejected or not rejected. (1) H o : μ 1 = μ 2 . H a : μ 1 ≠ μ 2 . (2) alpha=5%; (3) df=9, t-critical value is 2.262; (4) reject the null hypothesis if (a) the t-statistic is less than -2.262, or (b) the t-statistic is greater than 2.262. (5) the t-statistic is -3.029; (6) the t-statistic of -3.029 is less than -2.262, so the t-stat fell in the rejection region. 3