# Lecture 5B

.pptx
W5 L2: non-parametric tests
Parametric vs Nonparametric Parametric distributions: normal distribution (z), t-distribution, Chi-sq distribution, F-distribution. These distributions can be well described with parameters. Nonparametric distributions: 1. The taste characteristics of five brands of wine are rated on a scale of 1 to 5; 2. For house price, student marks 3. Income The median value is often used, instead of the mean to describe data.
Household income distribution
Non-parametric tests Mann-Whitney U-test (two independent samples) Wilcoxon signed-rank test (paired samples, or one sample)
Mann-Whitney U-test 1. H0: The two samples are from the same population. 2. H1: The two populations have different distributions (two tailed). H0 is true. The ranks from the two samples tends to be similar. H0 is not true. Population 2 has higher ranks in general. The ranks of the sample from population 1 tends to be smaller than that in population 2. H0 is not true. Population 2 has lower ranks in general. The ranks of the sample from population 1 tends to be larger than that in population 2. The key idea in this test is about ranking the measurements in the samples and comparing the rank numbers. H1: The distribution of population 1 lies to the left of that for population 2 (left tailed). H1: The distribution of population 1 lies to the right of that for population 2 (right tailed).
Mann-Whitney U-test procedure 1. Size of sample 1 is and size of sample 2 is . 2. Rank all data (+) from small to large. 3. Calculate the sum of ranks for sample 1 as , and that for sample 2 as . 4. For identical values, the mean of the ranks should be assigned to them. For example, if the 2 nd and the 3 rd value are identical, we should assign each of them with a rank of (2+3)/2 = 2.5. 5. Calculate the test statistic 6. Use Mann-Whitney U-test table to find the critical U value. If , we can reject
Critical U values Two tailed One tailed
Example
Solution sample 1 sample 2 Ranks (sample 1) Ranks (sample 2) W1 242 80 3 12 1 W2 223 485 90 30 14 n1= 13 176 272 21 26 n2= 17 224 80 23 12 U1= 70 141 8 18 2 U2 = 151 259 10 25 3 U= 70 120 72 17 10 alpha 0.01 80 294 12 28 U_crit 49 287 22 27 4 U>U_cric Fail to reject H0 240 144 24 19 192 160 22 20 35 50 5 7 45 64 6 9 480 29 56 8 96 15 104 16
Mann-Whitney-U test in python https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mann whitneyu.html scipy.stats.mannwhitneyu
Wilcoxon Signed-Rank Test 1. Null Hypothesis: two samples from the same population 2. Alternative hypothesis: Two tailed: the two populations have different distributions One tailed: Population 1 relative distribution lies to the right of the relative distribution for population 2 3. Calculate the 4. Calculate the test statistic T Two tailed: ) One tailed: 5. Rejection region: To compare two paired samples or compare a single sample with a known value.
Calculate the sum of ranks Procedure 1. Calculating the difference between each pair . If 0, the observation is eliminated and the number of pairs is reduced accordingly. 2. Rank the absolute values of the differences from small to large (1 to ). Average the rank numbers for identical difference values. 3. Calculate rank sum for the negative differences and label this as . 4. Calculate rank sum for the positive differences and label this as .
Critical T value
Example question
Wilcoxon test in python https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.wilco xon.html?highlight=wilcox#scipy.stats.wilcoxon scipy.stats.wilcoxon