# Lecture33worksheet

.pdf
Lecture 33 Worksheet Chrysafis Vogiatzis Every worksheet will work as follows. 1 . You will be asked to form a group with other students in the class: you can make this as big or as small as you'd like, but groups of 4 - 5 work best. 2 . Read through the worksheet, discussing any questions with the other members of your group. You can call me at any time for help! I will also be interrupting you for general guidance and an- nouncements at random points during the class time. 3 . Answer each question (preferably in the order provided) to the best of your knowledge. 4 . While collaboration between students is highly encouraged and expected, each student has to submit their own version. 5 . You will have 24 hours (see gradescope) to submit your work. Activity 1 : Respiratory function In this first set of problems, we tackle linear regression before finally seeing how a quadratic version works. The following table contains information about respiratory func- tion (as measured by forced expiratory volume) and smoking. The dataset contains information about age ( x 1 , note that all subjects are between 13 and 19 years old), height ( x 2 ), and whether they smoke ( 1 ) or not ( 0 ) ( x 3 ). The last column called FEV measures forced expi- ratory volume and is our dependent variable ( y ). ID Age ( x 1 ) Height ( x 2 ) Smoking ( x 3 ) FEV ( y ) 1 13 67 1 3 . 994 2 13 61 0 3 . 208 3 14 64 . 5 0 2 . 997 4 14 72 . 5 1 4 . 271 5 16 72 1 4 . 872 6 16 63 0 2 . 795 7 19 72 1 5 . 102 8 19 66 0 3 . 519 9 18 60 0 2 . 853 10 17 70 . 5 1 4 . 724 11 16 69 . 5 1 4 . 070
lecture 33 worksheet 2 Problem 1 : Simple linear regression First perform a linear regression between height ( x 2 ) and FEV ( y ). What is the adjusted R 2 score? Answer to Problem 1 . Problem 2 : Simple quadratic regression Does the adjusted R 2 score improve if we perform a regression on height squared ( x 2 2 ) and FEV ( y )? 1 1 Recall what we saw in the notes: create a new column with only x 2 2 values and use that one! Answer to Problem 2 . y_bar = -7.43 + 0.1682 * x2 SSe = 1.124, SSr = 5.854, SSt = 6.978 R_adj ^2 = 1 - (SSe / 9) / (SSt / 10) = 0.821 x2^2 = [[4489], [3721]... [4830.25]] y_bar = -1.8741 + 0.0013*x2^2 SSe = 1.085, SSr = 5.894, SSt = 6.979 R_adj ^ 2 = 0.827
lecture 33 worksheet 3 Problem 3 : A full quadratic regression Finally, do a regression between age, height squared, smoking ( x 1 , x 2 2 , x 3 ) and FEV ( y ). What is R 2 adj now? Answer to Problem 3 . X = [[1, 13, 4489, 1], [1, 13, 3721, 0], .... [1, 16, 4830.25, 1] ] (X^T X)^-1 = [[20.49, -0.149, -0.0095, 4.382], [-0.149, 0.0265, -0.00007, 0.7987], [-0.0045, -0.00007, 0.0000014, -0.0015], [4.382, 0.7987, -0.0015, 1.909] ] ?_hat = (X^T X)^-1 * X^T y = [[-0.475],[0.0622],[0.000645],[0.7913]] y_hat = -0.475 + 0.0622*x1 + 0.000645*x2^2 + 0.7913*x3 SSe = 0.7199, SSr = 6.25, SSt = 6.98