Worksheet 3.3B regression-residuals

Graphs from Peck, Olsen, Devore Page 1 of 4 AP Statistics Name: ______________________ WS 3.3B Analyzing Associations Per:______ In examining any graph of data, we look first for an overall pattern or shape and then for deviations in the pattern. With regression, the fitted line is the pattern and residuals are the deviations. Residual = _________________________ - _________________________ = y ˆ y (predicted from the equation) We use residuals to check the _____________ of our regression by making a residual plot: 1. Calculate the residual values (calculators automatically find and store residuals). 2. Make a scatterplot of the _____________ against the explanatory variable __ . 3. Draw a horizontal line at zero makes it easier to see any pattern (turn Axes on). 4. Look for patterns and outliers should appear as a _____________________ ____________________. Any sort of ___________________ may indicate that linear regression is not a good tool to use because the relationship may not be linear. Outliers may signify an influential observation that is having an effect the whole regression line. 5. A residual plot against x ________________________________________ seen in the original scatterplot. May indicate the presence of ____________ _________________________(variables that have an important effect on the response but are not included among the explanatory variables studied.) 6. Can also plot the residuals against ˆ y ( ˆ y is simply a linear transformation of x ).
Graphs from Peck, Olsen, Devore Page 2 of 4 _____________________________________________________________! Just because two variables have a strong correlation does not mean that one causes the other there may be an alternative explanation including chance. Do storks cause babies? Stork population vs. babies born - in a small town in Germany, both populations increased over time Does smoking cause cancer? difficult to prove although correlation exists _________________________ general term, often used for categorical variables as well as numeric; _________________________ only applies to _____________________ variables and is a statistical ____________________________________________________ ___________________________________________________________________ ___________________________________ a variable other than the two we are studying (x and y) that influences both variables of interest, explaining the relationship between the two (often a _____________________________ ice cream sales and number of people who drown both respond to hot weather) ___________________________________ two variables are confounded when their ___________________________________________________________________ __________________________________________________________; difficult to determine whether either of them is causal
Graphs from Peck, Olsen, Devore Page 3 of 4 large residual Also need to look at individual points that lie outside the pattern: _________________ - a point that lies far from the fitted line and so produces a large residual. Outliers are points that do not follow the pattern of the other points in the dataset. These points typically have y -values that are either much above or below the least squares line. That is, they have _________________ in absolute value. Find the equation of the least-squares regression line for the data set with and without the "outlier". Sketch the lines on the graph. What changed? Check the residuals. What do you notice?
Uploaded by autumnpage157 on