(Revision 10) Assignment 6 Overview Total marks: / 62 This assignment covers content from Unit 6. It assesses your knowledge of correlational analysis and regression analysis used to examine the relationship between two quantitative variables. Instructions Show all your work and justify all of your answers and conclusions, except for the True/False questions. Keep your work to 4 decimals, unless otherwise stated. Note: Finishing a test of hypotheses with a statement like "reject 0 H " or "do not reject 0 H " will be insufficient for full marks. You must also provide a written concluding statement in the context of the problem itself. For example, if you are testing hypotheses about the effectiveness of a medical treatment, you must conclude with a statement like, "we can conclude that the treatment is effective" or "we cannot conclude that the treatment is effective." (43 total marks) 1. A large warehouse superstore is interested in optimizing its customers' shopping experiences and, as such, wants to ensure that it is able to staff the store properly during peak hours. The store management is interested in studying the relationship between the number of tills or checkouts that are open in the store and the amount of time it takes for a customer to check out (that is, the time it takes from when they get in line to when they complete their purchase). The data in the following table were collected from a random sample of 7 customers: Tills Open (x) Time to Checkout (minutes) (y) 2 17 9 10 12 5 5 12 3 15 10 8 6 12 Mathematics 215: Introduction to Statistics Assignment 6 1
(Revision 10) (4 marks) a. Construct a scatter diagram for these data with "Tills Open" on the horizontal ( x ) axis, and "Time to Checkout" on the vertical ( y ) axis. Note: Try to make relatively full use of the graph paper provided. (2 marks) b. Describe the general pattern of relationship between the two variables within the context of this question. = While the "Tills Open" increase in number the "Time To Checkout" decreases. This means that there is a negative correlation between the two. The opposite is also true because when the "Time To Checkout" increases in number, there is a decrease in "Tills Open" which is called a negative linear relationship. (11 marks) c. Calculate the least squares regression line using "Time to Checkout" as the dependent variable and "Tills Open" as the independent variable. (3 marks) d. Calculate predicted values for 3 x and 10 x . Use these values to help plot the regression line on the scatter diagram you constructed in part a. above. x = 3 Mathematics 215: Introduction to Statistics Assignment 6 2
(Revision 10) ŷ = 18.4829 - (1.0719)(3) = 15.2672 x = 10 ŷ = 18.4829 - (1.0719)(10) = 7.7639 (10 marks) e. Can it be concluded that the slope of the regression line is negative? Formulate and test the appropriate hypotheses at the 5% significance level. Use the critical value approach. Clearly state and explain your conclusion within the context of the problem. Null Hypothesis, H0:β1 = 0 Alternative Hypothesis, Ha:β1 < 0 where, β = true slope. t = -11.59 cv = -t(0.05, 5) = -2.015 Since the value of t < cv, we reject H0 at 5% level of significance. Hence we conclude that slope is significantly less than zero. (4 marks) f. Construct a 95% confidence interval for β . 95% CI = (-1.3097, -0.8342) (2 marks) g. Interpret the value of b in the sample regression line. What does it mean in the context of this question? = The coefficient of slope which is the value of b is negative and the value is -1.0719 which means that for every 1 unit increase in value of "Tills Open" the value of "Time To Checkout" decreases by 1.0719 minutes. This can also be true about the reversed. They have a negative relationship between them. (2 marks) h. One of the store managers regularly likes to keep 8 tills open on Saturdays. Use the equation of the regression line to provide the manager with the predicted time to check out if 8 tills are open. Y = 18.48288 + (-1.07192)(X) = 18.48288 + (-1.07192)(8) = 18.48288 - 8.57536 = 9.90752 (1 mark) i. Which of the following cannot be answered from the regression equation? Clearly circle only one response. A. A prediction of the value of y at a particular value of x . B. An estimate of the slope between y and x . C. An estimate of whether the linear association between variables is positive or negative. Mathematics 215: Introduction to Statistics Assignment 6 3
Uploaded by DukeKnowledgeGorilla25 on