ECON2300 lecture 1 intro notes

LECTURE 1: INTRODUCTION & REVIEW USING DATA TO MEASURE CASUAL EFFECTS: Ideally, we would like an experiment - What would be an experiment to estimate the effect of class size on standardized test scores? almost always, we only have observational (nonexperimental) data - returns to education - cigarette prices - monetary policy Most of the course deals with difficulties arising from using observational to estimate causal effects - confounding effects (omitted factors) not all relevant variables are observed omitted factors cause bias to arise - simultaneous causality - correlation does not imply causation although correlation is closely related to causation 1.2 EXAMPLE: THE CALIFORNIA TEST SCORE DATA SET Empirical problem: class size & educational output 2 main variables of interest Policy question: what is the effect on test scores of reducing class size by one student per class? Important question for decision makers as they have to allocate resources & money to improve education We use data from California school districts (n=420) We observe 5 th grade test scores (district average) & student-teacher ratio (STR) 1.INITIAL LOOK AT THE DATA - this table doesn't tell us anything about relationship b/w test scores & STR 2. DRAW SCATTER PLOT OF 2 VARIABLES Do districts with smaller classes have higher test scores? What does this figure show? - Some data distribution, but relationship b/w test score & STR is unclear
3. GETTING NUMERICAL EVIDENCE ON WHETHER DISTRICTS W/ LOW STRS HAVE HIGHER TEST SCORES: 1. Estimation: - compare average test scores in districts w/ low STRs to those w/ high STRs 2. Hypothesis testing: - test the "null" hypothesis that the mean test scores in the 2 types of districts are the same, against the "alternative" hypothesis that they differ statistical inference 3. Confidence interval: - estimate an interval for the difference in the mean test scores, high vs low STR districts INITIAL DATA ANALYSIS: To do estimation, hypothesis testing & confidence intervals classify districts by class size Compare districts with "small" (STR < 20) and "large" (STR ≥ 20) class sizes: We will do the following: 1. Estimation of ∆ = difference between group means 2. Test the hypothesis that ∆ (difference) = 0 3. Construct a confidence interval for ∆ (difference) 1.3. 1. ESTIMATION: Is this difference large in a real-world sense? Standard deviation of all test scores across districts = 19.1 difference is not that large not even 1 SD statistical observation Difference b/w the 60 th & 75 th percentiles of test score distribution is 666.7-659.4=7.3 similar to mean difference statistical observation It's hard to know if this difference is large in a real-world sense more info. needed on whether the difference is large or not "small" class size: STR < 20 "large" class size: STR 20
2. HYPOTHESIS TESTING: Difference-in-means test: compute the t-statistic: COMPUTE THE T-STATISTIC: 3. CONFIDENCE INTERVAL: A 95% confidence interval for the difference between the means is: confidence intervals are closely related for hypothesis testing CONCEPTS REVIEW: RANDOM VARIABLE: informally, a random variable gives you a number that's determined by outcomes of underlying experiments concept of random variable is closely related to experiments & experimental outcomes in stats, the term experiment is used more broadly - e.g the experiment here may include some complicated underlying process that determines the gender, weight, size etc. of a newborn baby - e.g some unknown process that determines a graduates first salary & its difference for the salary of a high school graduate referred to as an experiment
Uploaded by UltraGoat906 on