# Psyu2248-stata-commands

.pdf
StuDocu is not sponsored or endorsed by any college or university PSYU2248 - STATA Commands Design and Statistics II (Macquarie University) StuDocu is not sponsored or endorsed by any college or university PSYU2248 - STATA Commands Design and Statistics II (Macquarie University) Downloaded by Bo Chang ([email protected]) lOMoARcPSD|14625475
*View rows of data for dataset list DV<numeric>, by(IV <categorical>) * tabulate all variables tab1 _all * tabulate specifc range or variables: tab1 variable-variable *Descriptve stats summarize GAD_SUM *same but more detail: summarize GAD_SUM, detail *mean (standard error /confdence interval etc) mean GAD_SUM * tabulate any stats for any variable (iqr etc etc) tabstat GAD_SUM, stat( n sd median iqr mean min max skew kurt) *or by grouped IV's: tabstat DV, by (IV) stat (n mean sd skew kurtosis etc) *Create scaterplot [DV (frst) Y axis / IV (second) X axis]: graph twoway (scater y1 x1) * Numerical Summaries to compare stats (mean, medians, variability, IQR etc) by IV, sort : summarize DV, detail * z-test (numerical) variable == known mean, known SD: ztest Ages==26, sd(4) *one sample t-test (numerical)variable == known mean : test general_disgust == 1.67 *2 sample t-test (between-independent sample) test DV, by (categorical IV) *or: test DV == IV, unpaired *check VARIANCE between two groups (homoscedacity) want equal variance among groups) robvar DV, by (IV) [numeric by categorical) *If any expected values - even just one - is less than 5 dont proceed with the chi-squared test of independence because the assumpton check has failed: tabulate Gender biganx, chi2 expected row * tab V1 etc * is the patern of the bars the same across the groups graph bar, over(V1) over(V2) asyvars **Assumptons for chi square ind... Independent observatons cat. var. (normial or ordinal) no smaller than 5 expected freq. * net install tab_chi *tabchi v1 v2 a - adjusted r - raw e.g. tabchi v1 v2, r a * chi-square test of independence tabulate var1 var2, chisq expected row or: tab V1 V2, chi2 V row *Raw residuals- dif between expected nd observed freq *Standardised resid- factoring in n **Adjusted standardised-factoring in standard error [THIS ONE] - (mean=0, SD=1) (Anythign greater than 2 is noteworthy, bigger efect) negatve = less observatons in cell positve = more observatons in cell * p-value for chi square: (df 1) and (answer to test statstc) which equaled to 11.91 display chi2tail(1,11.91) * histogram DV, by (IV) freq * box plot: graph box variable/s * swilk for groups: by IV, sort: swilk DV (NORMALITY) **Normality is defned by:using decriptves, swilk, histogram) *Central tendency (mean =¬ median =¬ mode) *Modality (symmetrical =unimodal - check histogram) *Variability (SD > 0) *Skewness (we want approx close to 0 [symmetrical] : < 0 = negatve skew - > 0 = positve skew) *Kurtosis (we want approx mesokurtc 3) (if its < 3 - plato if its >3 its lepto) * large sample can f*ck up your swilk *** Simple Linear Regression: *frst do scaterplot + line of best ft scater variable || lft variable * regress DV (y) IV1 (x) IV2 (x) .. * regress x y, beta * Multply R squared by 100 and get % for variance *MS (mean square = SS/ df) *Coefcients table showing individual efects of x's (DV's) on y (IV) *SE (standard error showing variance around slope) *If 0. is within range for CI likely no efect and H0 is correct. ASSUMPTIONS: *independence predict resid1 (NEW variable NAME), resid e.g. predict resid1, resid * check for normal distributon of RESIDUALS: - pnorm variable (p-plot - normal probability plot) - histogram resid1, freq bin(8) - swilk resid1 *Check for homoscedacity of RESIDUALS (constant variance)(linearity of Residuals - we dont want a trend - just equal variance from lef to right of plot) *Residuals Vs Fited scaterPlot* Downloaded by Bo Chang ([email protected]) lOMoARcPSD|14625475
Rvfplot (AND) If you want a line drawn through rvfplot, yline(0) *Check COLLINEARITY between IV's (if >.70 it's usually a sign) [if: VIF = >10 (not good)] [tolerance: 1/VIF = < . 1 [this shows % of how much one IV is NOT explained by the other IV - lef over % is how much they overlap = the correlaton] (check afer regression) estat *Also: ciplot DV, by(categorical IV) ***Regressin: **df simple regress: 1 (no. of Iv's in multple regress) **df within (resid): Total - df(model) -1 **df total: n-1 **ONE-WAY ANOVA (Independent t-test but for multple IV's) *make and assign value labels to the categories label defne Gender 1 "Male" 2 "Female" 3 "Other" label values Gender Gender1 * test normality: - tabstat DV, by (IV) stat (n mean sd skew kurtosis) (your sd can actually tell you if assumpton on equal variance has been violated - same with kurtosis and skew etc) *histogram *shap. Wilk *levene's test (robvar to test equal variance - homogeneity) anova DV IV *or: oneway DV IV, tabulate ***ANOVA **df between(IV): k-1 (k = levels of IV) **df within (resid): k (n-1) (n=sample in each group) **df total: N-1 (total sample size) **efect size only afer anova - omega instead of R squared: estat esize, omega **to also get: mean group diference and adjusted pvalue: oneway DV IV, methods **Pairwise comparison between IV's - Tukeys for PostHoc (No hypothesis) (make sure equal in variance) - Bonferronis for A priori (planned) (Hypothesis) pwmean DV, over (IV) mcompare (method) efects [ bon, noadj, tukey, sch ] **also: oneway DV IV, bonferroni ***FACTORIAL ANOVA *descriptves by IV1 IV2, sort; summarize DV, detail by pints sex, sort: summarize atract, detail histogram __, by (___) *check cell and marginal means tab IV1 IV2, summarize (DV) tab pints sex, summarize (atract) *Then run anova frst: anova DV IV1 IV2... *cell and marginal means: *margins IV1#IV2 (ignore t's and p's) margins pints#sex marginsplot [now rerun it to map plot with factor "sex" on the x- axis] (Line Graph) margins sex#pints marginsplot margins *bar graph predict dv graph bar dv, over (IV1) over (IV2) asyvars graph bar dv, over (IV2) over (IV1) asyvars graph bar yhat, over(sex) over(pints) asyvars yttle("Predicted Cell Means for Atract") [now rerun it one more tme, to generate the bar graph with factor "sex" on the x-axis] graph bar yhat, over(pints) over(sex) asyvars yttle("Predicted Cell Means for Atract") Assumptons: *independence (random allocaton fxes this) *test normality: by IV1 IV2, sort: swilk DV by sex pints, sort: swilk atract *homogeneity of variance: (homoscedacity)(cell can be anything you want to name new conditon) egen cell = group (IV1 IV2) egen cell = group(sex pints) robvar DV, by (cell) robvar atract, by (cell) [can also see how many conditons in DV using robvar] *calculate anova (testng your H): anova DV IV1 IV2 IV1#IV2 anova atract sex pints sex#pints [or] anova DV IV1#IV2 anova atract sex#pints *you can rerun marginsplot here swapping the IV on x- axis to see individual efects etc Downloaded by Bo Chang ([email protected]) lOMoARcPSD|14625475