School

Washtenaw Community College **We aren't endorsed by this school

Course

MTH 160

Subject

Statistics

Date

Sep 4, 2023

Pages

2

Uploaded by ShadowPrince77 on coursehero.com

Basic Statistics - Chapter 1 - Types of Samples and Types of Data
A
population
is the entire collection of individuals about which information is sought.
Parameters
are
numbers that describe the population.
A
sample
is a subset of a population, containing the individuals that are actually observed.
Statistics
are
numbers that describe a sample.
A
simple random sample
is chosen by a method in which each collection of items is equally likely.
For
cluster sampling
, the population is divided into groups, and a random sample of groups is drawn.
For
stratified sampling
, the population is divided into groups, and a random sample of individuals is drawn
from each group.
A
sample of convenience
is a sample that is not drawn by a well-defined random method.
Qualitative
data refers to categories or features (labels).
Quantitative
data refers to counts or measures
(numbers).
Nominal
data refers to items that have NO natural order.
Ordinal
data refers to items that can be ordered.
Continuous
data can take on any value in an interval (measures).
Discrete
data can be listed (counts).
Basic Statistics - Chapter 3 - Numerical Summaries of Data
STAT EDIT
lets you enter data, and
STAT CALC
lets you calculate two screens of 1-Var Stats,
where
Sample Mean.
The symbol
µ (mu)
represents the
Population Mean in many formulas.
S
x
= Sample Std. Deviation
σ
x
= Population Std. Deviation
Variance
= (
Std.Deviation)
2
Coefficient of Variation =
σ/µ
Five Number Summary
min, Q
1
, median, Q
3
, max
(and the median is same as Q
2
)
Empirical Rule:
For data sets
that
are
approximately symmetric
:
68%
of the data values are between
µ
−
σ
and
µ
+
σ
,
95%
of the data values are between
µ
− 2
σ
and
µ
+ 2
σ
,
and
almost all
of the data values are
between
µ
− 3
σ
and
µ
+ 3
σ
.
Chebyshev's Inequality:
For any data set (even very skewed, with one tail),
75%
or more of the
of the data
values are between
µ
− 2
σ
and
µ
+ 2
σ
,
and
89%
or more
of the data values are between
µ
− 3
σ
and
µ
+ 3
σ
.
z = (x-µ) / σ =
how many standard deviations that
value is from its population mean.
x
= µ + (z*σ)
= value based on a given z-score
Inner Quartile Range (IQR) = Q
3
- Q
1
Lower Outlier Boundary
=
Q
1
- 1.5 * IQR
Upper Outlier Boundary
= Q
3
+ 1.5 * IQR

Basic Statistics - Chapter 2 - Graphical Summaries of Data
Here are some things to remember about HISTOGRAMS:
Approximately symmetric
means that the right side and the left side are almost identical
Skewed to the right
, is also called positively skewed, which means the long tail is on the right side
Skewed to the left
, is also called negatively skewed, which means the long tail is on the left side
Frequency
histograms are based on counts and
Relative Frequency
histograms are based on percent
Finally,
classes
must not overlap, must be of equal width, and there should be NO missing classes
Here are some things to remember about STEM-&-LEAF PLOTS:
The
STEM
is the first part of the number, and NO values are skipped, when setting up your stems
The
LEAVES
are the rightmost part of the number, which is only the last digit
Finally, the leaves are ordered from smallest to biggest values, as you move away from the stems
Here are some things to remember about FREQUENCIES and RELATIVE FREQUENCIES:
The
frequency
of a category is the number of times it occurs in the data set
A
frequency distribution
is a table that presents the frequency for each category
The
relative frequency
of a category is the frequency of the category divided by the sum of all the
frequencies. (decimal or percent) -
Pie Charts
are based on relative frequency
Basic Statistics - Chapter 4 - Summarizing Bivariate Data
Scatterplots
Press
2nd, Y=
(
STAT PLOT
)
and press
1
for first plot and select
On
and scatterplot icon (first icon).
Press
ZOOM
and
9: ZoomStat.
Press
STAT
and
CALC.
Select
4: LinReg (ax+b)
-or-
8:
LinReg (a+bx)
and press
ENTER
.
Linear Equations
Two uses for the equation for the regression line include:
Determine how much y
differs
, when given the difference in two values of x.
The slope = b in the
example screen above, and ∆x is the difference in x, so the difference in y is
∆y = b * ∆x
.
Predict the value of y, when given a value for x.
Replace x with the given value and solve for y.
Use either 4:LinReg for y = ax + b or 8:LinReg for y = a + bx, on
your calculator.
Two values that are used with the regression line include:
r
2
=
Coefficient of Determination
, which is the % of variation explained by the regression line
r = Correlation Coefficient
, which describes the strength of the linear relationship (-1 ≤ r ≥ +1), and
the direction of the line (negative is downward and positive is upward).
Remember, correlation DOES NOT equal causation
,
as in example of ice cream sales and shark attacks!
Another Reminder: If r and r
2
DO NOT appear when you use the LinReg app, then go to 2
nd
and 0 to get
the Catalog list.
Scroll down to DiagnosticOn, hit enter twice, and Done should appear on your screen.

Page1of 2