# Notes chapter 2

.docx
2.1 Overview of Using Data: Definitions and Goals Data are the facts and figures collected, analyzed, and summarized for presentation and interpretation. A characteristic or a quantity of interest that can take on different values is known as a variable ; An observation is a set of values corresponding to a set of variables; variation is the difference in a variable measured over observations (time, customers, items, etc.) . The values of some variables are under direct control of the decision maker (these are often called decision variables ). a quantity whose values are not known with certainty is called a random variable, or uncertain variable 2.2 Types of Data Data can be categorized in several ways based on how they are collected and the type collected . In many cases, it is not feasible to collect data from the population of all elements of interest. In such instances, we collect data from a subset of the population known as a sample . a representative sample can be gathered by random sampling from the population data. Quantitative and Categorical Data quantitative data if numeric and arithmetic operations, such as addition, subtraction, multiplication, and division, can be performed on them. For instance, we can sum the values for Volume in the Dow data in Table 2.1 to calculate a total volume of all shares traded by companies included in the Dow. If arithmetic operations cannot be performed on the data, they are considered categorical data . We can summarize categorical data by counting the number of observations or computing the proportions of observations in each category.
Cross-Sectional and Time Series Data Cross-sectional data are collected from several entities at the same, or approximately the same, point in time . Time series data are collected over several time periods . Sources of Data experimental study , a variable of interest is first identified. Then one or more other variables are identified and controlled or manipulated to obtain data about how these variables influence the variable of interest. Nonexperimental , or observational , studies make no attempt to control the variables of interest . A survey is perhaps the most common type of observational study. In some cases, the data needed for a particular application exist from an experimental or observational study that has already been conducted . Anyone who wants to use data and statistical analysis to aid in decision making must be aware of the time and cost required to obtain the data. The cost of data acquisition and the subsequent statistical analysis should not exceed the savings generated by using the information to make a better decision. Frequency Distributions for Categorical Data A frequency distribution is a summary of data that shows the number (frequency) of observations in each of several nonoverlapping classes, typically referred to as bins . Relative Frequency and Percent Frequency Distributions
Frequency Distributions for Quantitative Data The three steps necessary to define the classes for a frequency distribution with quantitative data are as follows: 1. Determine the number of nonoverlapping bins. 2. Determine the width of each bin. 3. Determine the bin limits. Number of Bins Bins are formed by specifying the ranges used to group the data. As a general guideline, we recommend using from 5 to 20 bins. For a small