Вы находитесь на странице: 1из 3

CHAPTER 2

VARIABLE
A characteristic or quantity of interest that can take on different values.

DATA
The facts and figures collected, analyzed, and summarized for presentation and interpretation.

OBSERVATION
A set of values corresponding to a set of variables.

VARIATION
Differences in values of a variable over observations.

RANDOM VARIABLE, OR UNCERTAIN VARIABLE


A quantity whose values are not known with certainty.

POPULATION
The set of all elements of interest in a particular study.

SAMPLE
A subset of the population.

QUANTITATIVE DATA
Data for which numerical values are used to indicate magnitude, such as how many or how much.
Arithmetic operations such as addition, subtraction, and multiplication can be performed on
quantitative data.

RANDOM SAMPLING
Collecting a sample that ensures that (1) each element selected comes from the same population and
(2) each element is selected independently.

CATEGORICAL DATA
Data for which categories of like items are identified by labels or names. Arithmetic operations
cannot be performed on categorical data.

TIME SERIES DATA


Data that are collected over a period of time (minutes, hours, days, months, years, etc.).

CROSS-SECTIONAL DATA
Data collected at the same or approximately the same point in time.

BINS
The nonoverlapping groupings of data used to create a frequency distribution. Bins for categorical
data are also known as classes.

FREQUENCY DISTRIBUTION
A tabular summary of data showing the number (frequency) of data values in each of several
nonoverlapping bins.

PERCENT FREQUENCY DISTRIBUTION


A tabular summary of data showing the percentage of data values in each of several nonoverlapping
bins.

RELATIVE FREQUENCY DISTRIBUTION


A tabular summary of data showing the fraction or proportion of data values in each of several
nonoverlapping bins.
HISTOGRAM
A graphical presentation of a frequency distribution, relative frequency distribution, or percent
frequency distribution of quantitative data constructed by placing the bin intervals on the horizontal
axis and the frequencies, relative frequencies, or percent frequencies on the vertical axis.

SKEWNESS
A measure of the lack of symmetry in a distribution.

CUMULATIVE FREQUENCY DISTRIBUTION


A tabular summary of quantitative data showing the number of data values that are less than or equal
to the upper class limit of each bin.

MEAN (ARITHMETIC MEAN)


A measure of central location computed by summing the data values and dividing by the number of
observations.

MEDIAN
A measure of central location provided by the value in the middle when the data are arranged in
ascending order.

MODE
A measure of central location defined as the value that occurs with greatest frequency.

GEOMETRIC MEAN
A measure of central location that is calculated by finding the nth root of the product of n values.

GROWTH FACTOR
The percentage increase of a value over a period of time is calculated using the formula (1 − growth
factor). A growth factor less than 1 indicates negative growth, whereas a growth factor greater than 1
indicates positive growth. The growth factor cannot be less than zero.

RANGE
A measure of variability defined to be the largest value minus the smallest value.

VARIANCE
A measure of variability based on the squared deviations of the data values about the mean.

STANDARD DEVIATION
A measure of variability computed by taking the positive square root of the variance.

COEFFICIENT OF VARIATION
A measure of relative variability computed by dividing the standard deviation by the mean and
multiplying by 100.

PERCENTILE
A value such that approximately p% of the observations have values less than the pth percentile;
hence, approximately (100 2 p)% of the observations have values greater than the pth percentile. The
50th percentile is the median.

QUARTILES
The 25th, 50th, and 75th percentiles, referred to as the first quartile, second quartile (median), and
third quartile, respectively. The quartiles can be used to divide a data set into four parts, with each
part containing approximately 25% of the data.

INTERQUARTILE RANGE
The difference between the third and first quartiles.
Z-SCORE
A value computed by dividing the deviation about the mean (xi-x) by the standard deviation s. A z-
score is referred to as a standardized value and denotes the number of standard deviations that xi is
from the mean.

EMPIRICAL RULE
A rule that can be used to compute the percentage of data values that must be within 1, 2, or 3
standard deviations of the mean for data that exhibit a bell-shaped distribution.

OUTLIERS
An unusually large or unusually small data value.

BOX PLOT
A graphical summary of data based on the quartiles of a distribution.

SCATTER CHART
A graphical presentation of the relationship between two quantitative variables. One variable is
shown on the horizontal axis and the other on the vertical axis.

COVARIANCE
A measure of linear association between two variables. Positive values indicate a positive
relationship; negative values indicate a negative relationship.

CORRELATION COEFFICIENT
A standardized measure of linear association between two variables that takes on values between −1
and +1. Values near −1 indicate a strong negative linear relationship, values near +1 indicate a strong
positive linear relationship, and values near zero indicate the lack of a linear relationship.

LEGITIMATELY MISSING DATA


Missing data that occur naturally.

ILLEGITIMATELY MISSING DATA


Missing data that do not occur naturally.

MISSING COMPLETELY AT RANDOM (MCAR)


The tendency for an observation to be missing a value of some variable is entirely random.

MISSING AT RANDOM
The tendency for an observation to be missing a value of some variable is related to the value of
some other variable(s) in the data.

MISSING NOT AT RANDOM


The tendency for an observation to be missing a value of some variable is related to the missing
value.

IMPUTATION
Systematic replacement of missing values with values that seem reasonable.

DIMENSION REDUCTION
The process of removing variables from the analysis without losing crucial information.

Вам также может понравиться