Вы находитесь на странице: 1из 79

BZAN 6310: Quantitative Analysis for Business

Lecture 1. Introduction and Describing Data

Dr. Jinghui (Jove) Hou


What are we trying to accomplish?
1. What influence my firm’s return of investment?
2. Why some salespeople are more productive than others?
3. How to distribute advertising budget effectively?
4. How to predict next year’s revenue?
5. Which employee(s) should I hire?
6. What promotes my customer to stick around?
Textbook
Installing StatTools
• Palisade Decision Tools Suite: StatTools: a statistics toolset to Excel
• Install Excel 2016 (or later)
• Go to Blackboard “Course Syllabus and Information etc” for the link
• Check email -> Download -> Save File -> Extract
• Close all Excel windows
• Run .exe -> install (accept the default)
Installing StatTools cont’d
• Run Excel (or open an excel file)
• Menu -> StatTools -> Click “OK” -> Click “Close”
• New tab in the Excel menu: “StatTools”

Resources on Blackboard
• Excel tutorials
• StatTools tutorials
Describing Data
• What kind of Data do you encounter in your daily business?
2-2 Basic Concepts

 Several important concepts


 Populations and samples
 Data sets
 Variables and observations

 Types of data

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
2-2a Populations and Samples

 A population includes all of the entities of interest in a study (people,


households, machines, etc.).
 Examples
 All potential voters in a presidential election
 All subscribers to cable television
 All invoices submitted for Medicare reimbursement by nursing homes

 A sample is a subset of the population, often randomly chosen and


preferably representative of the population as a whole.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
2-2b Data Sets, Variables, and Observations

 A data set is usually a rectangular array of data, with variables in


columns and observations in rows.
 A variable (or field or attribute) is a characteristic of members of a
population, such as height, gender, or salary.
 An observation (or value, or case or record) is a list of all variable
values for a single member of a population.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.1: Data from an Environmental Survey

 Objective: To illustrate variables and observations in a typical data


set.
 Solution: Data set includes observations on 30 people who
responded to a questionnaire on the president’s environmental
policies.
 Variables include age, gender, state, children, salary, and opinion.
 Include a row that lists variable names.
 Include a column that shows an index of the observation.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
2-2c Types of Data

 A variable is numerical if meaningful arithmetic can be performed on


it.
 Otherwise, the variable is categorical.
 There is also a third data type, a date variable.
 Excel® stores dates as numbers, but dates are treated differently from
typical numbers.
 A categorical variable is ordinal if there is a natural ordering of its
possible values.
 If there is no natural ordering, it is nominal.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Types of Data

 A numerical variable is discrete if it results from a count, such as the


number of children.
 A continuous variable is the result of an essentially continuous
measurement, such as weight or height.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Types of Data
Categorical _ arithmetic operations do NOT make sense.
• Nominal _ Measurement are categories & numerical values have no
mathematical significance. Example: color.
• Ordinal _ Rank order. Example: horse racing
Numerical _ arithmetic operations make sense.
• Discrete_ Measurement are from counts. Example: size of household.
• Continuous _ continuous measures. Example: bank saving, volume
Test Yourself
Determine Types of Data.

Challenge Level 1:
• political party affiliation
• grades (A, A-, B+… F)
• test scores (SAT, ACT)
• size of household
Test Yourself
Determine Types of Data.

Challenge Level 2:
• Are you a smartphone owner? (1 = Y; 0 = N)
• President Trump
• Q1: Dependable _ _ _ _ _ _ _ _ _ Undependable
• Q2: Trustworthy _ _ _ _ _ _ _ _ _ Untrustworthy
• BZAN6310 is known to be extremely difficult.
1=Strongly Disagree 2=Disagree 3=Neutral 4=Agree 5=Strongly Agree
Types of Data

 Categorical variables can be coded numerically.


 A dummy variable is a 0–1 coded variable for a specific category.
 It is coded as 1 for all observations in that category and 0 for all
observations not in that category.
 A binned (or discretized) variable corresponds to a numerical
variable that has been categorized into discrete categories.
 These categories are usually called bins.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Test Yourself
Determine Types of Data.

Challenge Level 3:
• Age
• What is your age? ___________
• What is your age? ___________
• 1 = under 18
• 2 = 18 – 25
• 3 = 26 – 55
• 4 = above 55
• What is your age? ___________
• Young
• Old
Example 2.1: Data from an Environmental Survey

• Objective: To illustrate variables and observations in a typical data


set.
• Solution: Data set includes observations on 30 people who
responded to a questionnaire on the president’s environmental
policies.
• Variables include age, gender, state, children, salary, and opinion.
 Include a row that lists variable names.
 Include a column that shows an index of the observation.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Types of Data

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Types of Data

 Cross-sectional data are data on a cross-section of a population at a


distinct point in time.
 Time series data are data collected over time.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Types of Data

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
2-3 Descriptive Measures for
Categorical Variables

 There are only a few possibilities for describing a categorical


variable, all based on counting:
 Count the number of categories.
 Give the categories names.

 Count the number of observations in each category. (The resulting counts can
be reported as “raw counts” or as percentages of totals.)
 Once you have the counts, you can display them graphically, usually in a column
chart or a pie chart.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.2: Supermarket Sales

 Objective: To summarize categorical variables in a large


data set.
 Solution: Data set contains transactions made by
supermarket customers over a two-year period.
 Children, Units Sold, and Revenue are numerical.
 Purchase Date is a date variable.
 Transaction and Customer ID are used only to identify.
 All of the other variables are categorical.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.2: Supermarket Sales

 To get the counts in column S, use the Excel® function,


COUNTIF.
 To get the percentages in column T, divide each count by the total
number of observations.
 Keep charts simple so that the information they contain emerges
as clearly as possible.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.2: Supermarket Sales

 Another efficient way to find


counts for a categorical
variable is to use dummy (0–1)
variables.
 Recode each variable so that
one category is replaced by 1
and all others by 0.
 This can be done using a simple IF
formula.
 Find the count of that category
by summing the 0s and 1s.
 Find the percentage of that
category by averaging the 0s
and 1s.
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
In-class Lab 1
• Descriptive Statistics for Categorical Variables exercise
• Excel: COUNTIF Function (also see Excel Tutorial)
• Charts
• Data set: Supermarket Transaction.xlsx
2-4 Descriptive Measures for
Numerical Variables

 There are many ways to summarize numerical variables, both with


numerical summary measures and with charts.
 We begin with a numerical variable such as Salary, where there is one
observation for each person. Our basic goal is to learn how these
salaries are distributed across people by asking:
1. What are the most “typical” salaries?
2. How spread out are the salaries?
3. What are the “extreme” salaries on either end?
4. Is a chart of the salaries symmetric about some middle value, or is it
skewed in one direction?
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.3: Baseball Salaries

 Objective: To learn how salaries are distributed across


all 2015 MLB players.
 Solution: Data set contains data on 868 Major League
Baseball players in the 2015 season.
 Variables are player’s name, team, position, and salary.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
2-4a Numerical Summary Measures

 Throughout this section, we focus on a Salary variable.


 Measures of Central Tendency
 Minimum, Maximum, Percentiles, and Quartiles

 Measures of Variability

 Empirical Rules for Interpreting Standard Deviation

 Measures of Shape

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Measures of Central Tendency
(slide 1 of 3)

 The mean is the average of all values.


 If the data set represents a sample from some larger population, this
measure is called the sample mean and is denoted by (“X-bar”).
 If the data set represents the entire population, it is called the population
mean and is denoted by μ.

 In Excel®, the mean can be calculated with the AVERAGE function.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Measures of Central Tendency
(slide 2 of 3)

 The median is the middle observation when the data are sorted from
smallest to largest.
 If the number of observations is odd, the median is literally the middle
observation.
 If the number of observations is even, the median is usually defined as the
average of the two middle observations.
 In Excel®, the median can be calculated with the MEDIAN function.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Measures of Central Tendency
(slide 3 of 3)

 The mode is the value that appears most often.


 In Excel®, the mode can be calculated with the MODE function.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.3: Baseball Salaries
(slide 2 of 10)

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Minimum, Maximum, Percentiles, and Quartiles

 For any percentage p, the pth percentile is the value such that a
percentage p of all values are less than it.
 The quartiles divide the data into four groups, each with
(approximately) a quarter of all observations.
 The first, second, and third quartiles are the percentiles corresponding to p
= 25%, p = 50%, and p = 75%.
 By definition, the second quartile (p = 50%) is equal to the median.

 The minimum and maximum values can be calculated with MIN and
MAX functions, and the percentiles and quartiles with PERCENTILE and
QUARTILE functions in Excel®.
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Measures of Variability
(slide 1 of 4)

 The range is the maximum value minus the minimum


value.
 The interquartile range (IQR) is the third quartile
minus the first quartile.
 Thus, it is the range of the middle 50% of the data.
 It is less sensitive to extreme values than the range.

 The variance is essentially the average of the


squared deviations from the mean.
 If Xi is a typical observation, its squared deviation from
the mean is (Xi – mean)2.
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Measures of Variability
(slide 2 of 4)

 The sample variance is denoted by s2, and the population variance


by σ2.

 If all observations are close to the mean, their squared deviations from the
mean—and the variance—will be relatively small.
 If at least a few of the observations are far from the mean, their squared
deviations from the mean—and the variance—will be large.
 In Excel®, use the VAR.S function to obtain the sample variance and the
VAR.P function to obtain the population variance.
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Measures of Variability
(slide 3 of 4)

 A fundamental problem with variance is that it is in squared units


(e.g., $  $2).
 A more natural measure is the standard deviation, which is the
square root of the variance.
 The sample standard deviation, denoted by s, is the square root of
the sample variance.
 The population standard deviation, denoted by σ, is the square root
of the population variance.
 In Excel®, use the STDEV.S function to find the sample standard
deviation or the STDEV.P function to find the population standard
deviation.
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Measures of Variability
(slide 4 of 4)

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Empirical Rules for Interpreting Standard Deviation

 The interpretation of the standard deviation can be


stated as three empirical rules.
 “Empirical” means that they are based on commonly
observed data, as opposed to theoretical mathematical
arguments.
 If the values of a variable are approximately normally
distributed (symmetric and bell-shaped), then the following
rules hold:
 Approximately 68% of the observations are within one standard
deviation of the mean.
 Approximately 95% of the observations are within two standard
deviations of the mean.
 Approximately 99.7% of the observations are within three
standard deviations of the mean.
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.3: Baseball Salaries

skewed to the right (or positively skewed)


© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.3: Baseball Salaries

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Empirical Rules for Interpreting Standard Deviation

 The empirical rules should be applied with caution, especially when


the data are clearly skewed, as illustrated by the calculations for
baseball salaries below.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Empirical Rules for Interpreting Standard Deviation

 The mean absolute deviation (MAD) is the average of the absolute


deviations.

 In Excel®, use the AVEDEV function to calculate MAD.


 There is another empirical rule for MAD: For many variables, the
standard deviation is approximately 25% larger than MAD.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Measures of Shape
(slide 1 of 2)

 Skewness occurs when there is a lack of symmetry.


 A variable can be skewed to the right (or positively skewed) because of
some really large values (e.g., really large baseball salaries).
 Or it can be skewed to the left (or negatively skewed) because of some
really small values (e.g., temperature lows in Antarctica).
 In Excel®, a measure of skewness can be calculated with the SKEW
function.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.3: Baseball Salaries

skewed to the right (or positively skewed)


© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Measures of Shape
(slide 2 of 2)

 Kurtosis has to do with the “fatness” of the tails of the distribution


relative to the tails of a normal distribution.
 A distribution with high kurtosis has many more extreme observations.
 In Excel®, kurtosis can be calculated with the KURT function.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Histogram Example: Late or Lost Baggage

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
2-4b Numerical Summary Measures with StatTools

 Although built-in functions in Excel® can be used to


calculate a number of summary measures, a much
quicker way is to use the StatTools add-in.
 StatTools is part of the Palisade DecisionTools Suite®.
Once the suite is installed, load StatTools by double-
clicking the StatTools item in the list of programs on the
Windows Start menu.
 You will know that StatTools is loaded when you see the
StatTools tab and ribbon.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Basic StatTools Features
(slide 3 of 5)

Before you can perform any


statistical analysis, you must define
a StatTools data set.
 You do this by clicking the Data
Set Manager button.
 Make sure any cell in the data set
is selected, and click the Data Set
Manager button.
 StatTools makes several guesses
about your data set. You can
always override them.
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Basic StatTools Features
(slide 4 of 5)

 To generate the summary


measures for the Salary
variable, select One-Variable
Summary from the Summary
Statistics dropdown list.
 This is a typical StatTools
dialog box. In the top section,
you can select a StatTools
data set and one or more
variables. In the bottom
section, you can select the
measures you want.
 You might want to choose
only your favorite summary
measures as the defaults.
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Basic StatTools Features
(slide 5 of 5)

 There are several other things


to note about the StatTools
output.
 First, it formats the results
according to its own rules.
 Second, the fact that there are
formulas in these result cells
indicates that they are “live.”
If you go back to the data
and change any of the
salaries, the summary
measures will update
automatically.
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
In-class Lab 2
• Descriptive Statistics for Numerical Variables exercise
• Excel: StatTools (also see StatTools Tutorial)
• Data set: Baseball Salaries.xlsx
2-4d Charts for Numerical Variables

 There are many graphical ways to indicate the distribution of a


numerical variable.
 For cross-sectional variables:
 Histograms
 Box plots (also called box-whisker plots)
 For time series variables:
 Time series graphs

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Histograms

 A histogram is the most common type of chart for showing the


distribution of a numerical variable.
 It is based on binning the variable—that is, dividing it up into discrete
categories.
 It is a column chart of the counts in the various categories (with no gaps
between the vertical bars).
 A histogram is great for showing the shape of a distribution—whether
the distribution is symmetric or skewed in one direction.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.3: Baseball Salaries

 Objective: To see the shape of the salary distribution through a


histogram.
 Solution: Using Excel 2016®, select the Salary variable and choose
the Histogram chart type from the Statistics group on the Insert ribbon.
 If you are not yet using Excel 2016®, it is possible to create a histogram
with Excel® tools only but tedious.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.3: Baseball Salaries

 Excel® automatically chooses


bins for the histogram, but
you can right-click the
horizontal axis and select
Format Axis, where you can
choose the Bin width or the
Number of bins (but not
both).

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.3: Baseball Salaries

 The resulting histogram has one added


enhancement, where data labels “outside” the bars
indicate the counts of the various bins.
 You can use the Chart Elements dropdown list on the
Chart Tools Design ribbon to add the data labels.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.3: Baseball Salaries

 It is much easier to create a


histogram with StatTools.
 First, designate a StatTools
data set.
 Next, select Histogram from
the Summary Graphs
dropdown list.
 In the dialog box, select the
Salary variable and click OK
to get the histogram with the
StatTools “auto” bins.
 Change the options at the
bottom of the dialog box to
fine-tune the bins.
© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.3: Baseball Salaries

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Histogram Example: Late or Lost Baggage

 This example file contains the count of bags late or lost for 456 flights.
 The “natural” bins are the integer values 0 to 8, which matches to
values for the counts.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Histogram Example: Late or Lost Baggage

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Box Plots

 A box plot (or box-whisker plot) is an alternative type of chart for


showing the distribution of a variable.
 Side-by-side box plots are very useful for comparing distributions.
 Box plots and histograms are complementary ways of displaying the
distribution of a numerical variable.
 As with histograms, box plots are “big picture” charts.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.3: Baseball Salaries

 Objective: To illustrate the features of a box plot,


particularly how it indicates skewness.
 Solution in Excel®: Create a box plot of a single variable
like Salary almost exactly like you create a histogram.
 Select the variable and then choose the Box and Whisker
chart type from the Statistics group on the Insert ribbon.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.3: Baseball Salaries

 In StatTools, select Box-Whisker Plot from the Summary


Graphs dropdown list and fill in the dialog box.
 To get a generic box plot, you can check the bottom
option.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.3: Baseball Salaries

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.3: Baseball Salaries

skewed to the right (or positively skewed)


© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
2-5 Time Series Data

 Our main interest in time series variables is how they change over time,
and this information is lost in traditional summary measures and in
histograms or box plots.
 For time series data, a time series graph is used. This is a graph of
the values of one or more time series, using time on the horizontal axis.
 This is always the place to start a time series analysis.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.4: Crime in United States
(slide 1 of 6)

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.4: Crime in United States
(slide 2 of 6)

 Objective: To see how time series


graphs help to detect trends in
crime data.
 Solution: Data set contains annual
data on violent and property
crimes for the years 1960 to 2010.
 In StatTools, designate a StatTools
data set.
 Then select Times Series Graph
from the Time Series and
Forecasting dropdown list and fill in
the resulting dialog box.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.4: Crime in United States
(slide 3 of 6)

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.4: Crime in United States
(slide 4 of 6)

Violent and Property Crime Rates

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.4: Crime in United States
(slide 5 of 6)

Rates of Violent Crime Types

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.5: The DJIA Index
(slide 1 of 2)

 Objective: To find useful ways to summarize the monthly


Dow data.
 Solution: Data set contains monthly values of the Dow from
1950 through mid-2015.
 Create summary measures and time series graphs for
monthly values and percentage changes of the Dow.

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example 2.5: The DJIA Index
(slide 2 of 2)

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
In-class Lab 3 & 4
Lab 3
• Charts for Numerical Variables exercise
• Excel: StatTools (also see StatTools Tutorial)
• Histogram & box plots
• Data set: Baseball Salaries.xlsx

Lab 4
• Time series data exercise
• Excel: StatTools
• Time series graph
• Data set: Crime in US.xlsx
Class Summary
To do
• Quiz 0
• Reading Assignments (on Syllabus)
• Homework (#38, 41, 48; pp. 73-76)
• Say hello to Jove (optional)
2-7 Excel® Tables for Filtering, Sorting, and Summarizing

 Tables are a tool introduced in Excel® 2007.


 You now have the ability to designate a rectangular data set as a
table and then employ a number of powerful tools for analyzing
tables.
 These tools include:
 Filtering

 Sorting

 Summarizing

© 2017 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

Вам также может понравиться