Вы находитесь на странице: 1из 10

Data Analysis and Statistics Study Guide for the Math Basics

How to Prepare for the Data Analysis and Statistics


Questions on a Math Test

General Information
With the ability to gather seemingly endless information due to the internet, we
need to know how to process the data. At the basic level, that’s all Data Analysis
and Statistics is: the collecting, processing, organizing, and modeling of
information. It’s becoming increasingly important to know these skills not only in the
workplace, but also as a savvy consumer of advertised products and information.
So let’s get started!

Basic Vocabulary
If you’re trying to understand how many people, animals, or objects behave as a
group, it’s best to use a measure of central tendency to describe the behavior. For
all three measures of central tendency below, let’s analyze the set of grades you’ve
had on 10 different math tests:

{85,73,99,80,82,90,93,77,84,90}

Mean
Usually, your math teacher will use this measure of central tendency to calculate
your final grade in the class. The mean is just the average. To find the mean, take
the sum of all the values and divide it by the total number of tests.

mean=sum of valuesnumber of values


So, let’s find the average (or mean) of the test scores.

mean=85+73+99+80+82+90+93+77+84+9010tests

mean=85310

mean=85.3

Median
The median is just the middle value of a data set. To find the median, the first step
is to order the data from least to greatest:
{73,77,80,82,84,85,90,90,93,99}
Now, find the middle number by crossing out the highest and lowest values and
working your way in.

{73,77,80,82,84,85,90,90,93,99}

{73,77,80,82,84,85,90,90,93,99}

{73,77,80,82,84,85,90,90,93,99}

{73,77,80,82,84,85,90,90,93,99}
Notice, there are two middle numbers:

{73,77,80,82,84,85,90,90,93,99}
If this happens (which it will every time there are an even number of values), find
the average of the two middle numbers:

median=84+852=1692=84.5
NOTE: You don’t have to do this if there are an odd number of values. Let’s look at
this set:

{1,4,5,8,10}

Now start crossing out and you’ll arrive at just one middle number: 5.

{1,4,5,8,10}

Mode
The mode is the easiest of the 3 measures of central tendency to find.
The mode is just the most often occuring value. In this case, 90 occurs twice,
so 90 is the mode.
If all values occur the same number of times, we say there is no mode.
Occasionally, two or more values might occur more than the others, like in the
following set:

{4,4,5,8,8,9,10}

In this case, both 4 and 8 are modes.

Types of Graphs
In order to make data meaningful, we’ve found ways to express the data as a type
of graph. They can be used to show trends, comparisons, or the bigger picture and
are often easier to use than lists of data.

Circle Graph
Circle graphs (or pie charts) are useful when showing percentages. Pie charts help
to put things in perspective. Bigger pieces of the pie represent higher percentages.
In this example, the whole circle represents the United States Federal Spending in
the 2017 fiscal year. You can see from a glance that the most money was spent
on health care, pensions, and defense. Look a little closer to find the actual
percentages.

Retrieved from:
https://www.usgovernmentspending.com/year_spending_2017USbf_XXbs2n
When making an accurate circle graph, you’ll need to first calculate percentages.
Remember,

Percent=partwhole⋅100
Then, you’ll need to draw a piece of the pie to represent it. For a quick reference,
know that 25% takes up 14 of the circle, or a 90∘ angle.
If you need to find the exact angle, use this formula:

Angle=percent100⋅360∘
Bar Graph
Bar graphs or charts can be used almost any time you would use a circle chart.
They can be arranged vertically or horizontally. Each bar corresponds to an actual
amount instead of a percentage.
Horizontal Bar Chart

Vertical Bar Chart


Occasionally, the actual value is placed at the tip of the bar to make it easier for the
reader.
Line Graph
Line graphs are composed of two axes. The bottom axis is usually time (years,
months, days, etc). They are useful in showing trends of data. For example note
this graph seems to show that as time goes on, US spending is increasing.

Retrieved from:
https://www.usgovernmentspending.com/spending_chart_2003_2023USb_19s1li0
01mcn_F0t
Line graphs are made by collecting data every so often. In this case, it was every
year. Dots are put on each piece of data. In 2010, for instance, the government
spent 6000 billion dollars. Once all the dots are drawn, connect them from left to
right. Sometimes graph makers choose to keep the dots present, otherwise, they
blend them into the line.
How much money was spent in 2016? Move over to 2016 on the bottom axis. The
line above it has a height of just under 7000 billion dollars.

Scatterplot
A scatterplot is composed of two axes. Each piece of data is represented as a dot
on the plane.
Here’s an example comparing heights and weights of various people.
Read the data the same way you’d read points on the x-y plane. For example, you
can see that there is a 4 ft person who weighs roughly 125 pounds (see the lower
left dot). In total, there are 6 people who submitted their weights and heights.

Histogram
A basic histogram is basically a line chart mixed with a bar graph. Two axes again,
usually with time on the bottom. Dots are made for each date. Instead of drawing
lines connecting dots, draw a bar going up to the dot.

Retrieved from: https://www.usgovernmentspending.com/


Sometimes, you can make more interesting histograms. You can break up the bars
similar to the way you’d break up the pieces of a pie chart. In the case below, the
author wants to show that the combined total money spent on housing, food, and
clothing has been declining over time. In addition, each bar has been split up to
show how each particular amount has changed over time. At a glance, it looks like
the clothing and food sections have decreased over time while housing has
fluctuated.
Retrieved from: http://visualizingeconomics.com/blog/2013/11/18/100-years-of-
family-spending-in-the-us
Box Plot
To create a box plot or box and whisker plot you need to find 5 things about your
data set: minimum, first quartile, median, third quartile, and maximum.
Start by ordering your data. In this example let’s look at a few test scores in a
class:

{45,75,76,78,80,82,82,85,90,92,95,99}

The minimum is the smallest number: 45


The maximum is the largest number: 99
The median is the middle number. In this case, both 82s are the middle, so the
average of them is 82.
Now, split the set right in the middle into two sets.

{45,75,76,78,80,82,‖82,85,90,92,95,99}

The lower half: {45,75,76,78,80,82}


And the upper half: {82,85,90,92,95,99}
The first quartile or Q1 is the median of the lower half: 77 (the average
of 76 and 78)
The third quartile or Q3 is the median of the upper half: 91 (the average
of 90 and 92)
Now that you have all five numbers, everything between Q1 and Q3 is enclosed in
a box. Then, the whiskers extend to the minimum and maximum.

You can calculate the range of data by subtracting the minimum from the
maximum.
Range=Maximum−Minimum=99−45=54
You can also find the interquartile range or IQR by subtracting Q1 from Q3.

Вам также может понравиться