Академический Документы
Профессиональный Документы
Культура Документы
General Information
With the ability to gather seemingly endless information due to the internet, we
need to know how to process the data. At the basic level, that’s all Data Analysis
and Statistics is: the collecting, processing, organizing, and modeling of
information. It’s becoming increasingly important to know these skills not only in the
workplace, but also as a savvy consumer of advertised products and information.
So let’s get started!
Basic Vocabulary
If you’re trying to understand how many people, animals, or objects behave as a
group, it’s best to use a measure of central tendency to describe the behavior. For
all three measures of central tendency below, let’s analyze the set of grades you’ve
had on 10 different math tests:
{85,73,99,80,82,90,93,77,84,90}
Mean
Usually, your math teacher will use this measure of central tendency to calculate
your final grade in the class. The mean is just the average. To find the mean, take
the sum of all the values and divide it by the total number of tests.
mean=85+73+99+80+82+90+93+77+84+9010tests
mean=85310
mean=85.3
Median
The median is just the middle value of a data set. To find the median, the first step
is to order the data from least to greatest:
{73,77,80,82,84,85,90,90,93,99}
Now, find the middle number by crossing out the highest and lowest values and
working your way in.
{73,77,80,82,84,85,90,90,93,99}
{73,77,80,82,84,85,90,90,93,99}
{73,77,80,82,84,85,90,90,93,99}
{73,77,80,82,84,85,90,90,93,99}
Notice, there are two middle numbers:
{73,77,80,82,84,85,90,90,93,99}
If this happens (which it will every time there are an even number of values), find
the average of the two middle numbers:
median=84+852=1692=84.5
NOTE: You don’t have to do this if there are an odd number of values. Let’s look at
this set:
{1,4,5,8,10}
Now start crossing out and you’ll arrive at just one middle number: 5.
{1,4,5,8,10}
Mode
The mode is the easiest of the 3 measures of central tendency to find.
The mode is just the most often occuring value. In this case, 90 occurs twice,
so 90 is the mode.
If all values occur the same number of times, we say there is no mode.
Occasionally, two or more values might occur more than the others, like in the
following set:
{4,4,5,8,8,9,10}
Types of Graphs
In order to make data meaningful, we’ve found ways to express the data as a type
of graph. They can be used to show trends, comparisons, or the bigger picture and
are often easier to use than lists of data.
Circle Graph
Circle graphs (or pie charts) are useful when showing percentages. Pie charts help
to put things in perspective. Bigger pieces of the pie represent higher percentages.
In this example, the whole circle represents the United States Federal Spending in
the 2017 fiscal year. You can see from a glance that the most money was spent
on health care, pensions, and defense. Look a little closer to find the actual
percentages.
Retrieved from:
https://www.usgovernmentspending.com/year_spending_2017USbf_XXbs2n
When making an accurate circle graph, you’ll need to first calculate percentages.
Remember,
Percent=partwhole⋅100
Then, you’ll need to draw a piece of the pie to represent it. For a quick reference,
know that 25% takes up 14 of the circle, or a 90∘ angle.
If you need to find the exact angle, use this formula:
Angle=percent100⋅360∘
Bar Graph
Bar graphs or charts can be used almost any time you would use a circle chart.
They can be arranged vertically or horizontally. Each bar corresponds to an actual
amount instead of a percentage.
Horizontal Bar Chart
Retrieved from:
https://www.usgovernmentspending.com/spending_chart_2003_2023USb_19s1li0
01mcn_F0t
Line graphs are made by collecting data every so often. In this case, it was every
year. Dots are put on each piece of data. In 2010, for instance, the government
spent 6000 billion dollars. Once all the dots are drawn, connect them from left to
right. Sometimes graph makers choose to keep the dots present, otherwise, they
blend them into the line.
How much money was spent in 2016? Move over to 2016 on the bottom axis. The
line above it has a height of just under 7000 billion dollars.
Scatterplot
A scatterplot is composed of two axes. Each piece of data is represented as a dot
on the plane.
Here’s an example comparing heights and weights of various people.
Read the data the same way you’d read points on the x-y plane. For example, you
can see that there is a 4 ft person who weighs roughly 125 pounds (see the lower
left dot). In total, there are 6 people who submitted their weights and heights.
Histogram
A basic histogram is basically a line chart mixed with a bar graph. Two axes again,
usually with time on the bottom. Dots are made for each date. Instead of drawing
lines connecting dots, draw a bar going up to the dot.
{45,75,76,78,80,82,82,85,90,92,95,99}
{45,75,76,78,80,82,‖82,85,90,92,95,99}
You can calculate the range of data by subtracting the minimum from the
maximum.
Range=Maximum−Minimum=99−45=54
You can also find the interquartile range or IQR by subtracting Q1 from Q3.