# GV207 Political Analysis, Week 02

## Descriptive statistics 1: Levels of measurement and central tendency

Variable types
We can distinguish three main types of variables (amongst other less common types).
Variable type
Categorical variable
Continuous variable
Dummy variable

Definition
A variable with a finite number of categories,
e.g. gender, education level, responses to a five point rating scale.
A variable that can vary by infinitesimally small degrees,
e.g. age, income.
A variable that is 1, if a certain property is met, and 0 otherwise,
e.g. being female, vote for Obama, occurrence of civil war.

Levels of measurement
One of the most important characteristics of a variable is its level of measurement. Variables
with higher levels of measurement provide us with more information. The level of measurement
also determines which statistical techniques we can use to analyse a variable.
Measurement type
Nominal level
Ordinal level

Interval level

Definition
Values of a variable classify cases into categories that have no logical
order, e.g. party, country, gender.
Values of a variable classify cases into categories that indicate
a rank order, e.g. age group, education level, responses to a
five point rating scale. The intervals between the values are not
meaningfully interpretable. You can only make greater than
or lesser than statements.
Values of a variable indicate an actual quantity with equal distances
between the values, e.g. income, age, temperature. The
intervals between values are meaningfully interpretable.

## Measures of central tendency

Dependent on a variables level of measurement, we can calculate different measures of central
tendency. All these measures are univariate descriptive statistics and try to capture information
about the typical value of a variable.
Measure
Mode

Median

Mean

Definition
The category that has the highest frequency.
(Note that there can be more than one mode, e.g.
if the distribution is bimodal.)
The category that divides all observations exactly
at the midpoint of the distribution, if all values are
sorted. (Note that if the number of observations is
even, the median is the mean of the two middle
values).
The sum of all values divided by the number of
observations:

Measurement level
Can be calculated for nominal,
ordinal and interval level
variables.
Can be calculated for ordinal
and interval level variables.

## Can be calculated only for

interval level variables.

## Department of Government, University of Essex

A Stata exercise
Lets calculate some modes, medians and means in Stata:

Open the data set Democracy small.dta in Stata.1 Write all commands into a do-file and dont
forget to save this file.
Use the list-command to display a list of all Western European and Scandinavian countries and
their numbers of parties in parliament. (The number of parties can be found in the enpp2000
variable.) To do this, you can use the following command:
list country enpp2000 if region == 7 | region == 8

Calculate the mode, median and mean of the enpp2000 variable for the 20 countries with
available data. First, do the calculation by hand. Then, use Statas tab-command and
summarize-command (including its detail-option) to check your results.
Using the tab-command and the summarize-command, look at the variables listed in the table.
What is the type of each variable and what is its level of measurement?
Variable name

Variable type

Measurement level

## Meas. of central tendency

region
gdppc2000
fhcat2000
aclp_democ2000
women2000
internal2000
boycot2000

Now determine the most adequate measure of central tendency for each of the variables.
Calculate this measure using Statas tab- and summarize-commands.
Compare the means of GDP per capita across the different regions in the world. You can do this
by using the bysort-prefix. Which is the richest region?
bysort region: sum gdppc2000

The data set is based on the Democracy Crossnational Data Set 2009 provided by Pippa Norris. See
www.pippanorris.com (accessed October 08, 2014).
