Вы находитесь на странице: 1из 10

Running Head: SAS ANALYSIS PROJECT

SAS Analysis Project

Kelly Erazo
COH 602 Biostatistics
Professor Alan Smith

SAS ANALYSIS PROJECT

Death at any age can be saddening, especially when it means the loss of a loved one.
However, death at a young age and due to a disease increases the sadness of the loss the loved
one. For instance, high blood pressure is fairly common and can be associated to death in some
cases. Thus, the research question addressed in my analysis is Is there a relationship between
high blood pressure and death at a young age versus death at an old age? To assess this properly,
one would need to define young age and old age prior to analyzing the data. It is
hypothesized that people with high blood pressure will die at a young age rather than an old age.
The dataset that I will use to answer this research question is the Framingham Heart
Study Data, SASHELP.HEART, which comes from the ongoing Framingham Heart Study. Based
on the available variables from the SASHELP.HEART dataset, these are the variables selected
for the analysis:
Variable Name

Label

AGEATDEATH

Age at Death

BP_STATUS

Blood Pressure Status

STATUS

Status of life (dead or alive)

The Framingham Heart Study Data, SASHELP.HEART, provided the following information:

Age at Death had values that ranged between 36 and 93, and the total of people who

died is 1991.
Blood Pressure Status was rated as 1= High, 2= Normal, and 3= Optimal
Status was categorized as 1= alive and 2= dead

First, the data was sorted using PROC SORT to only analyze the variables of blood
pressure status. After sorting the data, PROC FREQ was used to provide frequency tables for
blood pressure status, age at death, and status. However, the table for blood pressure status shows

SAS ANALYSIS PROJECT

the frequencies using the entire total of participants, not only the participants that have died, or
those with an age at death.
Then, univariate statistics were produced using PROC UNIVARIATE for the variable age
at death, and it was sorted by the blood pressure status. This produced the mean, median, mode,
and standard deviation for those who had high blood pressure, normal blood pressure and
optimal blood pressure. For this analysis however, we will only use the data for those with high
blood pressure.
Lastly, the PROC SGPLOT was used to create histograms for the age at death variable
for those with high blood pressure, normal blood pressure and optimal blood pressure. Again, we
will only focus on the histogram for those with high blood pressure.
Results of Descriptive Analysis:
The variable age at death ranged from 36 to 93 years old. To define the terms young
age and old age, we will divide the range of ages in half. Therefore, anyone below the age of
65 is considered of young age, and everyone at or above the age of 65 is considered of old
age. The table below displays the univariate for high blood pressure for the variable age at
death. Out of the total number of people who are dead (1991), there are 1172 people with high
blood pressure. In other words, 58% of the people who are dead have high blood pressure. The
mean is 70.95, or about 71 years old, the median is 72, and the mode is 80. The mode is the age
most frequent within the participants, which is considered of old age, according to our definition.
This is surprising, since it was expected that the mode would be under the age of 65. The
histogram below displays the age at death for high blood pressure, and it is negatively skewed.

SAS ANALYSIS PROJECT

4
The UNIVARIATE Procedure
Variable: AgeAtDeath (Age at Death)
Blood Pressure Status=High

Moments
N
Mean

1172
70.9505119

Sum Weights
Sum

1172
83154

Observations
Std Deviation

10.284971

Variance

105.78064

Kurtosis

-0.1113993

Basic Statistical Measures

6
Location
Skewness
Uncorrected

-0.3994379
6023688

Corrected SS

14.495979
4

Std Error Mean

Mean

70.95051

Std Deviation

10.28497

Median

72.00000

Variance

Mode

80.00000

Range

57.00000

Interquartile Range

15.00000

123869.13

SS
Coeff Variation

Variability

0.30042723

105.78064

SAS ANALYSIS PROJECT

SAS ANALYSIS PROJECT

SAS ANALYSIS PROJECT

SAS ANALYSIS PROJECT

After computing the descriptive analysis of the data, we computed inferential analyses as
well. The first SAS Analysis used was the ANOVA procedure using the variables of blood
pressure status, the age at death and the status of life. Then, linear regression and logistic
regression analyses were computed on the dataset using also the variables of blood pressure
status, age at death, and status if life.
Results of Inferential Analyses:
The ANOVA procedure produced an interaction plot for age of death which shows the
age at death as the dependent variable vertically and blood pressure status as the independent
variable horizontally. The category of high blood pressure displays visually more plots than the
other two categories, suggesting more participants in that category, which is confirmed by the
univariate analysis performed previously, where it showed that 1172 of the 1991 participants has
high blood pressure. After the ANOVA procedure, a linear regression analysis was computed
also. The table on page 18 of the results displays a table that shows the distribution of age at
death according to each category of blood pressure. A closer observation of the high blood
pressure distribution shows that the mean is slightly below the median. The mean for high blood
pressure is 70.95 and the median is 72. Lastly, a logistic regression analysis was computed.
Notice how the p-value remains the same throughout the ages until the ages approximate the
mean. At age 69, the p-value is 0.0512, at 70 it is 0.8522, and at 71 it is 0.0156.
Conclusion:
The dataset from the Framingham Heart Study provided a large sample size to analyze
high blood pressure among those who have died. We hypothesized that more people died at a
young age who had high blood pressure than people who died at an old age and had high blood
pressure. For analytical purposes, we defined young age as people who had died under the age of

SAS ANALYSIS PROJECT

65 and old age as people who died at or above the age of 65. The results demonstrated that the
average age of death was 70.95, or 71 years old, which according to the definition previously
stated, is considered an old age. The median was 72, which means half of the participants were
below the age of 72 and half of the participants were above the age of 72. Surprisingly, once
again the age is considered to be an old age. Furthermore, the mode was 80, which means it was
the most frequent age among the participant, and is also considered an old age. The histogram
displayed above also shows visually that it is negatively skewed, which means that the majority
of participants are to the right of the histogram, or under the old age category. Surprisingly, the
hypothesis was incorrect and it seems that despite having high blood pressure, more people died
at an old age rather than a young age, meaning they died at or above the age of 65 years old.

The following coding was used to produce the data used for this analysis:
PROC CONTENTS DATA=SASHELP.HEART; RUN;
PROC SORT DATA= sashelp.heart OUT=temp; BY bp_status; RUN;
PROC FREQ DATA=temp; TABLES bp_status ageatdeath status; RUN;
PROC UNIVARIATE DATA=temp; var ageatdeath; by bp_status; RUN;
PROC SGPLOT DATA=temp; HISTOGRAM ageatdeath; BY bp_status; RUN;
PROC GLM DATA = temp; CLASS bp_status status; MODEL ageatdeath = bp_status status
bp_status*status;
MEANS bp_status status / TUKEY; RUN;
PROC GLM DATA = temp; CLASS bp_status status ; MODEL ageatdeath = bp_status;RUN;
PROC LOGISTIC DATA = temp; CLASS bp_status status; MODEL ageatdeath = bp_status
status; RUN;

SAS ANALYSIS PROJECT

10

The SAS results can be found in the PDF file named SAS Heart Data-Death and Blood Pressure
2.

Вам также может понравиться