Вы находитесь на странице: 1из 2

PH6420 Fall 2015: Assignment 1

Due: September 28, 2015

This assignment will give you practice reading data into SAS from several types of formats. There are 3
parts to the assignment. You may create one program by appending the code for each part or create
separate programs for each part. You may find it helpful to write the program on a piece of paper before
you work on the computer. It is very important to look at the log after running your program each time to
make sure there are no errors.
What to hand in?
Include the SAS program for each part answering any questions. The file can be a text file or you can
insert the code into Word. If you put the code into Word you should choose a fixed spaced font such as
Courier. The answers to the questions can be put in the same document. You may include output if you
like but do not include long listings. Remember to answer any questions that are in italics.
All data files are in the data sets section of the class website.
1.

The data file students.csv contains made-up data on 12 students. The content of the file is as
follows:
students.csv
F,23,S,15,MN
F,21,S,15,WI
F,22,S,09,MN
F,35,M,02,MN
F,22,M,13,MN
F,25,S,13,WI
M,20,S,13,MN
M,26,M,15,WI
M,27,,05,MN
M,,S,14,IA
M,21,S,14,MN
M,29,M,15,MN

The variables are gender (F or M), age, marital status (S or M), number of credits taken, and state
of residence (MN, WI, or IA). Note there is missing data in rows 9 and 10.
a. Download the file to your PC if not done already noting the folder the file is placed.
Write a SAS program that reads the data from students.csv and creates a SAS dataset called
class. Name the variables gender, age, marstat, credits, and state. The variables gender,
marstat, and state are character variables; age and credits are numeric variables. The first two
statements will include the following:
DATA class;
INFILE C:\folderpath\students.csv more options ;
Display the list of variables on the dataset using PROC CONTENTS. Also display the values
of the variables using PROC PRINT. Run the program, making sure the data is read in
correctly.

2. The data file bp.txt is a tab delimited file containing blood pressure and other data on 100
patients. The variables in order are the patient ID, clinical center, age, sex, diastolic BP at
baseline, 6-months, and 12-months (3 variables), and systolic BP at baseline, 6-months, and 12months (3 variables).
a. Write a SAS program that reads the data from bp.txt and creates a SAS dataset called bp1
Name the variables ptid, clinic, age, sex, dbpbl, dbp6, dbp12, sbpbl, sbp6, and sbp12.
Variables ptid and clinic are character variables; all other variables are numeric. Display
the list of variables on the dataset using PROC CONTENTS. Also, display the values of
the variables using PROC PRINT to verify the variables are read-in correctly.
*** Note: Since the first row of the file contains column headings you will need to use
the FIRSTOBS=2 option on the INFILE statement to skip over this row.
b. Instead of using a data-step, use PROC IMPORT to read the file creating a dataset called
bp2. Run PROC CONTENTS and PROC PRINT as before on this dataset.
Compare the PROC CONTENTS from part a with the PROC CONTETS from part b. What is
the length of the variable clinic in dataset bp1 compared to dataset bp2?
3. Download the file called tomhs.dat from the class webpage. The TOMHS data dictionary on the

class website gives the list of variables on the file, the locations (column positions) of the
variables on the file, and whether the variable is character or numeric.
a. Write a program to read in the variables ptid, clinic, group, age, and sex and create a SAS
dataset containing these variables (name the dataset whatever you like). Use the
POINTER/INFORMAT method of reading in the data.
b. Display all variables using PROC PRINT. Using the OBS dataset option on the procedure limit
the number of observation displayed to 10.
c.

Using PROC MEANS display the average age of all patients. What is the average age?

d. Using PROC FREQ display the number of men and women in the study. How many men and
how many women are in the study? Note: men are coded as 1 and women are coded as 2.

Optional things to try:


1. Take a spreadsheet file from your PC and read it into SAS using PROC IMPORT or using a

datastep. If you use a datastep you will need to first save the file as a csv or txt file.