Вы находитесь на странице: 1из 20

SAS Basics

Windows

Program Editor

Log

Write/edit all your statements here.


Watch this for any errors in program as it
runs.

Output

Will automatically pop in front when there is


output. Does not need to occupy screen
space during program editing.

File Organization

Create subfolders in your Project folder for

Data

Formats

Compiled version of formats, a file with .sc2


extension. Used for building classes of variables for
looking at frequencies.

Output

Contains SAS datasets, with .sd2 extension

Save output files here. These are text files with a .sas
extension.

Programs

All programs are text files with .sas ending.

Creating a dataset

Internal Data

DATA datasetname;
INPUT name $ sex $ age;
CARDS;
John M 23
Betty F 33
Joe M 50
;
RUN;

Creating a dataset

External Data

DATA datasetname;
INFILE c:\folder\subfolder\file.txt;
INPUT name $ sex $ age;
;
RUN;

Creating from an existing


one

DATA save.data2 (keep = age income);


SET save.data1;
RUN;
DATA save.data2;
SET save.data1;
DROP age;
TAX = income*0.28;
RUN;

Permanent Data Sets


LIBNAME save c:\project\data;
DATA save.data1;
X=25;
Y=X*2;
RUN;
Note that save is merely a name you
make up to point to a location where
you wish to save the dataset called
data1. (It will be saved as data1.sd2)

Whats in my SAS dataset?

PROC CONTENTS data=save.data1;


RUN;
PROC CONTENTS data=save.data1
POSITION;
RUN;

This will organize the variable list sorted


alphabetically and a duplicate list sorted by
position (the sequence in which they actually
exist in the file).

Viewing file contents

PROC PRINT data=save.data1; run;

PROC PRINT data=save.data1 (obs=5);


VAR name age;
RUN;

PROC PRINT data=save.data1 (obs=12);


VAR age -- income;
RUN;

Frequencies/Crosstabs

PROC FREQ data=save.data1;


TABLES age income trades;
RUN;
PROC FREQ data=save.data1;
TABLES age*sex;
RUN;

Scatter Plot

PROC PLOT data=save.data1;


PLOT Y*X;
RUN;

Creating a Format Library

PROC FORMAT LIBRARY=LIBRARY;


VALUE BG
0 = 'BAD'
1 = 'GOOD'
-1 = 'MISSING'
;
VALUE TWO
-1 = 'MISSING'
-2 = 'NO RECORD'
-3 = 'INQS. ONLY'
-4 = 'PR ONLY'
0='0'
1='1'
1<-HIGH='2+'
;
RUN;

Applying a format to a
variable

PROC DATASETS library=save;


MODIFY data1;
FORMAT trades ten.;
RUN;
QUIT;

This applies the format called ten to the


variable trades. A subsequent PROC FREQ
statement for trades will show the format
applied. Note that ten must already exist in
the format library for this to work.

Applying a format: Method 2

Data save.data2;
SET save.data1;
FORMAT
trades bktrds ten.
totbal mileage. ;
RUN;
This is another way to apply formats when
creating a new dataset (data2) from a previous
one (data1) that has unformatted variables.

Random Selection of Obs.

DATA save.new;
SET save.old;
Random1 = RANUNI(254987)*100;
IF Random1 > 50 THEN OUTPUT;
RUN;
QUIT;

The function RANUNI requires a seed number, and then


produces random values between 0 and 1, stored under
the variable name Random1 (you can choose any name).
The above program will create new.sd2, with about half
the observations of old.sd2, randomly chosen.

Sorting and Merging


Datasets

PROC SORT data = save.junk;


BY Age Income;
Run;
PROC SORT data=save.junk OUT=save.neat;
BY acctnum;
RUN;
PROC SORT data=save.junk NODUPKEY;
BY something;
RUN;

Sorting and Merging


Datasets

PROC SORT data=save.one;


BY Acctnum; RUN;
PROC SORT data=save.two;
BY Acctnum; RUN;
DATA save.three;
MERGE save.one save.two;
BY Acctnum;
RUN;

Sorting and Merging


Datasets

DATA save.three;
MERGE save.one (IN = a) save.two;
BY Acctnum;
IF a;
RUN;

Using Arrays

DATA save.new;
SET save.old;
ARRAY vitamin(6) a b c d e k;
DO i = 1 to 6;
IF vitamin(i) = -5 THEN vitamin(i) = .;
END;
RUN;

This assumes you have 6 variables called a, b, c, d, e, and ,k


in save.old. This program will modify all 6 such that any
instance of a 5 value is converted to a missing value.

Simple Correlations

PROC CORR data=save.relative;


VAR tvhours study;
RUN;
PROC CORR data=save.relative;
VAR tvhours study;
WITH Score;
RUN;

Вам также может понравиться