Вы находитесь на странице: 1из 3

SAS Technical Screening Problem asked at CDC, Atlanta...

The following 3 datasets exist in a hypothetical system:

COUNTY
COUNTY_ID STATE_NAME COUNTY_NAME
1 Texas Collin
2 Texas Dallas
3 Georgia DeKalb

AGE_DISTRIBUTION
COUNTY_ID AGE_0_10 AGE_10_20 AGE_20_40 AGE_40_PLUS
1 100 20 40 60
2 10 10 40 50
3 45 100 56 67

AGE_DISTRIBUTION_DESC
CATEGORY_NAME CATEGORY_DESCRIPTION
AGE_0_10 < 10 years
AGE_10_20 Between 10 and 20 years
AGE_20_40 Between 20 and 40 years
AGE_40_PLUS > 40 years

Requirement 1: Write a SAS program (with ODS) to use the datasets given above to
generate HTML and MS EXCEL file that will look like the following (the files must
have the 3 tables given below):

County Age Distribution


Name Category Total Number
< 10 years 100
Between 10 and 20 years 20
Between 20 and 40 years 40
Collin > 40 years 60
< 10 years 10
Between 10 and 20 years 10
Between 20 and 40 years 40
Dallas > 40 years 50
< 10 years 45
Between 10 and 20 years 100
Between 20 and 40 years 56
DeKalb > 40 years 67
Category County
Collin Dallas DeKalb
< 10 years 100 10 45
Between 10 and 20 years 20 10 100
Between 20 and 40 years 40 40 56
> 40 years 60 50 67
Total 220 110 268

Category State
Texas Georgia
< 10 years 110 45
Between 10 and 20 years 30 100
Between 20 and 40 years 80 56
> 40 years 110 67
Total 330 268

Note: The color scheme and other style elements (bold, italics etc) MUST be
incorporated in the output.

Requirement 2: With a different dataset structure as given below, produce the same
HTML and MS EXCEL file result as given above:

COUNTY
COUNTY_ID STATE_NAME COUNTY_NAME
1 Texas Collin
2 Texas Dallas
3 Georgia DeKalb

AGE_DISTRIBUTION
COUNTY_ID AGE_DISTR_CAT_ID DISTRIBUTION_COUNT
1 1.1 100
1 2.2 20
1 3.3 40
1 4.4 50
2 1.1 10
2 2.2 10
2 3.3 40
2 4.4 50
3 1.1 45
3 2.2 100
3 3.3 56
3 4.4 67

AGE_DISTRIBUTION_DESC
AGE_DISTR_CAT_ID CATEGORY_DESCRIPTION
1.1 < 10 years
2.2 Between 10 and 20 years
3.3 Between 20 and 40 years
4.4 > 40 years

What to submit: Please provide two separate self-contained SAS Programs (datasets
and data in these datasets must be declared in the ‘DATA’ step of the programs, an
external data source should not be used) which can be executed in the SAS console. The
programs must generate a HTML file called ‘results.html’ and an MS Excel file called
‘results.xls’, the look and feel of which has been depicted earlier.

The following aspects must be kept in mind when developing the programs:
♦ Presentation: The generated HTML/MS Excel must resemble the output table
structure given above.
♦ Program efficiency: The program must be as short and efficient as possible.
♦ In each program a macro variable must be used to determine whether to produce
HTML output or MS-Excel output. We will alter the variable value (the possible
values must be documented in the program) during execution to generate the desired
output.

Check your solution with my Solution at


http://sastechies.com/members/SMiniProject4.php

..under Projects | Mini Projects 2 Try...

For more please visit http://sastechies.blogspot.com/

Вам также может понравиться