Fundamentals of Programming in SAS. James Blum
Чтение книги онлайн.
Читать онлайн книгу Fundamentals of Programming in SAS - James Blum страница 25
table Metro*MortgagePayment / nocol nopercent format=comma10.;
format Metro Metro. MortgagePayment Mort.;
run;
NOCOL and NOPERCENT suppress the column and overall percentages, respectively, with NOPERCENT also applying to the marginal totals. NOROW and NOFREQ are also available, with NOFREQ also applying to the marginal totals.
A format can be applied to the frequency statistic; however, this only applies to cross-tabular frequency tables and has no effect in one-way tables.
Output 2.7.5: Using Options in the TABLE Statement
Table of METRO by MortgagePayment | ||||||
METRO(Metropolitan status) | MortgagePayment(First mortgage monthly payment) | |||||
FrequencyRow Pct | None | $350 and Below | $351 to $1000 | $1001 to $1600 | Over $1600 | Total |
Not Identifiable | 49,37953.66 | 6,9797.58 | 25,48827.70 | 7,3077.94 | 2,8753.12 | 92,028 |
Not in Metro Area | 134,31458.20 | 21,6989.40 | 60,94826.41 | 10,4644.53 | 3,3511.45 | 230,775 |
Metro, Inside City | 96,48762.50 | 4,4102.86 | 28,86618.70 | 14,0499.10 | 10,5566.84 | 154,368 |
Metro, Outside City | 149,96143.98 | 12,1483.56 | 79,38823.28 | 56,33016.52 | 43,15512.66 | 340,982 |
Metro, City Status Unknown | 173,55050.91 | 14,6214.29 | 88,42125.94 | 40,65111.92 | 23,6666.94 | 340,909 |
Total | 603,691 | 59,856 | 283,111 | 128,801 | 83,603 | 1,159,062 |
Higher dimensional requests can be made; however, they are constructed as a series of two-dimensional tables. Therefore, a request of A*B*C in the TABLE statement creates the B*C table for each level of A, while a request of A*B*C*D makes the C*D table for each combination of A and B, and so forth. Program 2.7.6 generates a three-way table, where a cross-tabulation of Metro and HomeValue is built for each level of Mortgage Status as shown in Output 2.7.6. The VALUE statement that defines the character format $MortStatus takes advantage of the fact that value ranges are legal for character variables. Be sure to understand the difference between uppercase and lowercase letters when ordering the values of a character variable.
Program 2.7.6: A Three-Way Table in PROC FREQ
proc format;
value MetroB
0 = “Not Identifiable”
1 = “Not in Metro Area”
other = “In a Metro Area”
;
value $MortStatus
‘No’-’Nz’=’No’
‘Yes’-’Yz’=’Yes’
;
value Hvalue
0-65000=’$65,000 and Below’
65000<-110000=’$65,001 to $110,000’
110000<-225000=’$110,001 to $225,000’
225000<-500000=’$225,001 to $500,000’
500000-high=’Above $500,000’
;
run;
proc freq data=BookData.IPUMS2005Basic;
table MortgageStatus*Metro*HomeValue/nocol nopercent format=comma10.;
format MortgageStatus $MortStatus. Metro MetroB. HomeValue Hvalue.;
where MortgageStatus ne ‘N/A’;
run;
Output 2.7.6: A Three-Way Table in PROC FREQ
Table 1 of METRO by HomeValue | ||||||
Controlling for MortgageStatus=No | ||||||
METRO(Metropolitan status) | HomeValue(House value) | |||||
FrequencyRow Pct | $65,000 and Below | $65,001 to $110,000 | $110,001 to $225,000 | $225,001 to $500,000 | Above $500,000 | Total |
Not Identifiable | 10,77735.49 | 5,46017.98 | 10,41534.29 | 2,5848.51 | 1,1343.73 | 30,370 |
Not in Metro Area | 34,76640.57 | 16,26118.98 | 26,88931.38 | 5,5536.48 | 2,2272.60 | 85,696 |
In a Metro Area | 34,17618.55 | 23,70612.86 | 71,13338.60 | 33,59018.23 | 21,67811.76 | 184,283 |
Total | 79,719 | 45,427 | 108,437 | 41,727 | 25,039 | 300,349 |
Table 2 of METRO by HomeValue | ||||||
Controlling for MortgageStatus=Yes | ||||||
METRO(Metropolitan status) | HomeValue(House value) | |||||
FrequencyRow Pct | $65,000 and Below | $65,001 to $110,000 | $110,001 to $225,000 | $225,001 to $500,000 | Above $500,000 | Total |
Not Identifiable | 7,48617.55 | 7,14216.75 | 19,45345.61 | 6,46815.17 | 2,1004.92 | 42,649 |
Not in Metro Area | 24,44325.34 | 19,39620.11 | 40,66842.16 | 9,1649.50 | 2,7902.89 | 96,461 |
In a Metro Area | 26,3516.33 | 37,3458.97 | 175,48242.16 | 110,41226.52 | 66,67116.02 | 416,261 |
Total | 58,280 | 63,883 | 235,603 | 126,044 | 71,561 | 555,371 |
It is also possible for the FREQ procedure to count based on a quantitative variable using the WEIGHT statement, effectively tabulating the sum of the weights. Program 2.7.7 uses the weight statement to summarize total HomeValue for combinations of Metro and MortgagePayment.
Program 2.7.7: Using the WEIGHT Statement to Summarize a Quantitative Value.
proc freq data=BookData.IPUMS2005Basic;
table Metro*MortgagePayment /nocol nopercent format=dollar14.;
weight HomeValue;
format Metro MetroB. MortgagePayment Mort.;
run;
Output 2.7.7: Using the WEIGHT Statement to Summarize a Quantitative Value
Table of HomeValue by METRO | ||||
HomeValue(House value) | METRO(Metropolitan status) | |||
FrequencyRow Pct | Not Identifiable | Not in Metro Area | In a Metro Area | Total |
$65,000 and Below | $2,737,53012.20 | $8,736,98638.93 | $10,969,60048.88 | $22,444,116 |
$65,001 to $110,000 | $3,770,84010.74 | $9,887,45428.16 | $21,448,05261.09 | $35,106,346 |
$110,001 to $225,000 | $15,896,8547.82 | $30,632,55615.07 | $156,700,07477.10 | $203,229,484 |
$225,001 to $500,000 | $8,192,9084.80 | $10,741,2586.30 | $151,601,29488.90 | $170,535,460 |
Above $500,000 | $3,854,2802.60 | $4,735,2883.19 | $139,862,78094.21 | $148,452,348 |
Total | $34,452,412 | $64,733,542 | $480,581,800 | $579,767,754 |
2.8 Reading Raw Data
Often data is not available as a SAS data set; in practice, data often comes from external sources including raw files such as text files, spreadsheets such as Microsoft Excel ®, or relational databases such