Fundamentals of Programming in SAS. James Blum

Чтение книги онлайн.

Читать онлайн книгу Fundamentals of Programming in SAS - James Blum страница 24

Автор:
Жанр:
Серия:
Издательство:
Fundamentals of Programming in SAS - James Blum

Скачать книгу

of trust or similar debt54561547.071159062100.00

      The TABLE statement is not required; however, in that case, the default behavior produces a one-way frequency table for every variable in the data set. Therefore, both types of SAS variables, character or numeric, are legal in the TABLE statement. Given that variables listed in the TABLE statement are treated as categorical (in the same manner as variables listed in the CLASS statement in PROC MEANS), it is best to have the summary variables be categorical or be formatted into a set of categories.

      The default summaries in a one-way frequency table are: frequency (count), percent, cumulative frequency, and cumulative percent. Of course, the cumulative statistics only make sense if the categories are ordinal, which these are not. Many options are available in the table statement to control what is displayed, and one is given in Program 2.7.2 to remove the cumulative statistics.

      Program 2.7.2: PROC FREQ Option for Removing Cumulative Statistics

      proc freq data=BookData.IPUMS2005Basic;

      table metro mortgageStatus / nocum;

      run;

      As with the CLASS statement in the MEANS procedure, variables listed in the TABLE statement in PROC FREQ use the format provided with the variable to construct the categories. Program 2.7.3 uses a format defined in Program 2.5.6 to bin the MortgagePayment variable into categories and, as this is an ordinal set, the cumulative statistics are appropriate.

      Program 2.7.3: Using a Format to Control Categories for a Variable in the TABLE Statement

      proc format;

      value Mort

      0=’None’

      1-350=”$350 and Below”

      351-1000=”$351 to $1000”

      1001-1600=”$1001 to $1600”

      1601-high=”Over $1600”

      ;

      run;

      proc freq data=BookData.IPUMS2005Basic;

      table MortgagePayment;

      format MortgagePayment Mort.;

      run;

      Output 2.7.3: Using a Format to Control Categories for a Variable in the TABLE Statement

First mortgage monthly payment
MortgagePaymentFrequencyPercentCumulativeFrequencyCumulativePercent
None60369152.0860369152.08
$350 and Below598565.1666354757.25
$351 to $100028311124.4394665881.67
$1001 to $160012880111.11107545992.79
Over $1600836037.211159062100.00

      The FREQ procedure is not limited to one-way frequencies—special operators between variables in the TABLE statement allow for construction of multi-way tables.

      The * operator constructs cross-tabular summaries for two categorical variables, which includes the following statistics:

       cross-tabular and marginal frequencies

       cross-tabular and marginal percentages

       conditional percentages within each row and column

      Program 2.7.4 summarizes all combinations of Metro and MortgagePayment, with Metro formatted to add detail and MortgagePayment formatted into the bins used in the previous example.

      Program 2.7.4: Using the * Operator to Create a Cross-Tabular Summary with PROC FREQ

      proc format;

      value METRO

      0 = “Not Identifiable”

      1 = “Not in Metro Area”

      2 = “Metro, Inside City”

      3 = “Metro, Outside City”

      4 = “Metro, City Status Unknown”

      ;

      value Mort

      0=’None’

      1-350=”$350 and Below”

      351-1000=”$351 to $1000”

      1001-1600=”$1001 to $1600”

      1601-high=”Over $1600”

      ;

      run;

      proc freq data=BookData.IPUMS2005Basic;

      table Metro*MortgagePayment;

      format Metro Metro. MortgagePayment Mort.;

      run;

       The first variable listed in any request of the form A*B is placed on the rows in the table. Requesting MortgagePayment*Metro transposes the table and the included summary statistics.

       The format applied to the Metro variable is merely a change in display and has no effect on the structure of the table—it is five rows with or without the format. The format on MortgagePayment is essential to the column structure—allowing each unique value of MortgagePayment to form a column does not produce a useful summary table.

      Output 2.7.4: Using the * Operator to Create a Cross-Tabular Summary with PROC FREQ

Table of METRO by MortgagePayment
METRO(Metropolitan status)MortgagePayment(First mortgage monthly payment)
FrequencyPercentRow PctCol PctNone$350 and Below$351 to $1000$1001 to $1600Over $1600Total
Not Identifiable493794.2653.668.1869790.607.5811.66254882.2027.709.0073070.637.945.6728750.253.123.44920287.94
Not in Metro Area13431411.5958.2022.25216981.879.4036.25609485.2626.4121.53104640.904.538.1233510.291.454.0123077519.91
Metro, Inside City964878.3262.5015.9844100.382.867.37288662.4918.7010.20140491.219.1010.91105560.916.8412.6315436813.32
Metro, Outside City14996112.9443.9824.84121481.053.5620.30793886.8523.2828.04563304.8616.5243.73431553.7212.6651.6234098229.42
Metro, City Status Unknown17355014.9750.9128.75146211.264.2924.43884217.6325.9431.23406513.5111.9231.56236662.046.9428.3134090929.41
Total60369152.08598565.1628311124.4312880111.11836037.211159062100.00

      Various options are available to control the displayed statistics. Program 2.7.5 illustrates some of these with the result shown in Output 2.7.5.

      Program 2.7.5: Using Options in the TABLE Statement.

Скачать книгу