Fundamentals of Programming in SAS. James Blum

Чтение книги онлайн.

Читать онлайн книгу Fundamentals of Programming in SAS - James Blum страница 20

Автор:
Жанр:
Серия:
Издательство:
Fundamentals of Programming in SAS - James Blum

Скачать книгу

MortgageStatus Metro;

      var HHIncome;

      run;

      proc means data=BookData.IPUMS2005Basic nonobs n mean std;

      class Metro MortgageStatus;

      var HHIncome;

      run;

      Output 2.4.8A: Using Multiple Class Variables (Partial Listing)

Analysis Variable : HHIncome
MortgageStatusMETRONMeanStd Dev
N/A01900931672.8132122.89
14861829122.7329160.23
26920138749.6946226.50
37323443325.2542072.78
49328036514.5636974.63
No, owned free and clear03037046533.1450232.50
18569642541.0644664.64
22728660011.1076580.75
37672763925.9975404.62
48027055915.0266293.39

      Output 2.4.8B: Effects of Order (Partial Listing)

Analysis Variable : HHIncome
METROMortgageStatusNMeanStd Dev
0N/A1900931672.8132122.89
No, owned free and clear3037046533.1450232.50
Yes, contract to purchase103046069.2636225.80
Yes, mortgaged/ deed of trust or similar debt4161971611.0155966.31
1N/A4861829122.7329160.23
No, owned free and clear8569642541.0644664.64
Yes, contract to purchase303442394.1235590.14
Yes, mortgaged/ deed of trust or similar debt9342762656.5448808.66

      The same statistics are present in both tables, but the primary ordering is on MortgageStatus in Output 2.4.8A as opposed to metropolitan status (Metro) in Output 2.4.8B. Two additional items of note from this example: first, note the use of NONOBS in each. By default, using a CLASS statement always produces a column for the number of observations in each class level (NOBS), and this may be different from the statistic N due to missing data, but that is not an issue for this example. Second, the numeric values of Metro really have no clear meaning. Titles and footnotes, as shown in Chapter 1, are available to add information about the meaning of these numeric values. However, a better solution is to build a format and apply it to that variable, a concept covered in the next section.

      As seen in Section 2.3, SAS provides a variety of formats for altering the display of data values. It is also possible to define formats using the FORMAT procedure. These formats are used to assign replacements for individual data values or for groups or ranges of data, and they may be permanently stored in a library for subsequent use. Formats, both native SAS formats and user-defined formats, are an invaluable tool that are used in a variety of contexts throughout this book.

      The FORMAT procedure provides the ability to create custom formats, both for character and numeric variables. The principal tool used in writing formats is the VALUE statement, which defines the name of the format and its rules for converting data values to formatted values. Program 2.5.1 gives an example of a format written to improve the display of the Metro variable from the BookData.IPUMS2005Basic data set.

      Program 2.5.1: Defining a Format for the Metro Variable

      proc format;

      value Metro

      0 = “Not Identifiable”

      1 = “Not in Metro Area”

      2 = “Metro, Inside City”

      3 = “Metro, Outside City”

      4 = “Metro, City Status Unknown”

      ;

      run;

       The VALUE statement tends to be rather long given the number of items it defines. Remember, SAS code is generally free-form outside of required spaces and delimiters, along with the semicolon that ends every statement. Adopt a sound strategy for using indentation and line breaks to make code readable.

       The VALUE statement requires the format name, which follows the SAS naming conventions of up to 32 characters, but with some special restrictions. Format names must meet an additional restriction of being distinct from the names of any formats supplied by SAS. Also, given that numbers are used to define format widths, a number at the end of a format name would create an ambiguity in setting lengths; therefore, format names cannot end with a number. If the format is for character values, the name must begin with $, and that character counts toward the 32-character limit.

       In this format, individual values are set equal to their replacements (as literals) for all values intended to be formatted. Values other than 0, 1, 2, 3, and 4 may not appear as intended. For a discussion of displaying values other than those that appear in the VALUE statement, see Chapter Note 4 in Section 2.12.

       The semicolon that ends the value statement is set out on its own line here for readability—simply to make it easy to verify that it is present.

      Submitting Program 2.5.1 makes a format named Metro in the format catalog in the Work library, it only takes effect when used, and it is used in effectively the same manner as a format supplied by SAS. Program 2.5.2 uses the Metro format for the class variable Metro to alter the appearance of its values in Output 2.5.2. Note that since the variable Metro and the format Metro have the same name, and since no width is required, the only syntax element that distinguishes these to the SAS compiler is the required dot (.) in the format name.

      Program 2.5.2: Using the Metro Format

      proc means data=BookData.IPUMS2005Basic nonobs maxdec=0;

      class Metro;

      var HHIncome;

      format Metro Metro.;

      run;

      Output 2.5.2: Using the Metro Format

Analysis Variable : HHIncome
METRONMeanStd DevMinimumMaximum
Not Identifiable920285480052333-199981076000
Not in Metro Area2307754785645547-299971050000
Metro, Inside City1543686032870874-199981391000
Metro, Outside City3409827764875907-299971739770
Metro, City Status Unknown3409096433566110-222981536000

      For this case, a simplified format that distinguishes metro, non-metro, and non-identifiable observations may be desired. Program 2.5.3 contains two approaches to this, the first being clearly the most efficient.

      Program 2.5.3: Assigning Multiple Values to the Same Formatted Value

      proc format;

      value MetroB

      0

Скачать книгу