Fundamentals of Programming in SAS. James Blum

Чтение книги онлайн.

Читать онлайн книгу Fundamentals of Programming in SAS - James Blum страница 16

Автор:
Жанр:
Серия:
Издательство:
Fundamentals of Programming in SAS - James Blum

Скачать книгу

uses the CONTENTS and PRINT procedures to make an initial exploration of the Ipums2005Mini data set. To begin, make sure the BookData library is assigned as done in Chapter 1.

      Program 2.3.1: Using the CONTENTS and PRINT Procedures to View Data and Attributes

      proc contents data=bookdata.ipums2005mini;

      ods select variables;

      run;

      proc print data=bookdata.ipums2005mini(obs=5);

      var state MortgageStatus MortgagePayment HomeValue Metro;

      run;

       The BookData.Ipums2005Mini data set is a modification of a data set used later in this chapter, BookData.Ipums2005Basic. It subsets the original data set down to a few records and is used for illustration of these initial concepts.

       The ODS SELECT statement limits the output of a given procedure to the chosen tables, with the Variables table from PROC CONTENTS containing the names and attributes of the variables in the chosen data set. Look back to Program 1.4.4, paying attention to the ODS TRACE statement and its results, to review how this choice is made.

       The OBS= data set option limits the number of observations processed by the procedure. It is in place here simply to limit the size of the table shown in Output 2.3.1B. At various times in this text, the output shown may be limited in scope; however, the code given may not include this option for all such cases.

       The VAR statement is used in the PRINT procedure to select the variables to be shown and the column order in which they appear.

      Output 2.3.1A: Using the CONTENTS Procedure to View Attributes

Alphabetic List of Variables and Attributes
#VariableTypeLenFormat
4CITYPOPNum8
2COUNTYFIPSNum8
10CityChar43
6HHINCOMENum8
7HomeValueNum8
3METRONum8BEST12.
5MortgagePaymentNum8
9MortgageStatusChar45
11OwnershipChar6
1SERIALNum8
8stateChar57

      Output 2.3.1B: Using the PRINT Procedure to View Data

ObsstateMortgageStatusMortgagePaymentHomeValueMETRO
1South CarolinaYes, mortgaged/ deed of trust or similar debt200325004
2North CarolinaNo, owned free and clear050001
3South CarolinaYes, mortgaged/ deed of trust or similar debt360750004
4South CarolinaYes, contract to purchase430225003
5North CarolinaYes, mortgaged/ deed of trust or similar debt450650004

      As seen in Chapter 1, SAS variable names have a certain set of restrictions they must meet, including no special characters other than an underscore. This potentially limits the quality of the display for items such as the headers in PROC PRINT. SAS does permit the assignment of labels to variables, substituting more descriptive text into the output in place of the variable name, as demonstrated in Program 2.3.2.

      Program 2.3.2: Assigning Labels

      proc print data=bookdata.ipums2005mini(obs=5) noobs label;

      var state MortgageStatus MortgagePayment HomeValue Metro;

      label HomeValue=’Value of Home ($)’ state=’State’;

      run;

       By default, the output from PROC PRINT includes an Obs column, which is simply the row number for the record—the NOOBS option in the PROC PRINT statement suppresses this column.

       Most SAS procedures use labels when they are provided or assigned; however, PROC PRINT defaults to using variable names. To use labels, the LABEL option is provided in the PROC PRINT statement. See Chapter Note 1 in Section 2.12 for more details.

       The LABEL statement assigns labels to selected variables. The general syntax is: LABEL variable1=’label1’ variable2=’label2’ …; where the labels are given as literal values in either single or double quotation marks, as long as the opening and closing quotation marks match.

      Output 2.3.2: Assigning Labels

StateMortgageStatusMortgagePaymentValue of Home ($)METRO
South CarolinaYes, mortgaged/ deed of trust or similar debt200325004
North CarolinaNo, owned free and clear050001
South CarolinaYes, mortgaged/ deed of trust or similar debt360750004
South CarolinaYes, contract to purchase430225003
North CarolinaYes, mortgaged/ deed of trust or similar debt450650004

      In addition to using labels to alter the display of variable names, altering the display of data values is possible with formats. The general form of a format reference is:

      <$>format<w>.<d>

      The <> symbols denote a portion of the syntax that is sometimes used/required—the <> characters are not part of the syntax. The dollar sign is required for any format that applies to a character variable (character formats) and is not permitted in formats used for numeric variables (numeric formats). The w value is the total number of characters (width) available for the formatted value, while d controls the number of values displayed after the decimal for numeric formats. The dot is required in all format assignments, and in many cases is the means by which the SAS compiler can distinguish between a variable name and a format name. The value of format is called the format name; however, standard numeric and character formats have a null name; for example, the 5.2 format assigns the standard numeric format with a total width of 5 and up to 2 digits displayed past the decimal. Program 2.3.3 uses the FORMAT statement to apply formats to the HomeValue, MortgagePayement, and MortgageStatus variables.

      Program 2.3.3: Assigning Formats

      proc print data=bookdata.ipums2005mini(obs=5) noobs label;

      var state MortgageStatus MortgagePayment HomeValue Metro;

      label HomeValue=’Value of Home’ state=’State’;

      format HomeValue MortgagePayment dollar9. MortgageStatus $1.;

      run;

       In the FORMAT statement, a list of one or more variables is followed by a format specification. Both HomeValue and MortgagePayment are assigned a dollar format with a total width of nine—any commas and dollar signs inserted by this format count toward the total width.

       The MortgageStatus variable is character and can only be assigned a character format. The $1. format is the standard character format with width one, which truncates the display of MortgageStatus to one letter, but does not alter the actual value. In general, formats assigned in procedures are

Скачать книгу