Fundamentals of Programming in SAS. James Blum

Чтение книги онлайн.

Читать онлайн книгу Fundamentals of Programming in SAS - James Blum страница 15

Автор:
Жанр:
Серия:
Издательство:
Fundamentals of Programming in SAS - James Blum

Скачать книгу

       2.10 Validation

       2.11 Wrap-Up Activity

       2.12 Chapter Notes

       2.13 Exercises

      At the conclusion of this chapter, mastery of the concepts covered in the narrative includes the ability to:

       Apply the MEANS procedure to produce a variety of quantitative summaries, potentially grouped across several categories

       Apply the FREQ procedure to produce frequency and relative frequency tables, including cross-tabulations

       Categorize data for analyses in either the MEANS or FREQ procedures using internal SAS formats or user-defined formats

       Formulate a strategy for selecting only the necessary rows when processing a SAS data set

       Apply the DATA step to read data from delimited or fixed-position raw text files

       Describe the operations carried out during the compilation and execution phases of the DATA step

       Compare and contrast the input buffer and program data vector

       Apply DATA step statements to assist in debugging

       Apply the COMPARE procedure to compare and validate a data set against a standard

      Use the concepts of this chapter to solve the problems in the wrap-up activity. Additional exercises and case-studies are also available to test these concepts.

      This section introduces a case study that is used as a basis for most of the concepts and associated activities in this book. The data comes from the Current Population Survey by the Integrated Public Use Microdata Series (IPUMS CPS). IPUMS CPS contains a wide variety of information, only a subset of the data collected from 2001-2015 is included in the examples here. Further, the data used is introduced in various segments, starting with simple sets of variables and eventually adding more information that must be assembled to achieve the objectives of each section.

      This chapter works with data that includes household-level information from the 2005 and 2010 IPUMS CPS data sets of over one million observations each. Included are variables on state, county, metropolitan area/city, household income, home value, mortgage status, ownership status, and mortgage payment. Outputs 2.2.1 through 2.2.4 show tabular summaries from the 2010 data, including quantitative statistics, frequencies, and/or percentages. Reproducing these tables in the wrap-up activity in Section 2.11 is the primary objective for this chapter.

      The first sample output shown in Output 2.2.1 produces a set of six statistics on mortgage payments across metropolitan status for mortgages of $100 per month or more. In order to make this table, and the slightly more complicated Output 2.2.2, several components of the MEANS procedure must be understood.

      Output 2.2.1: Basic Statistics on Mortgage Payments Grouped on Metropolitan Status

Analysis Variable : MortgagePayment Mortgage Payment
MetroNMeanMedianStd DevMinimumMaximum
Not Identifiable42927970.2800.0668.5100.07400.0
Not in Metro Area97603815.0670.0576.0100.06800.0
Metro, Inside City560391363.51100.0974.8100.07400.0
Metro, Outside City1859671480.81300.0974.7100.07400.0
Metro, City Status Unknown1632041233.21000.0846.4100.07400.0

      Output 2.2.2: Minimum, Median, and Maximum on Mortgage Payments Across Multiple Categories

MetroHousehold IncomeVariableLabelMinimumMedianMaximum
Metro, Inside CityNegativeMortgagePaymentHomeValueMortgage PaymentHome Value4407000012002500004500675000
$0 to $45KMortgagePaymentHomeValueMortgage PaymentHome Value100074013000068005303000
$45K to $90KMortgagePaymentHomeValueMortgage PaymentHome Value1000100018000074004915000
Above $90KMortgagePaymentHomeValueMortgage PaymentHome Value1000160034000074005303000
Metro, Outside CityNegativeMortgagePaymentHomeValueMortgage PaymentHome Value10010000145025000054004152000
$0 to $45KMortgagePaymentHomeValueMortgage PaymentHome Value100085015000074004304000
$45K to $90KMortgagePaymentHomeValueMortgage PaymentHome Value1000110019900068004915000
Above $90KMortgagePaymentHomeValueMortgage PaymentHome Value1000160033000074004915000
Metro, City Status UnknownNegativeMortgagePaymentHomeValueMortgage PaymentHome Value18017000120024500053002948000
$0 to $45KMortgagePaymentHomeValueMortgage PaymentHome Value100072012500074004915000
$45K to $90KMortgagePaymentHomeValueMortgage PaymentHome Value100096016000074004915000
Above $90KMortgagePaymentHomeValueMortgage PaymentHome Value1000140027000074004915000

      In Outputs 2.2.3 and 2.2.4, frequencies and percentages are summarized across combinations of various categories, which requires mastery of the fundamentals of the FREQ procedure.

      Output 2.2.3: Income Status Versus Mortgage Payment

Table of HHIncome by MortgagePayment
HHIncome(Household Income)MortgagePayment(Mortgage Payment)
FrequencyRow Pct$350 and Below$351 to $1000$1001 to $1600Over $1600Total
Negative309.939732.129230.468327.48302
$0 to $45K2292916.378312559.332261716.14114368.16140107
$45K to $90K138776.9610366051.995477827.482705213.57199367
Above $90K59442.895267925.586247430.338486741.20205964
Total42780239561139961123438545740

      Output 2.2.4: Income Status Versus Mortgage Payment for Metropolitan Households (Table 1 of 3)

Table 1 of HHIncome by MortgagePayment
Controlling for Metro=Metro, Inside City
HHIncome(Household Income)MortgagePayment(Mortgage Payment)
FrequencyRow Pct$350 and Below$351 to $1000$1001 to $1600Over $1600Total
Negative00.00730.43939.13730.4323
$0 to $45K159610.75894960.30259717.50170011.4514842
$45K to $90K9104.75921548.13557129.10345018.0219146
Above $90K5042.29494722.46632128.701025646.5622028
Total301023118144981541356039

      This section reviews and extends some fundamental SAS concepts demonstrated in code supplied for Chapter 1, with these examples built

Скачать книгу