Fundamentals of Programming in SAS. James Blum
Чтение книги онлайн.
Читать онлайн книгу Fundamentals of Programming in SAS - James Blum страница 15
![Fundamentals of Programming in SAS - James Blum Fundamentals of Programming in SAS - James Blum](/cover_pre687444.jpg)
2.11 Wrap-Up Activity
2.12 Chapter Notes
2.13 Exercises
2.1 Learning Objectives
At the conclusion of this chapter, mastery of the concepts covered in the narrative includes the ability to:
Apply the MEANS procedure to produce a variety of quantitative summaries, potentially grouped across several categories
Apply the FREQ procedure to produce frequency and relative frequency tables, including cross-tabulations
Categorize data for analyses in either the MEANS or FREQ procedures using internal SAS formats or user-defined formats
Formulate a strategy for selecting only the necessary rows when processing a SAS data set
Apply the DATA step to read data from delimited or fixed-position raw text files
Describe the operations carried out during the compilation and execution phases of the DATA step
Compare and contrast the input buffer and program data vector
Apply DATA step statements to assist in debugging
Apply the COMPARE procedure to compare and validate a data set against a standard
Use the concepts of this chapter to solve the problems in the wrap-up activity. Additional exercises and case-studies are also available to test these concepts.
2.2 Case Study Activity
This section introduces a case study that is used as a basis for most of the concepts and associated activities in this book. The data comes from the Current Population Survey by the Integrated Public Use Microdata Series (IPUMS CPS). IPUMS CPS contains a wide variety of information, only a subset of the data collected from 2001-2015 is included in the examples here. Further, the data used is introduced in various segments, starting with simple sets of variables and eventually adding more information that must be assembled to achieve the objectives of each section.
This chapter works with data that includes household-level information from the 2005 and 2010 IPUMS CPS data sets of over one million observations each. Included are variables on state, county, metropolitan area/city, household income, home value, mortgage status, ownership status, and mortgage payment. Outputs 2.2.1 through 2.2.4 show tabular summaries from the 2010 data, including quantitative statistics, frequencies, and/or percentages. Reproducing these tables in the wrap-up activity in Section 2.11 is the primary objective for this chapter.
The first sample output shown in Output 2.2.1 produces a set of six statistics on mortgage payments across metropolitan status for mortgages of $100 per month or more. In order to make this table, and the slightly more complicated Output 2.2.2, several components of the MEANS procedure must be understood.
Output 2.2.1: Basic Statistics on Mortgage Payments Grouped on Metropolitan Status
Analysis Variable : MortgagePayment Mortgage Payment | ||||||
Metro | N | Mean | Median | Std Dev | Minimum | Maximum |
Not Identifiable | 42927 | 970.2 | 800.0 | 668.5 | 100.0 | 7400.0 |
Not in Metro Area | 97603 | 815.0 | 670.0 | 576.0 | 100.0 | 6800.0 |
Metro, Inside City | 56039 | 1363.5 | 1100.0 | 974.8 | 100.0 | 7400.0 |
Metro, Outside City | 185967 | 1480.8 | 1300.0 | 974.7 | 100.0 | 7400.0 |
Metro, City Status Unknown | 163204 | 1233.2 | 1000.0 | 846.4 | 100.0 | 7400.0 |
Output 2.2.2: Minimum, Median, and Maximum on Mortgage Payments Across Multiple Categories
Metro | Household Income | Variable | Label | Minimum | Median | Maximum |
Metro, Inside City | Negative | MortgagePaymentHomeValue | Mortgage PaymentHome Value | 44070000 | 1200250000 | 4500675000 |
$0 to $45K | MortgagePaymentHomeValue | Mortgage PaymentHome Value | 1000 | 740130000 | 68005303000 | |
$45K to $90K | MortgagePaymentHomeValue | Mortgage PaymentHome Value | 1000 | 1000180000 | 74004915000 | |
Above $90K | MortgagePaymentHomeValue | Mortgage PaymentHome Value | 1000 | 1600340000 | 74005303000 | |
Metro, Outside City | Negative | MortgagePaymentHomeValue | Mortgage PaymentHome Value | 10010000 | 1450250000 | 54004152000 |
$0 to $45K | MortgagePaymentHomeValue | Mortgage PaymentHome Value | 1000 | 850150000 | 74004304000 | |
$45K to $90K | MortgagePaymentHomeValue | Mortgage PaymentHome Value | 1000 | 1100199000 | 68004915000 | |
Above $90K | MortgagePaymentHomeValue | Mortgage PaymentHome Value | 1000 | 1600330000 | 74004915000 | |
Metro, City Status Unknown | Negative | MortgagePaymentHomeValue | Mortgage PaymentHome Value | 18017000 | 1200245000 | 53002948000 |
$0 to $45K | MortgagePaymentHomeValue | Mortgage PaymentHome Value | 1000 | 720125000 | 74004915000 | |
$45K to $90K | MortgagePaymentHomeValue | Mortgage PaymentHome Value | 1000 | 960160000 | 74004915000 | |
Above $90K | MortgagePaymentHomeValue | Mortgage PaymentHome Value | 1000 | 1400270000 | 74004915000 |
In Outputs 2.2.3 and 2.2.4, frequencies and percentages are summarized across combinations of various categories, which requires mastery of the fundamentals of the FREQ procedure.
Output 2.2.3: Income Status Versus Mortgage Payment
Table of HHIncome by MortgagePayment | |||||
HHIncome(Household Income) | MortgagePayment(Mortgage Payment) | ||||
FrequencyRow Pct | $350 and Below | $351 to $1000 | $1001 to $1600 | Over $1600 | Total |
Negative | 309.93 | 9732.12 | 9230.46 | 8327.48 | 302 |
$0 to $45K | 2292916.37 | 8312559.33 | 2261716.14 | 114368.16 | 140107 |
$45K to $90K | 138776.96 | 10366051.99 | 5477827.48 | 2705213.57 | 199367 |
Above $90K | 59442.89 | 5267925.58 | 6247430.33 | 8486741.20 | 205964 |
Total | 42780 | 239561 | 139961 | 123438 | 545740 |
Output 2.2.4: Income Status Versus Mortgage Payment for Metropolitan Households (Table 1 of 3)
Table 1 of HHIncome by MortgagePayment | |||||
Controlling for Metro=Metro, Inside City | |||||
HHIncome(Household Income) | MortgagePayment(Mortgage Payment) | ||||
FrequencyRow Pct | $350 and Below | $351 to $1000 | $1001 to $1600 | Over $1600 | Total |
Negative | 00.00 | 730.43 | 939.13 | 730.43 | 23 |
$0 to $45K | 159610.75 | 894960.30 | 259717.50 | 170011.45 | 14842 |
$45K to $90K | 9104.75 | 921548.13 | 557129.10 | 345018.02 | 19146 |
Above $90K | 5042.29 | 494722.46 | 632128.70 | 1025646.56 | 22028 |
Total | 3010 | 23118 | 14498 | 15413 | 56039 |
2.3 Getting Started with Data Exploration in SAS
This section reviews and extends some fundamental SAS concepts demonstrated in code supplied for Chapter 1, with these examples built