SAS Statistics by Example. Ron Cody, EdD
Чтение книги онлайн.
Читать онлайн книгу SAS Statistics by Example - Ron Cody, EdD страница 6
SAS runs on many different computing platforms: PCs, UNIX and Linux operating systems, and mainframe computers. You can run the examples in this book on any of these platforms, although there are some small differences in how the data are accessed on the various systems. The examples in this book were all run on a PC, because this is by far the most popular platform on which SAS is run.
Statistical Tasks Performed by SAS
SAS, like many other statistical packages, has built-in procedures for analyzing data. These procedures (called PROCs for short) enable you to perform statistical tests and analyses. For example, if you want to perform a Student’s t-test, you use PROC TTEST, to run a regression model, you use PROC REG. The list of statistical PROCs is quite extensive, and not all of them are covered in this book. Even so, this book is a good place to get started, and many of the most popular statistical tasks are covered. The complete guide to all SAS/STAT procedures is available from SAS Institute. The documentation is available in more than five volumes (taking up about two feet of shelf space) or for free in HTML or PDF form on the SAS Web site: http://support.sas.com/documentation. Select SAS Press books are also available on iPad, Kindle, Google eBooks, Mobipocket, Books24x7, netLibrary, and Safari Books online.
The Structure of SAS Programs
SAS programs are divided into DATA and PROC steps. With DATA steps you can read text data from files; create new variables from existing variables; perform logical operations on your data; and merge, concatenate, and subset your data files. PROC steps give you the ability to perform pre-defined tasks such as creating frequency distributions or performing a t-test. SAS stores data in data sets that are unique to SAS. Your data might already be in a SAS data set, in which case you might not need to write a DATA step at all. However, even if you are starting with data already in a SAS data set, you might want to write a DATA step to perform some manipulation of the data, such as performing a transformation, grouping values, subsetting your data, or combining data from several data sets.
SAS Data Sets
SAS data sets consist of two parts: a descriptor portion, or metadata (information about your data set, such as the variable names and data types), and the data values themselves. SAS can create SAS data sets from almost any source. If you have raw data in a text file, you can use a DATA step to read the file and create your SAS data set. If you have an Excel workbook, or any one of several popular database formats, you can either use a SAS procedure to convert the data into a SAS data set or use the Import Wizard (part of SAS/ACCESS and available from SAS Institute) to point-and-click your way through the conversion.
SAS Display Manager
This section shows you how to run SAS programs on a Windows platform. If you are using UNIX or a mainframe to run your programs, your screens will not look like the images shown in this section. If you choose to use SAS Enterprise Guide to run your SAS programs on a Windows platform, you will also be using a different editor.
When you open SAS on a PC platform, you enter SAS Display Manager. This facility contains three major windows: the Program Editor, where you write, edit, and submit your SAS programs; a Log window, where you see error messages and information about the program you have submitted; and the Output window, where SAS displays your output. If you need the output for a Web page or you want to use the output in a Word or Word Perfect program, the Output Delivery System (ODS) can send your output to these destinations as HTML, RTF (rich text format), or PDF.
The following display shows SAS Display Manager with all three windows open:
The top section is the Output window, the middle section is the Log window, and the bottom section is the Program Editor window. You can resize, move, or expand each of these windows.
Excel Workbooks
Because Excel workbooks (or comma-separated values [CSV] files) are so popular on PCs, let’s examine how to use the SAS Import Wizard to convert these files into SAS data sets.
The following display shows an Excel workbook called SAMPLE.XLS:
Each of the columns in this workbook contains information about each of our subjects (ID, Age, and Gender). The first row of the workbook contains variable names. In Excel, these names can be anything. In SAS, variable names must conform to a stricter naming convention. The maximum length of a SAS variable name is 32 characters; the first character of a variable name must be a letter or an underscore, and the remaining characters of the variable name can contain letters (uppercase or lowercase), numbers, and the underscore character. For example, Age, Income2010, and Home_Runs are valid SAS variable names; 1year, Year Income, and Cost% are not valid SAS variable names. Later you will see what happens if the first row of your workbook contains variable names that are not valid SAS variable names.
By the way, SAS variable names are not case sensitive. However, if you use uppercase, lowercase, or mixed case, SAS remembers the case of the variable name from the first time you used it and displays the names in SAS reports based on the previous value.
Each of the rows of the workbook, with the exception of the first row, contains data about an individual. SAS calls these rows observations. So, whereas an Excel workbook has columns and rows, SAS data sets have variables and observations.
To convert this Excel file into a SAS data set, you can use the Import Wizard:
1. Click File.
2. Select Import Data.
3. Choose Microsoft Excel.
4. Select the Excel workbook that you want to convert.
5. Name the SAS data set (SampleData in this example).
6. Click Finish (at the bottom right of the window) to complete the conversion.
Naming conventions for SAS data sets are the same as for SAS variable names. The names must be 32 characters or less in length, they must start with a letter or an underscore, and the remaining characters must be letters, numbers, or underscores.
Now that you have converted your Excel workbook into a temporary SAS data set, you can list the observations in the data set and inspect the descriptor portion of the data set. SAS provides you with several ways to do this.
One way to see a listing of the data in a SAS data set is to use a SAS procedure called PROC PRINT. The following program demonstrates how to use PROC PRINT to list the observations in the SampleData data set:
Program 1.1: Using PROC PRINT to List the Observations in a SAS Data Set
proc print data=SampleData;
run;
|