A Gentle Introduction to Statistics Using SAS Studio. Ron Cody
Чтение книги онлайн.
Читать онлайн книгу A Gentle Introduction to Statistics Using SAS Studio - Ron Cody страница 9
When all is ready, click the Run icon (Figure 5.10).
Figure 5.10: Click the Run Icon
You are done! Here is a section of the results.
Figure 5.11: Variable List for the Work.Grades SAS Data Set
Here you see a list of the variable names (note: you may have to scroll down through several pages to see this), whether they are stored as numeric or character, along with some other information that we don’t need at this time. Notice that the import utility correctly reads Name as character and the other variables as numeric.
Listing the SAS Data Set
A quick way to see a listing of the Grades data set is to select the Libraries tab in the navigation pane, open the WORK library, and double-click Grades. It looks like this:
Figure 5.12: Data Set Grades in the Work Library
You can use your mouse to scroll to the right to see the rest of the table. To create a better-looking report, click the Tasks and Utilities tab of the navigation pane and select Tasks, then Data, followed by List Data. (See Figure 5.13.)
Figure 5.13: The List Data Task
Double-click List Data and select the Grades data set in the WORK library in the Data selection box. Then click the Run icon. You will be presented with a nice-looking list of the Grades data set. (See Figure 5.14 below.)
Figure 5.14: List of the Grades Data Set
Importing an Excel Workbook with Invalid SAS Variable Names
What if your Excel worksheet has column headings that are not valid SAS variable names?
Valid SAS variable names are up to 32 characters long. The first character must be a letter or underscore—the remaining characters can be letters, digits, or underscores. You are free to use upper- or lowercase letters.
As an example of a worksheet with invalid variable names, look at the worksheet Grades2 shown in Figure 5.15.
Figure 5.15: List of Excel Workbook Grades2
Most of the column headings in this spreadsheet are not valid SAS variable names. Six of them contain a blank in the middle of the name, and the last column (2015Final) starts with a digit. What happens when you import this worksheet? Because you now know how to use the Import Data task, it is not necessary to describe the import task again. All you really need to see is the final list of variables in the data set. Here they are:
Figure 5.16: Variable Names in the Grades2 SAS Data Set
SAS replaced all the blanks with underscores and added an underscore as the first character in the 2015Final name to create valid SAS variable names. Note: the option to use column labels as column headings was deselected so that you could see the actual variable names.
Importing an Excel Workbook That Does Not Have Column Headings
What if the first row of your worksheet does not contain column headings (variable names)? You have two choices: First, you can edit the worksheet and insert a row with column headings (probably the best option). The other option is to deselect “Create Variable Names” in the OPTIONS section in the Import Window (see Figure 5.17) and let SAS create variable names for you.
Figure 5.17: Uncheck the Create Variable Names Option
Here is the result:
Figure 5.18: Variable Names Generated by SAS
SAS used the Excel column identifiers (A through F) as variable names. You can leave these variable names as they are or change them using DATA step programming. Another option is to use PROC DATASETS, a SAS procedure that enables you to alter various attributes of a SAS data set without having to create a new copy of the data set.
When you import a CSV file without variable names, you will see variable names VAR1, VAR2, and so on, that are generated by SAS.
Importing Data from a CSV File
Comma-separated values (CSV) files are a popular format for external data files. As the name implies, CSV files use commas as data delimiters. Many websites enable you to download data as CSV files. As with Excel workbooks, your CSV file may or may not contain variable names at the beginning of the file. If the file does contain variable names, make sure that the “Generate SAS Variable Names” options box is checked; if not, deselect this option.
For example, look at the CSV file called Grades.csv in Figure 5.19 below:
Figure 5.19: CSV File Grades.csv
This CSV file contains the same data as the Excel Workbook Grades.xlsx. Notice that variable names are included in the file. You can import this file and create a SAS data set using the same steps that you used to import the Excel workbook. The import facility will automatically use the correct code to import this data file because of the CSV file extension. The resulting SAS data set is identical to the one shown in Figure 5.14.
Shared Folders (Accessing Data from Anywhere on Your Hard Drive)
When you follow the instructions in setting up SAS Studio, a default folder referred to in SAS Studio as /folders/myfolders allows you to read data from the folder called \SASUniversityEdition\myfolders on your hard drive. If this is the only place where you plan to read data, you do not need to create any other shared folders in your virtual computer.
If you need to read data from other locations on your hard drive, please see the relevant sections in SAS documentation.
Conclusion
In this chapter, you saw how to import data from Excel workbooks and CSV files. Importing data from any of the other choices displayed in Figure 5.3 follows the same basic procedure. If you need to read data from text files, you will need to learn some basic SAS programming, especially how to use the INPUT statement, one of the most powerful and versatile components of the SAS system.
Chapter