A Gentle Introduction to Statistics Using SAS Studio. Ron Cody
Чтение книги онлайн.
Читать онлайн книгу A Gentle Introduction to Statistics Using SAS Studio - Ron Cody страница 8
When you click this tab, you see three separate tabs, one labeled My Tasks, another labeled Tasks, the last labeled Utilities. Expanding the Utilities tab displays three sub-tabs: Import Data, Query, and SAS Program. (See Figure 5.2 below.)
Figure 5.2: Expanding the Utilities Tab
The Import Data task is used to import data in a variety of formats and to create SAS data sets. A complete list of supported file types is shown in Figure 5.3.
Figure 5.3: List of Supported Files
As you can see in Figure 5.3, this import data utility can import data from many of the most common PC data formats. Because Excel workbooks and CSV files are so popular, let’s use them to demonstrate how SAS converts various file formats into SAS data sets.
Importing Data from an Excel Workbook
Your virtual machine is running a Linux operating system where naming conventions for files are different from the naming conventions used on Microsoft or Apple computers. Filenames in Linux are case sensitive, and folders and subfolders are separated by forward slashes. Filenames on Microsoft platforms are not case sensitive, and folders and subfolders are separated by backward slashes. To help resolve these file-naming conventions, you set up shared folders in your virtual machine that allow your SAS programs to read and write files to the hard drive on your computer.
There are slight differences in how you create shared folders, depending on whether you are running VirtualBox, VMware Workstation Player, or VMware Fusion. The easiest way to read and write data between your SAS Studio session and your hard drive is to place your data files in a specific location on your Windows hard drive—\SASUniversityEdition\myfolders. If you followed the installation directions for your choice of virtualization software, this location on your hard drive is mapped to a shared folder called /folders/myfolders in SAS Studio.
For most of the examples in this book, the location c:\SASUniversityEdition\myfolders is the folder where your data files and SAS data sets are located. All the programs and data files that you place in \SASUniversityEdition\myfolders will show up when you click the My Folders tab in the Navigation pane. Remember that this folder (or an equivalent folder on other operating systems) was created when you installed and configured SAS University Edition.
Let’s use the workbook Grades.xlsx (located in the folder c:\SASUniversityEdition\myfolders) for this demonstration.
If you go to the SAS author site (support.sas.com/cody) and scroll down to this book, you will see some choices listed, including one that reads, “Example Code and Data.” If you click on this link, you can download a ZIP file that contains some programs and data sets. Find the program Create_Datasets.sas and extract it. If you are using SAS Studio with SAS University Edition, a good place to put the files that you downloaded is in a folder called:
c:\SASUniversityEdition\Myfolders
It you do that, you can access the programs and data in the Server Files and Folders tab on the left side of the navigation screen.
Next, open up SAS Studio. In the option to edit the Autoexec.sas file (click on the icon to the left of the question mark (?) on the top line of SAS Studio and select “Edit Autoexec File”), add a line similar to the one below:
libname Stats ‘/folders/Myfolders’;
If you are using SAS Studio in another environment (such as the SAS Windowing Environment or SAS on Demand), you will be placing your files in different locations and modifying the LIBNAME statement shown above.
If you open this workbook in Excel, it looks like this.
Figure 5.4: Excel Workbook Grades.xlsx
The first row of the worksheet contains column names (also known as variable names). The remaining rows contain data on three students (yes, it was a very small class). The worksheet name was not changed so that it has the default name Sheet1.
The first step to import this data into a SAS data set is to double-click the Import Data task.
Figure 5.5: Double-Clicking the Import Data Task
You have two ways to select which file you want to import. One is to click the Select File button on the right side of the screen—the other method is to click the Server Files and Folders tab in the Navigation pane (on the left), find the file, and drag it to the Drag and Drop area. Depending on how you set up your SAS Studio session, you might find your files under Folder shortcuts then myfolder.
Using the first method and clicking Select File, brings up a window where you can select a file to import. Here it is:
Figure 5.6: Clicking on the Select File Button
Select the file that you want to import and click Open. This brings up the Import Window:
Figure 5.7: The Import Window
Use the mouse to enlarge the top half of the Import window or use the scroll bar on the right to reveal the entire window. The figure below shows the expanded view of the Import window:
Figure 5.8: Expanded View of the Import Window
The top part of the window shows information about the file that you want to import. You can enter a Worksheet Name (if there are multiple worksheets). But because you only have one worksheet, you do not have to enter a worksheet name.
The OPTIONS pulldown menu enables you to select a file type. However, if your file has the appropriate extension (for example, XLSX, XLS, or CSV), you can leave the default actions (based on the file extension) to decide how to import the data.
Because the first row of the spreadsheet contains column names, leave the check on the “Generate SAS variable names” option. This tells the import utility to use the first row of the worksheet to generate variable names.
You probably want to change the name of the output SAS data set. Clicking the Change button in the Output Data area of the screen brings up a list of SAS libraries (below).
Figure 5.9: Changing the Name of the SAS Data Set
The WORK library is used to create a temporary SAS data set (that disappears when you close your SAS Session). For now, let’s