SAS Statistics by Example. Ron Cody, EdD
Чтение книги онлайн.
Читать онлайн книгу SAS Statistics by Example - Ron Cody, EdD страница 12
The SET statement is an instruction to read each of the observations from data set example.Blood_Pressure. In parentheses following the data set name is a KEEP= data set option. This option tells the program that you want only two of the variables (Subj and SBP) to be read from the input data set. Finally, the IF-THEN statement is true when the value of Subj is equal to 5. The assignment statement following the keyword THEN is executed and the SBP value is set to 180. In a similar manner, the ELSE-IF statement sets the value of SBP to 180 for subject 55.
The box plot of the modified data set shows the two outliers as small circles:
You can even get a bit fancier and let SAS label the outliers:
Program 2.9: Labeling Outliers on a Box Plot
title “Demonstrating How Outliers are Displayed with a Box Plot”; proc sgplot data=Blood_Pressure_Out; hbox SBP / datalabel=Subj; run; |
The option DATALABEL= lets you select a variable to identify specific outliers. If you use the DATALABEL option without naming a label variable, SGPLOT uses the numerical value of the response variable (SBP in this example) to label the outliers. Here is the output:
Notice that the outliers for subjects 5 and 55 are labeled.
Displaying Multiple Box Plots for Each Value of a Categorical Variable
If you want to see a box plot for each value of a categorical variable, you can include the option CATEGORY= on the HBOX or VBOX statement. The example that follows uses the original Blood_Pressure data set (without the outliers) and displays a box plot for each value of Drug.
Program 2.10: Displaying Multiple Box Plots for Each Value of a Categorical Variable
title “Box Plots of SBP for Each Value of Drug”; proc sgplot data=example.Blood_Pressure; hbox SBP / category=Drug; run; |
The HBOX option CATEGORY= generates a separate box plot for each of the three Drug values:
Conclusions
Descriptive statistics should be your first step in data analysis so that you can see a summary of the data and better understand their distribution. This chapter showed you how to produce both numerical and graphical output for continuous variables, using a number of SAS procedures.
The next two chapters will show you how to display descriptive statistics for categorical variables and how to investigate bivariate relationships.
Chapter 3 Descriptive Statistics – Categorical Variables
Introduction
Computing Frequency Counts and Percentages
Computing Frequencies on a Continuous Variable
Using Formats to Group Observations
Histograms and Bar charts
Creating a Bar Chart Using PROC SGPLOT
Конец ознакомительного фрагмента.
Текст предоставлен ООО «ЛитРес».
Прочитайте эту книгу целиком, купив полную легальную версию на ЛитРес.
Безопасно оплатить книгу можно банковской картой Visa, MasterCard, Maestro, со счета мобильного телефона, с платежного терминала, в салоне МТС или Связной, через PayPal, WebMoney, Яндекс.Деньги, QIWI Кошелек, бонусными картами или другим удобным Вам способом.