Probability with R. Jane M. Horgan
Чтение книги онлайн.
Читать онлайн книгу Probability with R - Jane M. Horgan страница 16
head(results, n = 4)
gives the first four rows of the data set.
gender arch1 prog1 arch2 prog2 1 m 99 98 83 94 2 m NA NA 86 77 3 m 97 97 92 93 4 m 99 97 95 96
and
tail(results, n = 4)
gives the last four lines of the data set.
gender arch1 prog1 arch2 prog2 116 m 16 27 25 7 117 m 73 51 48 23 118 m 56 54 49 25 119 m 46 64 13 19
The convention for accessing the column variables is to use the name of the data frame followed by the name of the relevant column. For example,
results$arch1[5]
returns
[1] 89
which is the fifth observation in the column labeled arch1.
Usually, when a new data frame is created, the following two commands are issued.
attach(results) names(results)
which give
[1] "gender" "arch1" "prog1" "arch2" "prog2"
indicating that the column variables can be accessed without the prefix results
. For example,
arch1[5]
gives
[1] 89
The command read.table
assumes that the data in the text file are separated by spaces. Other forms include:
read.csv
, used when the data points are separated by commas;
read.csv2
, used when the data are separated by semicolons.
It is also possible to enter data into a spreadsheet and store it in a data frame, by writing
newdata <- data.frame() fix(newdata)
which brings up a blank spreadsheet called newdata, and the user may then enter the variable labels and the variable values.
Right click and close creates a data frame newdata in which the new information is stored.
If you subsequently need to amend or add to this data frame write
fix(newdata)
which retrieves the spreadsheet with the data. You can then edit the data as required. Right click and close saves the amended data frame.
1.7 Missing Values
R allows vectors to contain a special
value to indicate that the data point is not available. In the second record in , notice that appears for arch1 and prog1. This means that the marks for this student are not available in Architecture and Programming in the first semester; the student may not have sat these examinations. The absent marks are referred to as , and are not included at the analysis stage.1.8 Editing
It is possible to edit data that have been already entered and to edit and invoke commands that have been previously used.
1.8.1 Data Editing
The data you have read and stored may be edited and changed interactively during your R session. Simply click on Edit on the toolbar to get access to the Data Editor, which allows you to bring up any data frame as a spreadsheet. You can edit its entries as you wish.
It is also possible to change particular entries of a data frame. For example,
arch1[7] <- 10
changes the mark for the seventh student in
in the data frame from 100 to 10. It may have been entered as 100 in error.1.8.2 Command Editing
The command
history()
brings up the previous 25 commands on a separate screen. These can be edited and/or used again as you wish.
history(max.show = Inf)
retrieves all previous commands that you have used.
1.9 Tidying Up
As your R session continues, you may find that the set of objects you have used has become unwieldy, and you may want to remove some. To see what the workspace contains write
ls()
or equivalently
objects()
which causes all objects in the workspace to appear on the screen. If you have run the preceding examples in this chapter, the following should appear.
[1] "downtime" "newdata" "prod1" "results" "x" [6] "X" "x2"
The content of the workspace can also be examined from the toolbar; go to Misc and choose List Objects.
To tidy up, you might want to remove some.
rm(x2)
removes the object x2.
To remove the complete workspace, write
rm(list = ls())
1.10 Saving and Retrieving
To save the entire workspace, click
on the tool bar. You will then be given the opportunity to specify the location where you want to save the workspace. The workspace is saved to a file with attachment.A saved workspace may be retrieved at File on the toolbar by clicking on Load Workspace, and specifying its location.