The Big R-Book. Philippe J. S. De Brouwer
Чтение книги онлайн.
Читать онлайн книгу The Big R-Book - Philippe J. S. De Brouwer страница 33
![The Big R-Book - Philippe J. S. De Brouwer The Big R-Book - Philippe J. S. De Brouwer](/cover_pre848614.jpg)
Adding Rows to a Data-frame
Adding rows corresponds to adding observations. This is done via the function rbind().
rbind()
# To add a row, we need the rbind() function: data_test.to.add <- data.frame( Name = c(“Ricardo”, “Anna”), Gender = c(“Male”, “Female”), Score = c(66,80), Age = c(70,36), End_date = as.Date(c(“2016-05-05”,“2016-07-07”)) ) data_test.new <- rbind(data_test,data_test.to.add) print(data_test.new) ## Name Gender Score Age End_date ## 1 Piotr Male 80 42 2014-03-01 ## 2 Pawel Male 88 38 2017-02-13 ## 3 <NA> Female 92 26 2014-10-10 ## 4 Lisa Female 89 30 2015-05-10 ## 5 Laura Female 84 35 2010-08-25 ## 6 Ricardo Male 66 70 2016-05-05 ## 7 Anna Female 80 36 2016-07-07
Merging data frames
Merging allows to extract the subset of two data-frames where a given set of columns match.
data_test.1 <- data.frame( Name = c(“Piotr”, “Pawel”,“Paula”,“Lisa”,“Laura”), Gender = c(“Male”, “Male”,“Female”, “Female”,“Female”), Score = c(78,88,92,89,84), Age = c(42,38,26,30,35) ) data_test.2 <- data.frame( Name = c(“Piotr”, “Pawel”,“notPaula”,“notLisa”,“Laura”), Gender = c(“Male”, “Male”,“Female”, “Female”,“Female”), Score = c(78,88,92,89,84), Age = c(42,38,26,30,135) ) data_test.merged <- merge(x=data_test.1,y=data_test.2, by.x=c(“Name”,“Age”),by.y=c(“Name”,“Age”)) # Only records that match in name and age are in the merged table: print(data_test.merged) ## Name Age Gender.x Score.x Gender.y Score.y ## 1 Pawel 38 Male 88 Male 88 ## 2 Piotr 42 Male 78 Male 78
merge()
Short-cuts
R will allow the use of short-cuts, provided that they are unique. For example, in the data-frame data_test
there is a column Name
. There are no other columns whose name start with the letter “N”; hence. this one letter is enough to address this column.
short-cut
data_test$N ## [1] Piotr Pawel Paula Lisa Laura ## Levels: Laura Lisa Paula Pawel Piotr
Use “short-cuts” sparingly and only when working interactively (not in functions or code that will be saved and re-run later). When later another column is added the short-cut will no longer be unique and behaviour is hard to predict and it is even harder to spot the programming error in a part of your code that previously worked fine.
Naming Rows and Columns
In the preceding code, we have named columns when we created the data-frame. It is also possible to do that later or to change column names …and it is even possible to name each row individually.
# Get the rownames. colnames(data_test) ## [1] “Name” “Gender” “Score” “Age” “End_date” rownames(data_test) ## [1] “1” “2” “3” “4” “5” colnames(data_test)[2] ## [1] “Gender” rownames(data_test)[3] ## [1] “3” # assign new names colnames(data_test)[1] <- “first_name” rownames(data_test) <- LETTERS[1:nrow(data_test)] print(data_test) ## first_name Gender Score Age End_date ## A Piotr Male 80 42 2014-03-01 ## B Pawel Male 88 38 2017-02-13 ## C <NA> Female 92 26 2014-10-10 ## D Lisa Female 89 30 2015-05-10 ## E Laura Female 84 35 2010-08-25
1 Create 3 by 3 matrix with the numbers 1 to 9,
2 Convert it to a data-frame,
3 Add names for the columns and rows,
4 Add a column with the column-totals,
5 Drop the second column.
4.3.9 Strings or the Character-type
Strings are called the “character-type” in R. They follow some simple rules:
string
strings must start and end with single or double quotes,
a string ends when the same quotes are encountered the next time,
until then it can contain the other type of quotes.
Example: Using strings
a <- “Hello” b <- “world” paste(a, b, sep = “, “) ## [1] “Hello, world” c <- “A ‘valid’ string” paste()
In many cases we do not need anything between strings that are concatenated. We can of course supply an empty string as separator ( sep = “
), but it is also possible to use the custom function pate0()
:
paste0(12, ‘%’) ## [1] “12%”
past0()
Formatting with
format()
In many cases, it will be useful to format a date or number consistently and neatly in