Читать онлайн книгу - The Big R-Book. Philippe J. S. De Brouwer. Математика. LiveLib

Новинки Лучшее Рекомендации

Информация о книге:

Название:

Автор:

Жанр:

Серия:

Издательство:

The Big R-Book - Philippe J. S. De Brouwer

Скачать книгу

href="#u5a09e035-b732-5113-a1bc-319a73819ad6">Chapter 17 “DataWrangling in the tidyverse” on page 265.

ggplot2 is a system to create graphics with a philosophy: it adheres to a “Grammar of Graphics” and is able to create really stunning results at a reasonable price (it is a notch more abstract to use than the core-R functionality). For more information, see Chapter 31 “A Grammar of Graphics with ggplot2” on page 687.ggplot2For both reasons, we will talk more about it in the sections about reporting: see Chapter 31 on page 687.

readr expands R's standard⁵ functionality to read in rectangular⁶ data.readrIt is more robust, knows more data types and is faster than the core-R functionality. For more information, see Chapter 17.1.2 “Importing Flat Files in the Tidyverse” on page 267 and its subsections.

purrr is casually mentioned in the section about the OO model in R (see Chapter 6 on page 87), and extensively used in Chapter 25.1 “Model Quality Measures” on page 476.purrrIt is a rather complete and consistent set of tools for working with functions and vectors. Using purrr it should be possible to replace most loops with call to purr functions that will work faster.

tibble is a new take on the data frame of core-R. It provides a new base type: tibbles.tibbleTibbles are in essence data frames, that do a little less (so there is less clutter on the screen and less unexpected things happen), but rather give more feedback (showwhat went wrong instead of assuming that you have read all manuals and remember everything). Tibbles are introduced in the next section.

stringr expands the standard functions to work with strings and provides a nice coherent set of functions that all start with str_.stringiThe package is built on top of stringi, which uses the ICU library that is written in C, so it is fast too. For more information, see Chapter 17.5 “String Manipulation in the tidyverse” on page 299.stringr

forcats provides tools to address common problems when working with categorical variables⁷.forcats

7.2.2 The Non-core Tidyverse

Besides the core tidyverse packages – that are loaded with the command library(tidyverse), there are many other packages that are part of the tidyverse. In this section we will describe briefly the most important ones.

Importing data: readxl for .xls and .xlsx files) and haven for SPSS, Stata, and SAS data.⁸readxlxlsxxls

Wrangling data: lubridate for dates and date-times, hms for time-of-day values, blob for storing binary data. lubridate –for example – is discussed in Chapter 17.6 “Dates with lubridate” on page 314.lubridatehmsblob

Programming: purrr for iterating within R objects, magrittr provides the famous pipe, %>% command plus some more specialised piping operators (like %$% and %<>%), and glue provides an enhancement to the paste() function.purrrmagrittrpaste()glue

Modelling: this is not really ready, though recipes and rsample are already operational and show the direction this is taking. The aim is to replace modelr ⁹. Note that there is also the package broom that turns models into tidy data.recipesrsamplemodelrbroom

Warning –Work in progress

While the core-tidyverse is stable, the packages that are not core tend still to change and improve. Check their online documentation when using them.

7.3. Working with the Tidyverse

7.3.1 Tibbles

Tibbles are in many aspects a special type of data frames. The do the same as data frames (i.e. store rectangular data), but they have some advantages.

Let us dive in and create a tibble. Imagine for example that we want to show the sum of the sine and cosine functions. The output of the code below is in Figure 7.1 on this page.

x <- seq(from = 0, to = 2 * pi, length.out = 100) s <- sin(x) c <- cos(x) z <- s + c plot(x, z, type = “l”,col=“red”, lwd=7) lines(x, c, col = “blue”, lwd = 1.5) lines(x, s, col = “darkolivegreen”, lwd = 1.5)

Figure 7.1: The sum of sine and cosine illustrated.

Imagine further that our purpose is not only to plot these functions, but to use them in other applications. Then it would make sense to put them in a data, frame. The following code does exactly the same using a data frame.

x <- seq(from = 0, to = 2 * pi, length.out = 100) #df <- as.data.frame((x)) df <- rbind(as.data.frame((x)),cos(x),sin(x), cos(x) + sin(x)) # plot etc.

This is already more concise. With the tidyverse, it would look as follows (still without using the piping):

library(tidyverse) x <- seq(from = 0, to = 2 * pi, length.out = 100) tb <- tibble(x, sin(x), cos(x), cos(x) + sin(x))

The code below first prints the tibble in the console and then plots the results in Figure 7.2 on this page.

Schematic illustration of a tibble plots itself like a data-frame.

Figure 7.2: A tibble plots itself like a data-frame.

The code with a tibble is just a notch shorter, but that is not the point here. Themain advantage in using a tibble is that it will usually do things that make more sense for the modern R-user. For example, consider how a tibble prints itself (compared to what a data frame does).

# Note how concise and relevant the output is: print(tb) ## # A tibble: 100 x 4 ## x `sin(x)` `cos(x)` `cos(x) + sin(x)` ## <dbl> <dbl> <dbl> <dbl> ## 1 0 0 1 1 ## 2 0.0635 0.0634 0.998 1.06 ## 3 0.127 0.127 0.992 1.12 ## 4 0.190 0.189 0.982 1.17 ## 5 0.254 0.251 0.968 1.22 ## 6 0.317 0.312 0.950 1.26 ## 7 0.381 0.372 0.928 1.30 ## 8 0.444 0.430 0.903 1.33 ## 9 0.508 0.486 0.874 1.36 ## 10 0.571 0.541 0.841 1.38 ## # … with 90 more rows # This does the same as for a data-frame:

Скачать книгу

The Big R-Book. Philippe J. S. De Brouwer

Чтение книги онлайн.

Читать онлайн книгу The Big R-Book - Philippe J. S. De Brouwer страница 56

Информация о книге:

7.2.2 The Non-core Tidyverse

Warning –Work in progress

7.3. Working with the Tidyverse

7.3.1 Tibbles

The Big R-Book. Philippe J. S. De Brouwer

Чтение книги онлайн.

Читать онлайн книгу The Big R-Book - Philippe J. S. De Brouwer страница 56

Информация о книге:

7.2.2 The Non-core Tidyverse

Warning –Work in progress

7.3. Working with the Tidyverse 7.3.1 Tibbles

7.3. Working with the Tidyverse

7.3.1 Tibbles