The Big R-Book. Philippe J. S. De Brouwer

Чтение книги онлайн.

Читать онлайн книгу The Big R-Book - Philippe J. S. De Brouwer страница 59

The Big R-Book - Philippe J. S. De Brouwer

Скачать книгу

<- 2 * x + 4 + rnorm(10, mean=0, sd=0.5)) %>% lm(y ~ x) ## Error in as.data.frame.default(data): cannot coerce class ““formula”” to a data.frame

      The aforementioned code fails. This is because R will not automatically add something like data = t and use the “t” as far as defined till the line before. The function lm() expects as first argument the formula, where the pipe command would put the data in the first argument. Therefore, magrittr provides a special pipe operator that basically passes on the variables of the data frame of the line before, so that they can be addressed directly: the %$%.

      # The Tidyverse only makes the %>% pipe available. So, to use the # special pipes, we need to load magrittr library(magrittr) ## ## Attaching package: ‘magrittr’ ## The following object is masked from ‘package:purrr’: ## ## set_names ## The following object is masked from ‘package:tidyr’: ## ## extract lm2 <- tibble(“x” = runif(10)) %>% within(y <- 2 * x + 4 + rnorm(10, mean=0,sd=0.5)) %$% lm(y ~ x) summary(lm2) ## ## Call: ## lm(formula = y ~ x) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.6101 -0.3534 -0.1390 0.2685 0.8798 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 4.0770 0.3109 13.115 1.09e-06 *** ## x 2.2068 0.5308 4.158 0.00317 ** ## --- ## Signif. codes: ## 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 ## ## Residual standard error: 0.5171 on 8 degrees of freedom ## Multiple R-squared: 0.6836,Adjusted R-squared: 0.6441 ## F-statistic: 17.29 on 1 and 8 DF, p-value: 0.003174

      coeff <- tibble(“x” = runif(10)) %>% within(y <- 2 * x + 4 + rnorm(10, mean=0,sd=0.5)) %$% lm(y ~ x) %>% summary %>% coefficients coeff ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 4.131934 0.2077024 19.893534 4.248422e-08 ## x 1.743997 0.3390430 5.143882 8.809194e-04

      image Note – Using functions without brackets

      Note how we can omit the brackets for functions that do not take any argument.

      7.3.4.2 The T-Pipe

Graph depicts a linear model fit on generated data to illustrate the piping command.

      7.3.4.3 The Assignment Pipe

      This last variation of the pipe operator allows us to simplify the first line, by providing an assignment with a special piping operator.

      x <- c(1,2,3) # The following line: x <- x %>% mean # is equivalent with the following: x %<>% mean # Show x: x ## [1] 2

      Note that the original meaning of “x” is gone.

      image Warning – Assignment pipe

      We recommend to use this pipe operator only when no confusion is possible. We also argue that this pipe operator makes code less readable, while not really making the code shorter.

      7.3.5 Conclusion

      Indeed, the piping operator will not provide a speed increase nor memory advantage even if we would create a new variable at every line. R has a pretty good memory management and it does only copy columns when they are really modified. For example, have a careful look at the following:

      library(pryr) x <- runif(100) object_size(x) ## 840 B y <- x # x and y together do not take more memory than only x. object_size(x,y) ## 840 B y <- y * 2 # Now, they are different and are stored separately in memory. object_size(x,y) ## 1.68 kB

      The piping operator can be confusing at first and is not really necessary (unless to read code that is using it). However, it has the advantage to make code more readable – once used to it – and it also makes code shorter. Finally, it allows the reader of the code to focus more on what is going on (the actions instead of the data, since that is passed over invisibly).

      image Hint – Use pipes sparingly

      Pipes are as spices in the kitchen. Use them, but do so with moderation. A good rule of thumb is that five lines is enough, and simple one-line commands do not need to be broken down in more lines in order to use a pipe.

      1 1 According to the Tiobe-index (see https://www.tiobe.com/tiobe-index), R is the 14th most popular programming language and still on the rise.

      2 2 More information can be found in this article of Hadley Wickham: https://tidyverse.tidyverse.org/articles/manifesto.html.

Скачать книгу