The Big R-Book. Philippe J. S. De Brouwer

Чтение книги онлайн.

Читать онлайн книгу The Big R-Book - Philippe J. S. De Brouwer страница 31

The Big R-Book - Philippe J. S. De Brouwer

Скачать книгу

“pi_value” = pi) unlist(L) ## ## 1.000000 2.000000 -10.000000 -9.000000 -8.000000 ## pi_value ## 3.141593

      Apart from performance considerations, it might also be necessary to convert parts of a list to a vector, because some functions will expect vectors and will not work on lists.

      4.3.7 Factors

       factors

      4.3.7.1 Creating Factors

      Factors are created using factor() the function.

       factor()

      # Create a vector containing all your observations: feedback <- c(‘Good’,‘Good’,‘Bad’,‘Average’,‘Bad’,‘Good’) # Create a factor object: factor_feedback <- factor(feedback) # Print the factor object: print(factor_feedback) ## [1] Good Good Bad Average Bad Good ## Levels: Average Bad Good

      # Plot the histogram -- note the default order is alphabetic plot(factor_feedback)Bar chart depicts the plot-function will result in a bar-chart for a factor-object.

       nlevels()

      # The nlevels function returns the number of levels: print(nlevels(factor_feedback)) ## [1] 3

      Digression – The reduced importance of factors

      When R was in its infancy, both computing power and memory were not at the level as today and in most cases it made sense to coerce strings to factors. For example, the base-R functions to load data in a data-frame (i.e. two dimensional data) will silently convert strings to factors. Today, that is most probably not what you need. Therefore, we recommend to make it a habit to use the functions from the tidyverse (see Chapter 7Tidy R with the Tidyverse” on page 161).

      4.3.7.2 Ordering Factors

      In the example about creating a factor-object for feedback one will have noticed that the plotfunction does show the labels in alphabetical order and not in an order that for us – humans – would be logical. It is possible to coerce a certain order in the labels by providing the levels – in the correct order – while creating the factor-object.

      feedback <- c(‘Good’,‘Good’,‘Bad’,‘Average’,‘Bad’,‘Good’) factor_feedback <- factor(feedback, levels=c(“Bad”,“Average”,“Good”)) plot(factor_feedback)

      In Figure 4.2 on page 63 we notice that the order is now as desired (it is the order that we have provided via the attribute labels in the function factor().

      Generate Factors with the Function gl()

      Function use for gl()

      gl(n, k, length = n*k, labels = seq_len(n), ordered = FALSE) with

       n: The number of levels

       k: The number of replications (for each level)

       length (optional): An integer giving the length of the result

       labels (optional): A vector with the labels

       ordered: A boolean variable indicating whether the results should be ordered.

       gl()

      image Question #4

      Use the dataset mtcars (from the library MASS) and explore the distribution of number of gears. Then explore the correlation between gears and transmission.

      image Question #5

      Then focus on the transmission and create a factor-object with the words “automatic” and “manual” instead of the numbers 0 and 1.

      Use the ?mtcars to find out the exact definition of the data.

       mtcars

      image Question #6

      Use the dataset mtcars (fromthe libraryMASS) and explore the distribution of the horsepower (hp). How would you proceed to make a factoring (e.g. Low, Medium, High) for this attribute? Hint: Use the function cut().

       cut()

      4.3.8 Data Frames

      4.3.8.1

Скачать книгу