The Big R-Book. Philippe J. S. De Brouwer

Чтение книги онлайн.

Читать онлайн книгу The Big R-Book - Philippe J. S. De Brouwer страница 41

The Big R-Book - Philippe J. S. De Brouwer

Скачать книгу

1.7374 ## 7 1999-01-27 1.7526 ## 8 1999-01-26 1.7609 ## 9 1999-01-25 1.7620 ## 10 1999-01-22 1.7515 ## 11 1999-01-21 1.7529 ## 12 1999-01-20 1.7626 ## 13 1999-01-19 1.7739 ## 14 1999-01-18 1.7717 ## 15 1999-01-15 1.7797 ## 16 1999-01-14 1.7707 ## 17 1999-01-13 1.8123 ## 18 1999-01-12 1.7392 ## 19 1999-01-11 1.7463 ## 20 1999-01-08 1.7643 ## 21 1999-01-07 1.7602 ## 22 1999-01-06 1.7711 ## 23 1999-01-05 1.7965 ## 24 1999-01-04 1.8004Bar chart depicts the histogram of the most recent values of the CAD only.

      image Warning – Silently added rows

      Without the row.names = FALSE statement, the function write.csv() would add a row that will get the name “X.”

      4.8.2 Excel Files

       Excel

       .xlx

      Importing and xlsx-file is very similar to importing a CSV-file.

      # install the package xlsx if not yet done if (!any(grepl(“xlsx”,installed.packages()))){ install.packages(“xlsx”)} library(xlsx) data <- read.xlsx(“input.xlsx”, sheetIndex = 1)

      4.8.3 Databases

      Spreadsheets and CSV-file are good carriers for reasonably small datasets. Any company that holds a lot of data will use a database system – see for example Chapter 13RDBMS” on page 285. Importing data from a database system is somewhat different. The data is usually structured in “tables” (logical units of rectangular data); however, they will seldom contain all information that we need and usually have too much rows.We need also some protocol to communicate with the database: that is the role of the structured query language (SQL) – see Chapter 14SQL on page 291.

       database – MySQL

      R can connect to many popular database systems. For example, MySQL: as usual there is a package that will provide this functionality.

       MySQL

      if(!any(grepl(“xls”, installed.packages()))){ install.packages(“RMySQL”)} library(RMySQL)

       RMySQL

      Connecting to the Database

      This is explained in more detail in Chapter 15Connecting R to an SQL Database” on page 327, but the code segment below will already get you the essential ingredients.

      # The connection is stored in an R object myConnection and # it needs the database name (db_name), username and password myConnection = dbConnect(MySQL(), user = ‘root’, password = ‘xxx’, dbname = ‘db_name’, host = ‘localhost’) # e.g. list the tables available in this database. dbListTables(myConnection)

       dbConnect()

       dbSendQuery()

      Fetching Data Drom a Database

      Once connected to the database, we can select and download the data that we will use for our analysis in R.

      Update Queries

      It is also possible to manipulate data in the database directly from R. This allows use to prepare data in the database first and then download it and do our analysis, while we keep all the code in one file.

       fetch()

      The dbSendQuery() function can be used to send any query, including UPDATE, INSERT, CREATE TABLE and DROP TABLE queries so we can push results back to the database.

      sSQL = “” sSQL[1] <- “UPDATE tbl_students SET score = ‘A’ WHERE raw_score > 90;” sSQL[2] <- “INSERT INTO tbl_students (name, class, score, raw_score) VALUES (‘Robert’, ‘Grade 0’, 88,NULL);” sSQL[3] <- “DROP TABLE IF EXISTS tbl_students;” for (k in c(1:3)){ dbSendQuery(myConnection, sSQL[k]) }

       dbSendQuery()

      Create Tables from R Data-frames

      Finally, it is possible to write back data from a dataset in R to a table in a database.

      dbWriteTable(myConnection, “tbl_name”, data_frame_name[, ], overwrite = TRUE)

       dbWriteTable()

      image Warning – Closing the database connection

      Even while connections will be automatically closed when the scope of the database object is lost, it is a good idea to close a connection explicitly. Closing a connection explicitly, makes sure that it is closed and does not remain open if something went wrong, and hence the resources of our RDBMS are freed. Closing a database connection can be done with dbDisconnect(myConnection, …).

       dbDisconnect()

      1 1 There are different package management systems for different flavours of Linux and discussing them all is not only beyond the scope of this book, but

Скачать книгу