The Big R-Book. Philippe J. S. De Brouwer
Чтение книги онлайн.
Читать онлайн книгу The Big R-Book - Philippe J. S. De Brouwer страница 24
♣4♣ The Basics of R
In this book we will approach data and analytics from a practitioners point of view and our tool of choice is R. R is in some sense a re-implementation of S – a programming language written in 1976 by John Chambers at Bell Labs – with added lexical scoping semantics. Usually, codewritten in S will also run in R.
S
R is a modern language with a rather short history. In 1992, the R-project was started by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand. The first version was available in 1995 and the first stable version was available in 2000.
Now, the R Development Core Team (of which Chambers is a member) develops R further and maintains the code. Since a few years Microsoft has embraced the project and provides MRAN (Microsoft R Application Network). This package is also free and open source software (FOSS) and has some advantages over standard R such as enhanced performance (e.g. multi-thread support, the checkpoint package that makes results more reproducible).
FOSS
Essentially, R is …
a programming language built for statistical analysis, graphics representation and reporting;
an interpreted computer language which allows branching, looping, modular programming as well as object and functional oriented programming features.
R offers its users …
integration with the procedures written in the C, C++, .Net, Python, or FORTRAN languages for efficiency;
C
C++
.Net
Fortran
zero purchase cost (available under the GNU General Public License), and pre-compiled binary versions are provided for various operating systems like Linux, Windows, and Mac;
Linux
Windows
Mac
simplicity and effectiveness;
a free and open environment;
an effective data handling and storage facility;
a suite of operators for calculations on arrays, lists, vectors, and matrices;
a large, coherent, and integrated collection of tools for data analysis;
graphical facilities for data analysis and display either directly at the computer or printing;
a supportive on-line community;
the ability for you to stand on the shoulders of giants (e.g. by using libraries).
R is arguably the most widely used statistics programming language and is used fromuniversities to business applications, while it still gains rapidly in popularity.
If at any point you are trying to solve a particular issues and you are stuck, the online community will be very helpful. To get unstuck, do the following:
First, look up your problems by adding the keyword “R” in the search string. Most probably, someone else encountered the very same problem before you, and the answer is already posted. Avoid to post a question that has been answered before.
If you need to ask your question in a forum such as for example www.stackexchange.com then you will need to add a minimal reproducible example. The package reprex can help you to do just that.
4.1 Getting Started with R
Before we can start, we need a working installation of R on our computer. On Linux, this can be done via the command line. On Debian and its many derivatives such as Ubuntu or Mint, this looks as follows:1
installing R
sudo apt-get install r-base
On Windows or Mac, you want to refer to https://cran.r-project.org
and download the right package for your system.
To start R, open the command line and type R
(followed by enter). This is the R interpreter (or R console). You can do all your data crunching here. To leave the environment type q()
followed by [enter].
It is also possible to use R online:
https://www.tutorialspoint.com/execute_r_online.php
RStudio
For the user, who is not familiar with the command line, it is highly recommendable to use an IDE, such as RStudio (see https://www.rstudio.com
). Later on – for example in Chapter 32 “R Markdown” on page 879 – we will see that RStudio has some unique advantages over the R-console in store, that will convince even the most traditional command-line-users.
IDE
RStudio
Whether you use standard R or MRAN, using RStudio will enhance your performance and help you to be more productive.
MRAN
Rstudio is an integrated development environment (IDE) for R and provides a console, editor with syntax-highlighting, a window to show plots and some workspace management.
IDE