Sports Analytics in Practice with R. Ted Kwartler

Чтение книги онлайн.

Читать онлайн книгу Sports Analytics in Practice with R - Ted Kwartler страница 9

Sports Analytics in Practice with R - Ted Kwartler

Скачать книгу

basic functionality of R is underpinned by functions and objects. Each package that specializes R comes with a set of functions usually coordinated for a particular task like data manipulation, obtaining sports data or similar. Functions accept inputs, including objects, and manipulate the inputs most often to create new objects or to overwrite and replace existing objects. For example, the following code creates a new object `newObj` using the assignment operator and on the right-hand side employs a base-R function. Base-R functions do not require any libraries to be loaded, so there is no need to specialize the R environment for a particular task. The `newObj` variable is declared as a result of a function `round` with two input parameters. The first parameter accepts the number to be rounded, `1.23`. The second parameter `digits = 0` is a tuning parameter which changes the behavior of the `round` function declaring the number of decimals to round the input to. Thus, when you add the following code to the script and then execute it in the console, the resulting `newObj` variable has a corresponding value of 1. As before, the `newObj` object will be stored actively and shown in the “Environment” tab. Keep in mind the inputs themselves can be objects not just declared values. As a result of this behavior, scripts manipulate objects and often pass them to another function later in the script.

      # Create a new object with a function newObj <- round(1.23, digits = 0)

      This book will illustrate many functions both in base-R and within specialized packages applied in a sports context. R has many tens of thousands of packages with corresponding functions. Often the rest of this book will defer to base-R functions in an effort for standardization, stability, and ease of understanding rather than utilize an esoteric package. This is a deliberate choice to improve conceptual understanding but does leave room for code optimization and improvement.

Name Code Description
FOR loops for (i in 1:4){ print(i + 2) } The FOR loop has a dynamic variable `i` which will update a number of times. Here, the `i` value loop will repeat from 1, 2, 3, and 4. The code within the curly brackets executes with the updated `i` value. The first time through the loop `i` equals `1` and with `+ 2` the value 3 is printed to the console. The second time through `i` updates to `2` and is once again added with `+ 2` so that the value `4` is printed. This continues in the loop 4 times because of the `1:4` parameter
IF statement if(xVal == 1){ print('xVal is equal to one.') } The IF statement is a control operator. After the `if` code, a statement is created to check its validity. If the statement inside parentheses evaluates to TRUE, then the code within the curly brackets is executed. In this example, the statement checks whether a variable `xVal` is equal to `1`. Since it does, the code in the curly brackets executes and a message is printed to the console state “xVal is equal to one.” If the statement does not evaluate to TRUE, the code inside the curly brackets is ignored. For example, if `xVal == 2`, then the code block is not run
IF ELSE statement if(xVal == 1){ print('xVal is equal to one.') } else { print('xVal is not equal to one.') } The IF-ELSE control flow adds another layer to the previous IF statement. Now a new set of curly brackets is added along with the `else` function. This statement will execute one of the two code chunks within the curly brackets based on the TRUE or FALSE result of the logical statement. Here, if `xVal == 1`, then the first message is printed, same as before. However, for any other value of `xVal`, the second bit of code is run. For example, if `xVal == 2`, then the IF statement evaluates to FALSE and the second message “xVal is not equal to one” will be printed to the console.

      class(i) class(xVal) class(i +.01)

      In addition to integers and numeric values, common R data types include “Boolean” values known in R as “logical” object types. Boolean data types are merely TRUE or FALSE. R can interpret these values as occurring or not occurring as shown in the IF statements. Additionally, for some operations, Boolean values can be interpreted as 1 and 0 for TRUE and FALSE, respectively. For example, in R `TRUE + TRUE` will return a value of `2` while `TRUE – FALSE` will return `1`, because R interprets the Boolean as 1 – 0. Let’s create a Boolean object called `TFobj` in the code below for use later.

      TFobj <- TRUE

      Another data type R often utilizes is a “factor.” A factor is a non-unique description of information. For example, a sports team may be assigned to a conference. Another team may also be assigned to that conference as well so it is frequently a repeating value within a data set. The factor has a level, meaning the conference name, and in effect the factor level alone represents specific “meta” information such as the other teams in the conference, and even perhaps some of the team’s schedule. This meta-information is inherited as a pattern within the larger data set, not explicitly defined within the object type. While this may be confusing, it will make sense eventually as the object

Скачать книгу