Sports Analytics in Practice with R. Ted Kwartler

Чтение книги онлайн.

Читать онлайн книгу Sports Analytics in Practice with R - Ted Kwartler страница 7

Sports Analytics in Practice with R - Ted Kwartler

Скачать книгу

are valuable, and people in your community are here for you. Your passing was a motivating reminder of the short time we have to make contributions along with the need for more kindness toward those that may be suffering silently.

      Next, Anup B, one of the most brilliant supportive leaders I have worked for. Not to mention your passion for cricket helped open my eyes to a noteworthy and enjoyable sport. Losing you to the pandemic was a disturbing blow felt by many people who were touched by your intelligence, humor, and positivity.

      This entire book would not have been possible without the fine professors at the University of Notre Dame that put me on my own professional journey. I fondly remember building my first logistic regression predicting March Madness after learning these techniques from Dr. Keating, the late Dr. Gilbride, and Dr. Devaraj.

      Further I would like to acknowledge my parents, Anatol and Trish, and my endearing wife, Meghan. Your support and patience has been significant. Writing a book is no small undertaking with much of the logistical burden falling to each of you. Completing this book is a shared victory.

      Lastly, my sincerest gratitude to the wonderful team at Wiley, particularly Kimberly Monroe-Hill. Your patience and flexibility to late submissions and delayed seasons stemming from the unusual 2020 year in sports (among other more important hardships) has been greatly appreciated. I was ready to give up on the project yet your e-mails demonstrated a commitment from Wiley that I cherish.

      Objectives

       Learn about R as a programming language

       Define Integrated Development Environment

       Define objects

       Learn the assignment operator

       Define functions

       Executing a loop

       Learn logical operators

       Learn about R data types

       Learn about object classes

       Indexing data objects

       Extending R functionality with packages

       Writing a custom function

       Create a scatter plot with sports data

       Create a heatmap with sports data

      R Libraries

      ggplot2 ggthemes RCurl tidyr

      R Functions

      + plot <- round class as.factor as.character c cbind rbind data.frame as.matrix as.data.frame install.packages library getURL read.csv dim names head tail summary table qplot pivot_longer geom_tile scale_fill_gradient xlab ggtitle theme theme_hc

      The R Programming Language

      In this textbook, the R language is applied specifically to sports contexts. Of course, the code in this book can be used to extend your understanding of sports analytics. It may give you insights to a particular sport or analytical aspect within the sport itself such as what statistics should be focused on to win a basketball game. However, learning the code in this book can also help open up a world of analytical capabilities beyond sports. One of the benefits of learning statistics, programming, and various analysis methods with sports data is that the data is widely available and outcomes are known. This means that your analysis, models, and visualizations can be applied, and you can review the outcomes as you expand upon what is covered in this book. This differs from other programming and statistical examples which may resort to boring, synthetic data to illustrate an analytical result. Using sports data is realistic and can be future oriented, making the learning more challenging yet engaging. Modeling the survivors of the Titanic pales in comparison since you cannot change the historical outcome or save future cruise ship mates. Thus, modeling which team will win a match or which player is a good draft pick is a superior learning experience.

      If you are new to programming don’t be intimidated. R is a forgiving language in that things like spacing an indentation are ignored. Further, the R community is well supported and a simple online search of any error message usually finds an answer quickly on any number of sites.

      To begin your R and sports analytics journey, please download the “base-R” distribution for your operating system. The “Comprehensive R Archive Network,” CRAN, is the home of the official R distribution as well as officially supported packages (more on that in a bit). The site to download base-R is https://cran.r-project.org.

      Unfortunately, base-R, having started in the nineties, looks abysmal and lacks some modern day functionality. Thus, you will need to next download the R-Studio Integrated Development Environment, or IDE. An IDE is software that consolidates many of the aspects needed to code into one place. For example, you will need to write code which could be done in a simple notepad like program, a place to execute the code written, a place to visualize plots that were output from the code, and so on. These individual components are assembled into the IDE for ease of use and fast development. R and many other languages have IDEs. In fact, R has multiple IDE optimized for the type of analysis you are performing such as biostatistics or working with another language like Java. The most popular and easily supported IDE for base-R is the R-Studio software. There are server and desktop versions available. The code executed in this book should work for either cloud or local but installation of base-R and R-Studio on a server is not covered. Therefore, please download the R-Studio desktop IDE by navigating to https://www.rstudio.com/products/rstudio.

      Essentially R-Studio sits on top of base-R. The IDE provides a modern GUI expected of today’s computer users while also adding functionality including the use of version control, terminal access and perhaps most importantly an easy way to create and view visualizations

Скачать книгу