Probability with R. Jane M. Horgan

Чтение книги онлайн.

Читать онлайн книгу Probability with R - Jane M. Horgan страница 24

Probability with R - Jane M. Horgan

Скачать книгу

      lm(prog2∼prog1)

      calculates what is referred to as the linear model (lm) of images on images, or simply the line

equation

      that best fits the data.

      The output is

      Call: lm(formula = prog2∼prog1) Coefficients: (Intercept) prog1 -5.455 0.960

      Therefore, the line that best fits these data is

equation

      To draw this line on the scatter diagram, write

      plot(prog2, prog1) abline(lm(prog2∼prog1))

c03f016

      A word of warning is appropriate here. The estimated values are based on the assumption that the past trend continues. This may not always be the case. For example, students who do badly in Semester 1, may get such a shock that they work harder in Semester 2, and change the pattern. Similarly, students getting high marks in Semester 1 may be lulled into a sense of false security and take it easy in Semester 2. Consequently, they may not do as well as expected. Hence, the Semester 1 trends may not continue, and the model may no longer be valid.

      Machine learning is the science of getting computer systems to use algorithms and statistical models to study patterns and learn from data. Supervised learning is the machine learning task of using past data to learn a function in order to predict a future output.

      Often, the data for supervised learning are randomly divided into two parts, one for training and the other for testing. In machine learning, we derive the line of best fit from the training set

equation

      The testing set is used to see how well the line actually fits. Usually, an images breakdown of the data is made, the 80% is used for “training,” that is, to obtain the line, and the 20% is used to decide if the line really fits the data, and to ascertain if the model is appropriate for future predictions. The model is updated as new data become available.

      Example 3.1

Observation Numbers images images Observation Numbers images images
1 11.8 31.3 21 15.1 80.1
2 10.8 59.9 22 14.7 66.9
3 8.6 27.6 23 10.5 42.0

Скачать книгу