Machine Learning for Time Series Forecasting with Python. Francesca Lazzeri
Чтение книги онлайн.
Читать онлайн книгу Machine Learning for Time Series Forecasting with Python - Francesca Lazzeri страница 9
Regarding time series specific methods, there are a few options:Linear interpolation: This method works well for a time series with some trend but is not suitable for seasonal data.Seasonal adjustment and linear interpolation: This method works well for data with both trend and seasonality.Mean, median, and mode: Computing the overall mean, median, or mode is a very basic imputation method; it is the only tested function that takes no advantage of the time series characteristics or relationship between the variables. It is very fast but has clear disadvantages. One disadvantage is that mean imputation reduces variance in the data set.
In the next section of this chapter, we will discuss how to shape time series as a supervised learning problem and, as a consequence, get access to a large portfolio of linear and nonlinear machine learning algorithms.
Supervised Learning for Time Series Forecasting
Machine learning is a subset of artificial intelligence that uses techniques (such as deep learning) that enable machines to use experience to improve at tasks (aka.ms/deeplearningVSmachinelearning
). The learning process is based on the following steps:
1 Feed data into an algorithm. (In this step you can provide additional information to the model, for example, by performing feature extraction.)
2 Use this data to train a model.
3 Test and deploy the model.
4 Consume the deployed model to do an automated predictive task. In other words, call and use the deployed model to receive the predictions returned by the model (aka.ms/deeplearningVSmachinelearning).
Machine learning is a way to achieve artificial intelligence. By using machine learning and deep learning techniques, data scientists can build computer systems and applications that do tasks that are commonly associated with human intelligence. These tasks include time series forecasting, image recognition, speech recognition, and language translation (aka.ms/deeplearningVSmachinelearning
).
There are three main classes of machine learning: supervised learning, unsupervised learning, and reinforcement learning. In the following few paragraphs, we will have a closer look at each of these machine learning classes:
Supervised learning is a type of machine learning system in which both input (which is the collection of values for the variables in your data set) and desired output (the predicted values for the target variable) are provided. Data is identified and labeled a priori to provide the algorithm a learning memory for future data handling. An example of a numerical label is the sale price associated with a used car (aka.ms/MLAlgorithmCS). The goal of supervised learning is to study many labeled examples like these and then to be able to make predictions about future data points, like, for example, assigning accurate sale prices to other used cars that have similar characteristics to the one used during the labeling process. It is called supervised learning because data scientists supervise the process of an algorithm learning from the training data set (www.aka.ms/MLAlgorithmCS): they know the correct answers and they iteratively share them with the algorithm during the learning process. There are several specific types of supervised learning. Two of the most common are classification and regression:Classification: Classification is a type of supervised learning used to identify what category new information belongs in. It can answer simple two-choice questions, like yes or no, true or false, for example:Is this tweet positive?Will this customer renew their service?Which of two coupons draws more customers?Classification can be used also to predict between several categories and in this case is called multi-class classification. It answers complex questions with multiple possible answers, for example:What is the mood of this tweet?Which service will this customer choose?Which of several promotions draws more customers?Regression: Regression is a type of supervised learning used to forecast the future by estimating the relationship between variables. Data scientists use it to achieve the following goals:Estimate product demandPredict sales figuresAnalyze marketing returns
Unsupervised learning is a type of machine learning system in which data points for the input have no labels associated with them. In this case, data is not labeled a priori so that the unsupervised learning algorithm itself can organize the data in and describe its structure. This can mean grouping it into clusters or finding different ways of looking at complex data structures (aka.ms/MLAlgorithmCS).There are several types of unsupervised learning, such as cluster analysis, anomaly detection, and principal component analysis:Cluster analysis: Cluster analysis is a type of unsupervised learning used to separate similar data points into intuitive groups. Data scientists use it when they have to discover structures among their data, such as in the following examples:Perform customer segmentationPredict customer tastesDetermine market priceAnomaly detection: Anomaly detection is a type of supervised learning used to identify and predict rare or unusual data points. Data scientists use it when they have to discover unusual occurrences, such as with these examples:Catch abnormal equipment readingsDetect fraudPredict riskThe approach that anomaly detection takes is to simply learn what normal activity looks like