Machine Learning for Time Series Forecasting with Python. Francesca Lazzeri

Чтение книги онлайн.

Читать онлайн книгу Machine Learning for Time Series Forecasting with Python - Francesca Lazzeri страница 12

Machine Learning for Time Series Forecasting with Python - Francesca Lazzeri

Скачать книгу

high performance, easy-to-use data structures, and data analysis tools for the Python programming language. Python has long been great for data munging and preparation, but less so for data analysis and modeling. Pandas helps fill this gap, enabling you to carry out your entire data analysis workflow in Python without having to switch to a more domain-specific language like R.The most up-to-date pandas documentation can be found in the pandas user's guide (pandas.pydata.org/pandas-docs/stable /).Pandas is a NumFOCUS sponsored project. This will help ensure the success of development of pandas as a world-class open-source project.Pandas does not implement significant modeling functionality outside of linear and panel regression; for this, look to statsmodels and scikit-learn below.Statsmodels: Statsmodels is a Python module that provides classes and functions for the estimation of many different statistical models as well as for conducting statistical tests and statistical data exploration. An extensive list of result statistics is available for each estimator. The results are tested against existing statistical packages to ensure that they are correct. The package is released under the open-source Modified BSD (3-clause) license.The most up-to-date statsmodels documentation can be found in the statsmodels user's guide (statsmodels.org/stable/index.html).Scikit-learn - Scikit-learn is a simple and efficient tool for data mining and data analysis. In particular, this library implements a range of machine learning, pre-processing, cross-validation, and visualization algorithms using a unified interface. It is built on NumPy, SciPy, and Matplotlib and is released under the open-source Modified BSD (3-clause) license.Scikit-learn is focused on machine learning data modeling. It is not concerned with the loading, handling, manipulating, and visualizing of data. For this reason, data scientists usually combine using scikit-learn with other libraries, such as NumPy, pandas, and Matplotlib, for data handling, pre-processing, and visualization.The most up-to-date scikit-learn documentation can be found in the scikit-learn user's guide (scikit-learn.org/stable/index.html).

      In this book, we will also use Keras for time series forecasting:

       Keras: Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, and Theano. Data scientists usually use Keras if they need a deep learning library that does the following:Allows for easy and fast prototyping (through user friendliness, modularity, and extensibility)Supports both convolutional networks and recurrent networks, as well as combinations of the twoRuns seamlessly on central processing unit (CPU) and graphics processing unit (GPU)The most up-to-date Keras documentation can be found in the Keras user's guide (keras.io).

      Now that you have a better understanding of the different Python packages that we will use in this book to build our end-to-end forecasting solution, we can move to the next and last section of this chapter, which will provide you with general advice for setting up your Python environment for time series forecasting.

      In this section, you will learn how to get started with Python in Visual Studio Code and how to set up your Python development environment. Specifically, this tutorial requires the following:

       Visual Studio Code: Visual Studio Code (VS Code) is a lightweight but powerful source code editor that runs on your desktop and is available for Windows, macOS, and Linux. It comes with built-in support for JavaScript, TypeScript, and Node.js and has a rich ecosystem of extensions for other languages (such as C++, C#, Java, Python, PHP, Go) and runtimes (such as .NET and Unity).

       Visual Studio Code Python extension: Visual Studio Code Python extension is a Visual Studio Code extension with rich support for the Python language (for all actively supported versions of the language: 2.7, ≥ 3.5), including features such as IntelliSense, linting, debugging, code navigation, code formatting, Jupyter notebook support, refactoring, variable explorer, and test explorer.

       Python 3: Python 3.0 was originally released in 2008 and is the latest major version of the language, with the latest version of the language, Python 3.8, being released in October 2019. In most of our examples in this book, we will use Python version 3.8.

       It is important to note that Python 3.x is incompatible with the 2.x line of releases. The language is mostly the same, but many details, especially how built-in objects like dictionaries and strings work, have changed considerably, and a lot of deprecated features were finally removed. Here are some Python 3.0 resources:Python documentation (python.org/doc /)Latest Python updates (aka.ms/PythonMS )

      If you have not already done so, install VS Code. Next, install the Python extension for VS Code from the Visual Studio Marketplace. For additional details on installing extensions, see Extension Marketplace. The Python extension is named Python and published by Microsoft.

      Along with the Python extension, you need to install a Python interpreter, following the instructions below:

       If you are using Windows:Install Python from python.org. You can typically use the Download Python button that appears first on the page to download the latest version.Note: If you don't have admin access, an additional option for installing Python on Windows is to use the Microsoft Store. The Microsoft Store provides installs of Python 3.7 and Python 3.8. Be aware that you might have compatibility issues with some packages using this method.For additional information about Python on Windows, see Using Python on Windows at python.org.

       If you are using macOS:The system installation of Python on macOS is not supported. Instead, an installation through Homebrew is recommended. To install Python using Homebrew on macOS use brew install python3 at the Terminal prompt.Note: On macOS, make sure the location of your VS Code installation is included in your PATH environment variable. See these setup instructions for more information.

       If you are using Linux:The built-in Python 3 installation on Linux works well, but to install other Python packages you must install pip with get-pip.py.

      To verify that you have installed Python successfully on your machine, run one of the following commands (depending on your operating system):

       Linux/macOS: Open a Terminal Window and type the following command:python3 --version

       Windows: Open a command prompt and run the following command:py -3 --version

      If the installation was successful, the output window should show the version of Python that you installed.

      In this chapter, I walked you through the core concepts and steps to prepare your time series data for forecasting models. Through some practical examples of time series, we discussed some essential aspects of time series representations, modeling, and forecasting.

      Specifically, we discussed the following topics:

       Flavors of Machine Learning for Time Series Forecasting: In this section you learned a few standard definitions of important concepts, such as time series, time series analysis, and time series forecasting. You also discovered why time series forecasting is a fundamental cross-industry research area.

       Supervised Learning for Time Series Forecasting: In this section you learned how to reshape your forecasting scenario as a supervised learning problem and, as

Скачать книгу