Applied Data Mining for Forecasting Using SAS. Tim Rey

Чтение книги онлайн.

Читать онлайн книгу Applied Data Mining for Forecasting Using SAS - Tim Rey страница 8

Автор:
Жанр:
Серия:
Издательство:
Applied Data Mining for Forecasting Using SAS - Tim Rey

Скачать книгу

data mining case for transaction data, there are some specific methods used to guard against overfitting, which helps provide a robust final model. One such method is dividing the data into three parts: model, hold out, and out of sample. This is analogous to training, validating, and testing data sets in the transaction data mining space. Various statistical measures are then used to choose the final model. Once the model is chosen, it is deployed using various technologies.

      This discussion shows how and why it is important that the subject matter experts' knowledge of a company's market dynamics is captured in a form that institutionalizes this knowledge. This institutionalization actually surfaces through the use of mathematics, specifically statistics, machine learning and econometrics. When done, the ensuing equations become intellectual property (IP) that can be leveraged across the company. This is true even if the data sources are in fact public, since how the data is used to capture the IP in the form of mathematical models is in fact proprietary.

      The core content of the book is designed to help the reader understand in detail the process described in the previous paragraphs. This will be done in the context of various SAS technologies, including SAS® Enterprise Guide®, SAS Forecast Server and various SAS/ETS® time series procedures like PROC EXPAND, PROC TIMESERIES, PROC ARIMA, PROC SIMILARITY, PROC Xll/12, as well as the SAS® Enterprise Miner time series data mining nodes, and others.

      The reason for integrating data mining and forecasting is simply to provide the highest-quality forecasts possible. Business leaders now have a unique advantage in that they have easy access to thousands of Xs, and the knowledge about a process and technology that enables data mining on time series data. With the tools now available through various SAS technologies, the business leader can create the best explanatory (cause and effect) forecasting model possible, and this can be accomplished in an expedient and cost efficient manner.

      Now that models of this type are easier to build, they then can be used in other applications, including scenario analysis, optimization problems, and simulation problems (linear systems of equations as well as non-linear system dynamics). All in all, the business decision maker is now prepared to make better decisions with these advanced analytics forecasting processes, methods and technologies.

      The next chapter defines and discusses in detail the process of data mining for forecasting. In Chapter 3, details are given about how to set up an infrastructure for data mining for forecasting. Chapter 4 covers issues with data dining for forecasting applications. This then leads to data collection in Chapter 5 and data preparation in Chapter 6, which has an entire chapter dedicated to the topic since 60–80% of the work lies in this step. Chapter 7 discusses the foundation for the actually doing data mining by providing a practitioner's guide to data mining methods for forecasting. Chapters 8 through 11 present a practitioner's guide to time series forecasting methods. Chapter 12 finishes the book by walking through an example of data mining for forecasting from start to finish.

      Chapter 2: Data Mining for Forecasting Work Process

       2.1 Introduction

       2.2 Work Process Description

       2.2.1 Generic Flowchart

       2.2.2 Key Steps

       2.3 Work Process with SAS Tools

       2.3.1 Data Preparation Steps with SAS Tools

       2.3.2 Variable Reduction and Selection Steps with SAS Tools

       2.3.3 Forecasting Steps with SAS Tools

       2.3.4 Model Deployment Steps with SAS Tools

       2.3.5 Model Maintenance Steps with SAS Tools

       2.3.6 Guidance for SAS Tool Selection Related to Data Mining in Forecasting

       2.4 Work Process Integration in Six Sigma

       2.4.1 Six Sigma in Industry

       2.4.2 The DMAIC Process

       2.4.3 Integration with the DMAIC Process

       Appendix: Project Charter

      This chapter describes a generic work process for implementing data mining in forecasting real-world applications. By work process the authors mean a sequence of steps that lead to effective project management. Defining and optimizing work processes is a must in industrial applications. Adopting such a systematic approach is critical in order to solve complex problems and introduce new methods. The result of using work processes is that productivity is increased and experience is leveraged in a consistent and effective way. One common mistake some practitioners make is jumping to real-world forecasting applications while focusing only on technical knowledge and ignoring the organizational and people-related issues. It is the authors' opinion that applying forecasting in a business setting without a properly defined work process is a clear recipe for failure.

      The work process presented here includes a broader set of steps than the specific steps related to data mining and forecasting. It includes all necessary action items to define, develop, deploy, and support forecasting models. First, a generic flowchart and description of the key steps is given in the next section, followed by a specific illustration of the work process sequence when using different SAS tools. The last section is devoted to the integration of the proposed work process in one of the most popular business processes widely accepted in industry–Six Sigma.

      The objective

Скачать книгу