Читать онлайн книгу - Machine Learning Approach for Cloud Data Analytics in IoT. Группа авторов. Программы. LiveLib

Новинки Лучшее Рекомендации

Информация о книге:

Название:

Автор:

Жанр:

Серия:

Издательство:

Machine Learning Approach for Cloud Data Analytics in IoT - Группа авторов

Скачать книгу

the model, authors attempt to thoroughly understand the requirements of retailers. It is understood that retailers have various queries in mind which need to be addressed by an efficient model. Some of these queries are as follows:

What is the probability of a person who is predicted to have online purchase behavior truly purchases online?

Which segment of customers the retailer should focus on?

Which are the geographical regions for online and offline channels?

A detailed understanding of the various queries of retailers enables devising an efficient predictive model. Here, authors aim to devise a model that provides various functions. Some of these functions are illustrated in Figure 3.2.

For instance, the proposed model can be used to estimate and forecast the sales of a particular product for a particular region. It can be performed at various levels of abstraction as per retailer’s choice and requirement. The proposed model aims to find the prospective buyers for a product even with very little probability of purchase. As it is observed, if a model targets more customers, then it may involve some additional costs but will not miss any probable buyer. Authors aim to not miss any probable customer as it may result in loss of some potential customers. The proposed model also attempts to predict the likelihood of a customer purchasing a particular product. This helps in targeting the prospective customers thus yielding an increase in revenue.

Schematic illustration of the major functions of predictive data analytics.

Figure 3.2 Illustration of major functions of predictive data analytics.

The proposed model collects data from various sources like social media, history data, and transaction details. This data from diverse sources is in disparate forms and thus needs to be cleaned during preprocessing. Thus, cleaned data from various sources is integrated, which is used for training the predictive model. The accuracy of model is largely dependent upon the size of training dataset. The basic structure of proposed model is represented in Figure 3.3.

As represented in Figure 3.3, the data integration is followed by algorithm selection for predictive model. There are several related algorithms like regression, boosting, or bagging, to name a few. Regression algorithms are basic algorithms for any predictive model. Boosting algorithms trains a model in a sequential and gradual manner. These algorithms perform both classification and regression. Boosting algorithms basically aim to identify weak learners which further can be improvised so as it turns to be a strong learner. Gradient boosting and AdaBoost are the two popularly used boosting algorithms. These two boosting algorithms basically differ in identification of weak learners. Weak learners are identified based on error rate. Error rate depends on the parameters to be optimized. For instance, if a model tries to predict sales, then error rate will be difference in predicted sale and actual sale.

Schematic illustration of the general framework of proposed model for predictive data analytics.

Figure 3.3 General framework of proposed model for predictive data analytics.

Random forest regression may also be employed for prediction problems as it performs classification and regression. Random forest regression employs some classification criteria to classify data. Thereafter, qualities of this split are measured using mean squared error or mean absolute error. It employs the concept of averaging to improve accuracy of prediction.

Authors in the chapter propose usage of bootstrap aggregating ML algorithm also referred to as bagging algorithm. Bagging algorithm aims to improve efficiency and accuracy of ML algorithms by reducing the variance. Usage of bagging algorithm advocates achievement of efficient and accurate predictive model. The accuracy of proposed model increases rapidly over time.

3.4.1 Case Study

For the sake of illustration of implementation of AI in retail industry, authors in the chapter consider a case study. Similarly, authors have taken a dataset pertaining to a retail store. This dataset comprises of observation for duration of 4 years from 2011 to 2015. This dataset has been taken from kaggle (https://www.kaggle.com/jr2ngb/superstore-datausername:jr2ngb). The considered dataset has 16 variables. Out of these 16 features, 10 are categorical features, 5 are numerical features, and 1 is date feature as follows.

#	Feature Name	Non-Null	Dtype
---	---------------	-----------	-------
0	Order Date	51290	datetime64[ns]
1	Customer_Name	51290	object
2	Segment	51290	object
3	City	51290	object
4	State	51290	object
5	Country	51290	object
6	Category	51290	object
7	Sub-Category	51290	object
8	Product Name	51290	object
9	Sales	51290	float64
10	Quantity	51290	int64
11	Discount	51290	float64
12	Profit	51290	float64
13	year	51290	int64
14	Скачать книгу В начало < 25 26 27 28 29 30 31 32 > В конец e-mail: [email protected]

Machine Learning Approach for Cloud Data Analytics in IoT. Группа авторов

Чтение книги онлайн.

Читать онлайн книгу Machine Learning Approach for Cloud Data Analytics in IoT - Группа авторов страница 30

Информация о книге:

3.4.1 Case Study