have similarities to each other, but which have dissimilarities with objects in other clusters. The input of a clustering algorithm consists of elements, while the output are clusters where the elements are divided according to a similarity measure. Clustering algorithms also provide a description of the characteristics of each cluster, which is essential for decision‐making processes. Concerning the continuous class of unsupervised algorithms, the K‐means algorithm is the classical approach estimating the centroid of the clustered data named clusters. The mean shift clustering (MSC) is a sliding‐window‐based algorithm executed to find dense areas of data points defining the centroids. The density‐based spatial clustering of applications with noise (DBSCAN) has the main characteristic to find arbitrarily sized and arbitrarily shaped clusters. The expectation maximization (EM) clustering uses the gaussian mixture model () approach assuming data points distributed as a Gaussian function and characterized by mean and standard deviation. Agglomerative hierarchical clustering (AHC) group clusters follow a hierarchy represented by a tree or by a dendrogram: the root of the tree is the main cluster grouping all the samples, and the leaves are clusters with only one sample. Principal component analysis (PCA) is able to reduce the dimensionality of a dataset made up of many more or less correlated variables. The single value deposition (SVD) technique is a particular factorization of a matrix based on the use of eigenvalues and eigenvectors. Concerning the categorical class of unsupervised algorithms, the Apriori algorithm is adopted for association rules for frequent item set mining. The algorithm of FP‐growth is able to complete a set of frequent patterns by pattern fragment growth, using frequent patterns. The hidden Markov model is a statistical Markov model with hidden states. Concerning the continuous class of supervised algorithms, the regression is a statistical process that tries to establish a relationship between two or more variables. If a regression model is given an x value, this returns the corresponding y value generated by the processing of x. The linear regression differs from the classification, since the latter is limited to discriminating the elements in a given number of classes (label), while for the linear regression approach the input is data and the system gives a real output (unlike the classification method which receives as input data and returns as output a label dataset). Polynomial regression uses the same method as linear regression, but assumes that the function that better describes the data trend is not a straight line, but a polynomial. Artificial neural networks (ANNs) are able to classify and predict data. Specifically, ANNs are made by three types of layer: the input layer, the hidden layers, and the output layer. In the input layer, the neural network receives the data in the form of inputs, activates and processes them according to the classification capacity for which it is trained, and passes the information obtained to the next layer as in neuron propagation. At each step, the starting information takes on an increasingly refined meaning due to the interpretations of the different nodes. Finally, the processed data arrive at the output layer, which collects the results. Concerning ANNs, the network is structured to learn automatically in self‐learning modality. Each perceptron has the task of categorizing objects by referring to common characteristics, following a score system calculated on each analyzed element. In AI, the perceptron represents a binary classifier selecting data input, and provides a features vector. The classifier makes its predictions based on a linear predictor function combining a set of weights with the feature vector. In AI machine learning algorithms, the perceptrons are important for their self‐learning capacity, thus addressing these tools for auto‐adaptive production solutions. A multilayered perceptron (MLP) is a particular class of feedforward ANN characterized by multiple layers of perceptrons having a threshold activation. The MLP consists of at least three layers of nodes: an input layer, a hidden layer, and an output layer. Except for the input nodes, each node is constituted by a neuron implementing an activation function. MLP utilizes a supervised learning technique called backpropagation for model training. In the class of categorical supervised algorithms, the random forest (RFo) represents a type of ensemble model, which uses bagging (the bagging aims to create a set of classifiers of equal importance) as an ensemble method and the decision tree (DT) as an individual model algorithm. Also DTs are a supervised learning tool, mainly solving classification or regression issues, capable of learning nonlinear associations and very easy to interpret and apply. DT algorithms work on both numeric and categorical data, and are categorized with respect to the output variable as categorical DT, and continuous DT. In the categorical class of a supervised algorithm, the k‐nearest neighbors (KNN) algorithm is an algorithm used for pattern recognition and for classification based on the characteristics of the objects close to the one considered. The logistic regression algorithm allows to generate a result representing a probability that a given input value belongs to a certain class. In binomial logistic regression problems, the probability that the output belongs to one class is P, while that it belongs to the other class is 1 ‐P (where P is a number between 0 and 1 because it expresses a probability). Naïve Bayes is a supervised learning algorithm suitable for solving binary and multi‐class classification problems ant it is based on Bayes’ theorem defining the conditional probability: let A and B be two events, and let B be a possible event having a probability of occurrence P(B) ≠ 0, if A∩B indicates the intersection of the two events (both occurred events), it is defined the conditional probability P(A|B) (probability of A conditioned by B) as:
(1.1)
The support vector machine (SVM) is a supervised machine learning algorithm that is used for both classification and regression purposes. For a given training dataset, the SVM training algorithm builds a model that assigns new examples to one category or the other, making it a non‐probabilistic binary linear classifier.
1.2.3 AI Image Processing
Image segmentation techniques are useful also to detect product anomalies enabling predictive maintenance procedures of production lines [43]: quality monitoring of defect clusters checked on product surface, can be performed by image segmentation and image filtering techniques, predicting defectiveness. Defect prediction in manufacturing processing is achieved by ANNs [44]. The image vision technique [45, 46] also verifies the correct position of each product component, enabling pick and place removal action of defective pieces, according also to in‐line machine timing processing [45]. Image vision and mechatronic technologies improve the automation of quality check processes according to ISO9001:2015. A typical mechatronic scheme for real‐time defect in‐line control and actuation is illustrated in Figure 1.6 where:
A product is moved on a conveyor belt.
The product passes in the image vision‐controlled area in a time between t1 and t2 (observation window for quasi real‐time processing).
The fixed smart camera locally processes the detected image by estimating the defect tolerance (output O).
If the measured defect overcomes the tolerance the input command (input i) of the pick and place robotic system is actuated, eliminating the piece.
The AI engine also predicts defect and machine failures thus improving the quality control of the whole production line.
Figure 1.6 Scheme of a pick and place automated system for defect removal, based on image processing.
The real‐time image processing in industrial applications permits:
Execution of a pre‐image filtering thus optimizing light and in general noise conditions (camera setting