Bioinformatics and Medical Applications. Группа авторов
Чтение книги онлайн.
Читать онлайн книгу Bioinformatics and Medical Applications - Группа авторов страница 13
Figure 1.6 Random forest algorithm.
Some of the disadvantages are as follows:
• Complexity
• Longer training period.
1.3.5 Naive Bayes Algorithm
Naive Bayes is a fantastic AI calculation utilized for prediction which depends on Bayes Theorem. Bayes Theorem expresses that given a theory H and proof E, the relationship between possibility of pre-proof likelihood P(H) and the possibility of the following theoretical evidence P (H|E) is
Assumption behind Naive Bayes classifiers is that the estimation of a unique element is not dependent on the estimation of some different element taking the class variable into consideration. For instance, a product may be regarded as an apple if possibly it is red in color, round in shape, and around 10 cm wide.
A Naive Bayes classifier looks at all these highlights to offer independently to the chances that this product is an apple, although there is a potential relationship between shading, roundness, and dimension highlights. They are probabilistic classifiers and, subsequently, will compute the likelihood of every classification utilizing Bayes’ hypothesis, and the classification with the most elevated likelihood will be the yield.
Let D be the training dataset, y be the variable for class and the attributes represented as X hence according to Bayes theorem
where
So, replacing the X and applying the chain rule, we get
Since the denominator remains same, removing it from the dependency
Therefore, to find the category y with high probability, we use the following function:
Some of the advantages of Naive Bayes algorithm are as follows:
• Easy to execute.
• Requires a limited amount of training data to measure parameters.
• High computational efficiency.
However, there are some disadvantages too, as follows:
• It is thought that all aspects are independent and equally important which is virtually impossible in real applications.
• The tendency to bias when increasing the number of training sets.
1.3.6 K Means Algorithm
K means, an unsupervised algorithm, endeavors to iteratively segment the dataset into K pre-characterized and nonoverlapping data groups with the end goal that one data point can have a place with just one bunch. It attempts to make the intra-group data as similar as could reasonably be expected while keeping the bunches as various (far) as could be expected under the circumstances. It appoints data points to a cluster with the end goal that the entirety of the squared separation between the data points and the group’s centroid is at the minimum. The less variety we have inside bunches, the more homogeneous the data points are inside a similar group.
1.3.7 Ensemble Method
Ensemble method is the process by which various models are created and consolidated in order to understand a specific computer intelligence problem. This prompts better prescient performance than could be acquired from any of the constituent learning models alone. Fundamentally, an ensemble is a supervised learning method for joining various weak learners/models to deliver a strong learner. Ensemble model works better, when we group models with low correlation. Figure 1.7 gives the various ensemble methods which are in use. Following are some of the techniques used for ensemble.
Figure 1.7 Ensemble methods.
1.3.7.1 Bagging
Bagging or bootstrap aggregation assigns equal weights to each model in the ensemble. It trains each model of the ensemble separately using random subset of training data in order to promote variance. Random Forest is a classical example of bagging technique where multiple random decision trees are combined to achieve high accuracy. Samples are generated in such a manner that the samples are different from each other and replacement is permitted.
1.3.7.2 Boosting
The term “Boosting” implies a gathering of calculations which changes a weak learner to strong learner. It is an ensemble technique for improving the model predictions of some random learning algorithm. It trains weak learners consecutively, each attempting to address its predecessor. There are three kinds of boosting in particular, namely, AdaBoost that assigns more weight to the incorrectly classified data that would be passed on to the next model, Gradient Boosting which uses the residual errors made by previous predictor to fit the new predictor, and Extreme Gradient Boosting which overcomes drawbacks of Gradient Boosting by using parallelization, distributed computing, out-of-core computing, and cache optimization.
1.3.7.3 Stacking
It utilizes meta-learning calculations to discover how to join the forecasts more readily from at least two basic algorithms. A meta model is a two-level engineering with Level 0 models which are alluded to as base models and Level 1 model which are alluded to as Meta model. Meta-model depends on forecasts made by basic models on out of sample data. The yields from the base models utilized as contribution to the meta-model might be in the form of real values in the case of regression and probability values in the case of classification. A standard method for setting up a meta-model training database is with k-fold cross-validation of basic models.
1.3.7.4 Majority Vote
Each model makes a forecast (votes) in favor of each test occurrence and the final output prediction is the one that gets the greater part of the