Biomedical Data Mining for Information Retrieval. Группа авторов

Чтение книги онлайн.

Читать онлайн книгу Biomedical Data Mining for Information Retrieval - Группа авторов страница 15

Biomedical Data Mining for Information Retrieval - Группа авторов

Скачать книгу

using z-score method.

      2 Calculate the auto correlation of the matrix (R).(1.1)

      3 Calculate the Eigen vectors (U) and Eigen values (l)(1.2)

      4 Rearrange the Eigen vector and Eigen values in descending order

      5 Calculate the factor loading matrix (A) by using(1.3)

      6 Calculate the score matrix (B)(1.4)

      7 Calculate the factor score (F)(1.5)

      Discriminant analysis [34] is one of the statistical tools which is used to classify individuals into a number of groups. To separate two groups, Discriminant Function Analysis (DFA) is used and to separate more than two groups Canonical Varieties Analysis (CVA) is used. There are two potential goals in a discriminant investigation: finding a prescient condition for grouping new people or deciphering the prescient condition to all the more likely comprehend the connections that may exist among the factors.

      Decision Tree [35] is a tree like structure used for classification and regression. It is a supervised machine learning algorithm used in decision making. The objective of utilizing a DT is to make a preparation model that can use to foresee the class or estimation of the objective variable by taking in basic choice principles gathered from earlier data (training information). In DT, for anticipating a class name for a record one has to start from the foundation of the tree. We look at the estimations of the root property with the record’s characteristic. Based on correlation, one follows the branch and jump to the next node.

      A Naive Bayes classifier [35] is a probabilistic AI model that is utilized for classification task. The Bayes equation is given as

      (1.6)

      Utilizing Bayes hypothesis, it discovers the likelihood of an occurrence, given that B has happened. Here, B represents evidence and A represents hypothesis. The supposition made here is that the indicators/highlights are free. That is nearness of one specific element doesn’t influence the other. Consequently it is called Naïve.

      Support Vector Machine [35] is a supervised machine learning algorithms which aims to find a hyperplane in the N-dimensional space. A plane which has the maximum margin is to be chosen. Vectors are information focuses that are nearer to the hyperplane and impact the position and direction of the hyperplane. Utilizing these help vectors, the edge of the classifier is expanded. Erasing the help vectors will change the situation of the hyperplane. These are the focuses that assist in building the SVM.

      As exhibited from the above table DT has outperformed the other five models with an accuracy of 97.95%. FA-FLANN model has secured the 2nd rank with an accuracy of 87.6%. DA, KNN and SVM models are giving almost same results with accuracy of 86.05%, 86.6% and 86.15% respectively. The worst result is reported for the Naïve Bayesian based model with an accuracy of 54.80%.

S. no. Model name Error during testing Accuracy Rank
Value (%)
1. FA-FLANN 0.1240 12.40% 87.60% 2
2. DA 0.1395 13.95% 86.05% 5
3. DT 0.0205 2.05% 97.95% 1
4. KNN 0.1340 13.4%

Скачать книгу