Biomedical Data Mining for Information Retrieval. Группа авторов

Чтение книги онлайн.

Читать онлайн книгу Biomedical Data Mining for Information Retrieval - Группа авторов страница 12

Biomedical Data Mining for Information Retrieval - Группа авторов

Скачать книгу

The model performance is better than the existing SAPS model. Another model in Ref. [11] has developed an algorithm to predict the in-hospital death of ICU patients for the event 1 and probability estimation in event 2. Here the missing values are imputed by zero and the data is normalized. Six support vector machine (SVM) classifiers are used for training. For each SVM positive examples and one sixth of the negative examples have taken in the training set. The obtained scores for events 1 and 2 are 0.5345 and 17.88 respectively. An artificial neural network model has developed for the prediction of in-hospital death patients in the ICU under the 48 h observations from the admission [12]. Missing values are handled using an artificial value based on assumption. From all feature sets, 26 features are selected for further process. For classification, two layered neural network having 15 neurons in the hidden layers is used. The model has used 100 voting classifiers and the output it produced is the average of 100 outputs. The mode is trained and tested using 5-fold cross validation. Fuzzy threshold is used to determine the output of the neural network. The model is resulted 0.5088 score for event 1 and 82.211 score for event 2 on the test data set. Ref. [13] has presented an approach that identify time series motifs to predict ICU patients in an in-hospital segmenting the variables into low, high and medium measurements. The method has outperformed the existing scoring systems, SAPS-II, APACHE-II and SOFA and obtained 0.46 score for event 1 and 56.45 score for event 2. An improved mortality prediction using logistic regression and Hidden Markov model has developed for an in-hospital death in Ref. [14]. The model is trained using 4,000 records of patients on set A and validation on other sets of unseen data of 4,000 records. Two different events: event 1 for minimum sensitivity and positive predictive value and for event 2 Hosmer–Lemeshow H statistic is used. The model has given 0.50, 0.50 for event 1 and 15.18, 78.9 for event 2 compared to SAPS-I whose event 1 scores are 0.3170, 0.312 and for event 2 66.03 and 68.58 respectively. An effective framework model for predicting in- hospital death mortality in the ICU stay has been suggested in Ref. [15]. Feature extraction is done by data interpolation and Histogram analysis. To reduce the complexity of feature extraction, it reduces the feature vector by evaluating measurement value of each variable. Then finally Cascaded Adaboost learning model is applied as mortality classifier and obtained the 0.806 score for event 1 and 24.00 score for event 2 on dataset A. On another dataset B the model has obtained 0.379 and 5331.15 score for both events 1 and 2. A decision support application for mortality prediction risk has been reported in Ref. [16]. For the clinical rules the authors have used fuzzy rule based systems. An optimizer is used with genetic algorithm which generates final solutions coefficients. The model FIS achieves 0.39 score for event 1 and 94 score for event 2. To predict the mortality in an ICU, a new method is proposed in Ref. [17]. The method, Simple Correspondence Analysis (SCA) is based on both clinical and laboratory data with the two previous models APACHE-II and SAPS-II. It collects the data from PhysioNet Challenge 2012 of total 12,000 records of Sets A, B and C and 37 time series variables are recorded. SCA method is applied to select variables. SCA combines these variables using traditional methods APACHE and SAPS. This method predicts whether the patient will survive or not. Finally, model has obtained 43.50% score 1 for set A, 42.25% score 1 for set B and 42.73% score1 for set C. The Naive Bayesian Classifier is used in [18] to predict mortality in an ICU and obtain high and small S1 and S2. For S1 sensitivity and predictive positive and for S2 Hosmer–Lemeshow H statistic is defined. It replaces the missing values by NaN (Not-a-Number) if variable is not measured. The model achieves 0.475 for S1 which is the eighth best solution and 12.820 for S2 which is the first best solution on set B. On set C, model has achieved 0.4928 score for event 1 (forth best solution) and 0.247 score for event 2 (third best solution). Di Marco et al. [19] have proposed a new algorithm for mortality prediction with better accuracy for data collected from the first 48 h of admission in ICU. A binary classifier model is applied to obtain result for event 1. The set A is selected which contains 41 variables of 4,000 patients. For feature selection forward sequential with logistic cost function is used. For classification a logistic regression model is used which obtained 54.9% score on set A and 44.0% on test set B. To predict mortality rate Ref. [20] has developed a model based on Support Vector Machine. Support Vector Machine is the machine learning algorithm which tries to minimize error and find the best hyperplane of maximum margin. The two classes represent 0 as survivor or 1 as died in-hospital. For training they read 3,000 data and for testing 1,000 data. They observed an over-fitting of SVM on set A and obtained 0.8158 score for event 1 and 0.3045 score for event 2. For phase 2 they set to improve the training strategies of SVM. They reduce the over-fitting of SVM. The final obtained for event 1 is 0.530 and for set B is 0.350 and for set C final score is 0.333. An algorithm based on artificial neural network has employed to predict patient’s mortality in the hospital in Ref. [21]. Features are extracted from the PhysioNet data and a method is used to detect solar ‘nanoflares’ due to the similarity between solar and time series data. Data preprocessing is done to remove outliers. Missing values are replaced by the mean value of each patient. Then the model is trained and yields 22.83 score for event 2 on set B and 38.23 score on set C. A logistic regression model is suggested in Ref. [22] for the purpose. It follows three phases. In phase 1 selection of derived variables on set A, calculation of the variable’s first value, average, minimum value, maximum value, total time, first difference and last value is done. Phase 2 has applied logistic regression model to predict patients in-hospital death (0 for survivor, 1 for died) on the set A. Third phase applies logistic regression model to obtain events 1 and 2 score. The results obtained are 0.4116 for score1 and 8.843 for score2. The paper [23] also reported a logistic regression model for the prediction of mortality. The experiment is done using 4,000 ICU patients for training in set A and 4,000 patients for testing in set B. During the filtering process it figures out 30 variables for building up model. Results obtained are score 0.451 for event 1 and score 2 45.010 for event 2. A novel cluster analysis technique is used in Ref. [24] to test the similarities between time series data for mortality prediction. For data preprocessing it uses a segmentation based approach to divide variables in several segments. The maximal and minimal values are used to maintain its statistical features. Weighted Euclidian distance based clustering and rule based classification is used. The average result obtained for death prediction is 22.77 to 33.08% and for live prediction is 75 to 86%.

      Sepsis is one of the reasons for high mortality rate and it should

Скачать книгу