Machine Learning for Healthcare Applications. Группа авторов

Чтение книги онлайн.

Читать онлайн книгу Machine Learning for Healthcare Applications - Группа авторов страница 18

Machine Learning for Healthcare Applications - Группа авторов

Скачать книгу

few days and their health status. The health status of a set of users already defined, known as labeled users UL. Whereas the health status of other sets of users is not defined, known as unlabeled users UV. The aim of the proposed model is to learn a function that uses the information of the labeled users’ UL and find the health status of the unlabeled users UV.

      Given a series of activities from last t days, the objective is to learn a function F,

      2.4.1 Pre-Processing

      The daily life activities of an individual that are mainly considered are screen time, sleep time, physical activity, number of cigarettes smoked, units of alcohol consumed. The measures that are mainly considered are age, gender, height, weight, calorie intake. Thus, there are ten features that are collected from an individual. Then, in the pre-processing step, the number of features is reduced by removing the activities and measures that do not have any direct effect on health status. This is achieved by using the Harris-Benedict Equation [4].

      The Harris–Benedict Equation [4] is a method used to estimate an individual’s basal metabolic rate (BMR). It says

For Men BMR = (10 × Weight in kg) + (6.25 × Height in cm) − (5 × Age in years) + 5
For Women BMR = (10 × Weight in kg) + (6.25 × Height in cm) − (5 × Age in years) − 161

      As per the Harris–Benedict Equation [4], the calories to be consumed is depending on the BMR value and the physical activity.

Schematic illustration of architecture of the model.

       Calories to be consumed = BMR * Physical Activity

       Calories Difference = (Calories Consumed) − (Calories to be consumed).

      In the proposed method the number of features is reduced to seven. They are age, gender, sleep time, screen time, number of cigarettes, units of alcohol consumed, and calorie intake.

      2.4.2 Phase-I

      The Phase-I of the model, process the data received from both the data sources and the user. In this phase, a decision tree classifier is used to estimate the health parameter of the user. Initially, the model is trained with the dataset received from the data sources. The Phase-I of the model estimates the health status of an individual for a particular day. But an individual’s health status can’t be accurate just by considering one day’s output. In Phase-I the decision tree classifier is used, it takes the activities of an individual as input and produces the status of the health parameters for one day. Thus, the output of Phase-I is collected over a week and feeds it to Phase-II.

      2.4.3 Phase-II

      The Phase-II of the model, process the data received from the data sources and the output of the Phase-I. In this phase, the decision tree classifier is used to estimate the health parameter of the user. Initially, the model is trained with the dataset received from the data sources. The Phase-II of the model estimates the health status of an individual for a week. The output of Phase-II estimates the health status and generates the alerts and suggestions that are to be notified to the individual. In Phase-II the decision tree classifier is used, it takes the daily status of the health parameters over a week as input (i.e. the output of Phase-I) and outputs alerts & predictions of that health parameter.

      2.4.4 Dataset Generation

      Sub-section below provides the details of the rules collection and the dataset generation. The generated dataset is used for training the model proposed in the previous section.

      For preparing the datasets a proper set of rules is required on how the daily life activities of an individual affect his health status. The rules are collected from different trusted sources [5] and [1]. Based on the activities and measures of an individual, these rules give the overall health status of an individual. For example, the recommended sleep time for the person aged between 6 and 13 years is 9 to 11 h. If the sleep time is between 7 and 8, it is a little less than normal. if the sleep time is between 11 and 12, it is a little more than normal. if the sleep time is more than 12 or less than 8, then it affects health.

      Selecting the features from the rules that are collected and these rules depend on some activities and measures of an individual. For example, alcohol consumption rules for females are different from males. Similarly, the calorie value recommended for a person of 100 kg is different than that of a person of 50 kg [1]. In these examples, gender and weight are the features that are selected. In a similar fashion all the features like age, gender, height, weight, calorie intake, units smoked, units drunk, physical activity, screen time and sleep time were collected.

       2.4.4.3 Feature Reduction

      Although the features were collected, some of them might not affect the health status of a person directly. Thus, the collected features need to be transformed into the actual features which affect the health status. Here, the Harris–Benedict equation is used to reduce the features. The Harris–Benedict equation [4] is a method used to estimate an individual’s basal metabolic rate (BMR). It says that the calories to be consumed depends on the BMR value and physical activity.

      For example, If the physical activity is sedentary or a little active, then the calories to be consumed is 1.2 ∗ BMR. If the physical activity is lightly active, then the calories to be consumed is 1.375 ∗ BMR. If physical activity is moderate, then the calories to be consumed is 1.55 ∗ BMR. If physical activity is an intense exercise, then the calories to be consumed is 1.725 ∗ BMR. If physical activity is an extra hard exercise, then the calories to be consumed is 1.9 ∗ BMR.

      (2.1)image

      Thus, the total number of inputs is reduced to seven. They are Age, Gender, Number of units smoked, Units of Alcohol Consumed, Screen Time, Sleep Time, Calories Difference.

       2.4.4.4 Dataset Generation From Rules

Скачать книгу