Machine Learning Algorithms and Applications. Группа авторов

Чтение книги онлайн.

Читать онлайн книгу Machine Learning Algorithms and Applications - Группа авторов страница 18

Machine Learning Algorithms and Applications - Группа авторов

Скачать книгу

80.2 × 103 30 × 103 0.0077 99.8115% 99.7981% Schematic illustration of the result of egg classification generated by the proposed method.

      In comparison with the conventional method of extracting egg count information using digital images that hardly require any training data, the proposed method that employs the CNN technique required large datasets to learn the features automatically to provide the required results. The CNN method uses plenty of training data along with test and validation datasets as the number of hidden layers increases.

      There are many datasets available for free that can be downloaded to train our own CNN models to classify handwritten digits, identify objects, and many more. But there is no single public dataset available corresponding with the sericulture field especially silkworm egg counting or classification. So, in our work, training datasets were generated by cropping class images from the silkworm egg sheet and providing class labels and other features that are necessary for CNN training such as egg center location. Over 400K image set was generated for egg location and FB class and over 100K image set for individual classes (HC and UHC). Also, data augmentation is implemented to increase the datasets.

      The model performance drops to newer egg data that are completely different in color and texture, which were not available in the training dataset. This happens due to the nature of different breed eggs that are spatially different from the trained model. Collecting and training a deep learning model to a different breed of silkworm eggs will resolve these issues, which is under action.

Test sample True count Count prediction Time (sec) Class scores Accuracy (%)
HC UHC
MSR1_001.jpg 588 586 11.83 437 149 99.65
MSR1_002.jpg 534 526 8.99 473 53 98.68
MSR1_003.jpg 554 556 10.42 491 65 99.28
MSR1_004.jpg 539 528 9.81 501 27 97.95
MSR1_005.jpg 597 588 11.14 562 26 98.32

      In this paper, CNN-based silkworm egg counting and classification model that overcomes many issues found with conventional image processing techniques is explained. The main contribution of this paper is in fourfolds. First, a method to generalize the method of capturing silkworm egg sheet data in a digital format using normal paper scanners rather than designing extra hardware, which eliminates the need for additional light sources to provide uniform illumination while recording data and maintain high repeatability.

      Second, the scanned digital data can be transformed into standard size by using key markers stamped onto the egg sheets before scanning. This allows the user to resize the dimension of digital data and later use it in an image processing algorithm or CNN without introducing dimensionality error.

      A dataset has been put together containing over 400K images representing different features of silkworm eggs. The CNN and other models that need a lot of training, testing and validation data can easily use this dataset to skip the data generation phase which is the third contribution.

      Fourth, a CNN model has been trained using the dataset that is designed to predict the egg class and count the number of eggs per egg sheet. With over 97% accuracy the model outperforms many conventional approaches with only 4 hidden layers and a fully connected layer.

      The model performs accurately in quantifying (counting) different breed silkworm eggs, but new datasets become necessary to predict the class labels for new silkworm breed for which the model is not trained. This is because HC class eggs have high pixel intensity throughout the egg surface while UHC has dark pixels at the center surrounded

Скачать книгу