Application of Region-Based Convolutional Neural Network (R-CNN) for Prompt Segmentation between Infected Cucumber Leaves and Healthy Cucumber Leaves

: Plant diseased leaf image segmentation plays an important role in the plant disease detection through leaf symptoms, and early separation of infected and healthy plant leaves from each other can prevent horticulture loss. To achieve this goal, Region-Based Convolutional Neural Network (R-CNN) for prompt segmentation between infected cucumber leaves and healthy cucumber leaves was proposed and applied. A whole color cucumber leaf image is inputted into the convolutional neural network of the Mask R-CNN model, thereafter, the extracted features in their map are passed to the region proposal network for region proposals, the proposed regions of interest in their unaligned form are aligned before passing them to the fully connected layers in a fixed size feature map for the following actions: (1) bounding boxing, classification, and masking. The experimental results obtained in this work are on a par with the results obtained in literature, which demonstrates the effectiveness of the proposed method and high practical value for plant growth monitoring.

In horticulture, various plant diseases exist that negatively affect plants and their yields in quantity and quality. Detection of plant diseases automatically through symptoms show in plant leaf is a comparatively effective and affordable solution as it can assist in reducing the stress of monitoring plant activities in a typical horticulture environment. Segmentation of leaf image is quite essential in detecting plant diseases; reliability and accuracy of the image feature extraction and disease recognition are determined by this segmentation process. Some applications of Region-Based Convolutional Neural Network (R-CNN) to image segmentation are available in literature (He et al., 2020;Bello et al. 2021a;Bello et al. 2021b;Bello et al. 2021c). An algorithm for segmenting the image of a leaf disease was presented by Al-Hiary et al. (2011), in which pixels that are mostly with green color are initially identified and masked using Otsu's method based on specific computed threshold values. The outcome of this experiment is the complete removal of the pixels with zero values of RGB and the pixels on the infected cluster boundaries. A segmentation method that is based on clustering algorithm of K-means for segmenting the image of grape diseased leaf was proposed by Kaur et al. (2015). Chaudhary et al. (2012) proposed an algorithmic method, and compared the effect of HIS, CIELAB and YCbCr color spaces in detecting and segmenting diseased spot in plant leaf using image processing techniques. A method based on Fuzzy C-means clustering for segmenting cotton diseased leaf was proposed by Hanping (2008). This is similar to the work of Meunkaewjinda et al. (2008) where grape leaf diseased image was segmented by clustering algorithm of unsupervised optimal Fuzzy C-means. Also, Qin et al. (2016) proposed the integration of twelve lesion segmentation methods and clustering algorithms for recognition of alfalfa leaf disease. The threshold, edge detection, watershed, K-means BELLO, R. W; ADIPERE, G. F.
clustering and Fuzzy C-means clustering were compared in Mohammad et al. (2016) for detection of plant disease. They posited that that algorithm of fuzzy clustering is appropriate for task involving overlapping clustering. Based on the above literature, it is established that image segmentation of diseased leaf is not an easy task but, a herculean task due to the ambiguity in image feature representation and complexity of the diseased leaf image including the occlusion of some regions. Most segmentation algorithms of diseased leaf make use of a criterion or fixed threshold to differentiate spot of leaf image by gray-level differences among the following pixels: spot pixels, normal pixels and background pixels. But in practicality, it is difficult to detect the regions of pixels in the image of a diseased leaf because they are changeable; this is addition to the uneven and color blurriness of normal and spot regions, and the occlusion of the gray histogram of the image of diseased leaf. Therefore, a fixed threshold, histogram clustering, Otsu and level-set algorithms cannot effectively segment a lesion image (Li et al., 2010). Superpixel clustering has been applied in computer vision and image processing (Zhang et al., 2020). The clustering ability of superpixels enables neighboring pixels with similarity in image features such as color and texture to be grouped into the same clustering regions, whereby image pixels complexity can be handled and reduced from their huge numbers to only a number of superpixels (Gui et al., 2015). Each of the different approaches, namely graph-based or gradientascent approaches for generating superpixels has application advantages and disadvantages (Ibrahim and El-kenawy, 2020;Niu et al., 2018;He et al., 2022). The compact representation of an original color of the image of a diseased leaf can be provided by a superpixel clustering technique in diseased leaf segmentation. This is in addition to more information that is contained in superpixels of a diseased leaf image than it contains in individual pixels, for the overall benefits in terms of computational speed, efficiency, and memory cost (Zhang et al., 2020). The Expectation Maximization (EM) algorithm on the other hand is often employed for segmentation of color image for its effectiveness (Channoufi et al., 2018). However, its initial parameters are not easy to determine. Instead of implementing the algorithm of EM on whole image, its implementation on each superpixel will allow easy estimation of the initialize parameters, because of similarity and compactness of the pixels in a superpixel. Moreover, EM does not converge on time when its initial parameters are randomly set for whole image which greatly affects its speed and its practical applications. In this work, Region-Based Convolutional Neural Network (R-CNN) was applied for prompt segmentation between infected cucumber leaves and healthy cucumber leaves.

MATERIALS AND METHODS
Cucumber Dataset Used: Cucumber plant diseased leaf dataset is the data type required in this work. This dataset is made up of cucumber leaf images and other complicated background objects, and is used for training the neural network for image segmentation experiment, classification experiment, and their evaluations. One thousand images of cucumber plant diseased leaf were employed for the experiments out of which eight hundred images were used as training dataset, and the remaining two hundred images were used as testing dataset. We trained our proposed model for the image segmentation and classification of powdery mildew disease on the cucumber diseased leaf datasets of Talasila et al. (2022), which we augmented.
Software and Application: The software and application specifications used in conducting this experiment are as follows: (1) Software; 64-bit Windows 10 Operating System, Jupyter IDE, and Open CV Python library, (2) Hardware; Intel Core i5 processor@2.4GHz CPU, 16 Gigabytes RAM, GeForce GTX 1080 Ti Graphics card, 2 Terabytes hard-disk, and 10.1 inch IPS HD Portable LCD Gaming Monitor PC display VGA HDMI interface for PS3/PS4/XBOx360/CCTV/Camera. The model hyper-parameters used in conducting this experiment are as follows: 0.001 learning rate, 0.0001 weight decay, 0.90 learning momentum, 200 batch size, ≥0.50 confidence of detection, 5 batches, 5 epochs, 5 iterations per epoch, 1000 steps per epoch, 5 validation step, 28×28 mask shape, and 2 anchor classes (instance and background).

Framework of Mask R-CNN for Color Diseased Leaf
Image Segmentation: Two-stage procedure of image segmentation is adopted by Mask R-CNN (He et al., 2020), with Region Proposal Network (RPN) as the identical first stage. Mask R-CNN outputs a mask that is binary in form for each RoI in parallel to the prediction of the class and box offset of the diseased leaf image; this process forms the second stage. This is different from other systems, where classification of object depends on mask predictions. The approach of Mask R-CNN in segmenting diseased leaf image follows the architecture of Fast R-CNN (Girshick, 2015), which makes use of bounding-box classification and regression in parallel for simplification of the multi-stage pipeline of R-CNN (Girshick et al., 2014). Formally, during training of Mask R-CNN for diseased leaf image segmentation, a multi-task loss is defined on each sampled RoI as follows L = Lcls + Lbox + Lmask (1) The classification loss (Lcls) and bounding-box loss (Lbox) are identical as those defined in (Girshick, 2015). The branch that contains the mask has a Km 2 dimensional output for each RoI that encodes K binary masks of resolution m × m, one for each of the K classes. To this, we applied a per-pixel sigmoid, and defined Lmask as the average binary cross-entropy loss. For a RoI associated with ground-truth class k, Lmask is only defined on the k-th mask; there is no contribution from other mask outputs to the loss. The network is able to generate masks for every class by the definition of Lmask without competition among classes; the prediction of the class label used in selecting the output mask is made possible by relying on the dedicated classification branch; whereby decoupling mask and class prediction. This is in contrast to common practice of applying FCNs (Long et al., 2015) to semantic segmentation that typically employs a per-pixel softmax and a multinomial cross-entropy loss, which causes conflict among the masks across classes. Mask R-CNN do not posses this ugly characteristics due to available of a per-pixel sigmoid and a binary loss. By the experiment conducted in this work, it is revealed that this formulation is important for accurate instance segmentation results. In color diseased leaf image segmentation, Mask R-CNN, as a segmentation method is used to provide a compact representation of the plant diseased leaf image and further provides the region proposals using the region proposal network (RPN) with region of interest alignment (ROI) for feature map matching before fully connected layers (FCL) operation is carried out on the fixed size feature map. The fully connected layers belong to the head region of the model and they are divided into bounding box, class and mask branch as illustrated in Fig. 1. In the figure, the cucumber diseased leaf image (Fig. 1a) is inputted into the model for features extraction by the CNN layers for further processing across the networks until it gets to the last stage of the fully connected layers where it is generated as a masked cucumber diseased leaf image output (Fig. 1b).

Plant Diseased Leaf Image Region Proposal and Region of Interest Alignment:
Extraction of a small feature map (e.g., 7×7) is possible using different methods. RoIPool (Girshick, 2015) (Jaderberg et al., 2015) in each RoI bin at four regularly sampled locations, and the result is aggregated using max or average. There is great improvement in RoIAlign when compared to the operation of RoIWarp proposed in Dai et al. (2016). Unlike RoIAlign used in Mask R-CNN, RoIWarp did not consider the issue of alignment and it was employed and considered in Dai et al. (2016) as quantizing RoI just like RoIPool. Despite the fact that RoIWarp also adopts bilinear resampling inspired by Jaderberg et al. (2015), its performance is on a par with RoIPool, showing the extreme importance of alignment.

Plant Diseased Leaf Classification by Masking:
The inputted diseased leaf image's spatial layout is encoded by a mask. Thus, in contrast to class labels or box offsets, which are certainly broken down into short output vectors by fully-connected layers (FCL), the extraction of the masks' spatial structure is possible naturally by the convolutions-based pixel-topixel correspondence. By using fully connected networks (FCN), the prediction of an m × m mask is made possible from each RoI.
This arrangement helps maintaining the explicit m × m object spatial layout of each layer that is in the mask branch without breaking it down into a vector representation that does not have spatial dimensions. In contrast to the previous methods that prefer using FCL for predicting mask, fewer parameters are required by the fully convolutional representation, and experiments in this study demonstrate its superior accuracy. The behavior of pixel-to-pixel necessitated the alignment of the RoI features in order to preserve the explicit per-pixel spatial correspondence. As a result of this, the developed RoIAlign layer was applied as key-part of the Mask R-CNN model for predicting the mask of the plant diseased leaf image as shown in the head region of Fig. 1. Fig. 2 shows the workflow of the proposed model.

RESULTS AND DISCUSSION
To evaluate the performance of the proposed model in this work, the performance evaluation metric and benchmarking are presented. Mean Average Precision (mAP) (Lin et al., 2014) is employed as the metric for evaluating the performance of the instance segmentation model. Intersection over Union (IoU) is used in the instance segmentation problem for measuring the overlapping rate between the value that is predicted and the value that represents the groundtruth. "The IOU equation is as follows

Area of Union
(2) If the value of the instance produced matches with many values of the ground-truth, the instance that has highest IOU score is chosen. The values of IOU used in this work are from 0.50 to 0.95 with mAP at X notation, where X is the value of threshold used to compute the metric. The precision-recall formula is as follows Average Precision (AP) is calculated upon producing precision-recall points using the various values of the IOU threshold. AP is calculated as follows Where N is the number of precision-recall points produced, P(n) and R(n) are the precision and recall with the lowest nth recall respectively.
Where = the AP of class i, and N = the number of classes Performance Evaluation of the Proposed Method: Fig.  3(b) shows the qualitative (visual) result of the experiment conducted on the original image of cucumber diseased leaf (Fig. 3a). In the image, only the masks are generated for the individual leaf after completing the segmentation experiment process. The bounding boxes, the confidence score and the class of cucumber disease are shown in Fig. 4.

Results Comparison of the Proposed Method and the Benchmark Methods:
The results obtained by the proposed method in this work are on a par with the ones obtained in literature. Table 1 shows the quantitative result of the experiment and comparison with the benchmark results. Fig. 5 shows the segmentation of lesion from cucumber leaf image by K-means clustering, and Fig. 6 shows the segmentation of lesion from cucumber leaf image by Fuzzy C-means clustering. As showed in Fig. 5 and Fig. 6, only the lesions were able to be detected in the image by the existing methods with inability to detect the class (name) of the cucumber disease unlike what is obtained by the proposed method in this work as shown in Fig. 3(b) and Fig. 4 where individual cucumber diseased leaf is accurately segmented and classified for the technical separation of the infected and healthy cucumber leaves.
The detection of cucumber disease through leaf image segmentation is important for cucumber growth monitoring. The proposed method in this work achieved higher qualitative and quantitative results than the results presented in Zhang et al. (2019), where EM algorithm, simple linear iterative clustering, superpixels clustering, K-means clustering, and Fuzzy C-clustering are presented. With all the figures presented in this section, it can be concluded that Fig.  3(b) and Fig. 4 performed better than Fig. 5 and Fig.  6, where Fig. 3(b) shows the generation of masks that segment individual cucumber diseased leaves, and Fig. 4 shows the class of the cucumber leaf disease.   Conclusion: Instance segmentation method for separating infected and healthy cucumber leaves has been proposed in this work. The detection and classification of cucumber leaf disease based on segmentation of cucumber diseased leaf image for a more accurate monitoring of cucumber growth have also been presented. The segmentation was carried out by using Mask R-CNN's two-stage procedure of image segmentation. Evaluation results show that Mask R-CNN performs better in segmenting and classifying the cucumber leaf disease than other benchmark methods compared in this work. Therefore, the proposed method Mask R-CNN has improved on the state-of-the-art methods for leaf image segmentation and separation of infected leaves from the healthy leaves.