New Publication by J Yu et. al. on Computers & GeosciencesCopyright: © Computers & Geosciences
Superpixel segmentations for thin sections: Evaluation of methods to enable the generation of machine learning training data sets
Training data is the backbone of developing either Machine Learning (ML) models or specific Deep Learning (DL) algorithms. The lack of well-labeled training image data has significantly impeded the development of novel DL methods like Convolutional Neural Networks (CNNs) in mineral thin section images identification. However, image annotation, especially pixel-wise annotation is always a costly process. Manually creating dense semantic labels for rock thin section images has been long considered as an unprecedented challenge in view of the ubiquitous variety and complexity of minerals in thin sections. To speed up the annotation, we propose a human–computer collaborative pipeline in which superpixel segmentation is used as a boundary extractor to avoid hand delineation of instances boundaries. The pipeline consists of two steps: superpixel segmentation using MultiSLIC, and superpixel labeling through a specific-designed tool. We use a cutting-edge methodology Virtual Petroscopy (ViP) for automatic image acquisition. Bentheimer sandstone sample is used to conduct performance testing of the pipeline. Three standard error metrics are used to evaluate the performance of MultiSLIC. The result indicates that MultiSLIC is able to extract compact superpixels with satisfying boundary adherence given multiple input images. According to our test results, large and complex thin section images with pixel-wise accurate labels can be annotated with the labeling tool more efficiently than in a conventional, purely manual work, and generate data of high quality.