LEAKAGE-FREE HYBRID DEEP FEATURE FUSION AND CATBOOST FOR GASTRIC CANCER DETECTION IN HISTOPATHOLOGICAL IMAGES
Keywords:
Gastric cancer; histopathology image classification; GasHisSDB; deep feature fusion; CatBoost; explainable artificial intelligence; LIMEAbstract
Gastric cancer is still a significant global health problem and histopathological examination plays a crucial role in the definitive diagnosis. But, manual microscopic examination is time consuming and may be inconsistent between different observers, which has led to the desire for reliable computer-aided diagnostic systems. In this work, a leakage-free hybrid scheme for binary classification of gastric histopathology images in “abnormal” class and “normal” class based on GasHisSDB160 data set is proposed. To ensure no overlap between the data partitions, a group-aware split was applied to partition 33,284 image patches into training, validation, and independent test sets. The framework is a mix of custom Convolutional Network (CNN), ResNet50V2 and MobileNetV2. The two strategies that were tested were an end-to-end deep learning classifier and a CatBoost classifier trained on PCA-reduced fused features where no leakage was observed. The deep learning model obtained a test accuracy of 97.94%, macro F1-score of 97.85% and ROC-AUC of 99.78%. The proposed CatBoost model achieved maximum test accuracy of 98.28%, macro F1-score: 98.21% and Matthews correlation coefficient (MCC): 96.41% with 86 test images misclassified out of the total of 4,993 images. LIME visual explanations also helped to explain decisions made by the model. The findings show that leakage-aware deep feature fusion with CatBoost is a competitive and explainable method in the support of gastric cancer screening in digital pathology.













