Radiologists Versus Artificial Intelligence in Distinguishing Between Thyroid Nodules on Ultrasound Images
Karima Bahmane*, AKSASSE HAMID, Brahim Alkhalil Chaouki and Soukaina Wakrim
Abstract
Introduction: In order to distinguish between benign and malignant thyroid nodules on ultrasound pictures, we created three convolutional neural network (CNN) models as well as ensemble models. We then evaluated the diagnostic efficacy of CNN models against that of two radiologists.
Material and Methods: Between 2020 and 2022, we analyzed ultrasound pictures of 100 individuals who had 120 thyroid nodules that were verified by surgical pathology. In a test set, two radiologists used ultrasound scans to retroactively diagnose benign and malignant thyroid nodules. Using 80 and 40 thyroid nodule ultrasound images, respectively, three CNNs (ResNet50, DenseNet12, and VGGNet) were tested and trained-validated. Next, for the model ensemble, we choose the two models that performed the best diagnostically on the test set. Then, a comparison was made between the integrated model and the diagnostic performance of two radiologists utilizing three CNN models.
Results: 50 of the 120 thyroid nodules were benign, and 70 were malignant. For the diagnosis of thyroid cancer, two radiologists under the curves (AUCs) ranged from 0.659 to 0.754. The three CNN models and the ensemble model had AUCs ranging from 0.801 to 0.907 for the diagnosis of thyroid cancer. AUC differences were statistically substantial (p < 0.05) between the CNN models and the radiologists' models. With the highest AUC score was the ensemble model.
Conclusions: When it came to using ultrasonography to differentiate between benign and malignant thyroid nodules, three CNN models and an ensemble model outperformed radiologists. The ensemble model's diagnostic performance demonstrated good promise and continued to improve