What the study evaluated
This retrospective multi-reader study assessed a deep learning-based automatic detection algorithm (Carebot AI CXR v2.00) for identifying suspicious pulmonary lesions on chest X-rays at a specialised oncology centre. Five radiologists of varying experience reviewed a curated set of 300 CXRs and their performance was compared with the AI algorithm. The aim was to determine whether AI improves lesion detection compared to conventional radiologist interpretation in a high-clinical-risk specialised setting. MDPI
Study results in clinical practice
The deep learning model achieved significantly higher sensitivity than all participating radiologists, meaning it detected more true lesion cases and reduced the risk of false negatives that might be missed in standard review. Specificity of the AI was lower than that of radiologists, indicating more false positives. In practice, this suggests the AI functions as an effective decision-support tool that enhances sensitivity for pulmonary lesions — particularly valuable in oncology settings where early detection impacts outcomes — while final interpretation remains with the physician. PubMed
Key numbers
CXRs assessed: 300 from a specialised oncology centre
AI (DLAD) sensitivity: 91.0% (CI: 85.4–96.6%)
Radiologists’ sensitivity range: ~29.0%–81.0%
AI specificity: 77.5%
Radiologists’ specificity: ~97.0%–100%
Significance: AI sensitivity was significantly higher than all radiologists (p < 0.001) M
Chest X-ray (CXR) is considered to be the most widely used modality for detecting and monitoring various thoracic findings, including lung carcinomas and other pulmonary lesions. However, X-ray imaging shows particular limitations when detecting primary and secondary tumors, and are prone to reading errors due to limited resolution and disagreement between radiologists. To address these issues, we developed a deep learning-based automatic detection algorithm (DLAD) to automatically detect and localize suspicious lesions on CXRs. Five radiologists were invited to retrospectively evaluate 300 CXR images from a specialized oncology center, and the performance of individual radiologists was subsequently compared with that of DLAD. The proposed DLAD has achieved significantly higher sensitivity (0.910 (0.854-0.966)) than that of all assessed radiologists (RAD 1 0.290 (0.201-0.379), p<0.001, RAD 2 0.450 (0.352-0.548), p<0.001, RAD 3 0.670 (0.578-0.762), p<0.001, RAD 4 0.810 (0.733-0.887), p=0.025, RAD 5 0.700 (0.610-0.790), p<0.001). The DLAD specificity (0.775 (0.717-0.833)) was significantly lower than in all assessed radiologists (RAD 1 1.000 (0.984-1.000), p<0.001, RAD 2 0.970 (0.946-1.000), p<0.001, RAD 3 0.980 (0.961-1.000), p<0.001, RAD 4 0.975 (0.953-0.977), p<0.001, RAD 5 0.995 (0.985-1.000), p<0.001). The study results demonstrated that the proposed DLAD could be utilized as a decision-support system to reduce radiologists' false negative rate.





