What the study evaluated
The study evaluated a deep learning–based multi-class classifier using the CoAtNet architecture for melanoma detection from dermatoscopic images. In addition to classification performance, the study focused on model explainability, using Grad-CAM visualisation to assess whether the network’s predictions are driven by clinically relevant image regions.
Study results in clinical practice
The model achieved high classification performance across malignant and benign skin lesion categories and demonstrated consistent activation of clinically meaningful regions in explainability maps. In practice, this supports the use of deep learning as a decision-support and educational tool, helping clinicians understand why a lesion is flagged as suspicious. However, the study is preclinical and does not evaluate real-world diagnostic workflow integration or patient outcomes.
Key numbers
Images analysed: 6,826 dermatoscopic images
Test set: 300 images (melanoma, non-melanoma cancer, benign lesions)
Overall precision: 90.1%
Overall recall: 89.5%
Average precision (AP): 92.3%
Melanoma class recall: 87.5%
Melanoma is considered to be the most aggressive form of skin cancer. At present, the evaluation of malignancy is performed primarily by invasive histological examination of the suspicious lesion. Developing an accurate classifier for early and efficient detection can minimize and monitor the harmful effects of skin cancer and increase patient survival rates. Due to the similar shape of malignant and benign cancerous lesions, doctors spend considerably more time when diagnosing these findings. However, using a deep learning approach as a computer vision tool can overcome some of the challenges. This paper proposes a multi-class classification task using the CoAtNet architecture, a hybrid model that combines the deep depthwise convolution matrix operation of traditional convolutional neural networks with the strengths of Transformer models and self-attention mechanics to achieve better generalization and capacity. The model was evaluated based on precision, recall, and AP. The proposed multi-class classifier achieves an overall precision of 0.901, recall 0.895, and AP 0.923, indicating high performance compared to other state-of-the-art networks. The proposed approach should provide a less complex framework to automate the melanoma diagnostic process and speed up the life-saving process.





