← Back to Presentations
Online

Vision Transformer vs. ResNet-101: An Explainable Deep Learning Approach for Breast Cancer Detection in Ultrasound Images

Speakers: Lipismita Panigrahi

Track: Track 7: Pattern Recognition, Computer Vision and Image Processing

📑 No Slides 🎬 No Video

Abstract

Breast cancer remains a significant global health concern, where early and accurate diagnosis is paramount for improving patient survival rates. This paper presents a comparative analysis of two deep learning architectures, the Convolutional Neural Network (CNN) based ResNet-101 and the Vision Transformer (ViT), for the classification of breast ultrasound images into benign, malignant, and normal categories. Addressing the common challenge of limited data, we employed a data augmentation strategy to expand a benchmark dataset of 780 images to over 10,000 images, creating a robust training set. Both models were trained on this augmented dataset, achieving test accuracies of 98.64% for the Transformer model and 97.57% for Resnet-101 model. The result indicates that the ViT model achieved higher accuracy than the ResNet-101. Furthermore, the existing Deep learning models are black box models. To enhance model transparency and build clinical trust, Gradient-weighted Class Activation Mapping (Grad-CAM), an Explainable AI (XAI) technique, is utilized to generate visual heatmaps, highlighting the specific regions in the ultrasound images that were most influential in the models’ diagnostic decisions. The proposed model harnesses GPU-based parallel infrastructure.

Speakers

Lipismita Panigrahi
Assistant Professor
SRM University-Amaravati

Details

Type
Online
Model
OFFLINE
Language
EN
Timezone
UTC+8
Views
145
Likes
31