An entropy-based attribute selection technique for interpretable breast cancer detection

dc.contributor.authorSubawickrama, H.D.A.W.
dc.contributor.authorUdagedara, U.G.I.G.K.
dc.contributor.authorNishantha, S.A.A.
dc.contributor.authorAbeysundara, S.P.
dc.date.accessioned2025-11-06T03:54:46Z
dc.date.available2025-11-06T03:54:46Z
dc.date.issued2025-11-07
dc.description.abstractBreast cancer is one of the most common and life-threatening cancers among women worldwide, with approximately 2.3 million new cases and over 600,000 deaths reported annually, according to the World Health Organisation. In Sri Lanka alone, nearly 3,000 new cases are diagnosed each year, underscoring the urgent need for reliable and efficient diagnostic techniques. Recently, data-driven methods have shown great potential in improving diagnostic accuracy and reducing subjectivity. This study introduces a Shannon entropy-based attribute selection approach for breast cancer detection using the Wisconsin Diagnostic Breast Cancer (WDBC) dataset of 569 instances. Unlike Principal Component Analysis (PCA), which transforms data into abstract components, the proposed method directly identifies informative attributes without altering the original data. Using the selected 13 attributes, a Support Vector Machine (SVM) classifier achieved 93.00% accuracy, with a precision of 0.95 (benign) and 0.86 (malignant), and a recall of 0.96 and 0.83, respectively. For comparison, PCA, with 13 principal components followed by SVM, yielded a slightly higher accuracy of 95.83%, but at the cost of interpretability, as PCA derived features lack clinical relevance. When the number of components was chosen via scree plot (k = 2), accuracy further decreased to 91.67%. Unlike PCA, the proposed entropy-based approach retains clinically meaningful attributes, such as mean radius and texture, making the results more interpretable for medical professionals. Additionally, it demonstrated greater computational efficiency by avoiding the matrix decomposition required in PCA. These findings suggest that entropy-based feature selection provides a more interpretable, efficient, and clinically relevant alternative to transformation-based methods for breast cancer detection. Future work will involve testing the method on large scale, real-time datasets to assess scalability and practical applicability in healthcare.
dc.identifier.citationProceedings of the Postgraduate Institute of Science Research Congress (RESCON)-2025, University of Peradeniya,p80
dc.identifier.issn3051-4622
dc.identifier.urihttps://ir.lib.pdn.ac.lk/handle/20.500.14444/6011
dc.language.isoen
dc.publisherPostgraduate Institute of Science (PGIS), University of Peradeniya, Sri Lanka
dc.relation.ispartofseriesVolume 12
dc.subjectAttribute selection
dc.subjectBreast cancer detection
dc.subjectClassification
dc.subjectPrincipal component analysis
dc.subjectShannon entropy
dc.subjectSupport vector machine
dc.titleAn entropy-based attribute selection technique for interpretable breast cancer detection
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
18 RESCON 2025 CMS-32.pdf
Size:
288.72 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed to upon submission
Description:

Collections