Computer Science
INVESTIGATING THE EFFECT OF SOME SELECTED DISTANCE MEASURES ON THE PERFORMANCE OF QUERY-BY-IMAGE-CONTENT OVER MAMMOGRAM IMAGES
Authors: Ronke Babatunde1, Ayodele Oloyede2, Temitayo Fagbola3, Nwaocha Vivian4, Tola Ajagbe5, RilwanShanu6.
Affiliations:
1. Department of Computer Science KwaraState University, Malete, Kwara State, Nigeria
2. Department of Computer Science, Lagos State University, Ojo, Lagos, Nigeria
3. Department of Computer Science, Federal University, Oye-Ekiti, Nigeria
4. Department of Computer Sciences, Faculty of Sciences, National Open University of Nigeria, (NOUN) Nigeria
Abstract
Materials and Methods: The image data are acquired through Computerized Tomography (CT) scan, Magnetic Resonance Imaging (MRI) and mammogram. In this paper, a (QBIC) was experimented using selected distance measures to detect abnormality in mammogram images. The system was benchmarked with mini mammographic image analysis society (mini-MIAS) and breast cancer digital repository (BCDR) dataset. The experimental process includes thresholding and extraction of Region of Interest (ROI) from the mammogram using gray level co-occurrence matrix (GLCM). The extracted features were tested on Euclidean distance, Minkowski distance, Hamming distance, Mahalanobis distance, Cosine Similarity and Manhattan distance measures. The performance of the system on the distance measure was compared and evaluated on the datasets to determine the distance metric that could best identify abnormality in the samples.
Results: The empirical results reveal that Mahalanobis distance measure outperforms the others in terms of retrieval time (1.26s and 1.14s) and minimal error (0.004 and 0.002) respectively for both the mini-MIAS dataset as well as the BCDR dataset, based on the similarity of images retrieved when compared to queried images.
Conclusion: The implication from this research is that for a QBIC system, the choice of distance measure is an advantage over the use of classification algorithms which always requires train/test splits and validation.