$\textbf{Purpose:}$ Breast density is an important independent breast cancer risk factor. Breast density reading is not a part of the standard screening examination due to time considerations. Purpose of the thesis was to assess predictive power of computer algorithms for breast density reading using quantitative analysis of mammographic images. Our initial hypothesis was that raw images have higher predictive power than processed images due to preserved information of x-ray attenuation in breast tissue.
$\textbf{Data and methods:}$ We used 9252 pairs of raw and processed mamogramms recorded by Siemens scanners and 4787 processed images recorder by Hologic scanners. Images were part of the DORA data base. Breast segmentation and feature generation were performed using LIBRA software. We used statistical methods one-way ANOVA and mRMR for feature selection. Classifier was based on multinomial logistic regression. Predictive power was assessed by calculating coefficient $\kappa$. Our results were also compared to results from literature.
$\textbf{Results:}$ When classification to four density classes was performed, our algorithm on processed images scored a $\kappa = 0.65$ (95\% CI, 0.58-071) and our algorithm on raw images a $\kappa = 0.61$ (95\% CI, 0.58-071). When classification to dense/non-dense breasts was performed, our algorithm on processed images scored a $\kappa = 0.56$ (95\% CI, 0.52-0.60) and on raw images a $\kappa = 0.55$ (95\% CI, 0.51-0.60).
$\textbf{Conclusion:}$ We did not find any significant difference in predictive power between raw and processed mammograms in our study. Our models scored comparable $\kappa$ values in comparison with results from literature where they assessed agreement between radiologists. Missclassifications of our models were associated with dense lesions and faulty segmentation of breast tissue and pectoral muscle.
|