In this work, we address the problem of automatic analysis and segmentation of the ankle-brachial index (ABI) to support the diagnosis of peripheral arterial disease (PAD). The ABI is a non-invasive measurement computed as the ratio of systolic blood pressure in the ankles to that in the arms, and it is used to assess the vascular condition of the lower limbs. Conventional ABI measurements are prone to noise and subjective interpretation, which makes diagnosis challenging. Therefore, the aim of this thesis is to develop a model capable of recognizing patterns in time-series signals associated with PAD. The model is intended to distinguish between normal blood flow, arterial calcifications, and peripheral arterial disease, thereby providing clinicians with support for faster analysis of pulse-wave signals.
From the input data, which were provided as time-series and represented different states (normal, calcifications, PAD, noise), we extracted features using various extraction methods, enabling the capture of key information for further analysis and clustering. Basic features were generated using the sktime library and the SummaryTransformer method. For additional features, we used the Catch22 tool, designed for time-series analysis, and tsfresh.
We applied unsupervised learning, focusing on clustering the data with different algorithms, such as K-means, spectral clustering, Ward, and Gaussian Mixture Model. To evaluate clustering performance, we used metrics such as the silhouette coefficient, the Calinski-Harabasz index, and the Davies-Bouldin index.
Next, in the context of supervised learning, for which we manually annotated 400 data points, we performed classification to identify a subset of features that most contribute to clustering success. By using these features, we improved the clustering quality and achieved more balanced clusters. Here, we applied various performance metrics such as the Rand index, adjusted Rand index, adjusted mutual information, and the homogeneity, completeness, and V-measure metrics.
The results obtained showed that the subset of most relevant features contributed to the improvement of clustering performance. For the K-means algorithm, on this subset of features, we achieved, compared to the reference set, a higher silhouette coefficient (0.213 compared to 0.182) and a higher Calinski–Harabasz index (2075.410 compared to 1265.620), while maintaining a favorable Davies–Bouldin index value (1.478 compared to 1.614). The final results indicate that clustering allows for the differentiation of expected states (normal, calcifications, PAD, noise).
|