In this work we focus on automatic transcription of polyphonic singing. In particular we do the multiple fundamental frequency (F0) estimation. From the terrain recordings a test set of Slovenian folk songs with polyphonic singing is extracted and manually transcribed. On the test set we try the general algorithm for multiple F0 detection. An interactive visualization of
the main parts of the algorithm is made to analyse how it works and try to detect possible issues. As the data set is new we cannot compare the results. Steps are made towards improvements of the algorithm. The magnitude spectrum weighting function is replaced with a simple linear function
but results in the degradation of the performance. Then we try to use double spectral whitening of the magnitude spectrum which turns out more promising, but still not satisfactory. A softer evaluation criteria shows that errors in performance might be due to the problematic test set, which has lots of intonation errors.
|