This thesis provides a tool for visualization and analysis of music recordings using compositional hierarchical model. Model learns the concept of music tones from monophonic recordings, transparent insight into learned structures and also robust and fast processing of sound recordings. Model is extended with discriminative non-negative matrix factorization method. With this method we can get a really good fit for polyphonic recordings. We introduced various techniques for pitch hipothesis cleaning that improve final results. Model is evaluated on polyphonic piano recording database, vocal collection of folk music and synthesized collection of various instruments. We achieve very significant results using CHM and DNMF and use CHM as a basis for the web application. Application can be used to upload new sound recordings, learn and test new models, observe graphical representation of learned structures and piano roll view. Piano roll helps us analyze generated transcriptions, interactive editing, adding rhythmic annotations and export data for further manipulation using other software products.
|