Transcription of music is a complex process of transcribing an audio recording into a symbolic notation. The goal of this thesis was to examine transcription of piano music with deep learning, for which three models of deep neural networks were implemented: multilayer perceptron, convolutional neural network and deep belief network. Through the use of deep belief network, unsupervised pretraining for automatic extraction of musical features from audio signals was also tested. Learning of these models and evaluation of transcription was performed with MAPS database for piano music transcription. A comparison between Fast Fourier Transform and Constant Q Transform for data pre-processing was also carried out. Final results show that deep learning with an appropriate learning schedule is potentially a powerful tool for automatic transcription of music.
|