In this thesis we tackle the problem of automatic music transcription of piano music. We wish to successfully transcribe piano notes played in an audio recording using machine learning techniques. We follow the latest developments in the field and implement a solution based on convolutional neural networks. In addition to training on annotated piano music datasets, we introduce a synthetic data generator that runs in real time during training and uses MIDI files to generate training spectrograms and groundtruth data. To train our models, we have collected a large set of MIDI files containing various genres of music. We also prepared a test set which comprises of 60 piano recordings of 6 different genres in addition to 10 recordings of classical music. We evaluate the results using different training methods. Frame-wise evaluation yields slightly better results using real piano test data than using synthetic data. We obtain better note-wise results without offsets using synthetic data, however note-wise evaluation yields superior results using real training data.
|