Unsupervised Transcription of Piano Music
MS Technical Paper
Audio signal processing has been a very active research area. Automatic piano music transcription, of all the tasks in this area, is an especially interesting and challenging one. There are many examples of how this technique can contribute to our life. For instance, in today’s music lessons and tests, we often rely on people’s hearing ability to judge whether a piano player performed well based on whether the notes played are accurate or not. The process requires man-power and is not always fair and accurate because people’s judgement is subjective. If a good automatic transcription system can be designed and implemented with high…show more content… To tackle this problem, source-separation techniques must be utilized.
2. Existing Approaches
In this section, we will discuss what has been done in this area of unsupervised music transcription. Undoubtedly there are different aspects to this task. And different ways and techniques are used in attempt to solve this problem efficiently and accurately. In an effort to provide a clear picture of what has been done, we will categorize different approaches based on technique used.
The classic starting point for the problem of unsupervised piano transcription where the test instrument is not seen during training, is a non-negative factorization of the acoustic signal’s spectrogram . Most research work has been improving on this baseline in the one of the following two ways: better modeling of the discrete musical structure of the piece being transcribed [2,3] or by better adapting to the timbral properties of the source instrument [4,5].
Combining the above two approaches are difficult. Hidden Markov or semi-Markov models are widely used as the standard approach to model discrete musical structures. This approach needs fast dynamic programming for inference. To combine discrete models with timbral adaption and source separation, it would break the conditional independence assumptions that dynamic programming rely on. Previous research work to avoid this inference problem typically postpones detailed modeling the discrete structure of timbre