Oral
in
Affinity Workshop: Tiny Papers Oral Session 2
Utilizing Cross-Version Consistency for Domain Adaptation: A Case Study on Music Audio
Lele Liu · Christof Weiß
Deep learning models are commonly trained on large annotated corpora, often in a specific domain. Generalization to another domain without annotated data is usually challenging. In this paper, we address such unsupervised domain adaptation based on the teacher--student learning paradigm. For improved efficacy in the target domain, we propose to exploit cross-version scenarios, i.e., corresponding data pairs assumed to obtain the same yet unknown labels. More specifically, our idea is to compare teacher annotations across versions and use only consistent annotations as labels to train the student model. Examples of cross-version data include the same text by different speakers (in speech recognition) or the same character by different writers (in handwritten text recognition). In our case study on music audio, versions are different recorded performances of the same composition, aligned with music synchronization techniques. Taking pitch estimation (a multi-label classification task) as an example task, we show that enforcing consistency across versions in student training helps to improve the transfer from a source domain (piano) to unseen and more complex target domains (singing/orchestra).