Dropped Scheduled Task: Mitigating Negative Transfer in Multi-task Learning using Dynamic Task Dropping

Aakarsh Malhotra · Mayank Vatsa · Richa Singh


Paper PDF

Thumbnail of paper pages


In Multi-Task Learning (MTL), K distinct tasks are jointly optimized. With the varying nature and complexities of tasks, few tasks might dominate learning. For other tasks, their respective performances may get compromised due to a negative transfer from dominant tasks. We propose a Dropped-Scheduled Task (DST) algorithm, which probabilistically “drops” specific tasks during joint optimization while scheduling others to reduce negative transfer. For each task, a scheduling probability is decided based on four different metrics: (i) task depth, (ii) number of ground-truth samples per task, (iii) amount of training completed, and (iv) task stagnancy. Based on the scheduling probability, specific tasks get joint computation cycles while others are “dropped”. To demonstrate the effectiveness of the proposed DST algorithm, we perform multi-task learning on three applications and two architectures. Across unilateral (single input) and bilateral (multiple input) multi-task net- works, the chosen applications are (a) face (AFLW), (b) fingerprint (IIITD MOLF, MUST, and NIST SD27), and (c) character recognition (Omniglot) applications. Experimental results show that the proposed DST algorithm has the minimum negative transfer and overall least errors across different state-of-the-art algorithms and tasks.