Poster
in
Workshop: 5th Workshop on practical ML for limited/low resource settings (PML4LRS) @ ICLR 2024
Conditional Transformer Fine-Tuning by Adaptive Layer Skipping
Xingjian Zhang · Jiaxi Tang · Yang Liu · Xinyang Yi · Li Wei · Lichan Hong · Qiaozhu Mei · Ed H. Chi
In recent years, deep learning have achieved significant success across various domains, such as natural language processing and computer vision. Despite their advancement, most of the deep neural networks assign uniform computation costs to all inputs regardless of their complexity.Focusing on Transformer architecture, our study addresses this challenge by introducing a sequence-level conditional fine-tuning framework through adaptive layer skipping.The proposed framework dynamically adjusts the computation based on the complexity of input sequence and is tailored for modern accelerators like TPU/GPUs.We examined several measurements on input complexity and found one to be very effective on guiding the conditional computation.The experiment results on synthetic and real-world datasets demonstrate the effectiveness of our methodology by achieving a substantial reduction in training time while maintaining the same predictive performance.