Skip to yearly menu bar Skip to main content


An interpretable error correction method for enhancing code-to-code translation

Min Xue · Artur Andrzejak · Marla Leuther

Halle B #111
[ ]
Wed 8 May 1:45 a.m. PDT — 3:45 a.m. PDT

Abstract: Transformer-based machine translation models currently dominate the field of model-based program translation. However, these models fail to provide interpretative support for the generated program translations. Moreover, researchers frequently invest substantial time and computational resources in retraining models, yet the improvement in translation accuracy is quite limited. To address these issues, we introduce a novel approach, $k\text{NN-ECD}$, which combines $k$-nearest-neighbor search with a key-value error correction datastore to overwrite the wrong translations of TransCoder-ST. This provides a decision-making basis for interpreting the corrected translations. Building upon this, we further propose $k\text{NN-ECS}_{m}$, a methodology that employs a distributed structure with $m$ sub-datastores connected in series, utilizing $m$ diverse experts for multi-round error correction. Additionally, we put forward a unified name rule, encouraging the datastore to focus more on code logic and structure rather than diverse rare identifiers. Our experimental results show that our approach improves the translation accuracy from 68.9\% to 89.9\% of TransCoder-ST (for translation from Java to Python). This error correction method augments program translation, overcoming the inherent limitations of Transformer-based code translation models, such as resource-intensive retraining requirements and uninterpretable outcomes.

Chat is not available.