A Framework for Prompt Optimization and Translation Across Foundation Models
Abstract
Foundation-model upgrades frequently break deployed prompt-based systems: target models differ in chat-template conventions, multimodal interfaces, context limits, and structured-output reliability. We study cross-model prompt adapta- tion: given a prompt program validated on a source model, produce a target- model prompt that preserves a semantic contract and an interface contract un- der bounded regression risk. We propose a governed, hierarchical adaptation framework that decomposes prompts into transferable semantic components and model-dependent structure and interface components, and optimizes only the non- transferable parts via budgeted search over system-level (L0) and template-level (L1) factors. Our optimization objective combines task utility with hard feasibil- ity constraints (schema validity, parseability, policy compliance) and a risk penalty capturing output instability under stochastic decoding. On a large-scale structured prediction workload (128K labeled instances across text and multimodal settings), automated prompt translation matches expert human prompts while reducing man- ual iteration by 97%. Across varied model families, we observe consistent trans- fer patterns: semantic directives transfer reliably, whereas schema enforcement and provider-specific formatting require targeted adaptation; multimodal ground- ing improves recall but shifts the cost–performance frontier. These results frame prompts as portable programs and provide an auditable recipe for reliable pre- deployment prompt adaptation before upgrading foundation models in real-world deployments.