Optimal Robust Subsidy Policies for Irrational Agent in Principal-Agent MDPs
Bowen Hu ⋅ Yixin Tao
Abstract
We study a principal-agent problem in a Markov Decision Process where the principal provides subsidies to influence the agent's policy, which in turn determines the accrued rewards. Our focus is on designing a robust subsidy scheme that maximizes the principal’s cumulative expected return, even when the agent displays bounded rationality and may deviate from the optimal action policy after receiving subsidies. As a baseline, we first analyze the case of a perfectly rational agent and show that the principal’s optimal subsidy coincides with the policy that maximizes social welfare, the sum of the utilities of both the principal and the agent. We then introduce a bounded-rationality model: the globally $\epsilon$-incentive-compatible agent, who accepts any policy whose expected cumulative utility lies within $\epsilon$ of the personal optimum. In this setting, we prove that the optimal robust subsidy scheme problem simplifies to a one-dimensional concave optimization, revealing that optimal subsidies concentrate along social-welfare-maximizing trajectories. We also bound the associated loss in social welfare. Finally, we investigate a finer-grained, state-wise $\epsilon$-incentive-compatible model. In this setting, we show that under two natural definitions of state-wise incentive-compatibility, the problem becomes intractable: one definition results in a non-Markovian agent action policy, while the other renders the search for an optimal subsidy scheme NP-hard.
Successful Page Load