Skip to yearly menu bar Skip to main content


Poster

Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients

Brenden Petersen · Mikel Landajuela Larma · Terrell N Mundhenk · Claudio Santiago · Soo Kim · Joanne Kim

Keywords: [ reinforcement learning ] [ symbolic regression ] [ automated machine learning ]


Abstract: Discovering the underlying mathematical expressions describing a dataset is a core challenge for artificial intelligence. This is the problem of symbolic regression. Despite recent advances in training neural networks to solve complex tasks, deep learning approaches to symbolic regression are underexplored. We propose a framework that leverages deep learning for symbolic regression via a simple idea: use a large model to search the space of small models. Specifically, we use a recurrent neural network to emit a distribution over tractable mathematical expressions and employ a novel risk-seeking policy gradient to train the network to generate better-fitting expressions. Our algorithm outperforms several baseline methods (including Eureqa, the gold standard for symbolic regression) in its ability to exactly recover symbolic expressions on a series of benchmark problems, both with and without added noise. More broadly, our contributions include a framework that can be applied to optimize hierarchical, variable-length objects under a black-box performance metric, with the ability to incorporate constraints in situ, and a risk-seeking policy gradient formulation that optimizes for best-case performance instead of expected performance.

Chat is not available.