FlowGen: Synthesizing Diverse Flowcharts to Enhance and Benchmark MLLM Reasoning
Abstract
Flowcharts are widely used to represent processes and relationships through intuitive visual representations. However, accurately interpreting these diagrams remains challenging due to their structural complexity and high visual diversity. Existing flowchart datasets often lack fine-grained control over key properties such as graph complexity and rendering style, limiting their utility for training and testing of multimodal large language models (MLLMs) on visual reasoning tasks. To address these limitations, we introduce FlowGen, a controllable synthesizer that generates flowcharts that have customizable structural features and supports multiple renderer backends. FlowGen enables fine-grained control over graph properties such as graph order and size, branched arrows, and nested subgraphs, facilitating systematic evaluation of MLLMs’ capabilities. Extensive experiments on open-source and proprietary MLLMs show that training on FlowGen substantially improves flowchart parsing and question answering (QA), while also enhancing generalization to other public datasets. Furthermore, FlowGen provides challenging test datasets that expose consistent weaknesses in current MLLMs, particularly related to high structural complexity and varied rendering styles. Our code and data are publicly available at https://anonymous.4open.science/r/FlowGen-.