MatSeek: An Automated Knowledge-Driven Framework for Materials Research
Jianguo Huang ⋅ Yuhao Lu ⋅ Yanchen Deng ⋅ Chendong Zhao ⋅ Penghui Yang ⋅ Zhonghan Zhang ⋅ Yixuan Li ⋅ yushan xiao ⋅ Cuntai Guan ⋅ Bijun Tang ⋅ Xinrun Wang ⋅ Zheng Liu ⋅ Bo An
Abstract
The discovery of advanced alloy materials increasingly depends on reliable and interpretable knowledge extracted from the scientific literature to guide data-driven composition–property optimization. While large language models (LLMs) have enabled automated database construction, existing approaches typically separate data extraction from relational scientific knowledge mining, limiting interpretability and physical grounding in materials design. Here we present $\textbf{MatSeek}$, an LLM-based framework that unifies structured alloy data and literature-derived scientific knowledge. MatSeek combines an automated pipeline for building high-quality alloy databases with a knowledge extraction module capturing empirical trends, mechanistic insights, and composition design principles. This knowledge can effectively accelerate machine-learning–driven alloy discovery by constraining exploration of composition space, while providing mechanistic explanations for model predictions. Applying MatSeek to 10,240 high-entropy alloy publications, we construct a database of 27,438 records and demonstrate efficient, interpretable identification of promising alloy compositions. MatSeek establishes a unified, literature-grounded paradigm for knowledge-driven materials discovery.
Chat is not available.
Successful Page Load