LLM-Driven Correlation-Aware Tournaments
Abstract
Classical portfolio construction frameworks such as mean–variance optimization, factor models, and convex risk–return formulations are intrinsically mathematical. They require explicit objective functions, parametric assumptions, and numerical solvers. In contrast, many institutional and high-net-worth portfolios are constructed through a primarily qualitative process, where analysts synthesize business fundamentals, competitive positioning, secular themes, and risk narratives rather than solving closed-form optimization problems. This paper introduces Hierarchical Cluster Tournaments (HCT), a qualitative, correlation-aware stock selection framework powered by large language models. Starting from a large asset universe such as the S&P 500, HCT constructs a hierarchical correlation dendrogram using nonlinear eigenvalue shrinkage and Ward-linkage clustering. Each internal node of the tree is repurposed as a tournament match between two clusters. A large language model, supplied exclusively with structured fundamental evidence and long-horizon investment criteria, allocates a fixed number of selection slots between competing clusters. At the leaf nodes, the model conducts local intra-cluster tournaments to identify the strongest representatives within each correlation regime. This process yields a fundamentals-grounded, correlation-aware survivor set with uniform weights, without defining expected returns, estimating covariances for optimization, or invoking numerical solvers. By embedding qualitative judgment inside a structurally disciplined correlation framework, HCT formalizes discretionary stock selection into a transparent, auditable, and systematically diversified pipeline. The proposed method bridges the gap between institutional qualitative practice and systematic portfolio design, offering a text-driven alternative to traditional quantitative selection mechanisms.