Affiliation(s): 1Nanyang Technological University, Singapore;
moreAffiliation(s): 1Nanyang Technological University, Singapore; 2E Fund Management Co., Ltd., Guangzhou 510000, China; 3National University of Singapore, Singapore;
less
Junjie ZHANG1, Liyuan CHEN2, Shuoling LIU2, Tongzhe ZHANG2, Yuchen SHI2,3. A survey on large language model-based alpha mining[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2500386
@article{title="A survey on large language model-based alpha mining", author="Junjie ZHANG1, Liyuan CHEN2, Shuoling LIU2, Tongzhe ZHANG2, Yuchen SHI2,3", journal="Frontiers of Information Technology & Electronic Engineering", year="in press", publisher="Zhejiang University Press & Springer", doi="https://doi.org/10.1631/FITEE.2500386" }
%0 Journal Article %T A survey on large language model-based alpha mining %A Junjie ZHANG1 %A Liyuan CHEN2 %A Shuoling LIU2 %A Tongzhe ZHANG2 %A Yuchen SHI2 %A 3 %J Frontiers of Information Technology & Electronic Engineering %P %@ 2095-9184 %D in press %I Zhejiang University Press & Springer doi="https://doi.org/10.1631/FITEE.2500386"
TY - JOUR T1 - A survey on large language model-based alpha mining A1 - Junjie ZHANG1 A1 - Liyuan CHEN2 A1 - Shuoling LIU2 A1 - Tongzhe ZHANG2 A1 - Yuchen SHI2 A1 - 3 J0 - Frontiers of Information Technology & Electronic Engineering SP - EP - %@ 2095-9184 Y1 - in press PB - Zhejiang University Press & Springer ER - doi="https://doi.org/10.1631/FITEE.2500386"
Abstract: alpha mining, which refers to the systematic discovery of data-driven signals predictive of future cross-sectional returns, is a central task in quantitative research. Recent progress in large language models (LLMs) has sparked interest in LLM-based alpha mining frameworks, which offer a promising middle ground between human-guided and fully automated alpha mining approaches and deliver both speed and semantic depth. This study presents a structured review of emerging LLM-based alpha mining systems from an agentic perspective and analyzes the functional roles of LLMs, ranging from miners and evaluators to interactive assistants. Despite early progress, key challenges remain, including limited numerical reasoning, weak exploitation mechanisms, low factor diversity, and risks of information leakage. Accordingly, we outline future working directions, including improving reasoning alignment, expanding to new data modalities, rethinking evaluation protocols, and integrating LLMs into more general-purpose quantitative systems. Our analysis suggests that LLM is a scalable interface for amplifying both domain expertise and algorithmic rigor,as it amplifies domain expertise by transforming qualitative hypotheses into testable factors, and enhances algorithmic rigor by serving as an interface for rapid backtesting and semantic reasoning. The result is a complementary paradigmwhere intuition, automation, and language-based reasoning convergeto redefine the future of quantitative research.
Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article
Reference
Open peer comments: Debate/Discuss/Question/Opinion
Open peer comments: Debate/Discuss/Question/Opinion
<1>