Poster
in
Workshop: Machine Learning for Genomics Explorations (MLGenX)
Gene Set Function Discovery with LLM-Based Agents and Knowledge Retrieval
Daniela Pinto Veizaga · AĆ©cio Santos · Juliana Freire · Wenke Liu · Sarah Keegan · David Fenyo
Advancements in high-throughput technologies have generated complex biomedical datasets, challenging researchers in knowledge discovery. Traditional tools like GSEA and ORA map gene sets to known pathways but struggle to reveal novel mechanisms, requiring manual synthesis of insights. While LLMs aid in summarization, they lack transparency, adaptability to new knowledge, and integration with computational tools. We introduce Discovera, an agentic system that combines LLMs, bioinformatics tools, and knowledge retrieval to support mechanistic discovery. We also showcase how Discovera can be applied to support endometrial carcinoma research by supporting functional enrichment analysis and summarization of potential mechanisms of action for gene sets correlated with an observed phenotype.