[2602.00663] SEISMO: Increasing Sample Efficiency in Molecular Optimization with a Trajectory-Aware LLM Agent
Summary
The paper presents SEISMO, a trajectory-aware LLM agent designed to enhance sample efficiency in molecular optimization, achieving significant improvements over existing methods.
Why It Matters
Molecular optimization is critical in drug discovery, where efficiency can reduce costs and time in developing new pharmaceuticals. SEISMO's innovative approach leverages real-time feedback, potentially transforming practices in the chemical sciences.
Key Takeaways
- SEISMO achieves 2-3 times higher optimization efficiency compared to previous methods.
- The agent updates its strategy after each oracle call, enhancing real-time decision-making.
- Incorporating structured feedback significantly boosts sample efficiency in molecular tasks.
Computer Science > Artificial Intelligence arXiv:2602.00663 (cs) [Submitted on 31 Jan 2026 (v1), last revised 18 Feb 2026 (this version, v2)] Title:SEISMO: Increasing Sample Efficiency in Molecular Optimization with a Trajectory-Aware LLM Agent Authors:Fabian P. Krüger, Andrea Hunklinger, Adrian Wolny, Tim J. Adler, Igor Tetko, Santiago David Villalba View a PDF of the paper titled SEISMO: Increasing Sample Efficiency in Molecular Optimization with a Trajectory-Aware LLM Agent, by Fabian P. Kr\"uger and 5 other authors View PDF HTML (experimental) Abstract:Optimizing the structure of molecules to achieve desired properties is a central bottleneck across the chemical sciences, particularly in the pharmaceutical industry where it underlies the discovery of new drugs. Since molecular property evaluation often relies on costly and rate-limited oracles, such as experimental assays, molecular optimization must be highly sample-efficient. To address this, we introduce SEISMO, an LLM agent that performs strictly online, inference-time molecular optimization, updating after every oracle call without the need for population-based or batched learning. SEISMO conditions each proposal on the full optimization trajectory, combining natural-language task descriptions with scalar scores and, when available, structured explanatory feedback. Across the Practical Molecular Optimization benchmark of 23 tasks, SEISMO achieves a 2-3 times higher area under the optimisation curve than prior meth...