[2602.13808] An end-to-end agentic pipeline for smart contract translation and quality evaluation
Summary
This article presents a comprehensive framework for evaluating smart contracts generated from natural language specifications, focusing on quality assessment and systematic error identification.
Why It Matters
As smart contracts become integral to blockchain applications, ensuring their correctness and security is crucial. This framework aids in the evaluation of LLM-generated contracts, enhancing trust and reliability in automated systems. It also sets a benchmark for future research in smart contract synthesis.
Key Takeaways
- Introduces an end-to-end pipeline for smart contract evaluation.
- Measures quality across five dimensions, including functional completeness and code quality.
- Supports empirical research by providing reproducible benchmarks.
- Identifies systematic error modes in smart contract generation.
- Facilitates extensions to formal verification and compliance checking.
Computer Science > Artificial Intelligence arXiv:2602.13808 (cs) [Submitted on 14 Feb 2026] Title:An end-to-end agentic pipeline for smart contract translation and quality evaluation Authors:Abhinav Goel, Chaitya Shah, Agostino Capponi, Alfio Gliozzo View a PDF of the paper titled An end-to-end agentic pipeline for smart contract translation and quality evaluation, by Abhinav Goel and Chaitya Shah and Agostino Capponi and Alfio Gliozzo View PDF HTML (experimental) Abstract:We present an end-to-end framework for systematic evaluation of LLM-generated smart contracts from natural-language specifications. The system parses contractual text into structured schemas, generates Solidity code, and performs automated quality assessment through compilation and security checks. Using CrewAI-style agent teams with iterative refinement, the pipeline produces structured artifacts with full provenance metadata. Quality is measured across five dimensions, including functional completeness, variable fidelity, state-machine correctness, business-logic fidelity, and code quality aggregated into composite scores. The framework supports paired evaluation against ground-truth implementations, quantifying alignment and identifying systematic error modes such as logic omissions and state transition inconsistencies. This provides a reproducible benchmark for empirical research on smart contract synthesis quality and supports extensions to formal verification and compliance checking. Comments: Subj...