[2502.16730] RapidPen: Fully Automated IP-to-Shell Penetration Testing with LLM-based Agents

[2502.16730] RapidPen: Fully Automated IP-to-Shell Penetration Testing with LLM-based Agents

arXiv - AI 4 min read Article

Summary

RapidPen is a novel automated penetration testing framework that utilizes large language models to autonomously exploit vulnerabilities, achieving shell access efficiently and cost-effectively.

Why It Matters

As cybersecurity threats continue to evolve, RapidPen represents a significant advancement in penetration testing by automating the process, making it accessible for organizations lacking dedicated security teams. This innovation can enhance overall security posture and streamline testing for professionals.

Key Takeaways

  • RapidPen automates the IP-to-shell penetration testing process without human intervention.
  • It utilizes LLMs for vulnerability discovery and exploitation, achieving shell access in 200-400 seconds.
  • The framework is cost-effective, with per-run costs between $0.3 and $0.6.
  • RapidPen can benefit both novices and experienced pentesters by simplifying repetitive tasks.
  • The tool aims to improve the accessibility and efficiency of penetration testing for various organizations.

Computer Science > Cryptography and Security arXiv:2502.16730 (cs) [Submitted on 23 Feb 2025 (v1), last revised 14 Feb 2026 (this version, v2)] Title:RapidPen: Fully Automated IP-to-Shell Penetration Testing with LLM-based Agents Authors:Sho Nakatani (SecDevLab Inc.) View a PDF of the paper titled RapidPen: Fully Automated IP-to-Shell Penetration Testing with LLM-based Agents, by Sho Nakatani (SecDevLab Inc.) View PDF HTML (experimental) Abstract:We present RapidPen, a fully automated penetration testing (pentesting) framework that addresses the challenge of achieving an initial foothold (IP-to-Shell) without human intervention. Unlike prior approaches that focus primarily on post-exploitation or require a human-in-the-loop, RapidPen leverages large language models (LLMs) to autonomously discover and exploit vulnerabilities, starting from a single IP address. By integrating advanced ReAct-style task planning (Re) with retrieval-augmented knowledge bases of successful exploits, along with a command-generation and direct execution feedback loop (Act), RapidPen systematically scans services, identifies viable attack vectors, and executes targeted exploits in a fully automated manner. In our evaluation against a vulnerable target from the Hack The Box platform, RapidPen achieved shell access within 200-400 seconds at a per-run cost of approximately \$0.3-\$0.6, demonstrating a 60\% success rate when reusing prior "success-case" data. These results underscore the potential of t...

Related Articles

You can now use ChatGPT with Apple’s CarPlay | The Verge
Llms

You can now use ChatGPT with Apple’s CarPlay | The Verge

ChatGPT is now accessible from your CarPlay dashboard if you have iOS 26.4 or newer and the latest version of the ChatGPT app.

The Verge - AI · 3 min ·
Llms

Have Companies Began Adopting Claude Co-Work at an Enterprise Level?

Hi Guys, My company is considering purchasing the Claude Enterprise plan. The main two constraints are: - Being able to block usage of Cl...

Reddit - Artificial Intelligence · 1 min ·
Llms

What I learned about multi-agent coordination running 9 specialized Claude agents

I've been experimenting with multi-agent AI systems and ended up building something more ambitious than I originally planned: a fully ope...

Reddit - Artificial Intelligence · 1 min ·
Llms

[D] The problem with comparing AI memory system benchmarks — different evaluation methods make scores meaningless

I've been reviewing how various AI memory systems evaluate their performance and noticed a fundamental issue with cross-system comparison...

Reddit - Machine Learning · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime