Lessons from building an autonomous cloud offensive multi-agent system

While Unit 42 did not use frontier AI models in testing, this research is a crucial look at how powerful AI models may ultimately be weaponized in cyberspace.
The following are transcripts of the internal Q&A with Yahav Festinger, Senior Staff Researcher, Palo Alto Networks—provided by the company.
Q: Going into this test, what were your expectations? Were there any major surprises?
A: We weren’t necessarily surprised by Zealot’s core capabilities—we fully expected it to identify the attack path and pinpoint the specific misconfigurations needed to achieve its goal. However, the speed of the compromise was genuinely astonishing.
It took Zealot merely two to three minutes to go from gaining initial access in the cloud environment to successfully reaching sensitive data.
Q: Looking ahead, do you see AI agents remaining primarily as ‘force multipliers’ for human attackers, or will they evolve to execute full, multi-stage attacks completely autonomously?
A: I can certainly see agents performing multi-stage attacks completely autonomously in the near future. The primary hurdle right now lies in the complexity of cloud execution. While frontier AI models are excellent at finding vulnerabilities through static code analysis, cloud environments require an agent to gather and track significantly more context to succeed.
In our testing, we encountered challenges like agents going down “rabbit holes,” but believe these issues will be naturally resolved as more advanced models are built to handle these complex scenarios.
Q: What are the implications for defenders? What should security teams be taking away from this test?
A: The critical takeaway, which strongly aligns with what we are already seeing in real-world cloud attacks, is that the window to mitigate issues is rapidly shrinking. Since AI can move from initial access to sensitive data in minutes, defenders must be able to remediate identified threats much faster.
Human reaction time is no longer sufficient on its own. Organizations must utilize automation and security playbooks to ensure a rapid, effective response.
“Historically, a sophisticated cloud breach required years of specialized training. AI has eliminated that requirement overnight. Attackers can now leverage models that possess that knowledge natively to move from intent to execution in minutes,” stated Chen Doytshman, Staff Researcher, Palo Alto Networks.
