From AI-assisted to AI-orchestrated: What agent-led cyber attacks mean for security

November 14, 2025

In this article

    November 14, 2025

    Anthropic’s recent report describes what we’d call an agentic nation-state attack: a state-backed operation where a model doesn’t just assist an operator, it quietly runs most of the intrusion end to end. In this case, Anthropic refers to the group as GTG-1002 and assesses it as Chinese state-sponsored, active against roughly 30 organizations including major technology companies and government agencies. As far as is publicly known, this is the first documented case of a state-backed, agentic campaign that leans on a frontier model as a core operator rather than a side-channel helper, marking a step change from earlier “AI-assisted” intrusions.

    Most security teams have already seen AI-assisted activity before: an operator pastes in an error log, asks for exploit ideas, or gets help scripting a scanner. Here the pattern is different. The attackers wrapped Claude Code and Model Context Protocol (MCP) tools in an automation layer that turned the model into the primary operator. Humans set objectives and rough constraints while the system handled the mechanics of reconnaissance, exploiting weaknesses, lateral movement, and data collection.

    From Anthropic’s point of view, this agentic nation-state operation never looked like a single dramatic “attack query”. It appeared as thousands of small, plausible security tasks: scanning a range, probing an endpoint, testing credentials, summarizing a result, all executed at a scale no human team could match.

    How the campaign slipped between the cracks

    Anthropic portrays the automation around Claude as tuned for two goals: blending in with normal security tasks and operating at scale. The orchestration layer slices the work into discrete technical steps that look like tasks from a normal security backlog. Each one, taken alone, fits neatly into a defensive narrative.

    From the provider’s point of view, the traffic looks like someone doing security work on their own systems. From the target’s point of view, the commands and scans resemble what a busy internal engineer might run on any given day. Nothing in a single request, or a single tool call, clearly crosses the line into an intrusion attempt.

    The real signal only shows up when we follow the sequence over time. Those individually reasonable steps line up into a clear trajectory: wider coverage of the target, confirmation of specific weaknesses, deeper access, more structured data being pulled out and organized. That sequence-level behavior is what matters, and it largely disappears when analysis only ever examines individual interactions.

    None of this is classic prompt engineering as there is no clever sentence that flips the model into an obviously offensive mindset. The operators designed a system whose behavior would stay within the bounds of what provider-side safety and abuse detection are likely to allow, while the overall campaign continues to move forward.

    More capable operators can make the campaign almost impossible to see from any single vantage point. Work can be spread across multiple API vendors so that no single lab ever sees more than a fragment of the activity. In practice, that means other labs and platforms can already be part of the same state-backed operation without realising it, each one seeing only “reasonable” security or reliability work that is actually a slice of a much larger, coordinated campaign. Sessions can be restarted as soon as they become suspicious, tools and personas can be rotated, and traffic can be blended with real defensive work, so local context and telemetry are always partial and noisy. In that world, no individual provider, customer, or agency can reliably reconstruct the operation on its own.

    Under those conditions, neither individual prompts nor any single provider’s logs contain the story that matters. The intrusion lives in how behavior lines up across time, across systems, and across organizations. Defending against agentic nation-state campaigns will require vendors, large enterprises, and governments to share more information about anomalous patterns, and at the same time to build internal controls that log, measure, and constrain what their own agents are allowed to do at that sequence level.

    Cyber is the proving ground for autonomy

    The Anthropic report confirms what many practitioners expected to see sooner or later. With the right scaffolding, a general-purpose model with ordinary security tools can cover much of the workload of an advanced intrusion team. Cybersecurity is therefore one of the first real, at-scale use case for autonomous AI agents in live operations, and the way we handle it will set expectations for how similar patterns play out in finance, industrial systems and other high-stakes domains. The operators behind this agentic nation-state attack did not rely on exotic malware or unknown techniques. They leaned on familiar open-source penetration-testing tools and invested their effort in the integration layer that wired those tools into Claude and turned it into a decision-maker.

    The same report also shows where the current limits are. The agentic model regularly overstated what it had achieved, claiming access that did not exist or treating publicly available information as if it were a significant discovery. Humans had to come back into the loop to check whether credentials actually worked and whether findings were worth pursuing. That verification overhead slowed the campaign and, for now, remains one of the practical barriers to completely hands-off autonomous operations.

    Current agentic AI systems are already strong enough for nation-state actors to use them in live operations, but still rough enough that their failure modes are visible. That combination is exactly what makes cyber the proving ground for how we use AI agents on real systems. Security teams have rich telemetry, established incident processes and years of practice dealing with adaptive attackers. If we cannot learn to observe, understand and contain agentic systems in this environment, where we can actually see what they are doing, it will be even harder in domains with less visibility. 

    It is not hard to see the same pattern elsewhere. Replace scanners and VPN access with a payments API and a trading system and you have the outline of an autonomous fraud agent; replace remote access and directory services with an IoT controller and a PLC and you have an OT or ICS sabotage agent; replace external recon with continuous mapping of suppliers and logistics and you have an agent quietly reshaping a supply chain.

    Measuring capability against agentic nation-state operations

    To move from this incident to concrete detection and mitigation, everyone involved needs a sharper view of how these systems behave in practice. Labs, platforms, operators of critical systems and policymakers all need to see how systems they are responsible for behave under the same style of agentic campaign.

    For labs and platform teams, that means system-level evaluation: exercising models, tools and orchestration together in realistic, adversarial scenarios and then measuring three things in a disciplined way. First, what that end-to-end system of model, tools and orchestration can actually achieve. Second, how its actions and tool calls appear in provider-side telemetry. Third, which technical or policy changes at the model and endpoint level meaningfully change the outcome.

    For operators of critical systems, the questions are different but connected. Given the same style of campaign, where does the agent get in, what systems and data does it touch, how visible is that activity in their own logs, and which existing controls meaningfully constrain the campaign? The answers to those questions should drive the detection patterns and hardening work on their side.

    A lot of this work depends on shared infrastructure for understanding how these systems behave under attack. At Irregular, we run system-level simulations with labs and high-risk organizations and turn the results into concrete signals and decision points, and use them to track and communicate rising risk levels to the stakeholders who need to respond. The aim is for detection and mitigation work like the ideas in this piece to be grounded in observed behaviour rather than speculation.

    Designing for detection and mitigation

    Agentic, model-led campaigns are now in state playbooks, and they will likely continue to evolve. Nation-state operators will keep iterating on this style of attack, and capable non-state actors will follow. The practical question for labs, AI platforms and large organizations is how to make these campaigns easier to see and harder to sustain, with minimal interruption to legitimate, high-value AI workloads. 

    Detection has to move away from single prompts and isolated API calls and toward behavior over time. The interesting signals live in long sequences of related actions, in the way tools are chained, and in checks that confirm whether access or exploitation actually worked. We also have to assume that the work can be spread across multiple vendors and accounts, so no single organization necessarily sees the full picture. Defending against such campaigns requires analysing activity across systems as a whole and building more structured cooperation between model providers, high-risk organizations and, where appropriate, governments. The threat is distributed, and so is the effort required to detect it.

    At the same time, AI security has to fit into the defense stack that already exists. The same SOCs and pipelines that handle human-driven campaigns now need to recognise and reason about AI-driven ones. That means getting model and tool telemetry into SIEMs and XDR, and treating “agent activity” as a distinct class of behavior, not just another kind of API traffic. It also means closer alignment between frontier labs and the organizations they serve: labs see how models behave at scale across many use cases, while operators see how those behaviors show up inside real environments. Both views are needed, and AI can help on the defensive side by surfacing campaign patterns in that combined data that would be hard for humans to connect on their own.

    Mitigation starts from a simple assumption: determined actors will keep using powerful public and private models in their operations. Labs and platforms cannot stop that, but they can decide how their endpoints behave when usage starts to look like a campaign, including how long sessions are allowed to run, how much high-impact work a single account can do, and how quickly suspicious tenants are slowed, flagged or frozen. Concretely, that might mean tighter limits on traffic that looks like infrastructure discovery, slower and heavily logged paths for exploit-development and code-execution flows, and keeping the most capable features and models available only to known, verified actors rather than to anyone who can make an API call.

    For operators of critical systems, mitigation mostly looks closer to classic defensive engineering, now with AI in the threat model. The job is to narrow and harden the interfaces that matter most, understand how model-driven operations show up against them in logs and telemetry, and rehearse responses before an incident is live. Except, of course, where new attack surfaces appear with AI products and systems being used by those same operators. It also means building an explicit view of how agentic behaviour would present in that environment, training SOC and incident teams to recognise it, and using shared infrastructure to run realistic simulations so that playbooks, controls and intuition are shaped by observed campaigns rather than by theory alone.

    Where we go from here

    Agentic, state-backed operations using frontier models are a fact, not a thought experiment. The immediate problem for labs, platforms and operators of critical systems is how to get a shared, realistic view of what these systems can actually do and how to keep that within acceptable bounds.

    Meeting that goal requires infrastructure that covers both how models behave and how real environments respond. On the model side, frontier models, tools and orchestration stacks need to be exercised in realistic, adversarial environments so labs can see how their systems behave under pressure, not just how they score on static benchmarks. On the defender side, operators of critical systems need to see how those same capabilities behave when pointed at environments that look like theirs, and which signals and controls actually change the outcome in practice.

    Irregular’s work sits between these two worlds: using one platform to run these scenarios with labs and operators of critical systems and enterprises, and to share the results from each side with the other. The emphasis is on giving both sides a shared, realistic picture of what these systems can do and where they need to be constrained, rather than on any single deployment model.

    Cyber is where this work becomes real first. What we collectively learn here, with real models under realistic pressure, will shape how autonomy is governed in every other high-stakes domain where models get tools and start to act.

    November 14, 2025

    Anthropic’s recent report describes what we’d call an agentic nation-state attack: a state-backed operation where a model doesn’t just assist an operator, it quietly runs most of the intrusion end to end. In this case, Anthropic refers to the group as GTG-1002 and assesses it as Chinese state-sponsored, active against roughly 30 organizations including major technology companies and government agencies. As far as is publicly known, this is the first documented case of a state-backed, agentic campaign that leans on a frontier model as a core operator rather than a side-channel helper, marking a step change from earlier “AI-assisted” intrusions.

    Most security teams have already seen AI-assisted activity before: an operator pastes in an error log, asks for exploit ideas, or gets help scripting a scanner. Here the pattern is different. The attackers wrapped Claude Code and Model Context Protocol (MCP) tools in an automation layer that turned the model into the primary operator. Humans set objectives and rough constraints while the system handled the mechanics of reconnaissance, exploiting weaknesses, lateral movement, and data collection.

    From Anthropic’s point of view, this agentic nation-state operation never looked like a single dramatic “attack query”. It appeared as thousands of small, plausible security tasks: scanning a range, probing an endpoint, testing credentials, summarizing a result, all executed at a scale no human team could match.

    How the campaign slipped between the cracks

    Anthropic portrays the automation around Claude as tuned for two goals: blending in with normal security tasks and operating at scale. The orchestration layer slices the work into discrete technical steps that look like tasks from a normal security backlog. Each one, taken alone, fits neatly into a defensive narrative.

    From the provider’s point of view, the traffic looks like someone doing security work on their own systems. From the target’s point of view, the commands and scans resemble what a busy internal engineer might run on any given day. Nothing in a single request, or a single tool call, clearly crosses the line into an intrusion attempt.

    The real signal only shows up when we follow the sequence over time. Those individually reasonable steps line up into a clear trajectory: wider coverage of the target, confirmation of specific weaknesses, deeper access, more structured data being pulled out and organized. That sequence-level behavior is what matters, and it largely disappears when analysis only ever examines individual interactions.

    None of this is classic prompt engineering as there is no clever sentence that flips the model into an obviously offensive mindset. The operators designed a system whose behavior would stay within the bounds of what provider-side safety and abuse detection are likely to allow, while the overall campaign continues to move forward.

    More capable operators can make the campaign almost impossible to see from any single vantage point. Work can be spread across multiple API vendors so that no single lab ever sees more than a fragment of the activity. In practice, that means other labs and platforms can already be part of the same state-backed operation without realising it, each one seeing only “reasonable” security or reliability work that is actually a slice of a much larger, coordinated campaign. Sessions can be restarted as soon as they become suspicious, tools and personas can be rotated, and traffic can be blended with real defensive work, so local context and telemetry are always partial and noisy. In that world, no individual provider, customer, or agency can reliably reconstruct the operation on its own.

    Under those conditions, neither individual prompts nor any single provider’s logs contain the story that matters. The intrusion lives in how behavior lines up across time, across systems, and across organizations. Defending against agentic nation-state campaigns will require vendors, large enterprises, and governments to share more information about anomalous patterns, and at the same time to build internal controls that log, measure, and constrain what their own agents are allowed to do at that sequence level.

    Cyber is the proving ground for autonomy

    The Anthropic report confirms what many practitioners expected to see sooner or later. With the right scaffolding, a general-purpose model with ordinary security tools can cover much of the workload of an advanced intrusion team. Cybersecurity is therefore one of the first real, at-scale use case for autonomous AI agents in live operations, and the way we handle it will set expectations for how similar patterns play out in finance, industrial systems and other high-stakes domains. The operators behind this agentic nation-state attack did not rely on exotic malware or unknown techniques. They leaned on familiar open-source penetration-testing tools and invested their effort in the integration layer that wired those tools into Claude and turned it into a decision-maker.

    The same report also shows where the current limits are. The agentic model regularly overstated what it had achieved, claiming access that did not exist or treating publicly available information as if it were a significant discovery. Humans had to come back into the loop to check whether credentials actually worked and whether findings were worth pursuing. That verification overhead slowed the campaign and, for now, remains one of the practical barriers to completely hands-off autonomous operations.

    Current agentic AI systems are already strong enough for nation-state actors to use them in live operations, but still rough enough that their failure modes are visible. That combination is exactly what makes cyber the proving ground for how we use AI agents on real systems. Security teams have rich telemetry, established incident processes and years of practice dealing with adaptive attackers. If we cannot learn to observe, understand and contain agentic systems in this environment, where we can actually see what they are doing, it will be even harder in domains with less visibility. 

    It is not hard to see the same pattern elsewhere. Replace scanners and VPN access with a payments API and a trading system and you have the outline of an autonomous fraud agent; replace remote access and directory services with an IoT controller and a PLC and you have an OT or ICS sabotage agent; replace external recon with continuous mapping of suppliers and logistics and you have an agent quietly reshaping a supply chain.

    Measuring capability against agentic nation-state operations

    To move from this incident to concrete detection and mitigation, everyone involved needs a sharper view of how these systems behave in practice. Labs, platforms, operators of critical systems and policymakers all need to see how systems they are responsible for behave under the same style of agentic campaign.

    For labs and platform teams, that means system-level evaluation: exercising models, tools and orchestration together in realistic, adversarial scenarios and then measuring three things in a disciplined way. First, what that end-to-end system of model, tools and orchestration can actually achieve. Second, how its actions and tool calls appear in provider-side telemetry. Third, which technical or policy changes at the model and endpoint level meaningfully change the outcome.

    For operators of critical systems, the questions are different but connected. Given the same style of campaign, where does the agent get in, what systems and data does it touch, how visible is that activity in their own logs, and which existing controls meaningfully constrain the campaign? The answers to those questions should drive the detection patterns and hardening work on their side.

    A lot of this work depends on shared infrastructure for understanding how these systems behave under attack. At Irregular, we run system-level simulations with labs and high-risk organizations and turn the results into concrete signals and decision points, and use them to track and communicate rising risk levels to the stakeholders who need to respond. The aim is for detection and mitigation work like the ideas in this piece to be grounded in observed behaviour rather than speculation.

    Designing for detection and mitigation

    Agentic, model-led campaigns are now in state playbooks, and they will likely continue to evolve. Nation-state operators will keep iterating on this style of attack, and capable non-state actors will follow. The practical question for labs, AI platforms and large organizations is how to make these campaigns easier to see and harder to sustain, with minimal interruption to legitimate, high-value AI workloads. 

    Detection has to move away from single prompts and isolated API calls and toward behavior over time. The interesting signals live in long sequences of related actions, in the way tools are chained, and in checks that confirm whether access or exploitation actually worked. We also have to assume that the work can be spread across multiple vendors and accounts, so no single organization necessarily sees the full picture. Defending against such campaigns requires analysing activity across systems as a whole and building more structured cooperation between model providers, high-risk organizations and, where appropriate, governments. The threat is distributed, and so is the effort required to detect it.

    At the same time, AI security has to fit into the defense stack that already exists. The same SOCs and pipelines that handle human-driven campaigns now need to recognise and reason about AI-driven ones. That means getting model and tool telemetry into SIEMs and XDR, and treating “agent activity” as a distinct class of behavior, not just another kind of API traffic. It also means closer alignment between frontier labs and the organizations they serve: labs see how models behave at scale across many use cases, while operators see how those behaviors show up inside real environments. Both views are needed, and AI can help on the defensive side by surfacing campaign patterns in that combined data that would be hard for humans to connect on their own.

    Mitigation starts from a simple assumption: determined actors will keep using powerful public and private models in their operations. Labs and platforms cannot stop that, but they can decide how their endpoints behave when usage starts to look like a campaign, including how long sessions are allowed to run, how much high-impact work a single account can do, and how quickly suspicious tenants are slowed, flagged or frozen. Concretely, that might mean tighter limits on traffic that looks like infrastructure discovery, slower and heavily logged paths for exploit-development and code-execution flows, and keeping the most capable features and models available only to known, verified actors rather than to anyone who can make an API call.

    For operators of critical systems, mitigation mostly looks closer to classic defensive engineering, now with AI in the threat model. The job is to narrow and harden the interfaces that matter most, understand how model-driven operations show up against them in logs and telemetry, and rehearse responses before an incident is live. Except, of course, where new attack surfaces appear with AI products and systems being used by those same operators. It also means building an explicit view of how agentic behaviour would present in that environment, training SOC and incident teams to recognise it, and using shared infrastructure to run realistic simulations so that playbooks, controls and intuition are shaped by observed campaigns rather than by theory alone.

    Where we go from here

    Agentic, state-backed operations using frontier models are a fact, not a thought experiment. The immediate problem for labs, platforms and operators of critical systems is how to get a shared, realistic view of what these systems can actually do and how to keep that within acceptable bounds.

    Meeting that goal requires infrastructure that covers both how models behave and how real environments respond. On the model side, frontier models, tools and orchestration stacks need to be exercised in realistic, adversarial environments so labs can see how their systems behave under pressure, not just how they score on static benchmarks. On the defender side, operators of critical systems need to see how those same capabilities behave when pointed at environments that look like theirs, and which signals and controls actually change the outcome in practice.

    Irregular’s work sits between these two worlds: using one platform to run these scenarios with labs and operators of critical systems and enterprises, and to share the results from each side with the other. The emphasis is on giving both sides a shared, realistic picture of what these systems can do and where they need to be constrained, rather than on any single deployment model.

    Cyber is where this work becomes real first. What we collectively learn here, with real models under realistic pressure, will shape how autonomy is governed in every other high-stakes domain where models get tools and start to act.

    To cite this article, please credit Irregular with a link to this page, or click to view the BibTeX citation.

    To cite this article, please credit Irregular with a link to this page, or click to view the BibTeX citation.