State of AI Agent Security: A Surface in Migration

Authored by ARIAEdited by Abdel FanePublished June 15, 202615 min readMethodology

The headline number this month is the least interesting one. ARIAscout's June 14 sweep counted 320,506 exposed AI services, up modestly from 297,723 in May. Underneath that flat total the surface is moving: OpenClaw gateways, last month's dominant exposure, fell by nearly a quarter, while unauthenticated model-serving infrastructure surged. Exposed Ollama instances rose 225% and MLflow tracking servers 173%. Inside the honeypot fleet the concentration ran the other way: the Model Context Protocol now draws 97.9% of observed attacker events, up from 75%, and 41% of unique fingerprints came back for more. The exposed surface is broadening. Attacker attention is narrowing. This report is about that divergence.

300,000+

events since launch

97.9%

of events targeted MCP

320,506

exposed AI services

41%

attacker return rate

What this report is

This is Issue 2 of the Behavioral Threat Report, a monthly synthesis published by ARIA, OpenA2A's autonomous research system, with editorial review by Abdel Fane. The numbers in this report are anchored in instrumentation that has run continuously through the reporting window. Methodology is documented at research.opena2a.org/methodology and the findings are reproducible by anyone who deploys equivalent instrumentation.

Four data streams contribute. ARIAscout ran a fresh Shodan sweep on June 14, 2026 to anchor the exposure picture as of publication, not a stale prior-month baseline. AgentPwn instruments honeypot pages with injection payloads across sector verticals. TrapMyAgent runs honey agents that observe attacker behavior when attackers believe they have control. HoneyMap samples the public web for adversarial injection patterns planted by third parties. The synthesis is the point: exposure measures latent surface, the honeypot ecosystem measures what attackers do with it. Every classification in this report resolves to an Agent Threat Matrix technique identifier in T-NNNN format.

A note on the honey-agent count, for readers comparing to Issue 1. TrapMyAgent computes every behavioral distribution over a rolling 30-day window. Issue 1 reported 206,571honey-agent events. That figure was the fleet's cumulative total at the time, which, with a fleet then about 30 days old, equaled the 30-day window. The fleet is now about 60 days old, so its cumulative total (an approximate count that exceeds 300,000 events) and its 30-day window have diverged. To keep month-over-month comparisons honest, this edition reports the honey-agent figure on the 30-day window basis (106,943 events), which is the same basis every distribution below is computed on, and the same effective basis as Issue 1.

1. The volume picture

Combined activity across the four streams. Behavioral telemetry is the 30-day window ending June 15. ARIAscout's exposure sweep was run June 14 specifically for this report.

ARIAscout exposed AI services (June 14 sweep)Shodan census320,506

AgentPwn interactionshoneypot pages244,145

TrapMyAgent events (30d window)honey-agent observations106,943

TrapMyAgent sessions28,774

AgentPwn callbacks3,163

HoneyMap wild bait surfacespublic web446

A surface in migration. The +8% on the headline (297,723 to 320,506) hides the real story. OpenClaw gateways, last month's single largest exposure, retreated by nearly a quarter, from 228,652 to 175,861. In their place, unauthenticated model-serving infrastructure swelled: exposed Ollama instances more than tripled (25,705 to 83,465) and MLflow tracking servers nearly tripled (11,620 to 31,773). This is not noise on top of a stable base. It is a composition change. The exposure that defines AI infrastructure is shifting from agent gateways toward inference and experiment-tracking endpoints that ship open by default. MCP servers rose from 1,322 to 1,658; the broken-out A2A endpoint count from 22 to 30.

The streams disagree this month, and that is the finding. Exposure broadened, but behavioral contact did the opposite. Honey-agent activity in the window fell across the board, while AgentPwn recorded 244,145 page interactions, more than the month before. Fewer distinct agents reached the honey-agent fleet, yet more of the traffic that did arrive went straight for planted bait. Latent surface is widening; active contact is contracting onto a smaller, more deliberate population. A reader who tracks only the exposure total would miss it entirely.

2. The success picture

One of every 77 agents that encountered a payload followed it to a measurable endpoint. The others did not.

On the baseline AgentPwn fleet, 3,163 of 244,145 interactions resolved to a payload callback, an aggregate rate of 1.30%. That rate is the share of agents that followed an injection to a measurable endpoint. It does not measure downstream impact. Small enough that defenders can hope to catch the followers. Large enough that the injection class is not theoretical.

Sector breakdown surfaces a sharper signal. On the AgentPwn security-vertical honeypot population (785 agent fingerprints, 1,317 attempts), 1,207 attempts (91.6%) resulted in a payload callback. This is a real finding, not a sampling artifact. Security-tooling agents that ingest external content trust security-flavored content more than general content, and that trust is exploitable. Any defender shipping an agent that reads security-research or vulnerability-database text into an LLM context window should assume an attacker has already considered the attack surface that 91.6% measures.

One cohort is intentionally quiet, and that quiet is a measurement. Alongside the public-web honeypots, ARIA runs a second cohort of authenticated surfaces, the kind an agent reaches only after acquiring a credential (one archetype is live so far, cloudops-agent-io). Crawler-class agents, which account for almost all observed traffic, never get there. The authenticated cohort is still being seeded and is not yet drawing the autonomous agents it is built to catch. That separation, between what a crawler can find and what only a credentialed agent can reach, is the Common Crawl gap this report measures, and Section 8 returns to it.

3. The technique picture

AgentPwn payload category leaderboard for the window. Each category resolves to one or more Agent Threat Matrix techniques.

Finance vertical (pwnagent-finance)

376

Direct prompt injection

334

Documentation (pwnagent-docs)

315

Continuous integration (pwnagent-ci)

315

API surface (pwnagent-api)

249

Medical vertical (pwnagent-medical)

209

Data exfiltration

123

Jailbreak

117

Context manipulation

115

Context-window exploitation

108

MITRE ATT&CK techniques observed in TrapMyAgent telemetry over the same 30-day window. These are window counts, not cumulative totals.

MITRE ATT&CK	Technique	Events
MITRE T1497.003	Time-Based Evasion	124
MITRE T1550	Use Alternate Authentication Material	99

Cross-source corroboration is a stronger signal than either source alone. AgentPwn's direct prompt-injection category (334 hits) and TrapMyAgent's Use-Alternate-Authentication-Material observation (99 window events) target overlapping defender controls. Both resolve to T-2001 and T-3001 on the Agent Threat Matrix. A defender that hardens against either independently still leaves the other open. The T1550 count fell sharply from the prior edition because it tracks Agent-to-Agent trust escalation, and A2A handshake volume receded this window (Section 5).

4. The adversary picture

Attack-origin geography spans 99 countries. Top ten by event volume.

US · United States62.1%66,449

GB · United Kingdom5.1%5,501

NL · Netherlands3.6%3,835

CA · Canada3.5%3,774

JP · Japan3.2%3,474

BR · Brazil2%2,132

RW · Rwanda2%2,113

SE · Sweden1.7%1,780

UA · Ukraine1.6%1,755

SG · Singapore1.4%1,474

Top cloud providers in the attack-infrastructure backplane. Provider attribution reflects which network fronted the traffic, not which network originated it.

Microsoft Azure43.9%46,950

Google Cloud12%12,786

Cloudflare5.4%5,786

AWS3.8%4,052

Tencent Cloud3.4%3,668

DigitalOcean1.3%1,362

OVH0.9%918

Oracle Cloud0.7%704

automated_scanner

Classifier verdicts

Automated scanner28,62999.5%
Unknown1380.5%
APT reconnaissance70.02%

How to read this. Most observed traffic is mass automated probing. The APT-reconnaissance count is small because the classifier's heuristic rules deliberately lag the technique catalog. Seven APT-reconnaissance events are not a noise floor. They are flags. Any single observation in the non-automated categories deserves investigation, not dismissal. We do not refine the heuristics week-to-week because a misclassification rate that drifts is harder to interpret across editions than a heuristic that conservatively under-reports.

Persistence. 41% of unique attackers return.

2,575 of 6,232 fingerprints

The top fingerprint by session count returned across 793 sessions between May 21 and June 14, a 24-day span. The returning population is a structural feature of the traffic, not a tail of one-off scans.

That return rate is among the most consequential structural findings in this edition. Defenders that treat attacker activity as one-shot reconnaissance underestimate the population. Fingerprint-stable telemetry that survives short-lived IP rotation is the minimum condition for measuring it at all. The recurring visitors are not just observing. They are sampling response variability, watching for changed defenses, and accumulating per-property knowledge.

Attribution limits. Geography reflects origin IP geolocation, which can be obscured by VPN, proxy, and cloud-fronting infrastructure. Microsoft Azure's 43.9% share is best read as "Azure-fronted traffic" not "traffic originating in Azure data centers." ASN-level data in our methodology block separates origin from terminus.

5. The protocol picture

Event types observed across the TrapMyAgent fleet in the 30-day window.

MCP

Model Context Protocol (MCP)104,71297.9%
Agent-to-Agent (A2A) handshake1,2631.2%
Context-read (other)9120.9%
Agent-to-Agent (A2A) task560.1%

A month ago the Model Context Protocol drew three of every four attacker events. This month it draws 97.9%, effectively all of them. That is the consolidation the lede pointed at: even as the exposed surface fans out across new infrastructure, observed attacker behavior is collapsing onto a single protocol. For a defender it is a clarifying result. The highest-leverage place to spend a hardening hour is no longer in doubt. Agent-to-Agent handshakes, 15.2% of events in May, fell to 1.2%, but a quiet channel is not a closed one: ARIAscout still counts exposed A2A endpoints, and that count rose. Section 8 carries hardening guidance for both.

6. The wild picture

HoneyMap samples the public web for injection bait planted by third parties. The window captured 446 surfaces across 353 unique domains. The result is a sample, not a coverage count.

Attack classes:

SOUL-INJECTIndirect prompt injection (T-2002)242

UNICODE-STEGOIndirect prompt injection (T-2002)204

Top AIIS signatures observed:

AIIS-UNICODE-TAG-BLOCK-01204

AIIS-HIDDEN-JAILBREAK-DAN-01178

AIIS-HIDDEN-ROLE-INJECT-0157

AIIS-ATTR-IGNORE-INST-017

Where the bait lives on the page:

Hidden text (display:none, visibility:hidden)152

Script literal (embedded in JS)148

HTML comment94

Alt and ARIA attributes37

Meta tags9

Data attributes6

Sector distribution (where the bait sits):

Unknown425

News11

Ecommerce7

Academic3

425 of 446 surfaces (95%) carry no sector classification. This is the honest answer, not a classification gap. Most injection bait lives on pages with no clear sector identity, and the catalog deliberately does not guess. The named sectors (news, ecommerce, academic) are the surfaces where the page itself made the classification trivial.

The wild is being seeded, not yet weaponized at scale, but the bait is real. SOUL-INJECT, the class that targets agent personality and soul-authority directives, is now the most-observed attack class at 242 surfaces, and a new role-authority injection signature appeared this window. Every signature in this section is a signature defenders should already be checking for. The bait arrives months before the campaign that uses it.

7. The model attribution picture

Model attribution remains a research question. This month we report aggregate user-agent categories with explicit limits. We do not name a vendor or model without evidence.

UA category	Approximate share	Notes
Standard browser user-agents	majority of identifiable UA strings	Chrome, Safari, and mobile WebKit strings. Could be human, scripted, or headless automation. Behavioral fingerprinting is required to disambiguate.
Scripted HTTP clients	small absolute count, clear identity	Explicit non-browser automation. Go-http-client/1.1 and /2.0 together posted 1,283 events in the window.
Other / unattributable	remainder	Strings that could be human, scripted, or automated. No reliable attribution from the user-agent alone.

User-agent strings are attacker-controllable and cannot ground an identity claim on their own. We are building behavioral fingerprinting (request-timing distributions, header-order canonicalization, tool-call structure) to close the gap. Progress will appear in subsequent editions.

8. What this means for defenders

Six recommendations follow from the data above. Each cites the Agent Threat Matrix technique it resolves to and the OASB control that implements the defense.

Treat the Model Context Protocol as the dominant hardening surface. It is now 98% of observed honey-agent events.

Within the 30-day window the Model Context Protocol drew 104,712 of 106,943 honey-agent events (97.9%), up from 75.4% a month ago. The concentration is sharper, not softer. MCP servers should require authentication, rate limit tool discovery, and reject tool definitions that contain unicode tag-block sequences.

Maps to: T-1002, T-2005 · OASB 2.1 (Explicit Capability Grants), OASB 2.3 (Capability Boundaries)

Exposed model-serving infrastructure is the fastest-growing surface. Lock down Ollama and MLflow first.

ARIAscout's June 14 sweep counted 83,465 exposed Ollama instances, up from 25,705 a month ago, and 31,773 exposed MLflow tracking servers, up from 11,620. Both surfaces commonly ship without authentication. Total exposed AI services rose to 320,506 from 297,723 even as OpenClaw gateway exposure fell. Put unauthenticated inference and experiment-tracking endpoints behind authentication and network policy before they are enumerated.

Maps to: T-1002 · OASB 2.1 (Explicit Capability Grants), OASB 5.3 (Credential Scope Limitation)

Filter rendered HTML for the AIIS-UNICODE-TAG-BLOCK-01 signature in agent retrieval pipelines.

204 of 446 wild bait surfaces (46%) carry this single signature, the most of any. Any agent that reads web content into an LLM context window should strip unicode tag-block characters before tokenization. A new signature this window, AIIS-HIDDEN-ROLE-INJECT-01, appeared on 57 surfaces and targets role-authority injection.

Maps to: T-2002 · OASB 3.1 (Prompt Injection Protection), OASB 3.3 (Input Validation)

Plan for persistence. Forty-one percent of unique attackers returned across multiple sessions.

Of 6,232 distinct fingerprints in the window, 2,575 came back. The top fingerprint returned across 793 sessions between May 21 and June 14, a 24-day span. Defenders that treat attacker activity as one-shot reconnaissance underestimate the population. Adopt fingerprint-stable telemetry that survives short-lived IP rotation.

Maps to: T-9001 · OASB 10.1 (Security Event Logging)

Agent-to-Agent traffic receded in the window, but exposed A2A endpoints persist. Keep authenticating handshakes.

A2A handshakes fell to 1.2% of observed events (1,263) from 15.2% a month ago, and the MITRE T1550 Use-Alternate-Authentication-Material count fell with them to 99, because that signal is tied to A2A trust escalation. A receding observation is not a closed surface. ARIAscout still counts exposed A2A endpoints. Authenticate every A2A handshake, verify the calling agent's identity before honoring a capability request, and reject agent-card discovery from unknown peers by default.

Maps to: T-1006, T-2008, T-3001 · OASB 7.1 (Mutual Authentication), OASB 7.2 (Message Integrity), OASB 5.2 (Context Window Isolation)

Close the Common Crawl gap by instrumenting authenticated and dynamic surfaces.

Public-web crawls including Google's January 2026 indirect-prompt-injection study are systematically blind to login-walled, per-fingerprint-dynamic, and federated-social content. Honeypot fleets that observe interactive behavior post-credential are the only way to measure that gap. This report's methodology details the partition.

The Common Crawl gap

Public-web measurement studies, including Google's January 2026 indirect-prompt-injection sweep, sample crawls that surface only anonymous, statically rendered, search-engine-discoverable content. Three classes of attacker-reachable surface are systematically invisible to that sample.

Authenticated surfaces. Login walls, gated dashboards, post-auth tool calls. An agent that completes a credential-acquisition step reaches a different content surface and exhibits behaviors that crawler-only research cannot observe.
Dynamic surfaces. Per-fingerprint content variation. The same URL serves different HTML to different agents. Static crawls collapse this to a single template.
Social surfaces. Federated and platform-mediated content where indexing is partial, latent, or restricted.

OpenA2A's honeypot fleet is built across this seam. Twenty SEO-engineered baseline sites are deliberately Common-Crawl-visible, and they carry essentially all of the traffic in this report. A second set of authenticated archetypes (one live so far, cloudops-agent-io) sits behind a credential, where no crawler reaches. The wild bait in Section 6 is what a crawler can find. The post-credential behavior the authenticated cohort is built to capture is what no crawler can. The distance between those two is the measurement, and it is why a public-web sweep alone, however large, cannot describe how autonomous agents actually behave once they are inside. As the authenticated archetypes mature, future editions report what arrives there.

9. Methodology, limits, and how to cite

techniques

Threat Matrix evidence tiers

Observed in production16
Validated in lab42
Theoretical, flagged3

Source: Agent Threat Matrix evidence audit, locked at publication.

Methodology. Full per-stream methodology is published at /methodology, with sub-pages for behavioral-sweep (ARIAtrap) and first-observed-in-the-wild (FOITW). All four streams run continuously. Numbers in this report are the snapshot at publication.

Window and basis. May 16 to June 15, 2026 (30 days). TrapMyAgent behavioral distributions are computed over the 30-day window, and the honey-agent event figure is the exact count in that window. The cumulative honey-agent total cited in "What this report is" is an approximate planner-estimated count of all events since fleet launch, presented as a floor and never used as a denominator.

What this report cannot answer this month.

HoneyMap is a sample of the public web, not a coverage count. The 446 surfaces are what CommonCrawl, Shodan, and CT-log sampling reached during the window.
TrapMyAgent geography reflects origin IP geolocation, which VPN and proxy infrastructure can obscure. ASN-level analysis in our methodology page separates origin from terminus.
AgentPwn attack-success rate measures injection-following, not downstream impact. A callback proves an agent followed a payload. It does not measure what the agent then did.
Sophistication scoring is published as distribution-only. We do not publish a sophistication mean, grade, or month-over-month delta. The supervised classifier that would ground a mean does not yet exist. We will not put a number on a slide that the data cannot defend.
Model attribution is aggregate UA categories with explicit limits. We will not name a vendor without evidence.
The authenticated cohort is still being seeded, so this report's behavioral numbers describe the public-web cohort. Post-credential agent behavior is the measurement future editions add as those archetypes mature.

How to cite

# BibTeX

@techreport{opena2a-btr-2026-06,
  author = {{ARIA, OpenA2A autonomous research system} and Abdel Fane (editor)},
  title  = {State of AI Agent Security: A Surface in Migration},
  institution = {OpenA2A Research},
  year   = {2026},
  month  = {6},
  type   = {Behavioral Threat Report},
  number = {Issue 2},
  url    = {https://research.opena2a.org/reports/state-of-ai-agent-security-2026-06}
}

# APA

ARIA, & Fane, A. (Ed.). (2026, June 15). State of AI Agent Security: A Surface in Migration (Behavioral Threat Report Issue 2). OpenA2A Research. https://research.opena2a.org/reports/state-of-ai-agent-security-2026-06

How to challenge a finding

Email info@opena2a.org with the specific number you dispute and the methodology you would prefer we used. We aim to respond within five business days. Substantive challenges that hold up under review are published as methodology updates in subsequent editions with attribution.

Appendix. First Observed In The Wild log

FOITW is the mechanism by which OpenA2A pre-registers signatures for sophisticated AI agent attack techniques and publishes within seven days of first wild observation. The pre-registration catalog lives at aria/data/foitw-catalog.json with full methodology at /methodology/foitw.

Catalog state at publication: 3 signatures registered, 0 firings during the reporting window. Transparency-log anchoring is in progress. Log-anchored claims will publish once the integration lands.

Catalog ID	Name	Technique	Status
FOITW-CAT-0001	Greshake-class indirect prompt injection via web content Greshake et al. (AISec 2023). Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection.	T-2002	Pre-registered 2026-04-27. Not yet fired.
FOITW-CAT-0002	Actor-Critic adaptive multi-turn manipulation Shi, Lin, Song, et al. Lessons from Defending Gemini Against Indirect Prompt Injections (Google DeepMind, 2025).	T-2007	Pre-registered 2026-04-27. Not yet fired.
FOITW-CAT-0003	Reputation-poisoning prompt injection Brunner, Liu, Pande. AI threats in the wild: The current state of prompt injections on the web (Google Threat Intelligence Group, April 2026).	T-9005	Pre-registered 2026-04-27. Not yet fired.

Live indices at publication

The numbers in this report are the time-capsule snapshot. The indices below are continuously updated. Click through for the live methodology and CSV exports.

Exposure Prevalence Index

ARIAscout

Shodan + Censys sweeps

Attack Prevalence Index

ARIAtrap

Honeypot fleet behavioral data

Sophistication Index

Distribution-only

Mean withheld pending classifier validation

Authorship. ARIA is OpenA2A's autonomous research system. Editorial review by Abdel Fane. Disclosure of authorship is a credibility move, not a limitation. We document how our content is made.

Data integrity. Every value in this report traces to a query against live instrumentation. The behavioral distributions and the 30-day honey-agent event count are exact counts. The single cumulative figure (events since fleet launch) is a Postgres planner estimate and is labeled as such, presented as a floor, and never used as a denominator. No number is modeled or projected. Pre-publication audit per Black Hat reproducibility standards.

License. Apache 2.0. Cite using the BibTeX or APA blocks in Section 9.

Coordinated disclosure. Any finding in this report that maps to a previously undisclosed vulnerability is held under the 90-day ARIAdesk disclosure protocol. None of the surface-level findings in this report require disclosure coordination.

The exposed surface is broadening as attacker attention narrows.