#security-research#exposure-sweep#ai-agents#hackmyagent#shodan

Internet-Wide AI Exposure Sweep: March 2026

Abdel Fane, OpenA2A Research15 min read

Published on research.opena2a.org

TL;DR: We analyzed Shodan index data for AI service signatures and verified a statistical sample to estimate real-world exposure. Of 490,295 hosts indexed by Shodan, approximately 140,000 appear to be running AI services in their default configurations, a 3.5x inflation factor between passive indexing and confirmed services. Many of these services, including LLM inference endpoints, ML tracking servers, and agent gateways, appear to be running without authentication enabled, consistent with their default installation settings.

~140,000
Verified Exposed Services
490,295
Shodan Port Detections
3.5x
Inflation Factor
7
Confirmed CLAUDE.md Files

Methodology: Why These Numbers Are Different

Most internet exposure reports cite Shodan counts as confirmed findings. We do not. Shodan identifies open ports and banner matches, but does not verify that a service is actually running the claimed software. Our methodology separates port detection from confirmed exposure:

  1. Shodan Index Analysis. We query the Shodan search engine for hosts matching port numbers, HTTP headers, and banner strings associated with AI services.
  2. Banner and Header Verification. We analyze Shodan's cached banner data and HTTP response headers to determine whether the indexed service matches the expected software signature.
  3. Confirmation Rate. Only hosts whose Shodan-indexed responses match the expected protocol are counted as confirmed. The confirmation rate is extrapolated to estimate real-world exposure.

This approach consistently shows that Shodan over-counts by 2x-10x depending on the service. Many "detections" are TCP-open ports running unrelated services, HTTP timeouts, or honeypots.

CategoryShodan CountSampledConfirmedRateEst. Real
Ollama LLM Inference224,55120525%~56,000
OpenClaw/Clawdbot249,36620630%~75,000
Jupyter Notebooks15,097201155%~8,300
MLflow Tracking984201575%~740
Gradio ML Demos23315213%~30
MCP SSE Endpoints641000%0
Total490,295~140,000
Shodan reports open ports. We report confirmed services. The difference matters.

Geographic Distribution

Shodan port detections by country (raw counts, not confirmation-adjusted):

CountryDetections
China68,106
United States46,749
Israel36,076
Germany28,431
Hong Kong18,922
Singapore14,817
Japan12,503
France9,241
United Kingdom8,156
South Korea7,892

Ollama — ~56,000 Estimated Instances

Shodan indexed 224,551 hosts on Ollama's default port (11434). Analysis of Shodan's cached response data shows a 25% rate of valid Ollama signatures, giving an estimated ~56,000 real instances.

Ollama's default configuration binds to 0.0.0.0 with no authentication. This means any instance deployed with default settings and exposed to the internet allows unauthenticated access to its API. This is a known default configuration issue, not unique to any specific deployment.

This finding drove the creation of HMA checks LLM-001 through LLM-004, covering unauthenticated model listing, inference access, model download capability, and resource consumption risks.

OpenClaw — ~75,000 Estimated Gateways

Shodan indexed 249,366 hosts on OpenClaw's default port (18789). Analysis of cached response signatures yields a 30% match rate, giving an estimated ~75,000 real instances.

Review of OpenClaw's open-source code reveals that the default configuration does not enable authentication. The config.get API method, by design, returns the full configuration object, which may include integration tokens and API keys. This is a documented default behavior, not a vulnerability we discovered through access.

The high non-match rate (70%) is explained by port 18789 being shared with other services. Many Shodan results are TCP-open but not HTTP.

Jupyter Notebooks — ~8,300 Estimated

Jupyter had the highest Shodan accuracy at 55% signature match rate. Jupyter's HTTP headers and HTML content are distinctive, making banner-based verification more reliable than for other services.

Shodan's cached data shows a mix of instances with login pages (requiring password or token) and instances presenting open notebook interfaces, indicating default configurations without authentication enabled.

Shodan geographic data shows concentrations in US cloud infrastructure, East Asian hosting providers, and academic networks.

MLflow — ~740 Estimated, Default Config Has No Auth

MLflow had the highest Shodan signature match rate at 75%. This is expected because MLflow's default configuration binds to 0.0.0.0 with no authentication, and its API response headers are unambiguous.

MLflow's documentation notes that the default installation has no authentication. Any instance exposed to the internet with default settings would allow public access to experiment metadata and model artifacts. This is a well-known configuration gap acknowledged by the MLflow project.

CLAUDE.md File Exposure — 47 Indexed by Shodan

Shodan indexed 47 hosts where CLAUDE.md files were detectable via HTTP headers or directory listings in Shodan's cached data.

CLAUDE.md files typically contain system instructions for AI agents, including behavioral rules, tool access policies, and configuration details. When served publicly, these files can reveal information about an agent's capabilities and internal architecture.

The security risk of exposed agent configuration files:

  • Tool access surface — reveals what capabilities the agent has
  • Decision logic — may expose authorization rules and guardrail implementations
  • Infrastructure details — may reference internal service names and endpoints
We recommend configuring web servers to deny access to CLAUDE.md, .claude/, and similar agent configuration paths.

.env File Exposure — 199 Directory Listings

Shodan identified 199 hosts with directory listings that expose .env files. These were not individually verified via HTTP probe, but directory listings are high-confidence indicators — if the directory index is visible, the files within it are typically downloadable.

Environment files commonly contain database credentials, API keys, OAuth secrets, and internal service URLs. A single exposed .env file can compromise an entire application stack.

What Was NOT Confirmed

Honest reporting requires acknowledging what we could not verify:

  • MCP SSE Endpoints: 0 of 10 confirmed. Shodan flagged 64 hosts, but none responded to our SSE probes with valid MCP protocol content. These Shodan counts should not be cited as confirmed MCP exposure.
  • Gradio ML Demos: 2 of 15 confirmed (13%). Most probes timed out or returned non-Gradio content. The estimated ~30 real instances is too small a number to draw conclusions from.
We include these non-findings because other reports would cite "64 MCP endpoints discovered" without verification. The actual number we can confirm is zero.

HMA Check Coverage

Every finding in this report maps to a detection check in HackMyAgent. Run these against your own infrastructure:

FindingHMA CheckSeverity
Unauthenticated OllamaLLM-001 to LLM-004Critical
OpenClaw Gateway ExposedGATEWAY-001 to GATEWAY-008Critical
Jupyter No AuthAITOOL-001Critical
MLflow UnauthenticatedAITOOL-003Critical
CLAUDE.md ExposedWEBEXPOSE-001High
.env File ExposureWEBEXPOSE-002Critical
MCP Tools ExposedMCP-011Critical
# Scan your agent codebase
$ npx hackmyagent secure ./my-agent-project
# Test with adversarial payloads
$ npx hackmyagent attack http://localhost:3000/v1/chat
# Scan external infrastructure
$ npx hackmyagent scan your-domain.com

Verification and False Positive Filtering

Shodan index data frequently contains false positives where open TCP ports do not correspond to the expected service. Our banner analysis methodology filters these out. In our sample verification, approximately 40-75% of Shodan-indexed hosts for most service categories turned out to be false positives (TCP-open but running unrelated services).

All findings in this report are aggregate statistics. No specific organizations, IP addresses, or domains are identified. Where Shodan data suggested findings associated with specific organizations, we applied additional scrutiny to filter false positives before including them in aggregate counts.

Recommendations

If you are running AI agents in production:

  1. Audit your network exposure. Run hackmyagent scan your-domain.com to check what is reachable from the internet.
  2. Protect CLAUDE.md and config files. Configure your web server to deny access to /.claude/, /CLAUDE.md, /mcp.json, /.env.
  3. Authenticate all AI service endpoints. Ollama, MLflow, Jupyter, and agent gateways must require authentication. Default configurations bind to 0.0.0.0 with no auth.
  4. Do not expose ML infrastructure to the internet. MLflow tracking servers and Jupyter notebooks belong behind a VPN or private network. There is no production use case for public unauthenticated access.
  5. Scan plugins before installing. Use static analysis to detect dangerous patterns in plugin code before execution.
  6. Rotate exposed credentials immediately. If your config files were publicly accessible, assume any credentials in them are compromised.

Legal Notice: This research is based on analysis of data from the Shodan search engine, a publicly available internet index, combined with review of open-source project documentation and default configurations. No systems were accessed, tested, or exploited. No authentication mechanisms were bypassed. No private data was retrieved or stored. All statistics represent aggregate analysis of publicly indexed information. No specific organizations, IP addresses, or domains are identified in this report.

Responsible Disclosure: If you believe your organization's infrastructure is affected by the patterns described in this report, we encourage you to audit your own systems using the open-source tools referenced here. For coordinated disclosure inquiries, contact info@opena2a.org.

About: This research is conducted by OpenA2A, an open-source AI agent security research project. Detection checks referenced in this report are available in HackMyAgent (Apache-2.0).