Research Methodology

How we collect and verify the vulnerability statistics shown in our research reports.

Specialized methodologies

The rest of this page documents the Shodan-driven exposure-sweep methodology used for monthly infrastructure census reports. ARIA v2 adds two additional methodologies for behavioral and pre-registered attack research:

Current Research Summary

207
Shodan Queries
97,013
Hosts Discovered
11,100
Hosts Scanned
14.4%
Vulnerability Rate

Last updated: January 29, 2026

Overview

The statistics in our reports are derived from analysis of the Shodan search engine's publicly available index data, combined with review of open-source project documentation and default configurations. We do not access, test, or exploit any third-party systems.

Key principle: All findings are based on publicly indexed data and open-source code review. We analyze what Shodan has already indexed, not what we access directly.

Our Process

1

Target Discovery via Shodan

We use the Shodan API with 207 different search queries to identify candidate IP addresses running AI agent infrastructure. We search for signatures across Python frameworks (Uvicorn, FastAPI, Django, Flask, Gunicorn), Node.js servers (Express, Koa, Next.js), Go/Java/Ruby/Rust frameworks, cloud platforms, and API patterns.

2

Banner and Response Analysis

Shodan's cached banner data and HTTP response headers are analyzed for security-relevant patterns: exposed configuration files, agent instruction files, API key patterns, MCP tool definitions, gateway signatures, and debug mode indicators. Only findings with clear signature matches in the indexed data are counted.

3

Aggregation and Reporting

Findings are aggregated by pattern type. We calculate rates based on the proportion of indexed hosts showing security-relevant patterns and publish aggregate statistics only. No individual hosts, IPs, or organizations are identified.

Shodan Query Categories

We use 207 queries across these categories to maximize coverage:

SSE Endpoints

5 queries

text/event-stream on ports 80, 443, 3000, 8000, 8080

Python Frameworks

35 queries

Uvicorn, FastAPI, Django, Flask, Gunicorn, Tornado, aiohttp

Node.js Servers

30 queries

Express, Koa, Hapi, Fastify, NestJS, Next.js, Nuxt

WebSocket/Real-time

15 queries

WebSocket upgrades, Socket.io, WS connections

API Patterns

25 queries

/api/v1, /api/v2, REST endpoints, GraphQL, OpenAPI

AI/ML Infrastructure

20 queries

LangChain, LlamaIndex, Hugging Face, model endpoints

Cloud Platforms

15 queries

AWS Lambda, GCP Run, Azure Functions, Vercel, Heroku

Debug/Admin Endpoints

20 queries

/debug, /admin, /health, /metrics, /status

Go/Java/Ruby/Rust

25 queries

Gin, Echo, Spring, Rails, Actix, Rocket

Database/Container UIs

17 queries

MongoDB Express, Redis Commander, Portainer, phpMyAdmin

View sample Shodan queries
# SSE Endpoints
"text/event-stream" port:443
"text/event-stream" port:80
# Python Frameworks
"uvicorn" port:443
"fastapi" port:8000
"gunicorn" port:443
"django" port:8000
# Node.js
"X-Powered-By: Express" port:443
"next.js" port:3000
# API Patterns
http.html:"/api/v1" port:443
"graphql" port:443
# Debug Endpoints
"/debug" http.status:200
"/admin" http.status:200

Patterns Analyzed

We analyze Shodan index data for 12 security-relevant patterns:

mcp-sse-exposed

MCP SSE Endpoints

SSE endpoint signatures in banner data

mcp-tools-exposed

MCP Tools Listing

Tool definition patterns in HTTP responses

api-key-exposed

API Key Exposure

API key patterns in cached response headers

config-file-exposed

Config Files

Configuration file paths in directory listings

claude-md-exposed

System Instructions

CLAUDE.md references in index data

no-auth-mcp

Unauthenticated MCP

MCP endpoints without auth indicators

outdated-api-endpoint

Debug Endpoints

/debug, /admin, /shell paths in index data

clawdbot-gateway-exposed

Agent Gateway

Gateway signatures on port 18789

clawdbot-websocket-exposed

WebSocket Control

WebSocket control signatures on port 18790

outdated-version

Outdated Versions

Outdated version strings in banners

debug-mode-enabled

Debug Mode

Debug mode indicators in response headers

dir-listing-enabled

Directory Listing

Directory listing indicators in HTML content

Current Findings Breakdown

From 11,100 analyzed hosts, 8,449 security-relevant patterns were identified:

5,042
Outdated API Endpoints
1,190
Claude.md Exposed
829
Outdated Versions
645
MCP Tools Exposed
289
Clawdbot Gateway
272
Debug Mode Enabled
58
Unauthenticated MCP
54
Config Files Exposed
32
API Keys Exposed
22
Clawdbot WebSocket
14
MCP SSE Exposed
2
Directory Listing

Reproducibility

Our research methodology is documented here for transparency. The analysis approach uses the Shodan search API, a publicly available service, combined with open-source code review.

# Requirements
- Shodan API key (Freelancer plan for 1000 results/query)
- Node.js 18+
# Analysis parameters
- 207 Shodan queries across 10 categories
- 6 second delay between Shodan API calls
- Banner and header pattern matching for service identification

Ethics and Legal Framework

  • *All research is based on analysis of Shodan's publicly available index data
  • *We do not access, test, or exploit any third-party systems
  • *No authentication mechanisms are bypassed or tested
  • *No private data is retrieved, stored, or disclosed
  • *All reports use aggregate statistics only; no organizations, IPs, or domains are identified
  • *Security patterns are identified through open-source code review and default configuration analysis
  • *Our goal is to help organizations identify and fix security issues in their own infrastructure

Full Research Report

For the full analysis including detailed vulnerability breakdowns and recommendations, read our published report.

Questions or Concerns?

If you have questions about our methodology or want to report an issue with our data, please contact us.