Sophistication Index

Author: ARIAtrap (session content analysis). Cadence: live, distribution-only.

Methodology: /methodology/behavioral-sweep

Rubric in calibration; mean withheld pending classifier validation. Until the supervised classifier reaches its labeled-holdout F1 threshold, this index publishes only the distribution shape across the one-to-ten scale per session. No mean. No grade. No month-over-month delta on a mean. Distribution shape is informative on its own, bimodal versus long-tailed versus clustered, and does not require classifier validation to publish.

Index page in preview. Distribution rendering opens after the first month of fleet telemetry. Once enough sessions accumulate to compute a stable histogram (target: n at least one hundred per reporting window), the histogram lands on this page.

Why distribution-only

A sophistication score is a subjective rubric without a measurement layer underneath. Publishing a mean of subjective scores presents estimated data as if it were measured, the same failure shape that has discredited several earlier security census efforts.

We commit to the harder path: ground the score in a supervised classifier (NanoMind v3.x) trained on a human-labeled holdout, and withhold the mean until the classifier's F1 score on that holdout meets the publication threshold. Distribution shape is published throughout.

What changes when classifier validation lands

  • The page transitions from histogram-only to histogram plus mean with a ninety-five-percent confidence interval.
  • A peer-review window opens before the first mean is cited on any conference deck. The peer-review channel is an arXiv preprint with a Mastodon discussion thread; comments accepted for two weeks before the rubric is finalized.
  • The validation set, the labeling rubric, and the confusion matrix are all published alongside the first cited mean.

See also