Operator Discovery
How session-level HASSH fingerprints surface coordinated SSH operators that subnet- and ASN-based clustering can't see — and what that's worth to your threat model.
We compute the HASSH MD5 of every SSH client that connects to our cowrie sensors. When ≥50 distinct non-benign actors share a fingerprint and span ≥3 distinct /16 subnets, we promote that cluster to a campaign. The /16 dispersion check ensures we surface distributed operators, not single-provider hosting farms (those are caught by subnet/ASN clustering already). The detector runs every aggregation cycle and excludes 259 benign-tagged scanners.
// What we see right now
| Campaign | IPs |
|---|---|
| HASSH 03a80b21afa8… — SSH-2.0-libssh_0.11.1 (666 IPs, 74 countries) | 666 |
| HASSH dd9bcf093c35… — SSH-2.0-ZGrab ZGrab SSH Survey (52 IPs, 1 countries) | 52 |
// Why HASSH
HASSH is an MD5 hash of the SSH client's KEX (key exchange), encryption, MAC, and compression algorithm sets, in the order the client offers them. Two clients running the same SSH library, version, and configuration produce the same HASSH — even if they're connecting from different IPs, ASNs, or countries.
That's the whole insight: an attacker can rent new IPs, but they can't easily change their tooling. A botnet operator controlling 4,000 compromised hosts is going to use the same SSH client on all of them, because that's what the malware ships with. A scanner deployed across 500 cloud VMs is going to identify identically on every connection, because that's what ZGrab does. HASSH collapses the cosmetic disguise of distributed infrastructure to expose the operator behind it.
Volume-based reputation services (Censys, Shodan, AbuseIPDB, CrowdSec) cannot cluster on HASSH because they don't run cowrie-class capture that produces HASSH. GreyNoise classifies SSH traffic but doesn't expose HASSH-based pivots publicly. We do both: every fingerprint is pivotable at /tools/hassh/<fp>/ and /api/v1/fingerprints/hassh/<fp>, and the campaign detector promotes qualifying clusters automatically.
// How the detector decides
A HASSH fingerprint qualifies as a hassh_cluster campaign when all of these are true:
- ≥50 distinct non-benign actors share the fingerprint. The threshold filters out small operators, accidental tool collisions, and minor scanner fleets.
- Those actors span ≥3 distinct /16 subnets. A single /16 means a hosting farm — already caught by our subnet detector. We require geographic dispersion to surface distributed operators specifically.
- At least one actor was active in the last 7 days. Stale fingerprints stop being campaigns; the lifecycle manager closes them after 30 days of inactivity.
- Benign-tagged actors are excluded from the count. Legitimate research scanners (Censys, Shodan, Shadowserver, etc.) don't pad our campaign counts — see classification methodology.
The thresholds are intentionally conservative. Tightening them would catch
smaller operators but contaminate the cluster set with false positives.
Loosening them would surface more activity but make the "this is a real
distributed operator" claim less defensible. The current values
(50 actors, 3 /16s) are tuned from production data and published in the
source: apps/threats/campaigns.py
— constants HASSH_MIN_ACTORS and HASSH_MIN_SUBNETS_16.
// Top fingerprints in the last 7 days
Distinct non-benign actors per HASSH, last 7 days. Click any fingerprint to drill into its actor list, geographic spread, and ASN distribution.
| HASSH (truncated) | Actors (7d) |
|---|---|
| 03a80b21afa81068… | 666 |
| dd9bcf093c355da7… | 52 |
| 16443846184eafde… | 41 |
| 084386fa7ae5039b… | 33 |
| 98f63c4d9c87edbd… | 29 |
// What the detector won't catch
Important to be honest about the limits:
- Operators using diverse tooling. A sophisticated actor running multiple SSH clients across their fleet will produce multiple HASSH fingerprints. Each one might fall below the 50-actor threshold individually. Credential-fingerprint and JA4 pivots (planned, GH #186) will help triangulate these.
- Single-provider hosting farms. Intentionally excluded by the /16 dispersion check — those are caught by our subnet and ASN detectors instead.
- Non-SSH attacks. HASSH is SSH-specific. The same architectural pattern (capture → pivot → detector) will extend to JA4 for TLS, but that's a separate feature in the queue.
- Brand-new operators. The 7-day window means a fresh operator needs at least one sensor touch from each of 50 IPs before they surface. Slow-and-low operators that probe gradually will take longer to cluster — that's the cost of a high-confidence threshold.
// How to use it
GET /api/v1/fingerprints/hassh/<fp>
7-day window default; ?window=all for history. Returns up to 500 actors per call.
GET /api/v1/clusters/hassh/top
Top fingerprints by recent non-benign actor count. Cached 5 min. Use as an anomaly signal.
/tools/hassh/<fp>/
HTML pivot view: actor table, top countries, top ASNs sidebar. Linkable from any actor detail page.
Active hassh_cluster campaigns appear in the standard threat feed with a HASSH badge.
// See also
- /intelligence/ — full methodology: collection, classification, scoring, sources
- /intelligence/methodology — published 6-signal confidence formula with weights
- /intelligence/scanners/ — benign scanner registry (organizations we suppress)
- /api/docs — OpenAPI/Swagger reference for all endpoints