How we populate all four intent axes

A month ago our threat classifier looked four-valued on paper and two-valued in practice.

The schema had four intent axes (malicious, suspicious, benign, unknown) modeled on GreyNoise's two-axis classification but applied to our own first-party honeypot data. Four values. Four meaningful labels. Good.

The actual distribution on 14,465 tracked actors:

benign: 259 (Censys, Shodan, Shadowserver, etc., populated)
suspicious: 9 (Tor exits, populated)
malicious: 0 (empty)
unknown: 14,197 (everybody else)

So a classifier with four possible verdicts in the database was, in production, telling you one of two things: "this is a known research scanner" or "we don't know." That's still honest, we're not confabulating, but it's not the four-axis classifier the public /features/ page was describing. It was the model the field was designed for, but not the model the data was expressing.

This post is about how we finished it, what each axis now actually means in our data, and the part I care about most: why every verdict carries a machine-readable reason string that a client can pull down and audit.

Why the gap existed

When I first shipped the intent + category fields, I deferred the harder writer. The easy writer was the hostname classifier: if the reverse DNS of an actor ends in censys-scanner.com, tag them benign. If the ASN is on the Tor Project's exit-node list, tag them suspicious. Both are cheap, both are accurate, both are stable. They shipped.

The harder writer was everything else. "What's behaviorally malicious?" is not a question the hostname tells you. It's a question the session data tells you: how far the attacker got, what they typed, what they tried to download. That writer I punted on, because getting the thresholds wrong would be worse than leaving actors at unknown.

For a month, the pragmatic shape of the product was "we tell you when we're confident something is benign, and otherwise we leave you to interpret the session pattern and the confidence score yourself." That's a defensible posture. It's also not what the hero card on our features page was promising. So in week one of building out the public-facing feature pages, the gap became visible. A few weeks later I came back and fixed it.

The reconciler

The thing that landed is 237 lines in one file, apps/threats/intent.py. It's a second writer that runs as the last step of every aggregation cycle. It reads signals already present on the actor (the session pattern the classifier had tagged, the confidence score, the count of independent external feeds corroborating the actor, whether the actor's ASN is on Spamhaus ASN-DROP) and promotes the actor's intent under a declared precedence.

The precedence, printed literally at the top of the module docstring:

benign  >  malicious  >  suspicious  >  unknown

Benign is sticky. Once the hostname classifier has flagged an actor as a known research scanner, nothing the reconciler sees overrides that. Tor-suspicious is also sticky for the same reason: the hostname classifier owns the tor_exit reason, and "Tor exit" is stronger than any behavioral signal you could apply to the same actor. For everyone else, the reconciler reads the signals in order of strength and picks the first rule that fires.

The rules are module constants, which I find clearer than imperative logic. They read as a classification spec, not code:

MALICIOUS_BEHAVIORAL_PATTERNS = frozenset({
    "malware_dropper",
    "data_exfiltrator",
    "interactive_operator",
})

SUSPICIOUS_BEHAVIORAL_PATTERNS = frozenset({
    "credential_harvester",
    "opportunistic_bruter",
    "proxy_abuser",
    "mysql_bruter",
    "ftp_bruter",
    "telnet_bruter",
})

MALICIOUS_CONFIDENCE_FLOOR = 0.35
SUSPICIOUS_CONFIDENCE_FLOOR = 0.30
SUSPICIOUS_CORROBORATION_MIN = 2
SUSPICIOUS_ASN_DROP_EVENT_MIN = 10

An actor qualifies as malicious if their session-level behavior matches one of three high-depth patterns (they dropped a malware payload, exfiltrated data, or ran an interactive shell) and their overall confidence score is at least 0.35. The confidence floor is there so a single low-confidence observation doesn't flip the verdict; we want crossed evidence.

An actor qualifies as suspicious for any of three reasons: their behavior matches a credential-harvester / bruter pattern at confidence ≥ 0.30, or two or more independent external feeds have corroborated them, or they're operating from an ASN on Spamhaus' ASN-DROP list with at least 10 observable events hitting our sensors. These are mid-depth signals: enough to label but not enough to condemn.

Everyone else stays unknown. That's the honest answer for actors whose observable behavior doesn't meet any promotion floor. We'd rather leave them unlabeled than guess.

What the axes actually look like now

As of this morning, across 14,591 tracked actors, the four axes are:

malicious: 884 (6.1%)
suspicious: 5,123 (35.1%)
benign: 259 (1.8%)
unknown: 8,325 (57.0%)

That's what a four-axis classifier is supposed to look like: each axis carries a meaningfully-sized population that earned its label from a specific signal, and the unknown bucket (still the plurality) is where actors sit when we genuinely don't have enough evidence to commit.

Three real actors, three real reasons

The part that makes this useful to a CTI analyst isn't the counts. It's that every verdict carries a reason string that traces back to the rule that fired. That reason is stored on the actor, in enrichment_metadata.intent.reason, and it's exposed on the public API.

Pulled live just now from three different actors:

115.191.66.84 (malicious) - intent_reason: behavioral:malware_dropper conf=0.41 - intent_source: algorithm:intent-reconciler-v1

The reconciler tagged this one malicious because their session history contains a malware_dropper pattern and their confidence score is 0.41, above the 0.35 floor. Concretely: the session classifier saw this actor successfully authenticate to a Cowrie honeypot, then run a sequence of commands culminating in a wget or curl for a payload.

20.203.42.204 (suspicious) - intent_reason: behavioral:credential_harvester conf=0.59 - intent_source: algorithm:intent-reconciler-v1

Suspicious because their behavior matches the credential_harvester pattern at confidence 0.59, well above the 0.30 floor. This actor is running password attacks, persistently enough and across enough of our sensors to pass the confidence floor, but hasn't crossed into the higher-depth behaviors that would promote them to malicious.

3.131.220.121 (benign) - intent_reason: hostname:known_scanner - intent_source: hostname-classifier

Benign because their reverse DNS matches our curated scanner-organization registry. Most likely Shodan, Censys, or one of the other research scanners we explicitly don't treat as threats. Note the source is different: this one was written by the hostname classifier (which owns benign and Tor-suspicious), not the reconciler.

A client consuming our API can distinguish algorithm-written verdicts from hostname-written ones by checking whether intent_source starts with algorithm:. Different sources carry different confidences, and an analyst may want to weight them differently when rescoring in their own pipeline.

The endpoint

The reason string is exposed at GET /api/v1/actor/<ip>:

$ curl https://intrusionlabs.com/api/v1/actor/115.191.66.84 | jq '{intent, intent_reason, intent_source, intent_reconciled_at}'
{
  "intent": "malicious",
  "intent_reason": "behavioral:malware_dropper conf=0.41",
  "intent_source": "algorithm:intent-reconciler-v1",
  "intent_reconciled_at": "2026-04-20T17:52:05.613907+00:00"
}

If you want to pull a whole cohort at once, the list endpoint supports ?intent=malicious|suspicious|benign|unknown with the usual min_confidence, max_age_hours, and category filters combinable:

$ curl 'https://intrusionlabs.com/api/v1/threats/ips?intent=malicious&limit=10'

And the bulk endpoint (POST /api/v1/threats/bulk with a list of up to 100 IPs) returns intent and intent_reason for every IP it found. All three endpoints are rate-limited at 60/h anonymous; no API key, no signup.

The bit I find most useful: the reason string is persistent. It was written when the actor was last reconciled, and it's still there now. An analyst investigating an incident can pull the actor's detail six hours later and see the exact rule that fired when the verdict was assigned, along with the timestamp. That's a different shape of trust than "our ML model says malicious with 0.7 confidence." It's a specific claim about a specific signal, visible end-to-end.

The precedence in action

Because the reconciler runs every cycle, verdicts can change as signals change. If a previously-unknown actor crosses the malware_dropper threshold tomorrow, they become malicious. If their behavior tapers off and drops below the confidence floor, they fall back to unknown. No verdict is locked except benign. The reconciled_at timestamp tells you how fresh the call is.

There are exactly two stickinesses:

Benign. The hostname classifier owns this. Once an actor's rDNS matches Censys or Shodan or Shadowserver, we don't un-benign them because they scanned aggressively one day. Research scanners scan aggressively. That's their job.
Tor-suspicious. The hostname classifier wrote intent=suspicious with a Tor-specific reason; the reconciler sees that combination and leaves it alone, because "operating from a Tor exit" is a structurally stronger signal than any behavioral pattern we could write from session data for the same actor.

That's it. Everything else is fluid, per aggregation cycle.

What the classifier won't catch

Four honest limits, because anyone who tells you their classifier is complete is selling you something:

Low-signal actors. An actor that only shows up once, on one sensor, with a scanner-shaped session won't meet any promotion floor. They stay unknown. That's the plurality of our population and it's the honest answer for them. We'd rather leave them unlabeled than guess.

New benign scanners not in the registry. Our scanner-organization registry is curated. A new legitimate research operator won't be flagged benign until their domain is added. Until then, if their behavior resembles scanning, they may earn a behavioral-suspicious promotion that will flip to benign once the registry catches up.

Mixed-intent actors. An actor whose behavior genuinely straddles suspicious and malicious (the same IP running a credential harvester and eventually dropping malware when it finds a weak password) collapses to the first rule that fires. The primary_threat_category field captures the strongest session-level signal; the rest is in the per-session detail on the actor's page.

Heuristic, not learned. No ML model in the loop. The floors and the rule list are hand-tuned and published. Literally, the table above is the whole classifier spec. If you disagree with where we drew a line, the constants are on the deep-dive page and you have everything you need to rescore in your own pipeline. A classify_intent --dry-run management command prints the projected distribution shift before a threshold change goes live, which is how I tune them myself.

Why this matters to defenders

A classifier that collapses to a single score ("this IP is 87% bad") gives you very little to act on. You either trust the number or you don't, and if you don't, you have nowhere to look.

A classifier that exposes the four axes plus the reason string gives you a different thing: a decision surface. You can filter to intent=malicious if your tolerance is low and you only want the highest-signal actors. You can filter to intent=suspicious and drill into the specific reason strings to decide which ones match your threat model (credential harvesters are a bigger concern for a high-value SSH estate; proxy abusers are more relevant for a web property). You can rescore our suspicious cohort in your own pipeline if you think the corroboration threshold of 2 is too low or the confidence floor of 0.30 is too conservative.

All of that is possible because the reason is on the actor, not buried in a model. The precedence is in a docstring. The thresholds are module constants. If a future version of the classifier picks different numbers, the reason strings will still trace back to the rule that fired for each specific verdict, because the provenance is stored per-actor.

This is the thing I find missing from most reputation services: the ability, as a practitioner, to read the rule that produced a specific verdict and decide whether I agree with it. Published aggregate formulas are good. Published per-verdict reasons are better. The deep-dive page at /features/scanner-classification/ has the live counts, the full rule table, and examples pulled in real time from the API.

Try it

The deep-dive (rules, thresholds, live examples): https://intrusionlabs.com/features/scanner-classification/
Filter by intent: curl 'https://intrusionlabs.com/api/v1/threats/ips?intent=malicious&limit=10'
Per-actor verdict with reason: curl https://intrusionlabs.com/api/v1/actor/<ip>
Bulk lookup (up to 100 IPs): curl -X POST -H 'Content-Type: application/json' -d '{"ips":["1.2.3.4","5.6.7.8"]}' https://intrusionlabs.com/api/v1/threats/bulk

Everything is free, public, rate-limited at 60/h anonymous. No API key, no signup.

If you're a CTI engineer consuming this and you have opinions about the thresholds, especially places where you'd rescore differently, send me a note. The floors are not immutable; they're the best tuning I have for the data I can see. Your data might suggest different ones.