What is an IOK rule?
IOK ("Indicator of Kit") is a small, open source language designed for detecting and classifying phishing sites. It's based on Sigma, but adapted for analysis of websites, rather than security logs.
The core IOK rule engine (along with over 200 rules) is open source. Phish Report builds on top of this open source core and provides:
- An online IDE allowing you to test and validate rules before deploying them
- A set of private rules maintained by the Phish Report team
The structure of an IOK rule
An IOK rule looks like this:
title: Fake Chrome error page
description: |
The Chrome error page HTML is built into the browser: you should never see it in the response from a
website.
This is a clear sign that the site is employing cloaking/anti-analysis techniques.
level: likely_malicious
references:
- https://twitter.com/phish_report/status/1537825544343011328
detection:
chromeHTMLFragments:
html|contains|all:
- '<body id="t" class="neterror" style="font-family: '
- '<div id="main-frame-error" class="interstitial-wrapper" jstcache="0">'
condition: chromeHTMLFragments
And can be broken down into two main sections.
Title and Description
A rule starts with some commentary to help you and your team understand what a rule detects. This commentary includes:
- A short title (which will appear in places like the analysis page of a site that matches this rule)
- A longer description where you can document, for example, what you know about the phishing kit this rule detects
- A level denoting the severity of this rule. This value determines what actions Phish Report takes automatically when a rule matches a website.
- A list of references where you can link to relevant scans of this phishing kit, your own internal documentation, or public analysis of this threat actor.
You can also include arbitrary additional data in this section. For example, if you wanted to document when you first observed a particular phishing kit, you could add your own field to do so:
first_seen: 2023-01-05
These custom fields will not be interpreted by Phish Report.
Detection logic
The detection of an IOK rule consists of a set of named properties, and a condition which is a boolean expression containing the named properties. How to write this logic is described below.
Writing detection logic
Single property rules
The simplest IOK rules are based on a single property of a phishing site. For example, the threat actor "Cazanova" names their session cookie after themselves and so any site with this cookie name is likely malicious.
To detect a Cazanova site, an IOK rule would match on the cookies
field and look for values like cazanova=COOKIEVALUE
.
This can be done using a property like this:
cazanovaCookie:
cookies|startswith: "cazanova="
Here we've chosen to name the property "cazanovaCookie" but this name is entirely arbitrary.
To complete the detection logic, we need to add a condition which tells Phish Report which combination of properties mean this rule matches a given site.
As we only have a single property (cazanovaCookie
) this is trivial:
detection:
cazanovaCookie:
cookies|startswith: "cazanova="
condition: cazanovaCookie
Filtering false positives: combining multiple properties
Many IOK rules can't be expressed as simply as a single condition. Often, you'll need to combine multiple properties to get a well-performing rule.
For example, you want to detect sites which are loading assets (JavaScript, CSS, images, etc.) from your website (typical of lazily created phishing kits). We can express that as an IOK rule like this:
detection:
hotlinkedAsset:
requests|startswith: "https://mydomain.com/assets/"
condition: hotlinkedAsset
However, this rule will likely alert on any other websites you host, like on subdomains.
To address this, we want to filter out any subdomains of your real domain. This can be done by adding another property and extending our condition to exclude those results:
detection:
hotlinkedAsset:
requests|startswith: "https://mydomain.com/assets/"
mySubDomain:
hostname|endswith: ".mydomain.com"
condition: hotlinkedAsset and not mySubDomain
Fields, Modifiers, and Conditions
Fields
Field name | Description |
---|---|
title | The title of the site as shown in a browser. If multiple titles are set (e.g. by JavaScript), this contains each one. |
hostname | The hostname of the site |
html | The contents of the page HTML (as returned by the server) |
dom | The contents of the page HTML after loading (e.g. after javascript has executed) |
js | Contents of JavaScript from the page (includes inline scripts as well as scripts loaded externally) |
css | Contents of CSS from the page (includes inline stylesheets as well as externally loaded stylesheets) |
cookies | Cookies from the page. Each is in the form cookieName=value |
headers | Headers sent by the server. Each is in the form Header-Name: value |
requests | URLs of requests made by the page (and assets loaded by the page) |
Modifiers
It's rare you'll want to match the exact contents of a field (as these often contain details specific to the individual site like the hostname). Instead, you can use modifiers to match on different parts of the field:
fieldname|startswith: value
: the field starts with the given valuefieldname|endswith: value
: the field ends with the given valuefieldname|contains: value
: the field contains the given valuefieldname|re: regexp
: the field matches the given regular expression
Additionally, if multiple values are given, the special |all
modifier can be used to specify that all values must match the field (rather than the default behaviour where any value is expected to match).
For example:
fieldname|contains|all:
- foo
- bar
Is true for foobar
, but not foobaz
.
Conditions
Within a condition you can use the following keywords to combine your properties into a boolean expression:
- A and B: true if both A and B are true
- A or B: true if either A or B is true
- not A: true if A is false
- 1 of them: true if any property is true
- 1 of {glob}: true if any property matching the glob pattern is true (e.g.
prop2
matches the patternprop*
) - all of them: true if all properties are true
- all of {glob}: true if all properties matching the glob pattern are true (e.g.
prop2
matches the patternprop*
)
Brackets can be used to arbitrarily nest expressions for example (A or B) and (C or D)
.