How to write an IOK rule

What is an IOK rule?

IOK ("Indicator of Kit") is a small, open source language designed for detecting and classifying phishing sites. It's based on Sigma, but adapted for analysis of websites, rather than security logs.

The core IOK rule engine (along with over 200 rules) is open source. Phish Report builds on top of this open source core and provides:

  • An online IDE allowing you to test and validate rules before deploying them
  • A set of private rules maintained by the Phish Report team

The structure of an IOK rule

An IOK rule looks like this:

title: Fake Chrome error page
description: |
    The Chrome error page HTML is built into the browser: you should never see it in the response from a
    website.
    This is a clear sign that the site is employing cloaking/anti-analysis techniques.
level: likely_malicious
references:
    - https://twitter.com/phish_report/status/1537825544343011328

detection:
    chromeHTMLFragments:
        html|contains|all:
            - '<body id="t" class="neterror" style="font-family: '
            - '<div id="main-frame-error" class="interstitial-wrapper" jstcache="0">'
    condition: chromeHTMLFragments

And can be broken down into two main sections.

Title and Description

A rule starts with some commentary to help you and your team understand what a rule detects. This commentary includes:

  • A short title (which will appear in places like the analysis page of a site that matches this rule)
  • A longer description where you can document, for example, what you know about the phishing kit this rule detects
  • A level denoting the severity of this rule. This value determines what actions Phish Report takes automatically when a rule matches a website.
  • A list of references where you can link to relevant scans of this phishing kit, your own internal documentation, or public analysis of this threat actor.

You can also include arbitrary additional data in this section. For example, if you wanted to document when you first observed a particular phishing kit, you could add your own field to do so:

first_seen: 2023-01-05

These custom fields will not be interpreted by Phish Report.

Detection logic

The detection of an IOK rule consists of a set of named properties, and a condition which is a boolean expression containing the named properties. How to write this logic is described below.

Writing detection logic

Single property rules

The simplest IOK rules are based on a single property of a phishing site. For example, the threat actor "Cazanova" names their session cookie after themselves and so any site with this cookie name is likely malicious.

To detect a Cazanova site, an IOK rule would match on the cookies field and look for values like cazanova=COOKIEVALUE. This can be done using a property like this:

cazanovaCookie:
    cookies|startswith: "cazanova="

Here we've chosen to name the property "cazanovaCookie" but this name is entirely arbitrary.

To complete the detection logic, we need to add a condition which tells Phish Report which combination of properties mean this rule matches a given site. As we only have a single property (cazanovaCookie) this is trivial:

detection:
    cazanovaCookie:
        cookies|startswith: "cazanova="
    
    condition: cazanovaCookie

Filtering false positives: combining multiple properties

Many IOK rules can't be expressed as simply as a single condition. Often, you'll need to combine multiple properties to get a well-performing rule.

For example, you want to detect sites which are loading assets (JavaScript, CSS, images, etc.) from your website (typical of lazily created phishing kits). We can express that as an IOK rule like this:

detection:
    hotlinkedAsset:
        requests|startswith: "https://mydomain.com/assets/"
        
    condition: hotlinkedAsset

However, this rule will likely alert on any other websites you host, like on subdomains.

To address this, we want to filter out any subdomains of your real domain. This can be done by adding another property and extending our condition to exclude those results:

detection:
    hotlinkedAsset:
        requests|startswith: "https://mydomain.com/assets/"
    mySubDomain:
        hostname|endswith: ".mydomain.com"
        
    condition: hotlinkedAsset and not mySubDomain

Fields, Modifiers, and Conditions

Fields

Field name Description
title The title of the site as shown in a browser. If multiple titles are set (e.g. by JavaScript), this contains each one.
hostname The hostname of the site
html The contents of the page HTML (as returned by the server)
dom The contents of the page HTML after loading (e.g. after javascript has executed)
js Contents of JavaScript from the page (includes inline scripts as well as scripts loaded externally)
css Contents of CSS from the page (includes inline stylesheets as well as externally loaded stylesheets)
cookies Cookies from the page. Each is in the form cookieName=value
headers Headers sent by the server. Each is in the form Header-Name: value
requests URLs of requests made by the page (and assets loaded by the page)

Modifiers

It's rare you'll want to match the exact contents of a field (as these often contain details specific to the individual site like the hostname). Instead, you can use modifiers to match on different parts of the field:

  • fieldname|startswith: value: the field starts with the given value
  • fieldname|endswith: value: the field ends with the given value
  • fieldname|contains: value: the field contains the given value
  • fieldname|re: regexp: the field matches the given regular expression

Additionally, if multiple values are given, the special |all modifier can be used to specify that all values must match the field (rather than the default behaviour where any value is expected to match). For example:

fieldname|contains|all:
    - foo
    - bar

Is true for foobar, but not foobaz.

Conditions

Within a condition you can use the following keywords to combine your properties into a boolean expression:

  • A and B: true if both A and B are true
  • A or B: true if either A or B is true
  • not A: true if A is false
  • 1 of them: true if any property is true
  • 1 of {glob}: true if any property matching the glob pattern is true (e.g. prop2 matches the pattern prop*)
  • all of them: true if all properties are true
  • all of {glob}: true if all properties matching the glob pattern are true (e.g. prop2 matches the pattern prop*)

Brackets can be used to arbitrarily nest expressions for example (A or B) and (C or D).