Cover image "Don’t Be an Easy Target" by Abraham Pena is licensed under CC BY 4.0

How to harden your login page against cloning

Bradley's author profile picture
Bradley Kemp on

Cloning a login page in order to make phishing sites takes only a few seconds. There's a good chance the browser you're using right now has this functionality built in:

File -> Save page as -> Web page, single file

That's all it takes to save a perfect replica of your login form as a single HTML file. From there, the phisher just needs to add a way to save the entered credentials, and they've got a simple phishing kit ready to deploy. A flow chart showing firstly a real bank website, secondly a clone of the bank's website, and finally a tweaked clone of the bank's website with a form asking for personal details.

Can you prevent your website being cloned?

Unfortunately, no. The HTML, CSS, and JavaScript that make up your website inherently need to be available in order for browsers to display your website to users. But this means your content can also be copied and abused by someone else. There's no way to prevent your website being cloned, so instead you need to focus on detecting when this has happened.

Anti-cloning techniques

While you can't prevent your website being cloned, there's a few techniques which can help detect when a phisher sets up a clone of your website.

In essence, these techniques work because the cloning process phishers use is too good: it replicates not just the look of the page they want to copy, but also any hidden traps you've added.

Install beacon assets

What if a phishing site directly sent you an alert when it was set up? That's what beacon assets (sometimes called honeytokens or "trust badges") give you.

The idea is to include assets in your pages that you know should only ever be loaded by pages on your domain name. Then, when a clone of your website is hosted on a different domain, if these assets are still loaded you'll be alerted to this by the request's Referer header: if your beacon is installed on login.brand.com and you start seeing requests from brand-login.com you've found a clone!

If the phisher doesn't remove the beacon when cloning your site, it'll ping back to your alerting endpoint whenever the phishing site is visited
If the phisher doesn't remove the beacon when cloning your site, it'll ping back to your alerting endpoint whenever the phishing site is visited

JavaScript beacons

JavaScript beacons are the simplest to set up as you can filter out expected domain names within the beacon itself.

All you need is an endpoint which will raise an alert, and a small snippet to make a request to that endpoint:

if (document.domain != "login.brand.com") {
    var l = location.href;
    var r = document.referrer;
    var m = new Image();
    m.src = "https://brand.com/your-alerting-endpoint?l="+
            encodeURI(l) + "&r=" + encodeURI(r);
}

This snippet can easily be hidden in amongst your website's normal javascript where it'll be hard for a phisher to find and remove it. Any time it's run on a domain which isn't your real login page, it'll send you an alert containing:

  • The current page (i.e. the URL of the phishing site)
  • The referrer (i.e. how the victim got to that page)

You can use a service like Thinkst's Canarytokens to handle the alerting for you but, the presence of the canarytokens.com domain in code makes it fairly obvious this is a security feature and not just a normal part of your website. For best results, it's always better to use your own infrastructure so you can more seamlessly blend in.

CSS URL beacons

Depending on how your website is made, a phisher might be able to simply delete all the JavaScript (including your beacon) and still have a genuine-looking phishing page. So it's important to have a couple different types of beacon to maximise the chance that one will make it into the phishing kit.

Although website cloning tools try to output a completely standalone copy of a website, there's usually a limit to how far in a chain they'll explore. When your website directly includes an image (like your logo: <img src="/logo.png">) the cloning tool will download that image and embed it in the page.

But, many tools fail to embed more deeply nested assets like those requested via CSS:

While cloning tools reliably handle one level of indirection, many fail to properly clone subsequent levels of indirection.
While cloning tools reliably handle one level of indirection, many fail to properly clone subsequent levels of indirection.

Using this fact, you can create a double-indirected asset by including this snippet in your CSS file:

.some-innocuous-element {
    background-image: url("https://brand.com/your-alerts-endpoint.png")
}

While this snippet itself will get cloned, many tools don't go deeper and clone the image URL referenced by the CSS.

Unlike the JavaScript beacon, this background image will get loaded even on your real login page (so make sure it's either an invisible image or something that won't disrupt the page) but you can tell via the Referer header whether it's being loaded from your legitimate domain or from a clone.

Embed high-entropy strings

Eventually a clever (or lucky) enough phisher will create a clone of your website that avoids triggering any of your beacon assets. For these sites, while you won't be directly alerted to them, you can still make it very easy to detect these clones.

It's very simple to create a string which appears nowhere else on the internet. A simple randomly generated UUID like F6CAD066-82B8-4A52-ACD4-586251506962 contains 122 bits of entropy meaning there's 2122 possible values. The number of possible UUIDs means there's effectively zero chance that someone else will randomly generate the same UUID as you.

Therefore, if you include a random UUID on your login page and you see that UUID appear on another website, it's overwhelmingly likely that it's because that site is a clone of yours.

Depending on the technologies you use on your website, you may already be including high-entropy strings in your pages. See our guide on detecting phishing sites using high-entropy strings for a full list of sources.

Ensure you include non-generic text

Even if a phisher diligently removes all your beacons and any high-entropy strings you've included on your website there's still parts of your site they're likely to leave alone. Text like:

  • Your page title
  • Help text on your login form
  • Footer text like "© Brand Ltd 2023" or your company's legal information

These are all important elements that make your login form seem legitimate so a phisher will leave them alone in order to make their clone more believable.

If you have a custom page title like "Login | Brand", you can detect sites with that title which aren't on your domain. In IOK, that rule would be written like:

detection:
  loginTitle:
    title: "Login | Brand"
  realHostname:
    hostname: "login.brand.com"

  condition: loginTitle and not realHostname

What to do when anti-cloning techniques stop working

These anti-cloning techniques work very well against lower-skilled phishers, but eventually someone will come along and create a phishing kit which doesn't include any easily identifiable fingerprint.

At this point you might need to switch focus towards:

Ultimately your goal is to make that specific phisher give up and move on to another target.

Want more insight into phishing kits?
Start a trial today.

More posts from the Phish Report team

Cover image

Detecting phishing sites with high-entropy strings

You'd expect phishing sites to be hard to detect and track, but actually, many of them contain HTM...
Cover image

Why it's hard to identify who hosts a website

Who hosts that website? Seems a simple question, but it's very hard to consistently get the right ...
Cover image

There's no honour among phishers: free phishing kits with hidden backdoors

Deploying a phishing website is rarely a highly technical task. For most brands, a phishing kit au...