Cover image "Cyber Specialists" by Khahn Tran is licensed under CC BY 4.0

Why it's hard to identify who hosts a website

Bradley's author profile picture
Bradley Kemp on

Who hosts that website? Seems a simple question, but it's very hard to consistently get the right answer.

Taking down malicious websites is already a slow process, but if you mis-identify who hosts a website, you add even more delay.

Why it's hard to identify who hosts a website

A naive process for finding out who hosts a website is this:

  1. Do a WHOIS lookup on the domain which will tell you who the registrar is (i.e. who that domain name was purchased from).
  2. Do a DNS lookup to find which IP addresses the domain name resolves to.
  3. Do a WHOIS lookup on those IP addresses which will tell you which ASN (Autonomous System Number, essentially the network) they belong to and hopefully an abuse contact.

This process used to work very well but, with the advent of the cloud, this method increasingly gives wrong answers.

The trouble with cloud IPs

If you look up a domain name and find it's hosted on an AWS IP address, what does that mean? It's pretty unlikely the phisher has an AWS account: most phishing is carried out by very low skilled actors who aren't going to be using a cloud provider. Instead, it's overwhelmingly likely the situation is this:

While the phishing site is ultimately hosted with a cloud provider, it's the hosting service you should report abuse to.
While the phishing site is ultimately hosted with a cloud provider, it's the hosting service you should report abuse to.

If you report the malicious website to AWS (or other cloud provider), they're unlikely to take any action themselves. Instead, they'll just forward on the abuse report to their customer. But, while the website hosting company likely has a standard abuse reporting process, this forwarded report will end up with the engineering team, delaying its resolution.

For fastest results you need to get your abuse report to the right place with mimimal forwarding, and that means identifying the website hosting company themselves.

To do so, you need to look at more nuanced parts of the domain name -> IP address process:

  • What are the nameservers for the domain? A common part of the setup process for hosting providers is to configure your domain with the company's nameservers.
  • Is there a CNAME record? If you're not handing complete control of a domain over to a hosting provider, they'll get you to add a CNAME record instead.

No standardisation of abuse contact information

Even when the naive method is correct, it's a painfully manual process. There's no standardisation for WHOIS server responses and so you either need to read the response yourself, or write a parser for each WHOIS format.

RDAP (Registration Data Access Protocol) is an attempt to replace WHOIS, and it's a big improvement: responses are in a standardised JSON format which makes machine parsing easy.

But RDAP adoption is slow. While RDAP support is mandatory for all Generic TLDs (like .dev, .site, and the notorious .zip), it's implemented by less than a quarter of Country-code TLDs.

So falling back to WHOIS is going to be a necessity for the foreseeable future.

How Phish Report attempts to solve this

Accurate abuse contact identification is a vital part of Phish Report. We don't have any magic extra techniques for identifying hosting providers, but we have the advantage of scale.

As an individual, it's not worth implementing all the niche ways of identifying hosting providers. But we analyse thousands of phishing sites every day so even methods that are only useful one in a thousand times are worth us implementing.

When we do get something wrong, it only takes one of our users to tell us for it to be swiftly fixed for everyone.

Free hosting provider identification tool

You can use Phish Report's free hosting provider identification tool in three ways.

Our online lookup tool

For quick, ad-hoc lookups you can use our online hosting provider identification tool.

Integrating with our API

But, for automated lookups you can use our hosting provider identification API. This gives you a list of all the providers involved in hosting a website as well as the best way to report abuse to them (whether it be an email address or online form)

With the API you can get all the hosting providers of a site with a single request:

$ curl https://phish.report/api/v0/hosting?url=phishy.pantheonsite.io
[
  {
    "name": "Pantheon",
    "role": "platform",
    "report_uri": "mailto:abuse@pantheon.io"
  }
]

Using a Cortex analyser

We also have a Cortex analyser and responder.

Once you've installed the analyser, you'll be able to analyse any URL, domain, or IP observable and see who hosts it.

Want more insight into phishing kits?
Start a trial today.

More posts from the Phish Report team

Cover image

Using IOK rules to hunt for phishing sites across multiple threat intelligence sources

IOK ("Indicator of Kit") is a small, [open source](https://github.com/phish-report/IOK) language d...
Cover image

How the static site trend has affected phishing

The trend towards statically generated websites hasn't been limited to legitimate websites. Increa...
Cover image

Top 5 phishing detection APIs you can start using today

There's too many suspicious URLs going round to manually check every one to see if it's malicious....