(Edited on 15 Sep 2016 – new info at the end)
There are a number of ways to find phishing websites. The best one that I’ve found is detailed below.
Very often, phishing websites have a short life span. Therefore, it’s best to catch them as they’re just being brought to life. I’ve developed a fairly easy way to discover these sites. Admittedly, there’s a bit of manual drudgery in this method but once you’ve been doing for a little bit, it goes pretty quickly.
There are a number of places where you can find lists of newly registered domains. Domain Punch, link here, is one of the more well known. The nice thing about this site is that you can do a lot of filtering. Using the box on the right hand side of the screen allows you search for specific phrases in newly registered domain names.
On the left side of the screen, you can then further narrow your search down by zone, or TLD. This is nice and you can find a lot of interesting domains using this interactive method. It does, however, take a bit more time than the method that I’m using.
There’s another website called Whoxy that allows you to download a zip file of all of the domains registered on a given day.
They have 3 days available. Inside of the zip file, there’s a single text file.
So, my process begins by downloading the zip file and extracting the text file. Open the text file in your favorite text editor – mine is Leafpad. Next, open a blank text file. At this point, I have two instances of Leafpad open – on the right is the text file of newly registered domains and on the left is a blank text file.
Most frequently, phishers are looking to spoof known brands (Apple, Amazon, PayPal) so that’s what we want to try to find. Call up the Search feature in your text editor and you’re ready to go. First, I’ll search for something like “paypal”. As I go through the search results, I’ll copy any new domains I find that look promising and paste them into the blank text file. I’ll keep hitting F3 to continue through the text file of new domains until I reach the end then go back to the top and start again on the next search term. There are many things you could search for but here is a list of what I’ve found to be some of the most promising search terms:
You could find many more if you had the time – just think of any popular brand and keep searching. Using only this handful of search terms usually gives me between 75 and 200 suspicious domains per day. Once I’ve completed my searching, I move to what was the blank Leafpad instance – which now contains all of the domains to be investigated – and select all of them and copy them into the paste buffer. Next I open a spreadsheet that I maintain to give myself a more permanent record and also to track some information about the domains that I find. Paste the newly found domains into the domains column of the spreadsheet and save it. Now, you can close both of your Leafpad instances.
The next step is still tedious but it’s more interesting. I have my spreadsheet open on the right side of the screen and a web browser open on the left side. Now, go through each new domain in the spreadsheet and copy it from the spreadsheet then paste it into your browser and see where it leads.
The spreadsheet is really a work in progress. I track a few things but I primarily just wanted a place that I could refer to that lists the suspicious domains that I find along with the hosting company and what I find when I go to the page.
The image above is a perfect example of what I find. It’s not uncommon to find a dozen or more domains that are masquerading as login pages. Sometimes, I find a directory listing that contains the tools the scammer is using to set up a phishing site (in this case, we’re early to the party).
As I go through the spreadsheet, I mark in bold, the domains that are obvious (to me, anyway) scams. Next, I go to a site like Hosting Detector and look up who is hosting the phishing site. Here’s an example of another site from the spreadsheet image (above):
After I’ve finished going through the day’s list, I’ll report my findings to the hosting companies that are giving shelter to these sites. Sometimes, the hosting information isn’t available so it may be possible to do a WHOIS lookup and send the info to the registrar. Unfortunately, sometimes you just don’t have enough information to do anything.
Most hosting companies and registrars are pretty good about investigating and taking down any domains that are obviously fraudulent. At the same time, if these domains are so easy to find by the public, one has to wonder why domain registrars and hosting companies don’t find them on their own. I mean, can there be any doubt what will be hosted on a domain named www-securepaypal.com !?
While this process works, it’s very tedious and time consuming. I hope to one day (soon?) automate this process to a large degree. I could envision a full application that would do the collection and parsing of the newly registered sites and then allow you to go through and check them. You could select certain flags for each domain based on your findings and the data would be saved to a database for later use or for statistical purposes. It might even be possible to generate and send emails to the appropriate hosting companies.
Let me know if you have any other ideas about finding this type of domain (or automating the process).
UPDATE – 15 Sep 2016
Shortly after I posted this blog entry, I received a message on Twitter from Brandon M – his Twitter handle is @0x4445565a . We briefly discussed automating this process. Remarkably, within a few hours, I heard from him again – he had a working tool called sushiphish! A few tweaks later and I was off to the races.
You can find a link to his blog entry with a description of his tool here. I’ve been using it ever since and I can say that it has dramatically cut down the time required to complete this whole process. There is a little glitchiness with the OUT.CSV file but that can be easily fixed as you’re importing (opening) the file in the spreadsheet software of your choice. Simply flag all columns to not be imported except the domain and the IP address. After that – the import works perfectly. Thanks again Brandon!