Effective Subdomain Crawling using Python

3 min readAug 24, 2022

Crawling to discover subdomains of any domain

If you want to test a website’s security and hack into it, you must be able to examine the whole web application as well as all of the provided capabilities and technologies in order to find a flaw that will allow you to access the system. A vulnerability in anything as simple as a search field or a forgotten page might be exploited to compromise a whole system. Therefore, it is imperative to gather as much information about the website as possible, including its directories, files, subdomains, and so on.

What exactly do we mean by a subdomain?

Subdomains are subsets of the main domain. For instance, “Google.com” is the main domain, and we have a variety of subdomains that enable you to access other Google websites, such as Google Drive, Gmail, Google Meet, Google Map, etc. We can visit “mail.google.com” to access Google’s email web application (Gmail), “plus.google.com” to access Google’s social network, and so on.

So, if you want to hack into any of the websites, you must first test all subdomains, which requires you to locate them. This is very essential since subdomains are often excellent areas to uncover vulnerabilities, as they are likely not to be as securely designed as the main websites.

Domain Crawler

So, now we will be writing our own tool that will allow us to discover all of the subdomains on the target website. Because manually testing is a very time-consuming task, to reduce this overhead, there must be some autonomous way of doing this tiresome task for us. We will use the Python programming language to craft a “subdomain-crawler” for us that will try different combinations from a list of subdomain names. This whole process is shown in the figure below.

Flow of Instructions

To determine whether a subdomain exists, we require a method of communicating with the website, such as entering the subdomain’s URL into a browser and then determining whether it exists. Since we want to accomplish this with a Python script, we’ll need a way to automatically make website requests. This programme will utilise the “requests” library, which can be installed with the command pip install requests.Using different word combinations, distinct subdomains will be created at runtime and their responses will be obtained using the request library.

This process appears like this:

Word = apt
Domain = google.com
Link_formation = word+’.’+domain
Final_link = http://+ Link_formation

So, the final link that we will be passed to the request library to get the response will look like this “http://apt.google.com”

Steps to Follow

You may get the code from the Github repository or Linkedin post. These are the simple steps to follow to use this script. The file hierarchy looks like this:

In this simple script, you will type the domain name such as target.com
This script will use a file words_list.txt that contains a list of words to create multiple subdomains
Each time a new domain is created it will be tested for response 200
In the end, it will create a list of all discovered sub-domains
To run the script use command python sub-domain-crawler.py

import requestsdiscovered_subdomains = []
url = ‘cytomate.net’ # main domaindef url_test(url):
 try:
 return requests.get(url)
 except requests.exceptions.ConnectionError:
 passwith open(“words_list.txt”, “r”) as words:
 for word in words:
 test_url = “http://”+word.strip()+”.”+url
 
 response = url_test(test_url)
 try:
 if(response.status_code == 200):
 print(“[+] Discovered >”,test_url)
 discovered_subdomains.append(test_url)
 else:
 pass
 except:
 pass
 
print(discovered_subdomains) # to see all discovered subdomains
print(“Done”)

Effective Subdomain Crawling using Python

What exactly do we mean by a subdomain?

Domain Crawler

Flow of Instructions

Steps to Follow

I publish posts on numerous issues of cyber security, and you may find them useful. Check out my Linkedin and Medium profiles for more updates.

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Azhar Ghafoor

No responses yet