Effective Subdomain Crawling using Python

What exactly do we mean by a subdomain?

Subdomains are subsets of the main domain. For instance, “Google.com” is the main domain, and we have a variety of subdomains that enable you to access other Google websites, such as Google Drive, Gmail, Google Meet, Google Map, etc. We can visit “mail.google.com” to access Google’s email web application (Gmail), “plus.google.com” to access Google’s social network, and so on.

Domain Crawler

So, now we will be writing our own tool that will allow us to discover all of the subdomains on the target website. Because manually testing is a very time-consuming task, to reduce this overhead, there must be some autonomous way of doing this tiresome task for us. We will use the Python programming language to craft a “subdomain-crawler” for us that will try different combinations from a list of subdomain names. This whole process is shown in the figure below.

Flow of Instructions

To determine whether a subdomain exists, we require a method of communicating with the website, such as entering the subdomain’s URL into a browser and then determining whether it exists. Since we want to accomplish this with a Python script, we’ll need a way to automatically make website requests. This programme will utilise the “requests” library, which can be installed with the command pip install requests.Using different word combinations, distinct subdomains will be created at runtime and their responses will be obtained using the request library.

Steps to Follow

You may get the code from the Github repository or Linkedin post. These are the simple steps to follow to use this script. The file hierarchy looks like this:

  1. In this simple script, you will type the domain name such as target.com
  2. This script will use a file words_list.txt that contains a list of words to create multiple subdomains
  3. Each time a new domain is created it will be tested for response 200
  4. In the end, it will create a list of all discovered sub-domains
  5. To run the script use command python sub-domain-crawler.py
import requestsdiscovered_subdomains = []
url = ‘cytomate.net’ # main domain
def url_test(url):
return requests.get(url)
except requests.exceptions.ConnectionError:
with open(“words_list.txt”, “r”) as words:
for word in words:
test_url = “http://”+word.strip()+”.”+url

response = url_test(test_url)
if(response.status_code == 200):
print(“[+] Discovered >”,test_url)

print(discovered_subdomains) # to see all discovered subdomains

Azhar Ghafoor

Cybersecurity Researcher | Ethical Hacking | Data Analyst