October 3, 2023
How To Brute-Force Directories and File Locations using Python

How To Brute-Force Directories and File Locations using Python

What Is Brute Force Attack

A Brute Force Attack, also known as an exhaustive search, is a cryptographic hack that relies on guessing possible combinations of a targeted password until the correct password is discovered. The longer the password, the more combinations that will need to be tested. A brute force attack can be time consuming, difficult to perform if methods such as data obfuscation are used, and at times down right impossible. However, if the password is weak it could merely take seconds with hardly any effort. Weak passwords are like shooting fish in a barrel for attackers, which is why all organizations should enforce a strong password policy across all users and systems.

Brute-Forcing Directories and File Locations

The previous example assumed a lot of knowledge about your target. But in many cases where you’re attacking a custom web application or large ecommerce system, you won’t be aware of all of the files accessible on the web server. Generally, you’ll deploy a spider, such as the one included in Burp Suite, to crawl the target website in order to discover as much of the web application as possible. However, in a lot of cases there are configura- tion files, leftover development files, debugging scripts, and other security breadcrumbs that can provide sensitive information or expose functionality that the software developer did not intend. The only way to discover this content is to use a brute-forcing tool to hunt down common filenames and directories.

We’ll build a simple tool that will accept wordlists from common brute forcers such as the DirBuster project1 or SVNDigger,2 and attempt to discover directories and files that are reachable on the target web server. As before, we’ll create a pool of threads to aggressively attempt to discover content.

Let’s start by creating some functionality to create a Queue out of a wordlist file. Open up a new file, name it content_bruter.py, and enter the following code:

import urllib2
import threading
import Queue
import urllib
threads = 50
target_url = "http://testphp.vulnweb.com"
wordlist_file = "/tmp/all.txt" # from SVNDigger
resume = None
user_agent = "Mozilla/5.0 (X11; Linux x86_64; rv:19.0) Gecko/20100101Firefox/19.0"
def build_wordlist(wordlist_file):
 # read in the word list
 fd = open(wordlist_file,"rb") 
 raw_words = fd.readlines()
 found_resume = False
 words = Queue.Queue()
 for word in raw_words: 
 word = word.rstrip()
 if resume is not None:
 if found_resume:
 if word == resume:
 found_resume = True
 print "Resuming wordlist from: %s" % resume
 return words

This helper function is pretty straightforward. We read in a wordlist file and then begin iterating over each line in the file. We have some built-in functionality that allows us to resume a brute-forcing session if our network connectivity is interrupted or the target site goes down. This can be achieved by simply setting the resume variable to the last path that the brute forcer tried. When the entire file has been parsed, we return a Queue full of words to use in our actual brute-forcing function. We will reuse this function later in this chapter.

We want some basic functionality to be available to our brute-forcing script. The first is the ability to apply a list of extensions to test for when making requests. In some cases, you want to try not only the /admin directly for example, but admin.php, admin.inc, and admin.html.

def dir_bruter(word_queue,extensions=None):
 while not word_queue.empty():
 attempt = word_queue.get()
 attempt_list = []
 # check to see if there is a file extension; if not,
 # it's a directory path we're bruting
 if "." not in attempt: 
 attempt_list.append("/%s/" % attempt)
 attempt_list.append("/%s" % attempt)

Our dir_bruter function accepts a Queue object that is populated with words to use for brute-forcing and an optional list of file extensions to test. We begin by testing to see if there is a file extension in the current word, and if there isn’t, we treat it as a directory that we want to test for on the remote web server

# if we want to bruteforce extensions 
 if extensions: 
 for extension in extensions:
 attempt_list.append("/%s%s" % (attempt,extension))
 # iterate over our list of attempts 
 for brute in attempt_list:
 url = "%s%s" % (target_url,urllib.quote(brute))

If there is a list of file extensions passed in, then we take the current word and apply each file extension that we want to test for. It can be useful here to think of using extensions like .orig and .bak on top of the regular programming language extensions.

 headers = {}
 headers["User-Agent"] = user_agent 
 r = urllib2.Request(url,headers=headers)

After we build a list of brute-forcing attempts, we set the User-Agent header to something innocuous and test the remote web server.

if len(response.read()): 
 print "[%d] => %s" % (response.code,url)
 except urllib2.URLError,e:
 if hasattr(e, 'code') and e.code != 404:
 print "!!! %d => %s" % (e.code,url) 

If the response code is a 200, we output the URL, and if we receive anything but a 404 we also output it because this could indicate something interesting on the remote web server aside from a “file not found” error.

It’s useful to pay attention to and react to your output because, depend- ing on the configuration of the remote web server, you may have to filter out more HTTP error codes in order to clean up your results. Let’s finish out the script by setting up our wordlist, creating a list of extensions, and spinning up the brute-forcing threads.

word_queue = build_wordlist(wordlist_file)
extensions = [".php",".bak",".orig",".inc"]
for i in range(threads):
 t = threading.Thread(target=dir_bruter,args=(word_queue,extensions,))

The code snip above is pretty straightforward and should look familiar by now. We get our list of words to brute-force, create a simple list of file extensions to test for, and then spin up a bunch of threads to do the brute-forcing.

Now Let’s Check Our Code

OWASP has a list of online and offline (virtual machines, ISOs, etc.) vulnerable web applications that you can test your tooling against. In this case, the URL that is referenced in the source code points to an intentionally buggy web application hosted by Acunetix. The cool thing is that it shows you how effective brute-forcing a web application can be. I recommend you set the thread_count variable to something sane such as 5 and run the script. In short order, you should start seeing results such as the ones below:

[200] => http://testphp.vulnweb.com/CVS/
[200] => http://testphp.vulnweb.com/admin/
[200] => http://testphp.vulnweb.com/index.bak
[200] => http://testphp.vulnweb.com/search.php
[200] => http://testphp.vulnweb.com/login.php
[200] => http://testphp.vulnweb.com/images/
[200] => http://testphp.vulnweb.com/index.php
[200] => http://testphp.vulnweb.com/logout.php
[200] => http://testphp.vulnweb.com/categories.php

You can see that we are pulling some interesting results from the remote website. I cannot stress enough the importance to perform content brute- forcing against all of your web application targets.

Also Check How To Map Open Source Web App Installations Using Python.

1 thought on “How To Brute-Force Directories and File Locations using Python

Leave a Reply

Your email address will not be published. Required fields are marked *