Incremental search with Feroxbuster

Feroxbuster is a great forced-browsing / directory-busting tool. In this article, I explore making the search more efficient by scanning at a low depth and filtering the found directories.

Table of Contents

Don’t waste time scanning directories you don’t care about

The last time I played HackTheBox I stumbled across a web server with a directory containing Apache documentation. Feroxbuster dived deep into the directory, carefully brute-forcing each sub-directory, wasting my time and computational resources.

A simple solution comes to mind: let’s stop the scan with CTRL+C and resume it with the boring directory excluded from the search:

feroxbuster \ 
	--resume-from ./ferox-https_example_htb_-1681503744.state \
	--dont-scan https://example.htb/manual

There is one issue with this: I like to run my recon in the background while working on other stuff, and I would miss that. The solution is simple:

  • Start the search with limited recursion depth.
  • Inspect the results.
  • Remove uninteresting directories and continue further.

Extracting URLs from saved output

Tip

Feroxbuster never deletes the output file; it only appends to it. You can use the same output file in multiple scans after each other.

Default output

This is the default output Feroxbuster saves the result if you run it with the --output flag.

MSG      0.000 feroxbuster::heuristics detected directory listing: https://example.htb/includes/ (Apache)
200      GET      147l      510w     9301c https://example.htb/
301      GET        9l       28w      321c https://example.htb/includes => https://example.htb/includes/

We can get found directories with grep and awk. We must check that the line starts with three digits: ^[0-9]{3} and ends with a slash /$. We pipe the output to awk and print the last field on the line with { print $NF }. Finally, we sort the output and remove duplicates.

grep -E '^[0-9]{3}.*/$' ferox-output \
	| awk '{ print $NF }' \
	| sort -u

JSON output

JSON output can be enabled with --json. Here’s a snippet:

{                                                                                                                                                                                                                                          
  "type": "response",                                                                                                                                                                                                                      
  "url": "https://example.htb/index.php",
  "original_url": "https://example.htb/",
  "path": "/index.php",
  "wildcard": false,
  "status": 200,    
  "method": "GET",
  "content_length": 9301,
  "line_count": 147,
  "word_count": 510,
  "headers": {         
    "vary": "Accept-Encoding",
    "set-cookie": "PHPSESSID=187s7ikdsvkm6e1ennr7b94vbd; path=/",
    "server": "Apache/2.4.54 (Debian)",
    "expires": "Thu, 19 Nov 1981 08:52:00 GMT",
    "date": "Fri, 14 Apr 2023 19:36:16 GMT",
    "transfer-encoding": "chunked",
    "cache-control": "no-store, no-cache, must-revalidate",
    "pragma": "no-cache",
    "content-type": "text/html; charset=UTF-8"
  },               
  "extension": ""      
}
{                           
  "type": "statistics",
  "timeouts": 0,   
  "requests": 31,      
  "expected_per_scan": 8,
  "total_expected": 8,      
  "errors": 0,     
  "successes": 15, 
  "redirects": 0,  
  "client_errors": 16,
  ...
}

jq is a command-line tool for parsing JSON. We can use it for selecting the correct field as well s for sorting the output. The -s flag on the second invocation causes the output to be read into an array. We then sort the array and remove duplicate entries with unique . Finally, we extract the string of the array with .[]. The -r flag stands for raw, which makes jq output raw strings (https://example.htb instead of "https://example.htb").

jq 'select(.type == "response") | .url | select(endswith("/"))' ferox-output.json \
| jq -rs 'unique | .[]' 

Silent output

The silent output can be enabled by passing the --silent flag.

https://example.htb/
https://example.htb/includes => https://example.htb/includes/

We can use awk again with a slight modification. We still want to print the last field on the line with { print $NF }, but only if it is a directory (i.e., ends with a slash). We can achieve this by checking the last field against a regex: $NF ~ /.*\/$/.

awk '$NF ~ /.*\/$/ { print $NF }' ferox-output.simple | sort -u

Complete example

URL=https://example.htb/
WORDLIST=./small

# run the initial scan with the default output
feroxbuster -u $URL \
	-w $WORDLIST \
	--depth 2 \
	--output ferox-0 \
&& notify-send "Feroxbuster has finished scanning"

# transform the ooutput 
grep -E '^[0-9]{3}.*/$' ferox-output \
	| awk '{print $NF}' \
	| sort -u > ferox-0-filtered

# manually check the directories

# continue the scan without recursion limits
cat ferox-0-filtered \
	| feroxbuster \
		--stdin \
		-w $WORDLIST \
		--output ferox \
&& notify-send "Feroxbuster has finished scanning"
Don’t forget about BurpSuite

Use --burp-replay flag to send found files to BurpSuite and make them appear on the target site-map. We could use a simple --proxy, but some servers don’t respond with a 404 status code and return the homepage instead. Feroxbuster can filter their responses out, but BurpSuite can’t. To keep our BurpSuite site-map clean, it’s safer to use --burp-replay.