Parsers¶

Amass Parser¶

class pipeline.recon.amass.ParseAmassOutput(*args, **kwargs)¶

Read amass JSON results and create categorized entries into ip|subdomain files.

Parameters:

db_location – specifies the path to the database used for storing results Required by upstream Task
target_file – specifies the file on disk containing a list of ips or domains Required by upstream Task
exempt_list – Path to a file providing blacklisted subdomains, one per line. Optional by upstream Task
results_dir – specifes the directory on disk to which all Task results are written Required by upstream Task

output()¶

Returns the target output files for this task.

Returns:	luigi.contrib.sqla.SQLAlchemyTarget

requires()¶

ParseAmassOutput depends on AmassScan to run.

TargetList expects target_file as a parameter. AmassScan accepts exempt_list as an optional parameter.

Returns:	luigi.ExternalTask - TargetList

run()¶

Parse the json file produced by AmassScan and categorize the results into ip|subdomain files.

An example (prettified) entry from the json file is shown below

{

“Timestamp”: “2019-09-22T19:20:13-05:00”, “name”: “beta-partners.tesla.com”, “domain”: “tesla.com”, “addresses”: [

{

“ip”: “209.133.79.58”, “cidr”: “209.133.79.0/24”, “asn”: 394161, “desc”: “TESLA - Tesla”

}

], “tag”: “ext”, “source”: “Previous Enum”

}

Web Targets Parser¶

class pipeline.recon.web.targets.GatherWebTargets(*args, **kwargs)¶

Gather all subdomains as well as any ip addresses known to have a configured web port open.

Parameters:

db_location – specifies the path to the database used for storing results Required by upstream Task
exempt_list – Path to a file providing blacklisted subdomains, one per line. Optional by upstream Task
top_ports – Scan top N most popular ports Required by upstream Task
ports – specifies the port(s) to be scanned Required by upstream Task
interface – use the named raw network interface, such as “eth0” Required by upstream Task
rate – desired rate for transmitting packets (packets per second) Required by upstream Task
target_file – specifies the file on disk containing a list of ips or domains Required by upstream Task
results_dir – specifes the directory on disk to which all Task results are written Required by upstream Task

output()¶

Returns the target output for this task.

Returns:	luigi.contrib.sqla.SQLAlchemyTarget

requires()¶

GatherWebTargets depends on ParseMasscanOutput and ParseAmassOutput to run.

ParseMasscanOutput expects rate, target_file, interface, and either ports or top_ports as parameters. ParseAmassOutput accepts exempt_list and expects target_file

Returns:	ParseMasscanOutput, str: ParseAmassOutput)
Return type:	dict(str

run()¶: Gather all potential web targets and tag them as web in the database.

Masscan Parser¶

class pipeline.recon.masscan.ParseMasscanOutput(*args, **kwargs)¶

Read masscan JSON results and create a pickled dictionary of pertinent information for processing.

Parameters:

top_ports – Scan top N most popular ports Required by upstream Task
ports – specifies the port(s) to be scanned Required by upstream Task
interface – use the named raw network interface, such as “eth0” Required by upstream Task
rate – desired rate for transmitting packets (packets per second) Required by upstream Task
db_location – specifies the path to the database used for storing results Required by upstream Task
target_file – specifies the file on disk containing a list of ips or domains Required by upstream Task
results_dir – specifes the directory on disk to which all Task results are written Required by upstream Task

output()¶

Returns the target output for this task.

Naming convention for the output file is masscan.TARGET_FILE.parsed.pickle.

Returns:	luigi.local_target.LocalTarget

requires()¶

ParseMasscanOutput depends on Masscan to run.

Masscan expects rate, target_file, interface, and either ports or top_ports as parameters.

Returns:	luigi.Task - Masscan

run()¶: Reads masscan JSON results and creates a pickled dictionary of pertinent information for processing.