Parsers

Amass Parser

class pipeline.recon.amass.ParseAmassOutput(*args, **kwargs)

Read amass JSON results and create categorized entries into ip|subdomain files.

Parameters:
  • db_location – specifies the path to the database used for storing results Required by upstream Task
  • target_file – specifies the file on disk containing a list of ips or domains Required by upstream Task
  • exempt_list – Path to a file providing blacklisted subdomains, one per line. Optional by upstream Task
  • results_dir – specifes the directory on disk to which all Task results are written Required by upstream Task
output()

Returns the target output files for this task.

Returns:luigi.contrib.sqla.SQLAlchemyTarget
requires()

ParseAmassOutput depends on AmassScan to run.

TargetList expects target_file as a parameter. AmassScan accepts exempt_list as an optional parameter.

Returns:luigi.ExternalTask - TargetList
run()

Parse the json file produced by AmassScan and categorize the results into ip|subdomain files.

An example (prettified) entry from the json file is shown below
{

“Timestamp”: “2019-09-22T19:20:13-05:00”, “name”: “beta-partners.tesla.com”, “domain”: “tesla.com”, “addresses”: [

{
“ip”: “209.133.79.58”, “cidr”: “209.133.79.0/24”, “asn”: 394161, “desc”: “TESLA - Tesla”

}

], “tag”: “ext”, “source”: “Previous Enum”

}

Web Targets Parser

class pipeline.recon.web.targets.GatherWebTargets(*args, **kwargs)

Gather all subdomains as well as any ip addresses known to have a configured web port open.

Parameters:
  • db_location – specifies the path to the database used for storing results Required by upstream Task
  • exempt_list – Path to a file providing blacklisted subdomains, one per line. Optional by upstream Task
  • top_ports – Scan top N most popular ports Required by upstream Task
  • ports – specifies the port(s) to be scanned Required by upstream Task
  • interface – use the named raw network interface, such as “eth0” Required by upstream Task
  • rate – desired rate for transmitting packets (packets per second) Required by upstream Task
  • target_file – specifies the file on disk containing a list of ips or domains Required by upstream Task
  • results_dir – specifes the directory on disk to which all Task results are written Required by upstream Task
output()

Returns the target output for this task.

Returns:luigi.contrib.sqla.SQLAlchemyTarget
requires()

GatherWebTargets depends on ParseMasscanOutput and ParseAmassOutput to run.

ParseMasscanOutput expects rate, target_file, interface, and either ports or top_ports as parameters. ParseAmassOutput accepts exempt_list and expects target_file

Returns:ParseMasscanOutput, str: ParseAmassOutput)
Return type:dict(str
run()

Gather all potential web targets and tag them as web in the database.

Masscan Parser

class pipeline.recon.masscan.ParseMasscanOutput(*args, **kwargs)

Read masscan JSON results and create a pickled dictionary of pertinent information for processing.

Parameters:
  • top_ports – Scan top N most popular ports Required by upstream Task
  • ports – specifies the port(s) to be scanned Required by upstream Task
  • interface – use the named raw network interface, such as “eth0” Required by upstream Task
  • rate – desired rate for transmitting packets (packets per second) Required by upstream Task
  • db_location – specifies the path to the database used for storing results Required by upstream Task
  • target_file – specifies the file on disk containing a list of ips or domains Required by upstream Task
  • results_dir – specifes the directory on disk to which all Task results are written Required by upstream Task
output()

Returns the target output for this task.

Naming convention for the output file is masscan.TARGET_FILE.parsed.pickle.

Returns:luigi.local_target.LocalTarget
requires()

ParseMasscanOutput depends on Masscan to run.

Masscan expects rate, target_file, interface, and either ports or top_ports as parameters.

Returns:luigi.Task - Masscan
run()

Reads masscan JSON results and creates a pickled dictionary of pertinent information for processing.