Skip to content

Latest commit

 

History

History
21 lines (13 loc) · 607 Bytes

README.md

File metadata and controls

21 lines (13 loc) · 607 Bytes

A basic script based that uses PDFMiner to decompress streams, and then looks inside the streams

Currently it attempts to pull out IPs, hashes, URLs, and hostnames.

Requires:

  • pip install dnspython
  • grab uniaccept from here
  • pip install pdfminer

Then after you've done that, you'll likely want to get the newest TLD list.
Open a Python interpreter then:

import uniaccept
uniaccept.refreshtlddb("/tmp/tld-list.txt")

Feel free to change the location of the tld-list.txt file, the scrape-pdf.py script expects it in the CWD.