Author Topic: Malicious Content Harvesting with Python, WebKit, and Scapy  (Read 6534 times)

0 Members and 1 Guest are viewing this topic.

November 29, 2011, 05:42:23 pm
Read 6534 times


  • Administrator
  • Hero Member

  • Offline
  • *****

  • 3335

Harvesting malicious files and websites isnít a difficult task these days when you have sites like MalwareDomainList,, etc. that allow pulling a list of URLs that have been reported as malicious or suspicious. What is more difficult and is most important to us is obtaining a complete picture of the actions that a malicious site is trying to perform. Tools like cURL, wget, etc. only retrieve an unrendered version of the page, but the exploit code will be missed if it is in an externally sourced file. Using libraries such as BeautifulSoup simplifies finding all external sources in a page and manually retrieve them, but if a malicious site limits the number of new connections for a host or if these external sources in turn load even more external sources you end up going down a rabbit hole you may never emerge from. Selenium is also a viable option in some circumstances, especially if you add on some custom scripts to auto-dump data from firebug. In the end I settled on using WebKit and Python because I felt they gave me what I needed and I gained some extra flexibility.
Ruining the bad guy's day