Malware Domain List

Malware Related => Malware Analysis => Topic started by: SysAdMini on November 29, 2011, 05:42:23 pm

Title: Malicious Content Harvesting with Python, WebKit, and Scapy
Post by: SysAdMini on November 29, 2011, 05:42:23 pm
http://dvlabs.tippingpoint.com/blog/2011/11/28/malicious-content-harvesting

Quote
Harvesting malicious files and websites isnít a difficult task these days when you have sites like MalwareDomainList, jsunpack.jeek.org, etc. that allow pulling a list of URLs that have been reported as malicious or suspicious. What is more difficult and is most important to us is obtaining a complete picture of the actions that a malicious site is trying to perform. Tools like cURL, wget, etc. only retrieve an unrendered version of the page, but the exploit code will be missed if it is in an externally sourced file. Using libraries such as BeautifulSoup simplifies finding all external sources in a page and manually retrieve them, but if a malicious site limits the number of new connections for a host or if these external sources in turn load even more external sources you end up going down a rabbit hole you may never emerge from. Selenium is also a viable option in some circumstances, especially if you add on some custom scripts to auto-dump data from firebug. In the end I settled on using WebKit and Python because I felt they gave me what I needed and I gained some extra flexibility.