Author Topic: Extract javascript from pdf file  (Read 32484 times)

0 Members and 2 Guests are viewing this topic.

August 27, 2008, 07:03:31 pm
Read 32484 times

SysAdMini

  • Administrator
  • Hero Member

  • Offline
  • *****

  • 3335
I've received the question

"How can I extract the javascript from a pdf exploit ?"

several times.

Here is my answer :


download pdftk from http://www.accesspdf.com/pdftk/.

run this from commandline : pdftk yourexploit.pdf output wanteveryoulike.pdf uncompress

The uncompressed pdf is a plain text file and you can copy and paste the javascript code
to your favorite javascript decoder, for example Malzilla.

 
Ruining the bad guy's day

August 27, 2008, 09:03:33 pm
Reply #1

bobby

  • Special Members
  • Hero Member

  • Offline
  • *

  • 322
    • Malzilla
Some more info about the JavaScripts in PDF:
Sometimes the JavaScripts contains shellcode that does not do runtime decoding of its own code, but sometimes the shellcode is not so easy to decode.

I have explained here how to decode the first kind of shellcode:
http://www.malwaredomainlist.com/forums/index.php?topic=2139.0

If you do not get any plain text link after doing UCS2 decoding, then you have a shellcode that will decode itself at runtime.
In that case use the second part of my instructions to save the shellcode to a file.
After that use your favorite debugger to debug the shellcode in order to get the URL.

I'm collecting info on how to make an emulated environment to run these shellcodes, so that we can have some automated decoder for all kind of shellcodes.
There is some working applications for Linux for such tasks, and I'll try to get them working on Windows.
If I get them working, I'll let you know.

August 28, 2008, 06:18:16 am
Reply #2

m1573r

  • Special Access
  • Newbie

  • Offline
  • *

  • 6
bobby today all shellcodes are simply ripped from milw0rm
or elder stuff. All malware, which i have found uses the same code,
maybe VXers are too lasy to write something new  :)

August 28, 2008, 03:00:58 pm
Reply #3

bobby

  • Special Members
  • Hero Member

  • Offline
  • *

  • 322
    • Malzilla
@m1573r
Yes and no. Last week I've got shellcode (JS inside PDF) which I do not recognize at all. It does not looks like any encoder from milw0rm. So, there is still someone who does code some riddles for us to solve.

September 21, 2008, 05:38:28 pm
Reply #4

SysAdMini

  • Administrator
  • Hero Member

  • Offline
  • *****

  • 3335
This is a video which demonstrates how to analyze shellcode of malicious pdf files.

http://www.honeynor.no/2008/08/24/analysing-malicious-pdf-documents-and-shellcode/
Ruining the bad guy's day

September 24, 2008, 10:45:21 pm
Reply #5

bobby

  • Special Members
  • Hero Member

  • Offline
  • *

  • 322
    • Malzilla
I've just finished a tool to extract and decompress all the compressed streams from a PDF file.
Please, give it a shoot and tell me if it works with the samples you have:
http://sourceforge.net/project/showfiles.php?group_id=203466

September 25, 2008, 07:50:32 am
Reply #6

SysAdMini

  • Administrator
  • Hero Member

  • Offline
  • *****

  • 3335
Thanks Bobby,

it works. I found one file which gives an error, but the javascript part was extracted.
So it doesn't matter.

Code: [Select]
hxxp://v2count.net/in/1/output.pdf


stream 1 and 3 have zero size length.

Code: [Select]
Inflater v1.0 by bobby

Numbers in brackets are the beginning and the end adresses of streams

Found stream nr.1 ( $04EF, $04FB)
Inflated stream nr.1

Found stream nr.2 ( $058F, $097E)
Inflated stream nr.2

Found stream nr.3 ( $0A4C, $0A98)
Failed to inflate stream nr.3

Done

Ruining the bad guy's day

September 25, 2008, 10:36:08 am
Reply #7

bobby

  • Special Members
  • Hero Member

  • Offline
  • *

  • 322
    • Malzilla
The two problematic streams can't be decompressed by some other tools too, so I think it something out of standard zlib compression.
First stream is just a couple of bytes long... strange one.

November 18, 2008, 05:07:31 pm
Reply #8

bobby

  • Special Members
  • Hero Member

  • Offline
  • *

  • 322
    • Malzilla
A serious bug is found in PDF_streams_inflater.
Please immediately delete it from your HDD and download the bugfix version from the same link.
If the files are still not updated on the sourceforge mirrors, please wait until they do.
The filestamps (dates) should be from today.

November 22, 2008, 07:35:15 pm
Reply #9

bobby

  • Special Members
  • Hero Member

  • Offline
  • *

  • 322
    • Malzilla
Linux version of PDF_streams_inflater

Please report bugs if you find some.

January 18, 2009, 11:08:55 am
Reply #10

DiFor

  • Jr. Member

  • Offline
  • **

  • 19
Sorry, but I can not download PDF_streams_inflater not from one server sourceforge, could you upload it somewhere where more, thanks.

January 18, 2009, 11:20:38 am
Reply #11

bobby

  • Special Members
  • Hero Member

  • Offline
  • *

  • 322
    • Malzilla
Hi,

here it is

January 18, 2009, 08:53:33 pm
Reply #12

DiFor

  • Jr. Member

  • Offline
  • **

  • 19
big thx to you. your soft super ;)

January 19, 2009, 05:02:39 pm
Reply #13

SysAdMini

  • Administrator
  • Hero Member

  • Offline
  • *****

  • 3335
Today I've found a pdf file which pdftk was unable to decode.

Code: [Select]
hxxp://suttds.com/spun/pdf.phpBobby's inflater did the job. :)
Ruining the bad guy's day

July 14, 2009, 09:33:01 am
Reply #14

sgres

  • Newbie

  • Offline
  • *

  • 2
Hi,

there is a pdf referenced by www.milw0rm.com, I couldn't extract the javascript! can anyone help me with this issue?

here is the link:

Code: [Select]
http://milw0rm.com/sploits/2009-crashy_the_clown.pdf
Please, be careful! the pdf installs tronjan on your machine.

Thanks

sgres  

MysteryFCM: Embedded URL in BBCode tags