Thursday, March 24, 2011

Malware Distribution Server Hostname Similarities by Michael Rash

On the Arbor Networks security blog there was an interesting post
about a Chinese DDoS bot called "JKDDOS" that appears to specifically
target the heavy mining industry. That by itself is noteworthy
considering that China is reducing rare eath exports (see:,
but what I found interesting in the JKDDOS analysis is a link between
the distribution servers for JKDDOS and another malware family called
the Avzhan family. That is, a disbtribution server for JKDDOS is
(sanitized) and two distribution servers for
the Avzhan family are and So, they are the same to within one
character, and this is most likely not just a coincidence.

Now, if we wanted to take our current malware repository and quickly
determine which distribution hostnames are highly similar, we could
use something like the perl String::Similarity module, or the
Levenshtein Python extension (see:
Given two strings, each of these will return a number between 0 and 1
that is a measure of how similar they are. Zero implies totally
different, and 1 implies identical. So, for a quick one liner in perl
(see below), we can see that the two hostnames mentioned above are
extremely similar, and it would be an interesting result to see this
applied across our entire malware repository - we might discover a
previously unknown relationship between two pieces of malware that is
worth exploring. Of course, just because a malware distribution
server is similar does not (by itself) prove anything - more
investigation would be necessary.

$ perl -e 'use String::Similarity; print similarity($ARGV[0],
$ARGV[1]), "\n"'

Michael Rash
If you would like to discuss DDOS, or anything else in this post, email Michael at:
michael (dot) rash (at) g2-inc (dot) com

Tuesday, March 15, 2011

Parsing Malware XML - By Riley Porter

So AMA's XML parsing was tossing an error as of 2 days ago. It turns out that
there were some invalid characters in the url section of the malware stream. This is nothing
new, as a few months back I noticed this same sort of thing. How I got around it last time was, just walking every line
before we processed the XML and re-wrote the file to disk. Like so:

for line in xmlfile.readlines():
tmpfile.writelines(filter(lambda x: x in string.printable,line))

Why is this breaking again? Continue down the rabbit hole. Here is the
offending XML:

(note that blogger is stripping some of this content but you get the idea)
So the crazy thing is the URL element actually worked! If you were to
paste that into a browser it would have downloaded a file.
Note I have made http:// into hxxp:\\ in order to not make it an auto
link. So what is this link to? It looks like its a copy of "DarkComet Rat".
DarkComet RAT is a Remote Administration Tool (or known as a
RAT in the Malware Community). It is a zip file too so it's not super
dangerous inherently. However, there are executables inside the archive that I have
NOT examined or tested. Best not mess with them. However, here is the
link for the RAT tools webpage:

The question still remains why is our xml parser dying on this url? As it turns out the
bytes 0x0a and 0x0d are in string.printable.
We were seeing some sneaky malware authors placing the 0x0d (form feed!) in the urls. When these bytes are inserted
into a browser URL bar they are stripped right out.
However xml parsers and python file.readlines() interpreted it as a
new line which in turn broke the parsing engine! So anyhow heres the
code should anyone want to see it.

"""A bit of a hack... However it works. Some malware authors
were innovating with the use of non-printable characters a few months back.
filtering based of string.printable() fixed most of this until today. 1-17-2011
It appears the 0x\0a - 0x\0d are "PRINTABLE". What we were seeing is form feed
inserted into our xml url streams. This caused a new line in the xml and then
an invalid xml document as the tag was not closed. This code fixes that issue.
for y in
if iscntrl(y) and y != "\n":
y = ""
tmpfile.write(filter(lambda x: x in string.printable, y))

With that code added to our pre-processor class our xml is now 'clean' enough to be parsed.

To see a picture of the hex in action go here.


P.S. If you would like to discuss this post further email Riley at riley (dot) porter (at) g2-inc (dot) com.