December 2014

Please note that republishing this article in full or in part is only allowed under the conditions described here.

Hiding Malware in Plain Sight From Online Scanners

There are serveral sites which offer scanning a URL for malware. One should expect that these sites emulate a real browser good enough so that their rating can be trusted. Unfortunatly this is not the case.

Based on research I published about 17 month ago about unusual Content-Encoding headers I had a closer look at the following major online scanners:

For testing I've compressed the content in the following ways and announced the compression with the Content-Encoding header:

To simulate an attacker which tries to be as anonymous as possible I've used one of the many sites offering free PHP hosting, because that's all needed to add custom HTTP headers. All of the tests deliver the harmless EICAR test virus which should be detected by all virus scanners.

Content-Encoding: gzip

This small PHP page delivers the EICAR test virus compressed with gzip. All major browsers will understand this format. The good news is that all of the tested online malware scanners also understand this and detect the virus. Unfortunatly this was most of the good news for this research.

    <?php
    header('HTTP/1.0 200 ok');
    header('Content-type: text/plain');
    header('Content-Encoding: gzip');
    
    // EICAR compressed with gzip and base64 encoded
    echo base64_decode('H4sIAPVklFQAA4sw9VcMUHVwDIg2iQmIijA10QiI0zR3dtY0r1Vx9XR2DNINDnH0c3EMctF19AvxDPMMCg3WDXENDtF18/RxVVTx0PbQAgA8z1FoRAAAAA==');
    exit(0);  // exit explicitly so that the free PHP hoster has no chance to append its own content
    ?>

Content-Encoding: deflate

These PHP pages use either raw deflate (RFC 1951) as supported by all major browsers or zlib (RFC1950) as supported by at least Google Chrome and Firefox. Surprisingly only Virustotal understands this compression scheme and both ZScaler and Comodo Web Inspector fail to detect the malware.

The reason might be, that the scanners look only at the content (the HTTP body) and ignore any information about the Content-Encoding inside the HTTP header. But, while a compression of gzip can be detected from a few magical bytes at the beginning of the file (the gzip header), deflate compression can not detected this way.

    <?php
    header('HTTP/1.0 200 ok');
    header('Content-type: text/plain');
    header('Content-Encoding: deflate');
   
    // EICAR compressed with RFC 1951 (raw deflate)
    echo base64_decode('izD1VwxQdXAMiDaJCYiKMDXRCIjTNHd21jSvVXH1dHYM0g0OcfRzcQxy0XX0C/EM8wwKDdYNcQ0O0XXz9HFVVPHQ9tACAA==');
    exit(0);
    ?>

    <?php
    header('HTTP/1.0 200 ok');
    header('Content-type: text/plain');
    header('Content-Encoding: deflate');
   
    // EICAR compressed with RFC 1950 (zlib)
    echo base64_decode('eJyLMPVXDFB1cAyINokJiIowNdEIiNM0d3bWNK9VcfV0dgzSDQ5x9HNxDHLRdfQL8QzzDAoN1g1xDQ7RdfP0cVVU8dD20AIAdFQSDw==');
    exit(0);
    ?>

Content-Encoding: deflate, deflate

In this case the content is compressed twice, i.e. compress(compress(content)). While this looks like a strange feature (it actually makes sense sometimes if different types of compressions are combined) it is in the standard and is supported by at least Google Chrome and Firefox. It is not supported by Microsoft Internet Explorer, which assumes no compression at all in this case. It is also not supported by lots of Intrusion Detection Systems (see previous research) which only assume a single compression and ignore the rest. Also, Virustotal, ZScaler and Comodo Web Inspector all fail to detect the malware.

    <?php
    header('HTTP/1.0 200 ok');
    header('Content-type: text/plain');
    header('Content-Encoding: deflate, deflate');
   
    // EICAR compressed twice with raw deflate 
    echo base64_decode('AUYAuf+LMPVXDFB1cAyINokJiIowNdEIiNM0d3bWNK9VcfV0dgzSDQ5x9HNxDHLRdfQL8QzzDAoN1g1xDQ7RdfP0cVVU8dD20AIACg==');
    exit(0);
    ?>

Conclusions

While file based malware scanners already have enough problems of their own to reliably detect malware, it gets much worse if you add the seemingly simple task of retrieving the content from a web site. Analysis of logs of my test site indicate that also Google and other bots have similar problems. And the requirements for a black hat to mount such an attack are trivial, any of the free PHP hosters or similar sites are enough to create the necessary HTTP responses.

This means, that you can neither trust the online scanners nor Google Safe Browsing or other technologies which are based on scanning the internet for malware.

How to do your own tests?


comments powered by Disqus