(mt):How to uncover malicious code/malware files

  • This page was last modified on November 25, 2011, at 20:04.
The (mt) Community Wiki is a collaborative project. Any (mt) Media Temple customer or employee may contribute. Not all articles and/or content have been tested for accuracy by (mt) Media Temple.

For officially moderated and tested articles, be sure to visit our KnowledgeBase.

From (mt) Community Wiki

Contents

Overview

An unfortunate side-effect of being online is the fact that you are continually being probed for weaknesses by ne'er do wells. Be it your computer, your internet provider, or your website, someone is almost always trying to find a way in to further their illicit goals, and give you a pretty massive headache as a result. While this article will not be able to teach you everything, it will serve to give you a solid base of skills to use when troubleshooting your own sites.

Requirements

This article assumes the user is comfortable with using SSH for navigation of Linux and the editing of files therein.

Your stalwart allies: Find, Grep & Stat

These three commands can easily uncover most kinds of malicious code and can often help point you towards the source of the attack, if they're used properly. I will break down how to use each command separately, and then later how they can be used in concert.

Find

The bitter, Linux manual definition of the command: "The find utility recursively descends the directory tree for each path listed, evaluating an expression in terms of each file in the tree."

A better explanation: "Find lets you search an area to look for files or folders as defined by a number of variables, such as by name, by owner, by time modified, etc."

Let's show some basic examples. You want to search the directory "/home/mywebsite" for a file called foobar.txt. There are a bunch of folders and subfolders inside that directory, and it will take ages to look through each folder one at a time. To get around this, you will run the following command:

find /home/mywebsite -type f -name "foobar.txt"

The command, if entered properly, will come up with the following result:

/home/mywebsite/folder/another-folder/foobar.txt

Et voilà, we have found where that file is located!

Next, I want to find a list of files in that same "/home/mywebsite" directory that have been changed in the last 7 days. To make this happen, you will run:

find /home/mywebsite -type f -ctime -7

The (-7) after ctime means files changed within 7 days or less. if that was changed to a plus (+) symbol, it would mean any files changed a minimum or 7 days or longer.

Now, let's do something a little more complex. I want to find a list of all files inside "/home/mywebsite" with the extension .php that have been changed within 30 days:

find /home/mywebsite -type f -name "*.php" -ctime -30

There are a lot more commands that 'find' can handle, but these three are generally enough to get you going. If you want to know more, just type "man find" via SSH.

Grep

The bitter, Linux manual definition of the command: "Grep searches the named input FILEs (or standard input if no files are named, or the file name - is given) for lines containing a match to the given PATTERN. By default, grep prints the matching lines."

A better explanation: "Grep lets you search files for a matching text pattern."

'Grep' is one of the greatest commands for finding malicious files, but it can also turn up a lot of false-positives. I will cover just the very basics of the command in this article. Once again, we are looking to find something inside "/home/mywebsite". We're am looking for the phrase "you just lost", located in a file somewhere inside that directory. Using SSH, run the following command:

grep -R "you just lost" /home/mywebsite

If there was a file that contains that phrase, 'Grep' will post the path to the file, and the line containing the matching text:

/home/mywebsite/foo/bar/guesswhat.txt:			you just lost the game.

Please note that Grep runs case-sensitive, so you may want to use the "-i" flag to make it run case-insensitive.

There are more functions that 'Grep' can perform, to learn more type "man grep" via SSH.

Stat

The bitter, Linux manual definition of the command: "Display file or filesystem status."

A better explanation: "Display permissions, ownership and various timestamps of a file."

To see an example of the 'stat' command output, let's take a look at that file we found earlier via the 'find' command, "/home/mywebsite/folder/another-folder/foobar.txt":

stat /home/mywebsite/folder/another-folder/foobar.txt

File: `home/mywebsite/folder/another-folder/foobar.txt'
Size: 19043     	Blocks: 39         IO Block: 32768  regular file
Device: 17h/23d	Inode: 140209072   Links: 1
Access: (0644/-rw-r--r--)  Uid: (841608/mywebsite-user)   Gid: (88432/mywebsite-user)
Access: 2011-10-22 21:10:09.106667057 -0700
Modify: 2011-11-14 15:14:19.493663971 -0800
Change: 2011-11-14 15:14:19.494043373 -0800

It's not quite as complex as it looks, once you know what to look for. The line with "access/Uid/Gid" simply lays out the read/write/execute permissions of the file, and who owns it in user and group. These can be important for other administrative duties, but is not what we're concerned with in the scope of this article.

The last three lines deal with the time-stamp of the file. "Access" normally refers to when it was first created, or last written to. "Modify" refers to the last time the file changed permissions, or was renamed (among other things). "Change" refers to the last time the actual contents of the file were modified.

Many malicious scripts are able to keep the access and modify time-stamps unchanged, but the change time-stamp will always reflect that someone has been inside the file. If a file has been modified via FTP, all three time-stamps will be changed to the same date.

Hunting for malicious code

As I mentioned earlier, I cannot teach you how to find every type of malicious file out there, but with the three commands above, and a little bit of gumshoe detective work, you'll be able to find most types.

Base64

Base64 code is one that is seen most often in attacked sites. Typically the attacker's script will inject the code into either the first or last line in a file, and can sometimes be painfully obvious to find. The 'grep' command is most useful for this, though as mentioned it does show a lot of false-positives. When searching for base64, you will normally want to use one of the following text patterns:

  • base64_decode
  • gzinflate(base64_decode
  • eval(gzinflate(base64_decode
  • eval(base64_decode

Listed below is an example of what you might see inside an infected PHP file:

<?php eval(base64_decode(
ZWNobygiTG9yZW0gaXBzdW0gZG9sb3Igc2l0IGFtZXQsIGNvbnNlY3RldHVyIGFkaXBpc2ljaW5n
IGVsaXQsIHNlZCBkbyBlaXVzbW9kIHRlbXBvciBpbmNpZGlkdW50IHV0IGxhYm9yZSBldCBkb2xv
cmUgbWFnbmEgYWxpcXVhLiBVdCBlbmltIGFkIG1pbmltIHZlbmlhbSwgcXVpcyBub3N0cnVkIGV4
ZXJjaXRhdGlvbiB1bGxhbWNvIGxhYm9yaXMgbmlzaSB1dCBhbGlxdWlwIGV4IGVhIGNvbW1vZG8g
Y29uc2VxdWF0LiBEdWlzIGF1dGUgaXJ1cmUgZG9sb3IgaW4gcmVwcmVoZW5kZXJpdCBpbiB2b2x1
cHRhdGUgdmVsaXQgZXNzZSBjaWxsdW0gZG9sb3JlIGV1IGZ1Z2lhdCBudWxsYSBwYXJpYXR1ci4g
RXhjZXB0ZXVyIHNpbnQgb2NjYWVjYXQgY3VwaWRhdGF0IG5vbiBwcm9pZGVudCwgc3VudCBpbiBj
dWxwYSBxdWkgb2ZmaWNpYSBkZXNlcnVudCBtb2xsaXQgYW5pbSBpZCBlc3QgbGFib3J1bS4iKTs=
));?><?php

echo("Hello world, I am testing my website!");
?>
What you see above is safe, it simply decodes into sentences from "lorem ipsum"

However, legitimate uses of base64 can come from plugin authors who are trying to hide their code, people who want to embed an image directly into a CSS file, etc. Due to this, you will need to take your 'grep' results, and compare them against known clean versions of those files.

JavaScript

JavaScript injections are most often seen in HTML files or in the header/footer of some PHP files, and they can sometimes be overlooked at first glance. Often the code will have a link to a country-code based URL (for example, co.nz, .ru, .br, etc.). Here is a rough example of what malicious JavaScript code can look like:

<script src="http://a.very.bad.website/attack.js"></script>

Because of the broad use of JavaScript for webpages, doing a 'grep' for "<script src" would probably not be the best idea. If you can hunt down at least 1 page it is occurring on, find the malicious code on that page and then run a 'grep' command for the referenced URL.

Putting it all together

Coming soon!