Document-Based Exploits (Part II)

Several factors make Adobe Reader an attractive target for exploitation to get malicious code run on a target machine. The first is the application has many buffers that can be populated by loading a document. Adobe Reader can also be thought of as an interpreter, executing whatever valid code might be contained within a document, using functions that potentially have vulnerabilities.
The biggest factor is the software is common to most desktop computers, giving the largest number of potential victims, a problem that’s exacerbated by web browsers that automatically load PDFs in a browser plugin after fetching them from web servers.

Here I’m going to illustrate that with two particular exploits I found in the CRIMEPACK, Blackhole, Eleanor and Phoenix crimeware kits (full write-up on these later).

Collab.getIcon()
Discovered (or publicly disclosed) in March 2009, the Collab.getIcon() method/function vulnerability appears to be specific to Adobe Reader, and the exploit must be implemented as a JavaScript call to this function. According to the advisories the exploit is a typical stack overflow through a malformed call to getIcon(), and this allows arbitrary code execution – a typical way of changing the Instruction Pointer value to the address of some malicious code. An example of this and a copy of the vulnerable application (for Windows users) are available from the Offensive Security exploit database (number 9579). The exploit is also available as a Metasploit module.

We’re looking for two things within a malicious PDF: something that causes an exception, and a payload that executes when the exception occurs. So, if we run the strings utility on an example PDF from SecurityFocus… where the hell is the exploit? Where is the getIcon() request for that matter?
The best place to start is by looking at the file’s structure and layout. PDFs are self-referencing, that is each section is an object marked by a reference number such as 10, 20, 30, etc. The contents of each object can also contain a reference to another object. In the SecurityFocus PDF, one section, 30, references some JavaScript in another section, 70R:

js-referenced

By the way, I’m using the SecurityFocus example because the references within the actual crimware PDF are also obfuscated.
Looking further through the code, at the referenced object, some random characters are found. This, I believe, is the exploit and payload.

referenced-js-code

However, it’s unintelligible because the content of that section is compressed and obfuscated, which enables the malicious code to get past various intrusion detection/prevention methods. For a while it would also have made reverse-engineering tricky because the tools were less readily available. Obfuscated code is indicated by the ‘/Filter /FlateDecode‘ string.
To uncompress/decode this section, I used qpdf. Since this doesn’t work on the SecurityFocus sample, I ran qpdf on the actual crimeware PDF instead:

$qpdf --stream-data=uncompress geticon.pdf 3rdattempt.pdf

The output file contains the unobfuscated exploit and its payload, and the following section is instantly recognisable as an array buffering the payload:

getIcon-payload

Of course, the payload is unreadable as the strings utility is attempting to convert hex shellcodes into ASCII text. To get those, the PDF must be run through a hex editor or something like Bokken.

bokken-pdf-shellcode

Unfortunately the shellcode is OS-specific and I wasn’t using a Windows machine, so I didn’t analyse it further. What we do already know is the payload results in the installation of a banking Trojan. The payload was buffered as an array for the exploit code itself, which is seen further down:

decoded-geticon-exploit

util.printf()
Problems with the printf() function in the C programming language are well-known. They aren’t necessarily caused by the developers of Adobe Reader, but instead it’s a vulnerability native to the C language, where the function doesn’t check the stack boundaries. Here the vulnrability might be in the C code underlying the JavaScript interpreter.
A full description and an exploit attempt is published on the CORE Security site, and that works by overwriting the Structured Exception Handler address with that of another location where the shellcode is placed. Again, the malicious code is only executed with the privileges of whoever’s running the PDF reader.

Conclusion
As both exploits were found in three crimeware kits, it’s obvious their authors targeted something common to most desktop computers – Adobe Reader. The versions of CRIMEPACK, Blackhole and Eleanor being examined were all created around the same period, so they were either sold by the same group, or the exploits were proven the most effective for circulation among the crimeware authors.

What’s the worst that can happen? Both exploits have already been out there for five years, and other vulnerabilities like them have been found since, so this post only gives a taste of what to expect in a crimeware kit.
The impact of a successful exploit here depends on the credentials the Adobe Reader application is running under. If it’s a standard user account with limited privileges, the exploit would lead to only that account being initially compromised, although privilege escalation is always possible afterwards. The latter is unlikely, though, as the crimeware has obviously been developed to automate things as much as possible, and the attacker would have many compromised admin accounts. If the user has admin privileges at the time the application is exploited, the payload has full control of the system.
As can be demonstrated using Metasploit, the payload could be anything, including a reverse shell for remote access or code that fetched a malware installer. The people behind the crimeware were counting on some victims being logged into the admin account, or on being able to escalate their privileges after the account was compromised.

Data Retention Directive: Not Quite Defeated

The European Court of Justice has ruled the Data Retention Directive would be a violation of the European Convention on Human Rights, and the Data Retention Directive is therefore invalid.

The ECJ’s line of reasoning is that metadata paints such a detailed picture of a given person’s life that the right to privacy under the Human Rights Act (or the European Convention of Human Rights) would become a joke. It’s also generalised enough that it would conceivably give citizens the feeling they were constantly under surveillance, and applied broadly enough that it no longer qualifies as a valid exception to the ECHR.
Of course, being a ham fisted power grab, the Data Retention Directive lacked the safeguards to prevent the data being improperly accessed and misused.

Data Retention Directive: Invalid, but not quite dead yet
For readers who are wondering how significant the ECJ ruling actually is, here’s my understanding of how things work: Basically the European Union consists (at least) of the European Council and the European Parliament. A proposition is made by the European Parliament, submitted to the EC, then voted on in the European Parliament. Sometimes laws can originate from the EC itself. In any case, the outcome is often a ‘Directive’, basically an order for the EU ‘states’ to implement it as legislation. In the UK, the Directives are eventually enacted as legislation by Westminster.

The bad news is the ECJ ruling itself wasn’t a defeat of the data retention thing. While the directive was ruled invalid, it would have to be repealed along with each country’s implementation of it – a process that realistically takes several years. In other words, the directive still exists and the Electronic Communications Bill could still be passed through Westminster. Of course, someone might take the matter to court, but how would such a legal challenge be framed?

Neither is data retention specifically illegal. The ECHR itself comes with this huge, but very subtle caveat – any part of the EHCR can be nullified for the purpose of ‘preventing crime’. Or data retention could be done rather effectively under the guise of foreign intelligence gathering, knowing the vast majority of our Internet traffic enters and leaves the United Kingdom.

The ECJ’s ruling does, however, set a nice precedent and qualifies our arguments of how increasing surveillance powers can be incompatible with the concept of human rights. For whatever it’s worth.

Linux Process Exploration

It seems my post on static analysis of executables is still very popular, and quite a few readers are even Googling for that specific page. I’m guessing some of you would like a follow-up, this time on more dynamic analysis.

It’s perhaps more important forensically. When it comes to detecting, capturing and analysing malware, the hard drive isn’t so important unless we’re establishing a timeline for how it got there. The kernel and anything that’s running on a system lives in memory, plus it’s where the unencrypted/unobfuscated malware payloads are. System memory is also the one place we’re likely to find cryptographic keys, if we’re really desperate and know roughly where to look.

Where to begin? As with Windows processes, the processes in UNIX-based systems are contiguous data structures, containers created by the kernel for running programs. They contain everything a program might need, including an image of the executable, links to external .so objects (roughly analogous to Windows DLLs), variables, runtime data, etc. A process has a definite logical structure, although that’s not immediately apparent outside textbooks.
The most lucid representations I could find were by Gustavo Duarte and a Lawrence Livermore National Laboratory tutorial, which I’ve merged for clarity:

process-structure

Of course, the stack is also in there, where variables passed to whatever functions and return addresses are found. According to the LLNL tutorial, threads are written to and from the stack.

procfs
This is important, because system monitoring tools will almost always get their information from something called procfs.
Practically everything in a UNIX system is represented by a file – sometimes files are created as interfaces or pointers to something physical. From the user perspective procfs is the /proc directory, which contains a number of virtual directories and files (existing entirely in system memory) representing low-level stuff such as processes, memory allocations and hardware components. Most the procfs files can be accessed directly.

Reading from procfs in the command line
The common way to find what’s running on a machine is the ‘top‘ command:

top-command-output

But it’s only going to show processes that are active or being awakened, and not suspended/dormant processes. This is where ‘ps‘ command becomes a handy alternative:
#ps -el

Lists all processes resident in memory:

ps-el

But, as Prof. Andrew Blyth would say, ‘user-space is a lie’, meaning the user-space programs can only report what kernel-space tells it. The implication is that a kernel-mode rootkit can hide information about processes and network activity from user-space, and Android OS is a total bitch for doing precisely that (unless the device is rooted).
With typical Linux desktops and servers, there are various ways around this. One of them is using unhide to catch discrepencies between CPU, memory usage and what /bin (or /sbin) executables report, and another method compares system calls with a fixed system call map.

Back to the list of processes: Having got the PIDs for active and dormant processes, and having found which programs they might belong to, a considerable amount about a given process can be learned by reading from /proc.
The directories of interest are listed to the left, when using the ‘ls‘ command. The directory names are the PIDs for processes resident in memory, and each contains a number of virtual files.

list-process-directories

The following screenshot shows the contents of /proc/3114, which belongs to Firefox:

process-directory-contents

To get a rough picture of what the process contains, pmap is worth trying:
#pmap 2586

pmap-example

This appears to do pretty much the same thing as:
#cat maps

proc-firefox-maps

Of course, the contents can be dumped to a text file with
#cat maps >> dumpfile.txt

Two values that might be important, which tell us the start and end addresses for the memory allocated to a process:
* vm_start – First address within virtual memory.
* vm_end – First address outside virtual memory.

Other files of interest include:
* /proc/[PID]/exe: Contains symbolic links to the executable binary file.
* /proc/[PID]/limits: Resource limits assigned to the process.
* /proc/[PID]/maps: A memory map of the process.
* /proc/[PID]/sched: Thread scheduling stats for the threads generated by the process.

A couple of other tools
Another useful utility is pstree, which shows whether a PID has any parent or child processes:

pstree

The strace utility will output the system calls made by a process. For example, if Firefox has a PID of 2990:
#strace -p 2990

Viewing process memory and the stack
It took some research to find something that enabled me to view the raw contents of a process’ memory space. Eventually I decided on scanmem, a little utility that lists memory sections/segments and dumps them to a file for later analysis.
The first thing to do is point scanmem at the right PID (Firefox has a PID of 2990 here):
0> pid 2990

The ‘lregions‘ command will give a list of sections/segments found, along with the stack address and size. In this case it’s at address 0xbfba6000, and it’s 180224 bytes in length. Dumping this should produce an 180KB file, basically. Remember, the stack size of a relatively dynamic program could change while being dumped, so this only gives a snapshot:
0> dump 0xbfba6000 180224 /home/michael/scanmem-stack-dump

stackdump

The dump file is quite readable in a hex editor, but the Bokken reverse engineering tool gives better results (when it doesn’t crash). Notice that bit where it says ‘Write a comment‘ at address 0x8a30: I was logged into FaceBook, and the ‘Comment’/’Like’ input fields got pushed onto the stack when I switched to another browser tab.

bokken-stack

Further up the stack, at 0x28ce4, some global variables or parameters are found, including the user name, the hostname, the PID of a crypto provider, session cookies, important filesystem locations, etc. I honestly didn’t expect to find that in the stack, but there it is.

The Kernel
More than the huge chunk of monolithic code, I’m referring to kernel-space, which is the physical memory allocated to the kernel. Generally off limits to users, interaction between user and kernel space is through system calls. When the user double-clicks on that Firefox icon or runs a shell command, the kernel allocates a block of memory for the process, responds to system calls from that process, and manages the execution of that process’ threads on the processor. The kernel also has an interface in procfs.

It’s possible to use ‘cat kcore‘, but that would screw up the terminal session after pressing Ctrl+C, probably because it’s overflowed a buffer related to stdout(). A better way is to use:
#strings -n 10 kcore

Some users with Clam AV installed might shit themselves when the output produces a list of (Windows) Trojans. Relax – they don’t actually exist. What I think is happening is bytes are continually passing through system memory, and just like that monkeys with a typewriter thing, kcore will inevitably dump erroneous byte patterns associated with known malware. For this reason, ani-malware programs should never scan kcore.

Ways to Fortify a UNIX Machine

Bruce Schneier made a statement last September to the effect: If the NSA wants in on your computer, it’s in. I’m not so sure. When putting together a report around roughly this issue, I arrived at the conclusion there is indeed a methodical way to bulletproof a UNIX system.

Host security is normally dependent on commercial anti-malware products, patching and various administrative controls, any of which could be a single point of failure. Almost no system had the layered security model I’m attempting to formalise. On top of that, relatively few systems are protected against the unknowns – the truly sophisticated malware and the zero-days.

What I’m proposing is an extremely secure configuration for UNIX installations by combining stuff at a very low level. What’s more, the components to do this are free, and in combination they provide a type of security unattainable by expensive ‘high-tech’ security products exposited in glossy brochures.

At the moment the model looks something like this:

unix-security-layers

The following are just a few notes, until I refine the idea and write something up in more depth.

Stack Protections
What this measure provides is the option to add stack protections when compiling software, specifically using fstack-protect with GCC. I’ve put this near the user level because it’s an option for users and developers, depending on whether the software’s distributed as source. Unfortunately it’s only useful where users are compiling their software from source, on systems where very few proprietary components exist.

Address Space Layout Randomisation
Again, patching only provides security against the known vulnerabilities, and these days the zero-day stuff is becoming a real concern. The next layer in the model randomises some of the process addresses – an exploit developer must somehow determine where the return address lives, the buffer sizes, the buffer addresses and whatnot. If some of the addresses are unpredictable enough, the task of creating a working exploit becomes extremely awkward.

No add-ons are needed as such. Apparently ALSR has already been included in mainstream Linux since 2005, and it’s definitely native to Ubuntu, Linux Mint and Oracle distributions. The value in /proc/sys/kernel/randomize_va_space indicates the mode it’s being used in. If it has a value of ‘2‘, the positions of the stack and the .data segment are randomised for programs configured to use ASLR protection.
The bad news is ASLR is not universally applied, being another compile-time option.

xinetd
This is more relevant to UNIX machines operating as servers, but anyone using Linux Mint or Ubuntu can install it from the package repositories. Configured correctly, it can prevent Denial of Service attacks, port scanning, port redirection and allow connections only from predefined IP addresses. Access control and resource usage limiting are the main reasons to configure xinetd on a server. The first line of protection for bastion hosts. More detailed write-up here.

xinetd-package

iptables and netfilter
Ultimately how firewall policies are implemented in a Linux system, netfilter is a kernel module that drops, accepts or forwards incoming packets before the OS does anything with them. The netfilter module policies are administrated by the iptables command line executable, and the degree of control it allows is highly granular.

iptables-command

Alternatively netfilter/iptables can be ultimately administrated through desktop GUI applications such as gufw, if configuring it through the command line proves too much of a learning curve.

Linux Security Modules Framework
A native feature of the Linux kernel since version 2.6, the Linux Security Modules (LSM) framework on its own adds almost no security. It provides an interface for optional modules that intercept system calls to critical kernel functions, in other words implementing various forms of access control for programs running in user space. This is an important distinction from security measures that apply to users.

SELinux
The National Security Agency has been getting a lot of bad press lately as a result of the Snowden/Greenwald drama. What’s less commonly known is the NSA also has an ‘Information Asurance Directorate’ (now the Trusted Systems Research Group), tasked with actually making stuff secure.
One good thing the NSA did produce enhances security by adding a module to the LSM framework – Security Enhanced Linux (SELinux), plus a load of documentation to go with it.
SELinux has been around a while, it’s still actively being maintained, and it’s available from the Ubuntu and Linux Mint repositories.

selinux-repo

The general idea is the SELinux module enforces a kind of role-based Mandatory Access Control (MAC), where programs and daemons are granted the least privileges required to function. Even if applications running with root privileges are compromised through unpatched vulnerabilities, the potential damage is quite limited.

Each process is assigned a user name, role and domain. SELinux determines what processes belonging to a given domain are allowed to access. The role tag is used by SELinux to separate administrative from non-administrative processes, which in turn limits the scope of programs that could be compromised and made to perform administrative actions.

AppArmor
SELinux received some criticism for being rather tricky to implement, so along came an easier alternative called AppArmor. It is also included or available in mainstream distributions, in particular SUSE/openSUSE.

apparmor-repo

AppArmor provides a security module that enforces security policies accrording to the profile set for each program on the system, with the idea that programs are then given only the access to resources that their profiles define. This is rather like SELinux, but AppArmor can be configured to operate in ‘learning’ mode.

Anyone with a spare box could install Ubuntu, grab the components I’ve described here from the repos, have a play around with them and have a far more secure installation as a result.

Methods Behind the ATM Malware

What got this in the news last week is the criminal(s) managed to develop Ploutus into something accessible with their cellphones. That, and the fact Microsoft is discontinuing support for Windows XP within a couple of weeks, XP being the common OS for ATMs.
There are two versions of the malware: Ploutus and Ploutus.B, both developed for the NCR systems.

Ploutus v1.0
The story goes that criminals opened up ATMs, accessed their CD-ROM drives and inserted the Ploutus malware from a boot disc. This installed ploutusservice.exe, which on execution loads a few DLLs and starts a service called ‘NCRDRVP‘. Among other things (like rendering a GUI), this sets up a listener on a socket for incoming commands.
It seems that the malware gained system privileges by booting from the CD, rather than using any software vulnerabilities normally associated with malware, and I get the impression it was using native Windows XP and NCR engineering functions after gaining system privileges.

According to the timestamp of the execuable that SpiderLabs uploaded to VirusTotal, it was created or finished in late-August 2013 – or perhaps that was the date SpiderLabs performed their own installation. Several anti-malware systems identified it as a Trojan or backdoor under various names.

Symantec got hold of Ploutus on 4th September 2013, quite a short time after the malware was apparently created, and classified it as a low-level threat with minimal impact (important point here). Less than 49 ATMs were known to be compromised then, which, together with the absence of available samples, suggested the malware was, and still is, in the hands of a small group operating offline. This idea is also supported by the fact Ploutus was developed in .NET, which is relatively easy to reverse engineer, and suggests the creators didn’t intend to trade it with others.

Ploutus.B
Perhaps after field testing Ploutus as proof-of-concept in Mexico, the criminals decided to use a modified version in the United States with a feature that enabled communication with the backdoor through SMS messaging.
This successor, Ploutus.B, is apparently more ‘modular’ than the previous version, but neither Symantec or SpiderLabs say how exactly. At worst it could potentially be modified to dump any recorded card details and their PINs instead of cash.

What the analysts term ‘Network Packet Monitor’ was added to the malware as a module to listen for incoming commands from the USB port connecting the cellphone to the ATM system. One of the numbers being sent could actually be a series of machine code instructions in denary format.

The Physical Element
When someone mentions ‘ATMs’, the first thing that comes to mind are the cash machines installed around banks, which are physically very secure and strategically placed so that tampering with them would draw attention.
However, in this case the targets are stand-alone systems installed in shops, shopping centres, car parks, alleyways, etc. With cash being the critical asset, usually it’s only the compartment storing it that’s secure. The upper part, where the disc drive and ports are found, are typically protected by a door with a standard tumbler lock, and anyone impersonating a service engineer while using a lock-picking kit has a reasonable chance of access. Maybe not even that – I looked up an old YouTube vid of Barnaby Jack’s research, where he said at least one ATM manufacturer supplies master keys that’ll also open any of its units.
Now, if the people behind Plutus somehow acquired their own ATM at some point, it would mean they had a master key, and it would explain how they managed to develop working malware in the first place.

atm-open-1

So what picture forms when the details are put together? A lot of security firms (and governments) like to throw the words ‘cyber’ and ‘sophisticated’ around a little too much, whether that’s to hype ‘threats’ or as an excuse when someone screws up.
Here but the story is essentially about someone running their own software on a Windows XP machine, having exploited weaknesses in physical security. It’s impressive because machines that were held to be secure were completely owned by a simple (but very clever) hack, and that should be seen in the context of past ATM skimming efforts and suchlike.

Another thing to note is it’s a computer-related crime, but it’s not Internet-based. Since only the larger firms have the malware samples, and it’s only spread from Mexico to the United States over the last six months, we can safely assume the culprits were (and still are) keeping it to themselves as a profitable tool, and we could assume it’s quite a small operation. I’m guessing they’re highly skilled, intelligent and methodical programmers, but they’re not yet experienced criminals with the means to sell malware anonymously.