Data Retention Directive: Not Quite Defeated

The European Court of Justice has ruled the Data Retention Directive would be a violation of the European Convention on Human Rights, and the Data Retention Directive is therefore invalid.

The ECJ’s line of reasoning is that metadata paints such a detailed picture of a given person’s life that the right to privacy under the Human Rights Act (or the European Convention of Human Rights) would become a joke. It’s also generalised enough that it would conceivably give citizens the feeling they were constantly under surveillance, and applied broadly enough that it no longer qualifies as a valid exception to the ECHR.
Of course, being a ham fisted power grab, the Data Retention Directive lacked the safeguards to prevent the data being improperly accessed and misused.

Data Retention Directive: Invalid, but not quite dead yet
For readers who are wondering how significant the ECJ ruling actually is, here’s my understanding of how things work: Basically the European Union consists (at least) of the European Council and the European Parliament. A proposition is made by the European Parliament, submitted to the EC, then voted on in the European Parliament. Sometimes laws can originate from the EC itself. In any case, the outcome is often a ‘Directive’, basically an order for the EU ‘states’ to implement it as legislation. In the UK, the Directives are eventually enacted as legislation by Westminster.

The bad news is the ECJ ruling itself wasn’t a defeat of the data retention thing. While the directive was ruled invalid, it would have to be repealed along with each country’s implementation of it – a process that realistically takes several years. In other words, the directive still exists and the Electronic Communications Bill could still be passed through Westminster. Of course, someone might take the matter to court, but how would such a legal challenge be framed?

Neither is data retention specifically illegal. The ECHR itself comes with this huge, but very subtle caveat – any part of the EHCR can be nullified for the purpose of ‘preventing crime’. Or data retention could be done rather effectively under the guise of foreign intelligence gathering, knowing the vast majority of our Internet traffic enters and leaves the United Kingdom.

The ECJ’s ruling does, however, set a nice precedent and qualifies our arguments of how increasing surveillance powers can be incompatible with the concept of human rights. For whatever it’s worth.

Linux Process Exploration

It seems my post on static analysis of executables is still very popular, and quite a few readers are even Googling for that specific page. I’m guessing some of you would like a follow-up, this time on more dynamic analysis.

It’s perhaps more important forensically. When it comes to detecting, capturing and analysing malware, the hard drive isn’t so important unless we’re establishing a timeline for how it got there. The kernel and anything that’s running on a system lives in memory, plus it’s where the unencrypted/unobfuscated malware payloads are. System memory is also the one place we’re likely to find cryptographic keys, if we’re really desperate and know roughly where to look.

Where to begin? As with Windows processes, the processes in UNIX-based systems are contiguous data structures, containers created by the kernel for running programs. They contain everything a program might need, including an image of the executable, links to external .so objects (roughly analogous to Windows DLLs), variables, runtime data, etc. A process has a definite logical structure, although that’s not immediately apparent outside textbooks.
The most lucid representations I could find were by Gustavo Duarte and a Lawrence Livermore National Laboratory tutorial, which I’ve merged for clarity:

process-structure

Of course, the stack is also in there, where variables passed to whatever functions and return addresses are found. According to the LLNL tutorial, threads are written to and from the stack.

procfs
This is important, because system monitoring tools will almost always get their information from something called procfs.
Practically everything in a UNIX system is represented by a file – sometimes files are created as interfaces or pointers to something physical. From the user perspective procfs is the /proc directory, which contains a number of virtual directories and files (existing entirely in system memory) representing low-level stuff such as processes, memory allocations and hardware components. Most the procfs files can be accessed directly.

Reading from procfs in the command line
The common way to find what’s running on a machine is the ‘top‘ command:

top-command-output

But it’s only going to show processes that are active or being awakened, and not suspended/dormant processes. This is where ‘ps‘ command becomes a handy alternative:
#ps -el

Lists all processes resident in memory:

ps-el

But, as Prof. Andrew Blyth would say, ‘user-space is a lie’, meaning the user-space programs can only report what kernel-space tells it. The implication is that a kernel-mode rootkit can hide information about processes and network activity from user-space, and Android OS is a total bitch for doing precisely that (unless the device is rooted).
With typical Linux desktops and servers, there are various ways around this. One of them is using unhide to catch discrepencies between CPU, memory usage and what /bin (or /sbin) executables report, and another method compares system calls with a fixed system call map.

Back to the list of processes: Having got the PIDs for active and dormant processes, and having found which programs they might belong to, a considerable amount about a given process can be learned by reading from /proc.
The directories of interest are listed to the left, when using the ‘ls‘ command. The directory names are the PIDs for processes resident in memory, and each contains a number of virtual files.

list-process-directories

The following screenshot shows the contents of /proc/3114, which belongs to Firefox:

process-directory-contents

To get a rough picture of what the process contains, pmap is worth trying:
#pmap 2586

pmap-example

This appears to do pretty much the same thing as:
#cat maps

proc-firefox-maps

Of course, the contents can be dumped to a text file with
#cat maps >> dumpfile.txt

Two values that might be important, which tell us the start and end addresses for the memory allocated to a process:
* vm_start – First address within virtual memory.
* vm_end – First address outside virtual memory.

Other files of interest include:
* /proc/[PID]/exe: Contains symbolic links to the executable binary file.
* /proc/[PID]/limits: Resource limits assigned to the process.
* /proc/[PID]/maps: A memory map of the process.
* /proc/[PID]/sched: Thread scheduling stats for the threads generated by the process.

A couple of other tools
Another useful utility is pstree, which shows whether a PID has any parent or child processes:

pstree

The strace utility will output the system calls made by a process. For example, if Firefox has a PID of 2990:
#strace -p 2990

Viewing process memory and the stack
It took some research to find something that enabled me to view the raw contents of a process’ memory space. Eventually I decided on scanmem, a little utility that lists memory sections/segments and dumps them to a file for later analysis.
The first thing to do is point scanmem at the right PID (Firefox has a PID of 2990 here):
0> pid 2990

The ‘lregions‘ command will give a list of sections/segments found, along with the stack address and size. In this case it’s at address 0xbfba6000, and it’s 180224 bytes in length. Dumping this should produce an 180KB file, basically. Remember, the stack size of a relatively dynamic program could change while being dumped, so this only gives a snapshot:
0> dump 0xbfba6000 180224 /home/michael/scanmem-stack-dump

stackdump

The dump file is quite readable in a hex editor, but the Bokken reverse engineering tool gives better results (when it doesn’t crash). Notice that bit where it says ‘Write a comment‘ at address 0x8a30: I was logged into FaceBook, and the ‘Comment’/’Like’ input fields got pushed onto the stack when I switched to another browser tab.

bokken-stack

Further up the stack, at 0x28ce4, some global variables or parameters are found, including the user name, the hostname, the PID of a crypto provider, session cookies, important filesystem locations, etc. I honestly didn’t expect to find that in the stack, but there it is.

The Kernel
More than the huge chunk of monolithic code, I’m referring to kernel-space, which is the physical memory allocated to the kernel. Generally off limits to users, interaction between user and kernel space is through system calls. When the user double-clicks on that Firefox icon or runs a shell command, the kernel allocates a block of memory for the process, responds to system calls from that process, and manages the execution of that process’ threads on the processor. The kernel also has an interface in procfs.

It’s possible to use ‘cat kcore‘, but that would screw up the terminal session after pressing Ctrl+C, probably because it’s overflowed a buffer related to stdout(). A better way is to use:
#strings -n 10 kcore

Some users with Clam AV installed might shit themselves when the output produces a list of (Windows) Trojans. Relax – they don’t actually exist. What I think is happening is bytes are continually passing through system memory, and just like that monkeys with a typewriter thing, kcore will inevitably dump erroneous byte patterns associated with known malware. For this reason, ani-malware programs should never scan kcore.

Ways to Fortify a UNIX Machine

Bruce Schneier made a statement last September to the effect: If the NSA wants in on your computer, it’s in. I’m not so sure. When putting together a report around roughly this issue, I arrived at the conclusion there is indeed a methodical way to bulletproof a UNIX system.

Host security is normally dependent on commercial anti-malware products, patching and various administrative controls, any of which could be a single point of failure. Almost no system had the layered security model I’m attempting to formalise. On top of that, relatively few systems are protected against the unknowns – the truly sophisticated malware and the zero-days.

What I’m proposing is an extremely secure configuration for UNIX installations by combining stuff at a very low level. What’s more, the components to do this are free, and in combination they provide a type of security unattainable by expensive ‘high-tech’ security products exposited in glossy brochures.

At the moment the model looks something like this:

unix-security-layers

The following are just a few notes, until I refine the idea and write something up in more depth.

Stack Protections
What this measure provides is the option to add stack protections when compiling software, specifically using fstack-protect with GCC. I’ve put this near the user level because it’s an option for users and developers, depending on whether the software’s distributed as source. Unfortunately it’s only useful where users are compiling their software from source, on systems where very few proprietary components exist.

Address Space Layout Randomisation
Again, patching only provides security against the known vulnerabilities, and these days the zero-day stuff is becoming a real concern. The next layer in the model randomises some of the process addresses – an exploit developer must somehow determine where the return address lives, the buffer sizes, the buffer addresses and whatnot. If some of the addresses are unpredictable enough, the task of creating a working exploit becomes extremely awkward.

No add-ons are needed as such. Apparently ALSR has already been included in mainstream Linux since 2005, and it’s definitely native to Ubuntu, Linux Mint and Oracle distributions. The value in /proc/sys/kernel/randomize_va_space indicates the mode it’s being used in. If it has a value of ‘2‘, the positions of the stack and the .data segment are randomised for programs configured to use ASLR protection.
The bad news is ASLR is not universally applied, being another compile-time option.

xinetd
This is more relevant to UNIX machines operating as servers, but anyone using Linux Mint or Ubuntu can install it from the package repositories. Configured correctly, it can prevent Denial of Service attacks, port scanning, port redirection and allow connections only from predefined IP addresses. Access control and resource usage limiting are the main reasons to configure xinetd on a server. The first line of protection for bastion hosts. More detailed write-up here.

xinetd-package

iptables and netfilter
Ultimately how firewall policies are implemented in a Linux system, netfilter is a kernel module that drops, accepts or forwards incoming packets before the OS does anything with them. The netfilter module policies are administrated by the iptables command line executable, and the degree of control it allows is highly granular.

iptables-command

Alternatively netfilter/iptables can be ultimately administrated through desktop GUI applications such as gufw, if configuring it through the command line proves too much of a learning curve.

Linux Security Modules Framework
A native feature of the Linux kernel since version 2.6, the Linux Security Modules (LSM) framework on its own adds almost no security. It provides an interface for optional modules that intercept system calls to critical kernel functions, in other words implementing various forms of access control for programs running in user space. This is an important distinction from security measures that apply to users.

SELinux
The National Security Agency has been getting a lot of bad press lately as a result of the Snowden/Greenwald drama. What’s less commonly known is the NSA also has an ‘Information Asurance Directorate’ (now the Trusted Systems Research Group), tasked with actually making stuff secure.
One good thing the NSA did produce enhances security by adding a module to the LSM framework – Security Enhanced Linux (SELinux), plus a load of documentation to go with it.
SELinux has been around a while, it’s still actively being maintained, and it’s available from the Ubuntu and Linux Mint repositories.

selinux-repo

The general idea is the SELinux module enforces a kind of role-based Mandatory Access Control (MAC), where programs and daemons are granted the least privileges required to function. Even if applications running with root privileges are compromised through unpatched vulnerabilities, the potential damage is quite limited.

Each process is assigned a user name, role and domain. SELinux determines what processes belonging to a given domain are allowed to access. The role tag is used by SELinux to separate administrative from non-administrative processes, which in turn limits the scope of programs that could be compromised and made to perform administrative actions.

AppArmor
SELinux received some criticism for being rather tricky to implement, so along came an easier alternative called AppArmor. It is also included or available in mainstream distributions, in particular SUSE/openSUSE.

apparmor-repo

AppArmor provides a security module that enforces security policies accrording to the profile set for each program on the system, with the idea that programs are then given only the access to resources that their profiles define. This is rather like SELinux, but AppArmor can be configured to operate in ‘learning’ mode.

Anyone with a spare box could install Ubuntu, grab the components I’ve described here from the repos, have a play around with them and have a far more secure installation as a result.

Methods Behind the ATM Malware

What got this in the news last week is the criminal(s) managed to develop Ploutus into something accessible with their cellphones. That, and the fact Microsoft is discontinuing support for Windows XP within a couple of weeks, XP being the common OS for ATMs.
There are two versions of the malware: Ploutus and Ploutus.B, both developed for the NCR systems.

Ploutus v1.0
The story goes that criminals opened up ATMs, accessed their CD-ROM drives and inserted the Ploutus malware from a boot disc. This installed ploutusservice.exe, which on execution loads a few DLLs and starts a service called ‘NCRDRVP‘. Among other things (like rendering a GUI), this sets up a listener on a socket for incoming commands.
It seems that the malware gained system privileges by booting from the CD, rather than using any software vulnerabilities normally associated with malware, and I get the impression it was using native Windows XP and NCR engineering functions after gaining system privileges.

According to the timestamp of the execuable that SpiderLabs uploaded to VirusTotal, it was created or finished in late-August 2013 – or perhaps that was the date SpiderLabs performed their own installation. Several anti-malware systems identified it as a Trojan or backdoor under various names.

Symantec got hold of Ploutus on 4th September 2013, quite a short time after the malware was apparently created, and classified it as a low-level threat with minimal impact (important point here). Less than 49 ATMs were known to be compromised then, which, together with the absence of available samples, suggested the malware was, and still is, in the hands of a small group operating offline. This idea is also supported by the fact Ploutus was developed in .NET, which is relatively easy to reverse engineer, and suggests the creators didn’t intend to trade it with others.

Ploutus.B
Perhaps after field testing Ploutus as proof-of-concept in Mexico, the criminals decided to use a modified version in the United States with a feature that enabled communication with the backdoor through SMS messaging.
This successor, Ploutus.B, is apparently more ‘modular’ than the previous version, but neither Symantec or SpiderLabs say how exactly. At worst it could potentially be modified to dump any recorded card details and their PINs instead of cash.

What the analysts term ‘Network Packet Monitor’ was added to the malware as a module to listen for incoming commands from the USB port connecting the cellphone to the ATM system. One of the numbers being sent could actually be a series of machine code instructions in denary format.

The Physical Element
When someone mentions ‘ATMs’, the first thing that comes to mind are the cash machines installed around banks, which are physically very secure and strategically placed so that tampering with them would draw attention.
However, in this case the targets are stand-alone systems installed in shops, shopping centres, car parks, alleyways, etc. With cash being the critical asset, usually it’s only the compartment storing it that’s secure. The upper part, where the disc drive and ports are found, are typically protected by a door with a standard tumbler lock, and anyone impersonating a service engineer while using a lock-picking kit has a reasonable chance of access. Maybe not even that – I looked up an old YouTube vid of Barnaby Jack’s research, where he said at least one ATM manufacturer supplies master keys that’ll also open any of its units.
Now, if the people behind Plutus somehow acquired their own ATM at some point, it would mean they had a master key, and it would explain how they managed to develop working malware in the first place.

atm-open-1

So what picture forms when the details are put together? A lot of security firms (and governments) like to throw the words ‘cyber’ and ‘sophisticated’ around a little too much, whether that’s to hype ‘threats’ or as an excuse when someone screws up.
Here but the story is essentially about someone running their own software on a Windows XP machine, having exploited weaknesses in physical security. It’s impressive because machines that were held to be secure were completely owned by a simple (but very clever) hack, and that should be seen in the context of past ATM skimming efforts and suchlike.

Another thing to note is it’s a computer-related crime, but it’s not Internet-based. Since only the larger firms have the malware samples, and it’s only spread from Mexico to the United States over the last six months, we can safely assume the culprits were (and still are) keeping it to themselves as a profitable tool, and we could assume it’s quite a small operation. I’m guessing they’re highly skilled, intelligent and methodical programmers, but they’re not yet experienced criminals with the means to sell malware anonymously.

Document-Based Exploits

Document files and PDFs as a method of inserting exploits are one of the common features across publicly-disclosed targeted attacks, with the non-targeted incidents generally involving links to web pages hosting malicious code. Here I’m focussing on what appears the very first stage of a typical targeted attack (after the recce and intel gathering).

Technical Background
An OS must be capable of identifying different file types, in order to know which application to process it – when the user clicks on a .doc file, the OS just knows to open it with LibreOffice or Microsoft Word. What the user then has is a running application and a file loaded into memory – the process either linking to the file’s data structure or loading it into its address space.

The other concept to understand is the role of file extensions. We give a Word document the .doc extension purely to identify it as such, and so the desktop GUI gives it the appropriate icon. Remove the file extension and the OS would identify its true file type. This is one feature that enables a baddy to trick a human user, and usually it works because most of us interact with computers through a GUI.
So how does the OS know what a file actually is, without an extension?

Headers and File Internals
File types are actually defined by an initial sequence of bytes, which are sometimes referred to as the ‘file header’ or ‘magic number’. They can be seen by running the following command on a given file:
$hexdump -n 50 (filename)

Here are a couple of examples for PDFs and JPGs:

jpg-headers

pdf-headers

In fact, this is how digital forensic software can determine whether images are being hidden using a false extension. In our case, the technique can be used to find whether malicious code is masquerading as a document or image.
With executables, the first 125 byte seem to be an identifier, as a consequence of having a standard data structure and multiple headers.

exe-headers

When the file is opened, the OS determines which program/application should handle it by reading the first several bytes, and then initialises a process for that.

A Little Experiment
Let’s go beyond the forensics and see how the theory could be put ino practice by doing a little magic trick – turning cmd.exe into a ‘PDF’.
The first thing to determine is what the file header is for a valid PDF, by opening two separate documents and isolating the initial bytes that are common to both files – any file with those bytes must be a PDF, right? I did this earlier in the command line, but GHex gives a different output for some reason.

hexcompare

Next step is to open cmd.exe in GHex, prefix its contents with the PDF header bytes and save it as ‘testploit‘ (without an extension).

cmd-edit

And there we go: our edited cmd.exe is now disguised as a PDF even on closer inspection. The file effectively should become a launcher for a Windows command prompt. It even passes itself off as a valid document when viewed in the properties window or scanned by VirusTotal.

testsploit-properties

There’s a much faster way, using Metasploit to create malicious PDFs complete with exploits for Adobe Reader.

Embedded Functions
What I’ve described so far is pretty amateur – a recipient would know something’s up if an actual document fails to materialise, plus it’s obvious if an .exe program is launched. In targeted attacks both the email and the attachment would be carefully tailored, to ensure that both are convincing and innocuous enough not to raise any suspicion.
It doesn’t even have to be that targeted – a fake brochure emailed to someone who attended a major marketing event (such as <InfoSecurity Europe) would work, or perhaps a 'mislaid' USB drive containing a PDF with an interesting filename, and I'm guessing that most people don't habitually update their versions of Adobe Reader. The recipient would open the doc, hit the delete button and think nothing of it, by which time the payload would have done its job.

I created a basic PDF document, then used the $strings command to view its structure:

pdf-strings

The /OpenAction string looks most promising. According to Tim Xia at Websense, this field can be used to cause a JavaScript action to run when the file is opened, JavaScript exploits being associated with a ‘heap spraying’ technique that could provide a way around Microsoft’s Address Space Layout Randomisation. The presence of JavaScript doesn’t necessarily mean there’s actually an exploit, though.
In the Websense analysis, ‘this.(function)‘ was placed in the /OpenAction field, with ‘(function)‘ being a call to an object elsewhere in the file. I reckon both could be inserted into a PDF using a hex editor, using the same method I used for changing the file header bytes. The function could be anything – perhaps an exploit for a buffer overflow vulnerability within any of Adobe Reader’s functions, with a payload to fetch a malware installer.
The exploit creators went a couple of steps further, encoding the function and compressing it with zblib, but they still needed to reference it in the /OpenAction field.

Solutions
Of course, a policy of ‘don’t click shit!’ is always the first countermeasure that comes to mind, but if a hundred employees of a given organisation were sent a malicious attachment, it’s guaranteed that several of them will open it. Only one successful attempt is needed. I’d also argue that anyone could be made to open a malware-infected document if enough effort went into crafting the attack.

A security plan must take into account that people will open whatever attachments are mailed to them. Security then relies on: 1) Patching and exploit prevention, 2) Malware detection, 3) Preventing traffic between malware and a C&C server, 4) Detection and incident response.

Windows 7 and 8 users are in a relatively good position, as Microsoft works on the assumption that code vulnerabilities will always slip through the net, and decided to mitigate them with things like like ASLR and SafeSEH. There are ways around these, but they present an obstacle to getting an exploit to run. Patching Adobe Reader should also be effective, depending on whether the attackers are limited to stock exploits.
The Hong Kong CERT have recomended the use of alternative applications for reading PDFs and Microsoft Office documents, the idea being that users would be unaffected by exploits for Adobe/Microsoft. While it’s a good strategy in the short term, it’s more of a delaying tactic against an APT, and alternative applications would become vectors should they become popular.