Memory scanning leaves attackers nowhere to hide

Credit to Author: Matt Wixey| Date: Thu, 09 Nov 2023 13:46:19 +0000

In the first of our new series of technical thought leadership papers, which aim to give readers an in-depth look under-the-hood at some of our technologies and research, we wanted to provide an overview of our memory scanning protection and how it works.

Memory scanning – searching within a process’s memory (the process image, and/or suspicious modules, threads, and heap regions) for threats – can be achieved in a variety of ways by security products, and at a variety of times. It may occur when a new process has been created, or regularly for all or some processes on the system. For example, a behavioral trigger for a memory scan may be malware calling CreateRemoteThread (or variants thereof) when it attempts to execute a malicious payload which has been injected into a process; or various other suspicious API calls which are commonly used in process injection and related techniques, such as VirtualAllocEx and WriteProcessMemory, to allocate memory and copy payloads, respectively. More sophisticated malware may call undocumented API functions, or eschew them altogether in favor of direct syscalls and other techniques; combating these methods requires a slightly different approach to memory scanning. There are various other possible behavioral triggers for a memory scan, including process creation, file reads/writes, or connecting to an IP address.

For almost a quarter of a century, we’ve devoted a considerable amount of research and effort into developing various forms of memory scanning. This goes right back to the year 2000, when our capabilities included periodic and on-demand scans, evolving to behavioral-based memory scans with HIPS (Host-based Intrusion Prevention Systems), and now employing much more sophisticated behavioral technology which evolves as the threat landscape does. In particular, our capabilities are not reliant on pattern-matching but employ more complex logic, such as a Turing-complete definition language which employs an algorithmic approach.

Why do we need memory scanning?

The increasing ubiquity of antivirus and endpoint detection solutions means that threat actors are more cautious than ever about dropping malicious files to disk. From their perspective, doing so incurs the risk not only of that particular attack being thwarted, but also having to retool as their malware is analysed, signatured, and reverse-engineered.

As a result, threat actors are increasingly turning to so-called “fileless” techniques, such as process injection, packers, virtualized code, and crypters, to run malicious payloads. For example, in our recent telemetry, we found that 91% of ransomware samples, and 71% of RAT samples, were either custom-packed or used some kind of code obfuscation.

Crucially, many of these techniques mean that the payload itself, even if it does touch disk, is in an encrypted form, and its true intentions and capabilities are only revealed in memory. This makes it difficult for security solutions to distinguish between clean and malicious files, and countermeasures – such as unpacking packed files by emulating packer instructions – often come at considerable computational cost.

Many of these tools and techniques are available in open-source code repositories, or within commercial frameworks designed for legitimate penetration testing; as a result, it is trivial for threat actors to leverage them during attacks, often in slightly modified forms. (In an upcoming blog series, we’ll walk through multiple different process injection techniques, complete with demonstrations, to show just how simple it is for threat actors to use off-the-shelf solutions). More advanced attackers, of course, are capable of finding new techniques, or creating novel combinations of, and refinements to, existing methods.

In-memory attacks provide threat actors with a crucial advantage: they can evade detection by running malicious payloads without writing anything incriminating to disk. Some techniques – such as certain forms of process injection – can also complicate post-incident forensics, and enable threat actors to harvest sensitive information like credentials stored in memory, or to escalate their privileges.

However, memory scanning takes advantage of one crucial fact: when it is loaded into memory, malware must reveal itself. It will be unpacked, or deobfuscated, or decrypted, so that it can achieve its end objective. Examining and assessing the region of memory in which this occurs, in real-time, allows us to make a judgment on whether a particular thread or process contains malicious code.

And while memory scanning has historically been a computationally expensive process, particularly when scanning an entire system’s memory, there are various ways in which we can target memory scans based on contextual cues about a given incident and other factors. This allows us to adapt flexibly to the situation and therefore maximize performance.

Types of memory scan

Scanning an entire system’s memory can present performance challenges. More to the point, it isn’t always necessary. Because memory scanning is a feature within a larger subset of detection and prevention tools, we often know where we want to scan, or when, and so we can perform a targeted memory scan against a process (or processes) at the time they exhibit a suspicious behavior.

For example, say we’re alerted to malware hijacking a thread within a running legitimate process (such as the Suspend, Inject, Resume, or SIR, attack), or malware launching a legitimate process and injecting a malicious payload into it (as in various forms of process injection). We can simply scan that thread or process, which both limits the performance overhead and makes it easier to focus resources on assessing that particular region of memory.

An image showing types of memory scanning, arranged as circular diagrams.

Figure 1: An overview of our targeted memory scan types

Targeting by ‘where’

Parent/child

On occasions where a suspicious process spawns another process and injects into it, we can scan both the parent process and the child for malicious code.

Single thread

Attackers often target particular processes for injection, such as lsass.exe (which contains sensitive credentials that can be leveraged for privilege escalation) or explorer.exe. Typically, these processes have hundreds of threads. In such cases, it’s not necessary to scan every single thread within the process to locate a malicious payload; instead, we pinpoint a specific thread via its ID – for example, by identifying threads which are about to be started or resumed via API calls such as CreateRemoteThread – and scan only that one.

Targeting by ‘when’

Inline

Here, a scan is triggered by a specific behavior, such as process creation; analysts write behavioral rules based on suspicious behaviors which may not in themselves be sufficient to kill the process, but are reason enough to start a scan. We stop the given behavior from completing, and only allow it to continue once the scan has completed and if all appears well.

Asynchronous

An asynchronous scan is for circumstances where we can’t make a decision about a particular behavior until the action is completed and we have more context, so we allow the process to continue while scanning it, while continuously updating the assessment.

Periodic background

Some fileless malware sits idle in memory for some time in order to evade defences or when it’s waiting for C2 responses – sometimes for a few minutes or hours, but sometimes for much longer. To counter this, we can scan memory at regular intervals for malicious behaviors.

Scheduled

Here, the user wants to scan all machines at a specific time of day or at particular intervals, so as not to cause a spike in memory consumption.

Post-detection clean-up

If a behavioral rule is triggered and we block a process as a result, we also trigger a memory scan, in order to check for remnants of the malicious process in memory. For example, some malware employs a technique called a ‘watcher thread’, where one thread remains idle and simply monitors the execution of a malicious payload in another. If the primary thread is killed, the watcher thread takes over and resumes the activity. A post-detection clean-up memory scan terminates all associated threads, so that the malware won’t relaunch.

Memory scanning in action

To demonstrate some of the memory scanning types we discuss above, we selected a malware sample and ran it in a lab environment protected by Sophos to capture the behavioral protection details reported after several memory scans. In a real-world environment, the product would block execution as soon as the malware triggered any of the below protections.

The malware we’re using for this test is the Agent Tesla RAT, a prolific and common threat often distributed via malicious spam emails. Threat actors use Agent Tesla to steal credentials through screenshots and keylogging, and more recent versions employ a variety of anti-sandbox and anti-analysis techniques.

For convenience, as we discuss the memory scans and protections which fire when executing Agent Tesla, we’ll also detail the corresponding MITRE ATT&CK techniques.

An image showing five memory protections against the Agent Tesla RAT

Figure 2: An overview of the scans initiated during our laboratory test of an Agent Tesla RAT sample

Evade_7a (T1055.012) (first released June 2019)

This memory scan rule triggers when a suspicious process launches a high-reputation clean process, potentially for process injection. Because the rule is triggered during a ProcessCreate event, the newly-created process hasn’t yet started, so we scan the suspicious process for malicious code. In a real-world environment, Sophos protections would kill the parent and child processes, and remove any associated suspicious files.

Evade_34b (T1055.012) (first released February 2023)

This rule is technique-based, focusing specifically on process hollowing. It extrapolates specific process memory characteristics, and evaluates if a target process has been hollowed and injected with malicious content. Because this rule is focused on the technique, rather than specific code, it provides additional behavioral protection and assurance

Exec_14a (T1055.012) (first released October 2019)

Here, a memory scan occurs as a result of a specific event which occurs when malicious code is injected into a child process, as part of the SIR sequence referenced previously. This event triggers a protection.

A screenshot of computer code, with a memory dump on the left and dnSpy output on the right

Figure 3: The Tesla RAT code which corresponds to part of the SIR workflow, leading to a protection being triggered

The process being scanned is already marked as a suspicious process, since it was launched by another suspicious process (the parent process in the above section). During a typical process injection attack, we want to block the injected process as early as possible, which we achieve by targeting the process shortly after malicious code has been injected. If the parent process didn’t seem to contain any malicious code during the first scan, this scan is the next step; it allows us to check if the malware has unpacked or deobfuscated any malicious code

C2_1a (T1071.001 and T1095) (first released February 2020)

At this point, Agent Tesla makes an outbound connection to a C2 server.

A screenshot of computer code

Figure 4: Part of the Tesla RAT code responsible for making an outbound C2 connection

We report two different techniques here, because we also capture the port number; for ports 80 and 443, we report T1071, and for others, we report T1095. This is primarily an asynchronous scan. We don’t intentionally hold process execution here, unlike the previous two scans, but when the memory detection triggers, the process would be immediately terminated.

Creds_2c (T1555.003) (first released September 2021)

This rule triggers when a process touches files which hold credentials (such as browser credentials) on disk; we scan the responsible process for any suspicious code. Typically, non-browser processes would not touch these files, so that’s immediately suspicious.

A screenshot of computer code

Figure 5: The Tesla RAT looks for credentials in local storage

Memory_1b (first released September 2021)

Finally, this is a periodic background memory scan, which scans all running processes on a system at regular intervals. It provides an extra layer of assurance, ensuring that all processes are scanned even if there are no behavioral triggers.

As shown in this example, having multiple scanning layers for different events and triggers – complemented by periodic scans across the whole system – is a key defence against in-memory threats, providing multiple opportunities to terminate malicious processes.

Conclusion

While memory scanning is not a panacea for all in-memory attacks, it is an important weapon in the continuing battle against increasingly sophisticated malware. As with any form of protection, memory scanning techniques must constantly adapt and respond to real-world developments, as threat actors develop new methods or build on those which already exist.

As we noted earlier, we’ve been doing this for a long time, and as the threat landscape has shifted and evolved, we’ve continued to adapt our technologies in order to protect against threats, while keeping performance overheads to a minimum and ensuring we build redundancy into our various scan types to provide in-depth protection. These are central tenets of Sophos’ memory scanning capabilities, and our current research reflects this.

For example, one area we’re currently researching is using the data and intelligence we’ve gathered across all of our incidents, research, and analysis to statistically identify certain patterns in memory which are suggestive of a particular class of malware. Various ransomware families, for instance, may have very different codebases and approaches to enumerating and encrypting files – but, from an in-memory perspective, there are commonalities across many of them which we can use to build in more generic protections. Similarly, RATs and infostealers may be very distinct in themselves, but they often generate predictable sequences of behavior which, at the memory level, can be a good predictor that a particular thread or process has been hijacked by a RAT or infostealer.

http://feeds.feedburner.com/sophos/dgdY