Introduction
Identity is the new endpoint but the largest consumers of files transmitted over the internet are still finding hundreds of millions of malicious files a month and your user download folders are probably full of weird and wonderful things with SEO poisoning dominating search engine results to deliver info stealers for initial access brokers. SOC analysts are confronted with this reality everyday whether its “lowest confidence threshold” machine learning model-based detections, anti-virus signatures triggering on byte sequences, funky browser compromises pushing ads and redirects or detections with more criticality.
Fortunately for these same analysts, the industry already has everything it needs to triage and establish both a strong understanding of what has happened and what to do next. To achieve this in the greatest capacity analysts must pivot into sandboxing tools and techniques to expand on what the detection is telling them particularly where the detection is an assessment of nature not behaviour.
How Sandboxing exists in the SOC
Sandboxing is a topic with incredible depth that has roles like malware analysts, reversing engineer and purple team positions dedicated to understanding, documenting and developing detection techniques and material existing in mature teams however the average SOC team and by extension analyst does not have the capacity nor the same objectives of the dedicated malware analysis roles
When looking to perform sandboxing SOC analysts should have two primary goals:
Providing supporting evidence for ideas and assertions you have already arrived at through the source detecting tool (EDR, FIM, Email protection etc)
Identifying further investigative avenues to explore the scope of the compromise
Note how I haven't listed “is the file malicious' because sandbox tools won't tell you that they will only contribute to the evidence that you must assess to arrive at an assertion like that.
Expanding on these two points analysts might be triaging an alert for browser data stores being copied to a location outside their respective program directory, file analysis will confirm this by articulating the suspected files capabilities or where an analyst is triaging a series of alerts across more than two assets they are often faced with more than a single binary because separating functions across files enables adversaries to better evade detections, however only one of these binaries will be dropped to new systems in the network. With sandboxing SOC analysts can establish which binaries represents which component of the adversaries procedure set and more optimally expend resources when investigating telemetry
To illustrate this the following sample once run will result in the below additional files:
https://app.malcore.io/share/65f0eeeea0349ee6c4f1e139/65fc7992cafc33ec9e402404
V168_2fe3868764b70dafe5d89d79466c63e3\MSIUpdaterV168.exe
MSIUpdaterV168.exe
PS_eO1aD5jWGNZ_Jsx42vOi.zip
dt4sw_vbexi68svujoy.exe
AdobeUpdaterV168.exe
These files are all key to the samples intended objectives but which should be of most importance?
Through file analysis we can easily arrive at key pieces of information like:
The nature of item 2 in the execution change means its hash will change per asset
Hunting for beaconing on any of these files will yield no results as item 4 has instructions to inject into regasm.exe
Assembly instructions and file entropy are heavily indicative of obfuscation
Assembly instructions consistent with context modification for a specific process.
With confidence we can say this isn't a funky legitimate app so we can quarantine the file, effectively hunt for further infections and look for process injection if we didn't get a detection for it.
TL;DR There's lots of things we can do and might want to do but when triaging an alert we need to find the most pertinent information, Sandboxes can enable this.
Common Mistakes
As with any tool it's only as effective as the individual using it, with sandboxing this is exasperated by the fact that analysts are not hired as malware experts, let's look at what analysts often do wrong:
Holy Grail Verdict
The word verdict within the context of sandboxing is used for the conclusion the tool arrives at after completing its analysis of a sample which is typically Clean, Suspicious, Malicious. Analysts often use this verdict as the most significant anchor for their decision-making. This is dangerous because sandboxing tools do not express absolutes and instead use weightings on a scale for example a tool might have the following observations:
Here your tool might consider 2.5 above the malicious threshold and provide that verdict, it might consider a circumstance where items 1,2 exist together always provide a malicious verdict or 2.5 might be below any threshold and provide a clean verdict. Everyone reading this will have a differing opinion on exactly what score should be assigned to each observation and where the threshold should be placed.
As an analyst, you must use the context available to you to form your own opinion on whether these observations carry criticality and whether any or all of them compound each other. Areas you can find this context are:
Established understanding of business processes and infrastructure through exposure to telemetry and documentation
Accounts from a relevant user on what they expect the file to be doing
Open-source intelligence
Established understanding of adversary procedure sets and already developed capabilities
Good development practices are unfortunately rare and there's a fun resource here https://wtfbins.wtf/ where you can explore examples of files that would push tools towards higher thresholds but are perfectly legitimate.
No Perfect World
Analysing artefacts even with the more basic objectives SOC analysts aim to achieve does come with complexity both in how malware exists and the way automated tools work. This even extends to malware authors knowing analysts get nosey and take steps to maintain the integrity of their efforts from low-effort attempts to analyse their files and payloads.
Understanding at a surface level how authors do this and how your tools might actually impede your investigation is a must because how you ask your tool to analyse the file and what type of tool you use will decide whether you arrive at a false negative and a lot of screaming and shouting from your stakeholders.
I won't be exploring every technique authors use simply because the average analyst isn't expected to know. Instead, I will capture the considerations you need.
One of the most common anti-analysis techniques is checking environment variables for analysis tools and or virtualization environments, this means that where possible have your automated tools configured to the same build parameters as the system you retrieved the file from. Good automated tools will do their best to bypass these checks but unknowns are scary in triage work so take steps yourself.
Malicious payloads are often staged onto the system after the execution of the initial file; this means your attention needs to be much further than the file you picked up from your EDR and where possible recursively analyse everything newly spawned or dropped.
Hypervisor-based sandboxing can result in noise that abstracts into the tools reporting, this can be particularly dangerous if your tool doesn't give you debug information (almost none do) because it is otherwise impossible to identify the noise outside of logical assumptions. If you think this is happening, switch to an emulator.
If your sandbox gives you an AV signature observation, ensure it continues the analysis and in general does not pre-filter samples based on their reputation.
If you're analysing a website take consideration for common anti-analysis techniques like capture portals and referrer checking, If you suspect this is being deployed manually perform interactive analysis.
In summary, sandboxing is a must have capability within a SOC however it comes with complexity that analysts have to take the time to understand and get comfortable with to both ensure triage is conducted accurately and fully realize the technologies potential. Consider when reading the above content “How does my sandbox fit”, if it’s giving verdicts and minimal context throw it away, if it can’t handle the diversity of artefacts you encounter do you need supplementary tooling and lastly if you want to grow your team’s malware analyst skills will your tool eventually become the bottleneck.
Forgive the question, you might have answered it in a previous post or in this one in a way that I missed on a quick read, but what is “SOC”? I gather not system on chip, but I’m lost here…