Skip to content

Conversation

lautarolecumberry
Copy link

This artifact compares executables in memory (RAM) with those on hard disk. This way, RAM injections are detected. This rarely happens legitimately and is mostly used by malware. This check is executed without dumping the memory and works live on the target system(s).

This artifact detects RAM injections by comparing executables in memory with those on disk
@CLAassistant
Copy link

CLAassistant commented Oct 4, 2025

CLA assistant check
All committers have signed the CLA.

@scudette
Copy link
Collaborator

scudette commented Oct 4, 2025

Does this take into account relocations? on quick read it does not so it is unlikely to work.

@mdenzel
Copy link

mdenzel commented Oct 4, 2025

Thanks for your feedback!

Which relocations are you refering to? ASLR?

The main RAM relocations we could find were due to 'BaseOfData'. The IgnoreOneByteOffsets parameter takes care of them. In our tests (on standard Windows systems) we did not encounter other relocations that affected the technique.

As for if it works: Lautaro wrote a master thesis on it and tested 54 samples (38 malicious / 16 benign). After the thesis we improved the false positive rate due to 'BaseOfData' and retested against 34 malware samples including three C2 frameworks (sliver, mythic, havoc) and 18 normal programs. All three C2 injections were detected.

Here are the results of the retests:

Not-detected Detected Total
Non-malware 33% (17) 2% (1) 35% (18)
Malware 19% (10) 46% (24) 65% (34)
Total 52% (27) 48% (25) 100% (52)

Detection rate is 96.0%
Sensitivity is 70.6%
Accuracy is 78.8%

Thesis and original code is here:
https://github.com/lautarolecumberry/DetectingFilelessMalware

We're currently working on a blog post about the improvements. The submitted code already includes the 'BaseOfData' improvements (the thesis does not).

If you specify which relocations you are referring to, we're happy to have a look and improve the technique further 😊

@scudette
Copy link
Collaborator

scudette commented Oct 4, 2025

I was thinking of the relocations needed when the binary is not built with position independent code (pic)

https://0xrick.github.io/win-internals/pe7/

In that case the binary image in memory will be different from the image on disk due to addresses being relocated by the loader.

Maybe it's not that common to have non pic binaries any more. Perhaps the artifact needs to flag that though for the analyst to ignore the results in this case

@mgreen27
Copy link
Collaborator

mgreen27 commented Oct 4, 2025

We could probably also make the powershell for CheckOneByteChanges native VQL and do the comparisons in memory to not write .mem and .disk as a tempfiles. Maybe thats a v2 though :)

@mdenzel
Copy link

mdenzel commented Oct 5, 2025

Dear @scudette & @mgreen27

Thank you again for the valuable suggestions.

I checked online and in our notes. Documenting the findings here:
(position independent code = pic; position dependent code = pdc)

  • ASLR usually does not change the .text segment, so our technique is usually not affected by it
  • modern compilers/linkers by default use pic (that is probably why it always worked with us on default windows)
  • from Windows 8 on, ASLR is the standard and requires pdc to have relocation tables

So we could add another column to show if there are relocation tables/pdc.

As for why we went with powershell: I tried VQL first but I could not figure out how to iterate over a binary byte by byte. A diff or comparison is possible, but we also need to ignore changes when it is just one byte. Any ideas how to implement this in VQL?

SELECT Pid, Path, MemAddress, DiskAddress, Size,
upload(
file=format(format="/%d", args=Pid),
--offset=Address,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to comment all these lines?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upload command does not support offset and length. Beforehand we used a read_file here. We could remove the comments.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can upload with offset and length using the sparse accessor.

@mdenzel
Copy link

mdenzel commented Oct 5, 2025

Lautaro and me checked pic vs pdc:

  1. creating a pdc binary

gcc has an option -fno-pic and -fno-pie but both produce the same executable on Windows with mingw. PE-bear shows that only timestamp and checksum are different. Same for -flinker-output=exec.

So, we are not sure how to even create a binary with pdc for testing. Might it be that modern compilers do not even have the option to create pdc any more?

  1. detecting a pdc binary

Relocation tables exist also for pic binaries. There is a value e_crlc that showed the relocations but it seems to be deprecated (source: https://docs.rs/goblin/latest/goblin/pe/header/struct.DosHeader.html#structfield.relocations).

Also, according to https://stackoverflow.com/questions/73221196/is-there-a-way-to-tell-if-a-windows-binary-is-a-pie people think there is no flag to tell if a windows binary is pic.

@scudette
Copy link
Collaborator

scudette commented Oct 6, 2025

Thanks for starting this discussions - I think this will end up being a very cool artifact.

I did look at it today and played with the VQL to make it faster and more efficient. I also wanted to see how many false positives there were.

This is my improved version

name: Windows.Memory.Mem2Disk
author: Lautaro Lecumberry, Dr. Michael Denzel
description: |
    This artifact compares executables in memory (RAM) with those
    on hard disk. This way, RAM injections are detected. This rarely
    happens legitimately and is mostly used by malware.
    This check is executed without dumping the memory and works live
    on the target system(s).

parameters:
- name: IgnoreOneByteOffsets
  description: Relative Virtual Adresses (RVA) cause an offset in the code in memory of a process.
               This is the case when the field BaseOfData is set to 0x8000. It creates false
               positives and is fairly safe to ignore (1-byte injections are really hard).
  default: True
  type: bool
- name: UploadFindings
  description: Upload all executables where code in memory does not match code on disk. This
               can potentially generate a lot of traffic. Dry-run before enabling this option.
  default: False
  type: bool
- name: ProcessNameFilter
  type: regex
  default: notepad

precondition: SELECT OS From info() where OS = 'windows'

export: |
  // These functions help to resolve the Kernel Device Filenames
  // into a regular filename with drive letter.
  LET DriveReplaceLookup <= SELECT
      split(sep_string="\\", string=Name)[-1] AS Drive,
      upcase(string=SymlinkTarget) AS Target,
      len(list=SymlinkTarget) AS Len
    FROM winobj()
    WHERE Name =~ "^\\\\GLOBAL\\?\\?\\\\.:"
  
  LET _DriveReplace(Path) = SELECT Drive + Path[Len:] AS ResolvedPath
    FROM DriveReplaceLookup
    WHERE upcase(string=Path[:Len]) = Target
  
  LET DriveReplace(Path) = _DriveReplace(Path=Path)[0].ResolvedPath ||
      Path

sources:
- query: |
    -- get all processes
    LET GetPids = SELECT Pid,
                         Name,
                         Username
      FROM pslist()
      WHERE Name =~ ProcessNameFilter
    
    -- get all memory pages for a certain pid
    LET InfoFromVad(Pid) = SELECT Address,
                                  Size,
                                  DriveReplace(Path=MappingName) AS Path
      FROM vad(pid=Pid)
      WHERE MappingName
       AND Protection =~ "xr-"
            AND MappingName =~ "(exe)$"
      LIMIT 1
    
    LET GetTextSegment(Path) = filter(condition="x=>x.Name = '.text'",
                                      list=parse_pe(file=Path).Sections)[0]
    
    -- parse the executable (PE) from memory (specifically, the text segment)
    LET GetMetadata(Pid, Name) = SELECT
        Path,
        str(str=Pid) AS PidFilename,
        Address,
        GetTextSegment(Path=Path) AS TextSegmentData
      FROM InfoFromVad(Pid=Pid)
      WHERE Address != 0
       AND TextSegmentData.FileOffset
    
    LET Hex(X) = format(format="%#x", args=X)
    
    -- read the executable from memory and hard disk
    LET GetContent(Pid, Name) = SELECT *, Address AS MemAddress,
                                       read_file(
                                         accessor="process",
                                         offset=Address,
                                         filename=PidFilename,
                                         length=TextSegmentData.Size) AS MemoryData,
                                       hash(
                                         path=PidFilename,
                                         accessor="process",
                                         hashselect="SHA256").SHA256 AS MemorySHA256,
                                       TextSegmentData.FileOffset AS DiskAddress,
                                       TextSegmentData.Size AS SegmentSize,
                                       read_file(
                                         accessor="file",
                                         offset=TextSegmentData.FileOffset,
                                         filename=Path,
                                         length=TextSegmentData.Size) AS DiskData,
                                       hash(
                                         path=Path,
                                         accessor="file",
                                         hashselect="SHA256").SHA256 AS DiskSHA256
      FROM GetMetadata(
        Name=Name,
        Pid=Pid)
      WHERE MemoryData
       AND log(
             dedup=-1,
             message="Inspecting Pid %v (%v): %#x-%#x vs %#x-%#x",
             args=[Pid, Name, Address, Address + SegmentSize,
               DiskAddress, DiskAddress + SegmentSize])
    
    -- Filter out not needed comparisons early
    LET FilterContent(Pid, Name) = SELECT *, MemoryData = DiskData AS Comparison
      FROM GetContent(Pid=Pid, Name=Name)
      WHERE NOT Comparison
    
    -- Dict stored as query, so it only gets executed once
    LET Tmp <= dict(a=0)
    
    LET Cmp(X, Y) = SELECT X[_value] = Y[_value]  AND X[1] = Y[1] AS Eq
      FROM range(end=len(list=X), step=2)
      WHERE set(item=Tmp,
                field="a",
                value=if(condition=Eq  AND Tmp.a < 2, then=0, else=Tmp.a + 1))
       AND Tmp.a > 2
      LIMIT 1
    
    LET CheckOneByteChanges(X, Y) = (X = Y
         AND log(message="Comparing %v quickly", dedup=-1, args=len(list=X))) OR (
          set(item=Tmp, field="a", value=0)
         && Cmp(X=X, Y=Y))
    
    -- compare the executable from memory and hard disk
    -- only print the ones where they do not match
    LET Compare(Pid, Name) = if(
        condition=log(message="Comparing process %v", args=Pid)
         AND IgnoreOneByteOffsets,
        then={
        SELECT Pid,
               PidFilename,
               Path,
               NOT CheckOneByteChanges(X=MemoryData, Y=DiskData) AS OneByteOffset,
               Comparison,
               MemorySHA256,
               DiskSHA256,
               MemAddress,
               DiskAddress,
               SegmentSize
        FROM FilterContent(Pid=Pid, Name=Name)
        WHERE NOT OneByteOffset
      },
        else={
        SELECT Pid,
               PidFilename,
               Path,
               Comparison,
               MemorySHA256,
               DiskSHA256,
               MemAddress,
               DiskAddress,
               SegmentSize
        FROM FilterContent(Pid=Pid, Name=Name)
      })
    
    -- compare with uploading the suspicious executables
    LET CompareAndUpload(Pid, Name) = SELECT
        Pid,
        Path,
        Hex(X=MemAddress) AS MemAddress,
        Hex(X=DiskAddress) AS DiskAddress,
        Hex(X=SegmentSize) AS SegmentSize,
        upload(
          file=pathspec(DelegateAccessor="process",
                        DelegatePath=PidFilename,
                        Path=[dict(Offset=MemAddress, Length=SegmentSize), ]),
          name=pathspec(parse=format(format="%s.%d.mem", args=[Path, Pid]),
                        path_type="windows"),
          accessor="sparse") AS UploadMem,
        upload(
          file=pathspec(DelegateAccessor="file",
                        DelegatePath=Path,
                        Path=[dict(Offset=DiskAddress, Length=SegmentSize), ]),
          name=pathspec(parse=format(format="%s.%d.disk", args=[Path, Pid]),
                        path_type="windows"),
          accessor="sparse") AS UploadDisk
      FROM Compare(Pid=Pid, Name=Name)
    
    -- for every process, evaluate the memory-harddisk-comparison
    SELECT *
    FROM foreach(row=GetPids,
                 workers=20,
                 query={
        SELECT *
        FROM if(condition=UploadFindings,
                then={
        SELECT *
        FROM CompareAndUpload(Pid=Pid, Name=Name)
      },
                else={
        SELECT *
        FROM Compare(Pid=Pid, Name=Name)
      })
      })

There were a couple of smaller issues:

  1. The first issue is that the MappingName returned by the VAD plugin are in kernel notation - they need to be converted to a path before we can open the file (for example \Device\HarddiskVolume3\velociraptor.exe should be C:/velociraptor.exe)

I added the code to convert back to regular paths by inspecting the object directory in the kernel object manager.

  1. I also added more logging so we can see exactly what it is trying to do.
  2. Additionally I optimized the code to just check the two strings for equality - most of the time they will be equal so there is no need to fall back to byte by byte comparisons.
  3. I also added proper sparse upload of the regions if they were different.

After playing with the artifact I found some false positives on a clean system. In particular velociraptor.exe was a FP - I uploaded both the mem and disk versions and they were almost identical except of qword at offset 0x1D41166 (Marked with -> )

01D41150   E9 DB F3 2B  FE 90 90 90  90 90 90 90  90 90 90 90  FF FF FF FF  FF FF FF FF->80 1E 7F C3  F7 7F 00 00  50 21 4B C5  F7 7F 00 00  ...+............................P!K.....
01D41178   00 00 00 00  00 00 00 00  FF FF FF FF  FF FF FF FF  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ........................................
01D411A0   00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ........................................
01D411C8   00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ........................................
01D411F0   00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00                                                                                ................
---  velociraptor.exe.7244.mem       --0x1D411F1/0x1D41200--100%---------------------------------------------------------------------------------------------------------------------
01D41150   E9 DB F3 2B  FE 90 90 90  90 90 90 90  90 90 90 90  FF FF FF FF  FF FF FF FF ->80 1E 08 40  01 00 00 00  50 21 D4 41  01 00 00 00  [email protected]!.A....
01D41178   00 00 00 00  00 00 00 00  FF FF FF FF  FF FF FF FF  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ........................................
01D411A0   00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ........................................
01D411C8   00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ........................................
01D411F0   00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00                                                                                ................
---  velociraptor.exe.7244.disk       --0x1D411D8/0x1D41200--100%---------

I then inspected the base address of the image in memory and the PE file:

SELECT format(format="%#x", args=Address) AS AddressHex,
       *
FROM vad(pid=getpid())
WHERE MappingName =~ ".exe$"

SELECT
    format(format="%#x",
           args=parse_pe(file="C:/velociraptor.exe").Sections[0].VMA)
FROM scope()
image

You can see that the VMA (Virtual Memory Address) in the PE header is 0x140001000 and the actual address in memory is 0x7ff7c3771000 . Compare the bytes that have changed between the two images:

memory: 7FF7C37F1E80
disk: 000140081E80

and 0x7FF7C37F1E80 - 0x7ff7c3771000 = 0x000140081E80 - 0x140001000

So you can see how the address was fixed from the disk image to the memory image by the loader - this is what I meant by relocations - the loader will compensate the addressed by the ASLR amount.

In my testing this is not very common at least in this binary but there were two binaries that did have relocations. To properly account for this we need to calculate the relative offset that should have been added (similar to the calculation above) and see if it all adds up.

@mdenzel
Copy link

mdenzel commented Oct 6, 2025

Wow, thanks for all the work!

Hm, I am thinking how to solve this issue without recalculating the entire relocation tables - is checking only the ASLR offset enough? Let's talk offline. I sent you a message on Discord.

@mgreen27
Copy link
Collaborator

mgreen27 commented Oct 7, 2025

@scudette maybe we should add in the device path conversion exports into the VAD artifact (or another main project artifact) so we can import them easily?

@scudette
Copy link
Collaborator

scudette commented Oct 7, 2025

Yeah I have a PR with that already - will send soon.

@lautarolecumberry
Copy link
Author

lautarolecumberry commented Oct 13, 2025

Hi, we updated our query to check for the offsets and add them to a dictionary (as Mike suggested) but it's very slow and it times out. Can you have a look into it?

The problem should be in the following part of the code:

    LET OffsetsTmp <= dict()

    LET CheckOffsetsHelper(X, Y) = SELECT 
        atoi(string=format(format="0x%x", args=substr(str=X, start=_value, end=AddressLength+_value)))
          - atoi(string=format(format="0x%x", args=substr(str=Y, start=_value, end=AddressLength+_value))) AS Difference
      FROM range(end=len(list=X), step=AddressLength)
      WHERE if(condition=Difference!=0,
        then=set(
          item=OffsetsTmp, 
          field=Difference, 
          value=if(
            condition=OffsetsTmp[format(format="%d", args=Difference)], 
            then=1+OffsetsTmp[format(format="%d", args=Difference)], 
            else=1
          )
        )
      )

PS: Artifact code is not yet finished, we have one TODO remaining. Please, don't merge the PR yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants