The Stack Series: Return Address Spoofing on x64

\"\"

introduction

The stack of a process has the potential to give away the true nature of the running program in the memory. Hence it is one of the monitored entities by the security solutions. When a program executes any interesting functions like InternetConnectA, security systems may initiate a stack check to find out if there is anything suspicious about the program. For example Is the base module that initiated the entire call stack a floating code in the memory? If yes then its highly likely to be suspicious/malicious and requires further inspection.

This is usually done by performing a stack walk. In x86 applications, debuggers usually use EBP call chain to figure out call stack meanwhile x64 architecture system prefers to do it in a completely different manner by using unwind information stored in executable itself . In this article we will begin with a more simpler concept called Return Address Spoofing vector. This can be seen implemented in many game cheats and malwares. In the context of game development, modern games have robust anti-cheat engines in them to prevent external programs from modifying its code in memory to cheat the game. The anti-cheat in the game makes sure any function code executed in the game lies within the bounds of the game\’s code in the memory. This is done by checking the return address of the function, if the return address lies within the boundaries of game code then anti-cheat engine assumes it is legit code and lets the function run. The return address spoofing fools the game into believing that the called function code is part of the game.

In the context of malwares, we usually see C2 platforms and malwares in the wild spoofing the return address to decouple the malware program from the called Win32 API in the event of a stack check initiated by security solution to identify the caller, by placing a return address that points to some legit DLL module to make it look like the Win32 API is called from that DLL instead of malware code. This doesn\’t guarantee full evasion as it is one of many tricks employed by malwares to stay under the radar.

Give credits where its due:

  • IAT hooking code from iredteam.
  • x64 stack spoofer assembly from namazso
  • Implemented in Ace Loader project.

call stack

Below image shows a simple call stack for InternetConnectA() function call. Following observations can be made from it.

  • The main() function calls SomeFunc1 from some.dll.
  • The SomeFunc2 in some.dll is invoked by SomeFunc1.
  • Finally SomeFunc2 calls InternetConnectA() in wininet.dll.

The execution starts from the main and finally calls InternetConnectA api, this contextual information can be retrieved by any application by performing a \”stack walk\” which is a process of backtracking each frames from the recent function frame to the least recent one, giving us an overview of call chain of the program. In this post we are more concerned with return address spoofing that spoofing entire stack itself, that will be a topic for next post. 🙂

\"\"

The understanding of call stack is very important to implement a return address spoofer. Despite of the architecture [x64/x86], when program encounters a CALL instruction, the return address will be placed on the stack and the callee\’s stack frame starts from next 8 bytes. This post covers x64 architecture, below image shows a spoofed return address highlighted in red color, this makes kernel32.dll look like the caller of InternetConnectA(). When any application looks at the return address of the InternetConnectA() sees only kernel32.dll, nothing sus!

\"\"

game plan

  • MessageBox() will be our target function whose return address will be spoofed.
  • To make the spoofing happen, we need set few things up before MessageBox is executed. For that reason we will hook it.
  • The hook is a simple Import Address Table hook, where the actual address to MessageBox() is replaced with our code.
  • After hook is placed, we will finally call MessageBox().
  • The call to MessageBox lands in our hooked code, where we implement a function SpoofRetAddr to call an assembly procedure that performs the return address spoofing.

iat hooking

using PrototypeMessageBox = int (WINAPI*)(HWND hWnd, LPCSTR lpText, LPCSTR lpCaption, UINT uType);
PrototypeMessageBox originalMsgBox = MessageBoxA;
PVOID SpoofRetAddr(PVOID function, HANDLE module, ULONG size, PVOID a, PVOID b, PVOID c, PVOID d, PVOID e, PVOID f, PVOID g, PVOID h)
{
	//will be discussed later
}
int main()
{
	
	LPVOID imageBase = GetModuleHandleA(NULL);
	PIMAGE_DOS_HEADER dosHeaders = (PIMAGE_DOS_HEADER)imageBase;
	PIMAGE_NT_HEADERS ntHeaders = (PIMAGE_NT_HEADERS)((DWORD_PTR)imageBase + dosHeaders->e_lfanew);
	PIMAGE_IMPORT_DESCRIPTOR importDescriptor = NULL;
	IMAGE_DATA_DIRECTORY importsDirectory = ntHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT];
	importDescriptor = (PIMAGE_IMPORT_DESCRIPTOR)(importsDirectory.VirtualAddress + (DWORD_PTR)imageBase);
	LPCSTR libraryName = NULL;
	HMODULE library = NULL;
	PIMAGE_IMPORT_BY_NAME functionName = NULL;
	while (importDescriptor->Name != NULL)
	{
		libraryName = (LPCSTR)importDescriptor->Name + (DWORD_PTR)imageBase;
		library = LoadLibraryA(libraryName);
		if (library)
		{
			PIMAGE_THUNK_DATA originalFirstThunk = NULL, firstThunk = NULL;
			originalFirstThunk = (PIMAGE_THUNK_DATA)((DWORD_PTR)imageBase + importDescriptor->OriginalFirstThunk);
			firstThunk = (PIMAGE_THUNK_DATA)((DWORD_PTR)imageBase + importDescriptor->FirstThunk);
			while (originalFirstThunk->u1.AddressOfData != NULL)
			{
				functionName = (PIMAGE_IMPORT_BY_NAME)((DWORD_PTR)imageBase + originalFirstThunk->u1.AddressOfData);
				// find MessageBoxA address
				if (std::string(functionName->Name).compare(\"MessageBoxA\") == 0)
				{
					SIZE_T bytesWritten = 0;
					DWORD oldProtect = 0;
					VirtualProtect((LPVOID)(&firstThunk->u1.Function), 8, PAGE_READWRITE, &oldProtect);
					// swap MessageBoxA address with address of hookedMessageBox
					firstThunk->u1.Function = (DWORD_PTR)hookedMessageBox;
				}
				++originalFirstThunk;
				++firstThunk;
			}
		}
		importDescriptor++;
	}
	// message box after IAT hooking
	MessageBoxA(NULL, \"Check_My_Ret_Addr\", \"Hooked\", 0);
	return 0;
}

  • Above code simply parses the executable PE.
  • Import descriptor of each imported modules are parsed to locate the MessageBox api\’s import address. We replace the original address with that of SpoofRetAddr function.
  • Note: Before hooking, the original address of MessageBox needs to stored somewhere, in our program it is in originalMsgBoxA.

return address spoofing

x64 stack primer

  • On x86 architecture, RBP register is a special purpose register that is used to keep track of stack frames and RSP is used as the stack pointer [top of the stack]. By following the pointer chains from RBP, we can process the entire call stack of the program. Scene is different on x64 architecture where RBP is a general purpose register and it is relieved off the stack duties. The RSP acts as both stack pointer and frame pointer.
  • x64 architecture adopts fast call convention where first four parameters of the function are stored in RCX/RDX/R8/R9 respectively and rest are pushed on to stack. The direction of argument value parsing at the callsite is from left to right.
  • RSP is fixed through out a function code in x64 architecture. The push and pop instructions are restricted to only prologue and epilogue of the function, these instructions cannot be used anywhere else in the code as it changes the state of RSP. Both local variables and argument values are retrieved by using RSP.
  • There is special \”Home space\” allocated on the stack to accommodate the argument values. The address of Arguments passed through registers are stored in the home space. [This is not the complete picture, refer references to know more].
  • The callee\’s stack frame will have a reserved space to store the non-volatile registers on stack.
  • When a call happens the caller pushes the return address, which is the address of the instruction that follows the call, to the stack and jumps to the target function code.
  • The stack should be 16 byte aligned. [Refer reference to know more]

Below image is taken from msdn. it summarizes everything we have discussed before and shows the stack layout on x64 platform.

\"\"

Below image shows the implementation of \”SpoofRetAddr\” which gets called from the hooked MessageBox function. Lets focus on first three parameters :

  • PVOID function : The target Win32 API whose return address needs to be spoofed before invocation.
  • HANDLE module: The fake return address will be taken from this module. The gadget will be taken from this module. When target function returns to spoofed address, our gadget will get executed hence controlling the execution flow back to our assembly code, the fix up code to be specific. We will cover it in following sections.
  • ULONG size : The size of the module.

The remaining parameters are for the target function.

\"\"
  • The FindGadget function will fetch the address of the gadget from the module. The gadget we are interested in is \”\\xFF\\x23\” or JMP QWORD PTR [RBX]. In the program code, wininet.dll is the module and MessageBox is the function passed to the SpoofRetAddr function. So our call stack will show the address of the gadget in wininet.dll as the return address.
  • The address of the gadget is stored in a pointer called Trampoline as shown in the code.
  • The PRM structure glues together all the data [Trampoline,function and placeholder] we need to perform the spoofing.
\"\"
  • The variable \”param\” is of type PRM which is initialized with values Trampoline and function.
  • Finally the call to assembly procedure Spoof is made by passing arguments required for the target Win32 API and &param. The NULL is passed to keep the stack 16 byte aligned.
  • The position of &param is very important. It has to be right after the first four arguments.

assembly magic

[BITS 64]
DEFAULT REL
GLOBAL Spoof
[SECTION .text]
Spoof:
    pop    r11               ;saves the return address in R11
    add    rsp, 8            ;skips callee reserved space
    mov    rax, [rsp + 24]   ;Dereference param [address of gadget]
    mov    r10, [rax]        ;gadget address stored in r10
    mov    [rsp], r10        ;replace RSP with gadget address[spoof]
    mov    r10, [rax + 8]    ;store target function addr in r10
    mov    [rax + 8], r11    ;put original ret addr in function member of param
    mov    [rax + 16], rbx   ;save rbx in rbx member of param
    lea    rbx, [fixup]      ;store fixup addr in rbx
    mov    [rax], rbx        ;put fixup in Trampoline member of param
    mov    rbx, rax          ;store updated param in rbx
    jmp    r10               ;jumps to target function addr
fixup:
    sub    rsp, 16           ;Reverts all the RSP modifications done in the beginning 
    mov    rcx, rbx          ;Restore our updated param from rbx
    mov    rbx, [rcx + 16]   ;restores the rbx
    jmp    QWORD [rcx + 8]   ;jumps to original return addr 

Working of Spoof procedure is explained below:

  • When the SpoofRetAddr function calls the Spoof(), the caller pushes the return address to the stack hence RSP contains the original return address which is required by Spoof to return safely back to our program without any issues. So we need to safely store it in a non-volatile register of our choosing. Hence pop r11 does the job.
  • Since we are working with x64 application, as we discussed in the previous section, the stack has a Shadow space or Home space just above the return address, which is 32 bytes in size. We need to skip these callee reserved space to fetch our stack based parameter \”param\” structure. The instructions add rsp 8 / mov rax, [rsp + 24] does exacly that. We have address of first member in param structure which is Trampoline that has our gadget. Now the RAX points to address where gadget is stored. From RAX, the gadget is moved to r10. The RAX will be used to reference the param struct.
  • The spoofing happens at mov [rsp] , r10 when the gadget address is stored in RSP. Now the return address is changed.
  • The instruction \”mov r10, [rax + 8]\” stores the second member function which is MessageBox api address in r10.
  • The instruction mov [rax + 8], r11stores original return address in function member of param struct. This will be used in fixup to return to the caller after the execution of MessageBox.
  • The instruction lea rbx, [fixup] stores the address of fixup code in RBX for later use. The instruction mov [rax], rbx replaces the value of Trampoline member with fixup in RBX.
  • The mov rbx, rax instruction stores updated param struct in RBX.
  • And finally instruction jmp r10 calls the MessageBox api.

How does the fixup code work?

  • Since we have modified the return address to our gadget JMP QWORD PTR [RBX] in wininet.dll, when the MessageBox executes ret instruction, address to gadget get loaded in the RIP and resumes execution.
  • RBX contains the updated param structure with function member pointing to original return address. The instruction mov rcx, rbx in fixup stores the param in RCX.
  • The instruction jmp QWORD [rcx + 8] will take us back to the caller SpoofRetAddr function.
  • The instruction sub rsp, 16 in fixup reverts all the changes made to RSP and cleans the stack.

The overall working of the Spoof procedure is highlighted in the image below.

\"\"
#include <iostream>
#include <Windows.h>
#include <winternl.h>
#include <psapi.h>
using PrototypeMessageBox = int (WINAPI*)(HWND hWnd, LPCSTR lpText, LPCSTR lpCaption, UINT uType);
PrototypeMessageBox originalMsgBox = MessageBoxA;
extern \"C\" PVOID Spoof(PVOID function, HANDLE module, ULONG size, PVOID a, PVOID b, PVOID c, PVOID d, PVOID e, PVOID f, PVOID g, PVOID h);
typedef struct {
	const void* trampoline;     // always JMP RBX
	void* function;             // Target Function
	void* rbx;                  // Placeholder
} PRM, * PPRM;
INT compare(PVOID stringA, PVOID stringB, SIZE_T length)
{
	PUCHAR A = (PUCHAR)stringA;
	PUCHAR B = (PUCHAR)stringB;
	do {
		if (*A++ != *B++)
		{
			return(*--A - *--B);
		};
	} while (--length != 0);
	return 0;
}
PVOID FindGadget(LPBYTE module, ULONG size)
{
	for (int x = 0; x < size; x++)
	{
		if (compare(module + x, (PVOID)\"\\xFF\\x23\", 2) == 0)
		{
			return (LPVOID)(module + x);
		};
	};
	return NULL;
}
PVOID SpoofRetAddr(PVOID function, HANDLE module, ULONG size, PVOID a, PVOID b, PVOID c, PVOID d, PVOID e, PVOID f, PVOID g, PVOID h)
{
	PVOID Trampoline;
	if (function != NULL)
	{
		Trampoline = FindGadget((LPBYTE)module, size);
		if (Trampoline != NULL)
		{
			PRM param = { Trampoline, function };
			return (
				(
					PVOID(*) (
						PVOID, PVOID, PVOID, PVOID, PPRM,PVOID, PVOID, PVOID, PVOID, PVOID
						)
					)
				(
					(PVOID)Spoof
					)
				)
				(
					a, b, c, d, &param, NULL,e,f,g,h
					);
			
		};
	};
	return NULL;
}
int hookedMessageBox(HWND hWnd, LPCSTR lpText, LPCSTR lpCaption, UINT uType)
{
	HMODULE hModule = LoadLibrary(L\"Wininet.dll\");
	MODULEINFO modInfo;
	GetModuleInformation(GetCurrentProcess(), hModule, &modInfo, sizeof(modInfo));
	ULONG ModuleSize = modInfo.SizeOfImage;
	PVOID msgBox = originalMsgBox;
	;
	SpoofRetAddr(msgBox, hModule, ModuleSize, hWnd, (PVOID)lpText, (PVOID)lpCaption, (PVOID)uType,NULL,NULL,NULL,NULL);
	//SPOOF(msgBox,hModule,ModuleSize,hWnd,lpText,lpCaption,uType);
	return 0;
	
}
int main()
{
	
	LPVOID imageBase = GetModuleHandleA(NULL);
	PIMAGE_DOS_HEADER dosHeaders = (PIMAGE_DOS_HEADER)imageBase;
	PIMAGE_NT_HEADERS ntHeaders = (PIMAGE_NT_HEADERS)((DWORD_PTR)imageBase + dosHeaders->e_lfanew);
	PIMAGE_IMPORT_DESCRIPTOR importDescriptor = NULL;
	IMAGE_DATA_DIRECTORY importsDirectory = ntHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT];
	importDescriptor = (PIMAGE_IMPORT_DESCRIPTOR)(importsDirectory.VirtualAddress + (DWORD_PTR)imageBase);
	LPCSTR libraryName = NULL;
	HMODULE library = NULL;
	PIMAGE_IMPORT_BY_NAME functionName = NULL;
	while (importDescriptor->Name != NULL)
	{
		libraryName = (LPCSTR)importDescriptor->Name + (DWORD_PTR)imageBase;
		library = LoadLibraryA(libraryName);
		if (library)
		{
			PIMAGE_THUNK_DATA originalFirstThunk = NULL, firstThunk = NULL;
			originalFirstThunk = (PIMAGE_THUNK_DATA)((DWORD_PTR)imageBase + importDescriptor->OriginalFirstThunk);
			firstThunk = (PIMAGE_THUNK_DATA)((DWORD_PTR)imageBase + importDescriptor->FirstThunk);
			while (originalFirstThunk->u1.AddressOfData != NULL)
			{
				functionName = (PIMAGE_IMPORT_BY_NAME)((DWORD_PTR)imageBase + originalFirstThunk->u1.AddressOfData);
				// find MessageBoxA address
				if (std::string(functionName->Name).compare(\"MessageBoxA\") == 0)
				{
					SIZE_T bytesWritten = 0;
					DWORD oldProtect = 0;
					VirtualProtect((LPVOID)(&firstThunk->u1.Function), 8, PAGE_READWRITE, &oldProtect);
					// swap MessageBoxA address with address of hookedMessageBox
					firstThunk->u1.Function = (DWORD_PTR)hookedMessageBox;
				}
				++originalFirstThunk;
				++firstThunk;
			}
		}
		importDescriptor++;
	}
	// message box after IAT hooking
	MessageBoxA(NULL, \"Check_My_Ret_Addr\", \"Hooked\", 0);
	return 0;
}

Result

\"\"

Leave a Reply

Your email address will not be published. Required fields are marked *