Malware source code investigation: BlackLotus - part 1
BlackLotus is a UEFI bootkit that targets Windows and is capable of evading security software, persisting once it has infected a system, bypassing Secure Boot on fully patched installations of Windows 11, and executing payloads with the highest level of privileges available in the operating system.
The source code for the BlackLotus UEFI bootkit has been published on GitHub on July, 12, 2023
.
Since at least October 2022, BlackLotus is a UEFI bootkit that has been for sale on hacking forums. The dangerous malware is for sale for $5,000, with payments of $200 per update.
In this small research we are detailed investigate the source code of BlackLotus and highlights the main features.
Architecture
Black Lotus is written in assembly and C and is only 80kb
in size, the malicious code can be configured to avoid infecting systems in countries in the CIS region (At the time of writing, these countries are Armenia, Azerbaijan, Belarus, Kazakhstan, Kyrgyzstan, Moldova, Russia, Tajikistan and Uzbekistan).
Source code structure looks like this:
The software consists of two major components: the Agent, which is installed on the target device, and the Web Interface, which is used by administrators to administer bots. A bot in this context refers to a device with the Agent installed.
Cryptography
First of all, we paid attention to libraries and cryptographic functions:
At first we wanted to focus on the WinAPI hashing method by CRC32
at malware development. As you can see, nothing out of the ordinary here, CRC32
implementation with constant 0xEDB88320L
. You can learn more about how to use it for hashing when developing malware, for example, here.
The implementation of the RC4 algorithm is also standard here, there is nothing complicated about it:
What about XOR? This code appears to implement a custom type of encryption on a given data buffer. The function CryptXor
is applied to the buffer using the specified Key and the Cipher Block Chaining (CBC)
method. The CBC
method is a type of block cipher mode that encrypts plaintext into ciphertext. The encryption of each block depends on the previous block of data:
In summary, this function performs a custom type of encryption on the input buffer. It uses XOR
operations with a given key and CBC chaining, with the possibility to skip over pairs of zero DWORD
s.
And also we have function to decrypt via XOR:
Then, the next interesting thing is files like ntdll_hash.h
, kernel32_hash.h
, etc:
Each of which contains hashes of WINAPI functions and DLL names:
AV evasion tactic
Then, malware author just use GetModuleHandleByHash (DWORD Hash)
function:
The given C function, GetModuleHandleByHash
, is a means of dynamically resolving and obtaining a module handle given a hash of the module name. This is typically seen in malware code, as it helps to avoid static strings (like "kernel32.dll"
) that could be easily spotted by antivirus heuristic algorithms. This technique increases the difficulty of static analysis.
The function works as follows:
-
It begins by reading the
Thread Environment Block (TEB)
via inline assembly code. This is a structure that Windows maintains per thread to store thread-specific information. The structure of theTEB
and the offsets used indicate that it’s retrieving the first entry in the InLoadOrderModuleList, which is a doubly linked list of loaded modules in the order they were loaded. This is a common way to get a list of loaded modules without calling any APIs likeEnumProcessModules
. -
Once it has the first module, it enters a loop where it processes each module in turn. For each module, it converts the module name to lower case and computes its
CRC32
hash (using theCrc32Hash
function). -
If the computed hash matches the input hash, it returns the base address of the module (which is effectively the same as the module handle, for the purpose of calling
GetProcAddress
). -
If the hash does not match, it moves to the next module in the
InLoadOrderModuleList
and repeats the process. -
If it has checked all the modules and not found a match, it returns
NULL
.
Note that LDR_MODULE
and its linked list structures are part of the Windows Native API (also known as the “NT API”), which is an internal API used by Windows itself. It’s not officially documented by Microsoft, so using it can be risky: it can change between different versions or updates of Windows. However, it also provides a way to do things that can’t be done with the standard Windows API, so it’s often used in low-level code like device drivers or, in this case, bootkit malware.
Also we have files like advapi32_functions.h
, ntdll_functions.h
or user32_functions.h
:
This piece of code is a C++ header files that defines function pointers to a Windows API functions like: VirtualAlloc
, OpenProcess
, and Process32FirstW
or NT API structures and functions:
These are being defined as function pointers rather than directly calling the functions because this can make it easier to dynamically load these functions at runtime. This can be useful in a few scenarios, such as when writing code that needs to run on multiple versions of Windows and not all functions may be available on all versions, and in our case when trying to evade detection by anti-malware tools (since these tools often flag direct calls to certain API functions as suspicious).
The GetProcAddressByHash
function in the given code is designed to look up a function in a DLL
using the hash of the function’s name, rather than the name itself. This is typically used in malware to make static analysis harder, as it avoids leaving clear text strings (like "CreateProcess"
) in the binary that can be easily identified:
This code also assumes that it’s running on the same architecture as the DLL it’s examining, i.e., if the code is compiled for a 64-bit
target, it assumes the DLL is also 64-bit
, and vice versa for 32-bit
.
It’s worth noting that manipulating the PE file format and using hashed function names like this is a common technique used in malware and rootkits to make analysis and detection more difficult.
Also interesting file is nzt.h
:
As you can see, function pointer macro: API(Function)
is a macro that expands to NzT.Api.p##Function
. This is likely used to call function pointers stored in an API_FUNCTIONS
structure, which is part of the NzT_T
struct.
NzT_T
is a structure that bundles together various components of the bot’s functionality, including an API_FUNCTIONS
structure for API
function pointers, an API_MODULES
structure for loaded module information, a CRC
type (for checksum calculations), and an INFECTION_TYPE
field indicating the infection status of the bot.
Windows Registry
Then, in the registry.c
file implements functions for interacting with the Windows Registry:
GetRegistryStartPath(INT Hive)
- This function is used to get the start path of the registry hive, based on the hive type passed to it (e.g., HKEY_LOCAL_MACHINE
). The path is formatted into the form expected by the Windows kernel functions, which is a bit different from what you might usually see (e.g., "\Registry\Machine"
instead of HKEY_LOCAL_MACHINE
). The function returns this path as a wide character string (LPWSTR
):
RegistryOpenKeyEx(CONST LPWSTR KeyPath, HANDLE RegistryHandle, ACCESS_MASK AccessMask)
- This function is used to open a specific key in the registry, given its path, a handle to a pre-existing key (or NULL
for the root of the registry), and an access mask specifying what type of access the function caller requires to the key (e.g., KEY_READ
, KEY_WRITE
). It uses the NtOpenKey API function from the Windows Native API to actually open the key:
RegistryReadValueEx(CONST LPWSTR KeyPath, CONST LPWSTR Name, LPWSTR* Value)
- This function reads a value from a given key in the registry. It does this by opening the key with RegistryOpenKeyEx
, then querying the value with NtQueryValueKey
. The function reads the value’s data into a buffer, which it then returns to the caller. If anything goes wrong (e.g., the key couldn’t be opened, the value couldn’t be queried, there wasn’t enough memory to store the value’s data), the function returns FALSE
:
RegistryReadValue(INT Hive, CONST LPWSTR Path, CONST LPWSTR Name, LPWSTR* Value)
- This function combines the functionality of the other functions. It reads a value from a specific key in a specific hive of the registry. It constructs the full path to the key by concatenating the start path of the hive (obtained with GetRegistryStartPath
) and the rest of the key path passed to the function. It then reads the value from this key with RegistryReadValueEx
:
There are also two functions, but they are not used anywhere and are commented out:
Filesystem
There are also separate functions for working with files in Windows OS - file.c
:
which implements such functions as, for example FileGetInfo
, FileGetSize
, FileOpen
, FileWrite
, etc.
FileGetInfo(HANDLE FileHandle, PFILE_STANDARD_INFORMATION Info)
- This function retrieves standard information about a file. The NtQueryInformationFile
function is used to retrieve the information. It takes a handle to an open file and a pointer to a FILE_STANDARD_INFORMATION
structure to fill with information. The MemoryZero
function is used to clear these structures before use.
The FILE_STANDARD_INFORMATION
structure includes several file attributes such as the allocation size of the file, the end of the file, the number of links to the file, and flags to indicate if the file is a directory or if it is deleted. If the operation is successful, the function returns TRUE
. If the operation fails, it returns FALSE
:
FileGetSize(HANDLE FileHandle, PDWORD FileSize)
- This function retrieves the size of a file. It does so by calling FileGetInfo
to get the standard information of the file, and then sets the value pointed to by FileSize
to the AllocationSize.LowPart
of the FILE_STANDARD_INFORMATION
structure:
Note that AllocationSize
is a LARGE_INTEGER
(which is a 64-bit
value), and this function is only returning the lower 32 bits of it, which may be incorrect for files larger than 4GB
.
Injections
Another functions from source code of investigated malware, for injection logic:
For example:
LPVOID InjectData(
HANDLE Process,
LPVOID Data,
DWORD Size
)
Here’s a breakdown of what the function does:
NzT.Api.pVirtualAllocEx(Process, NULL, Size, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE)
- It starts by allocating memory within the virtual memory space of a target process. The size of the allocated memory is specified by the Size
parameter. The memory is both committed (MEM_COMMIT
) and reserved (MEM_RESERVE
) for future use. The allocated memory has read, write, and execute permissions (PAGE_EXECUTE_READWRITE
). The address of the allocated memory is saved in the Address
variable. If this operation fails, the function returns NULL
.
NzT.Api.pWriteProcessMemory(Process, Address, Data, Size, NULL)
- If memory allocation is successful, the function proceeds to write data into the allocated memory within the target process. It does this using the WriteProcessMemory
function. This function copies data from a buffer (Data
) in the current process to the allocated memory (Address
) in the target process. If the operation fails, it frees the allocated memory using VirtualFreeEx
and returns NULL
.
If both operations are successful, the function returns the address of the allocated memory in the target process. This can then be used for various purposes, such as executing the injected code.
This type of functionality is often seen in malware that injects malicious code into legitimate processes to hide its activities or gain higher privileges.
What about this injection logic?
DWORD InjectCode(
HANDLE Process,
LPVOID Function
)
which also implemented in this file:
This function appears to inject code into a target process by creating a section of memory, copying the code into this section, performing relocations, and finally mapping this section into the target process.
Once all the tasks are performed, the function will clean up by closing any open handles and unmap any mapped views of files. Finally, it will return the address of the injected function in the target process.
As with many other kinds of code injection techniques, this one is also commonly seen in malware.
Pseudo-Random Generator
And there are several functions in this malware guid.c
:
These functions are designed to generate a pseudo-random GUID (Globally Unique Identifier)
. The GUID
is built from the values produced by a simple linear congruential generator (LCG
), which is a type of pseudorandom number generator.
Here’s what each function does:
GuidRandom(PDWORD Seed)
- This is a linear congruential generator (LCG) function that takes a seed as a parameter and generates a pseudorandom number. It’s important to note that this LCG function always produces the same sequence of numbers if the initial seed is the same:
GuidGenerate(GUID * Guid, PDWORD Seed)
- This function takes a pointer to a GUID
structure and a pointer to a DWORD seed
as parameters. It generates a GUID
by calling GuidRandom(Seed)
to generate pseudorandom numbers and assign them to the four parts of the GUID
structure (Data1, Data2, Data3, Data4
):
GuidGenerateEx(PDWORD Seed)
- This function generates a GUID
string. It calls GuidGenerate(&Guid, Seed)
to generate a GUID
and then converts this GUID
to a string format with GuidToString(&Guid)
. This string is then copied to a newly allocated memory block, and a pointer to this block is returned:
As for the context of malware, the generated GUIDs might be used for a variety of purposes including marking infected systems, communicating with command-and-control (C2) servers, or creating mutexes to avoid multiple instances of the malware. In our case, this functions used for generate Bot ID.
Utils
There is also a file with utilities where there are a lot of auxiliary functions utils.c
:
For example, GetProcessIdByHandle (HANDLE Process)
:
This function, retrieves the unique process ID of a process given a handle to the process.
Or function GetProcessIdByHash(DWORD Hash)
:
which returns the Process ID (PID) of a process given its hash. This function scans all running processes on the system and returns the PID of the process whose executable name matches the provided hash.
The function creates a snapshot of all processes currently running on the system by calling the CreateToolhelp32Snapshot
function. If the snapshot creation fails, it returns -1
to indicate the failure. It then retrieves the first process in the snapshot using the Process32FirstW
function. If this function fails, it closes the snapshot handle and returns -1
to indicate the failure. The function then enters a loop, where it calculates the CRC32
hash of the current process’s executable name (szExeFile
). It checks whether this calculated hash is equal to the input hash. If it is, the function breaks out of the loop and returns the Process ID (th32ProcessID
) of the current process. If the hash doesn’t match, it proceeds to the next process in the snapshot using the Process32NextW
function and repeats previous steps. After the loop, it closes the snapshot handle and returns the PID
of the process with the matching hash. If no matching process was found, it returns -1
.
The CreateMutexOfProcess(DWORD ProcessID)
function is attempting to create a mutex (a synchronization object) with a unique name based on the process ID and the serial number of the disk volume (which is obtained by the GetSerialNumber()
function):
A mutex can be used to prevent multiple instances of a malware or application from running at the same time. In this case, the mutex name is generated by concatenating the disk volume’s serial number and the process ID, which should provide a unique mutex for each running instance of the process.
Also, interesting logic in destroyOS()
function:
but it’s also commented.
That’s all today. In the next part we will investigate another modules.
We hope this post spreads awareness to the blue teamers of this interesting malware techniques, and adds a weapon to the red teamers arsenal.
By Cyber Threat Hunters from MSSPLab:
References
https://github.com/ldpreload/BlackLotus
https://malpedia.caad.fkie.fraunhofer.de/details/win.blacklotus
https://twitter.com/threatintel/status/1679906101838356480
https://twitter.com/TheCyberSecHub/status/1680044350820999168
Thanks for your time happy hacking and good bye!
All drawings and screenshots are MSSPLab’s