Write-Ups

10 min read

Business CTF 2022: Defeating modern malware techniques - Mr Abilgate

This blog post will cover the creator's perspective, challenge motives, and the write-up of the Mr Abilgate challenge from 2022's Business CTF.

Shad3 avatar

Shad3,
Nov 26
2022

Challenge summary

The goal of the challenge is to decrypt the encrypted .xls file in order to find out what is really hiding. In order for that to be achieved, the provided executable must be reverse engineered, so that we form a decent understanding of how it operates.

🎮 PLAY THE TRACK

Determining the executable is packed and unpacking

The binary analysis starts with an enumeration phase, in which we collect information about the executable. To do so, we can first open the binary in DiE (Detect it Easy) in order to collect initial info. The first interesting thing that we observe is that DiE detects that the executable is packed using UPX.

Even though UPX has an unpacking utility, which means that we can retrieve the unpacked executable with just a command as shown above, we'll pretend that it doesn't so that we can explore some more parameters that can help identify if an executable is packed.

Entropy

Entropy, in computer science, is the measurement of the randomness or diversity of a data-generating function. If a block of data has high entropy (close to the value of 8), that means that it is unstructured and possibly random, a characteristic of encrypted data. That said, structured languages, including assembly and machine language, have a low entropy value. If we regard the file segments as the block of data explained above, we can figure out which are packed and which are not. To do so we can again use DiE as shown below.

Size of the segments

Another indication that an executable is packed, is the abnormal size of the "data storing" segments. To be more precise, .data,  .rsrc and other custom file segments are used to store static data. It is highly unlikely for executables that implement complex functionality to have any larger data storing segments than a .text segment (where the compiled code gets stored). As we can see below, that is the case here, as the .text segment holds approximately only 13.3% of the size of the total executable.

Note: The rest of the analysis will be done using IDA Pro which is not a requirement.

Initial analysis

By skimming through the decompiled code and looking at the exports we observe that the executable has minimal function imports, and the ones that are actually imported seem obscure (e.g. some common function calls are missing).

Another observation that we make is that one function is called everywhere - .text:@0x001120, which I arbitrarily named getFuncHashResolve. The reason why will be explained later on. That function calls LoadLibraryA, passing to it as an argument the output of an operation on the first argument. Keep in mind that the first argument of getFuncHashResolve is a static string.

The decompiled output looks like the following:

char *__fastcall getFuncHashResolve(char *encryptedString, __int64 a2) {
...
  heapBuffer = calloc(0xFFu, 1u);
  if ( heapBuffer )
  {
    j = 0;
    v21 = 0;
    encryptedStringPtr = encryptedString - &encryptedStaticPasswordPtr;
    encryptedStaticPasswordPtr = encryptedStaticPassword;
    do
    {
      v7 = &encryptedStaticPasswordPtr + j;
      *v7 ^= 0xDDu;
      decryptedByte = *(&encryptedStaticPasswordPtr + j + encryptedStringPtr) ^ *(&encryptedStaticPasswordPtr + j);
      v7[heapBuffer - &encryptedStaticPasswordPtr] = decryptedByte;
      if ( !decryptedByte )
        break;
      ++j;
    }
    while ( j < 16 );
  }
  lib = LoadLibraryA(heapBuffer);

According to the MSDN documentation LoadLibraryA has the following definition:

HMODULE LoadLibraryA(
  [in] LPCSTR lpLibFileName
);

Thus it is obvious that the buffer, allocated at the start of the function, is filled with the file name of the library to be loaded. Taking that into account, we can safely assume that the operation that's performed on the first argument of getFuncHashResolve is some sort of decryption routine, as the string differs on several cross references of the function, and it seems fairly random.

Decrypting the encrypted strings

The same algorithm can be observed to be used on other places inside the binary, but without a function implementation that gets called. This is because the compiler inlined a function:

In computing, inline expansion, or in-lining, is a manual or compiler optimization that replaces a function call sitewith the body of the called function. Inline expansion is similar to macro expansion, but occurs during compilation,without changing the source code (the text), while macro expansion occurs prior to compilation, and results indifferent text that is then processed by the compiler. 

Knowing this, we can recreate the decryption routine using a language of our choice in order to decrypt the static strings. For reference, a proof of concept for one string is shown below:

def decryptString(encr):
  pw = b''
  decr = b''
  sbxr = b'\xB2\xBB\xB2\x90\xB8\xAF\xB8\x95\xAE\xBC\x8A\xEE\xB9\xBC\xB5\x8E'[::-1]
  for i in range(len(sbxr)):
    pw  += bytes([sbxr[i] ^ 0xDD])
  print("[+] The password used is : " + pw.decode())
  for i in range(len(encr)):
    decr += bytes([pw[i] ^ encr[i] ])
  return decr
print("[+] The final decrypted string: " + decryptString(b'\x38\x0d\x13\x0a\x56\x3b\x52\x41\x66\x01\x1e\x09\x4d').decode())
...
[+] The password used is : Shad3WasHereMofo
[+] The final decrypted string: kernel32.dll

Defeating function resolution

Looking deeper and setting the correct types on the variables based on the return types that are defined in MSDN, we end up with something like the following:

 lib = LoadLibraryA(heapBuffer);
  if ( !lib )
    return 0;
  if ( *lib != 0x5A4D )
    return 0;
  exportDirectoryRVA = (lib + *(lib + 15));     // libraryBase + dosHeader->e_lfanew
  if ( exportDirectoryRVA->Signature != 17744 )
    return 0;
  if ( (exportDirectoryRVA->FileHeader.Characteristics & 0x2000) == 0 )
    return 0;
  imageExportDirectory = exportDirectoryRVA->OptionalHeader.DataDirectory[0].VirtualAddress;
  if ( !imageExportDirectory )
    return 0;
  if ( !exportDirectoryRVA->OptionalHeader.DataDirectory[0].Size )
    return 0;
  imageExportDirectory_NumberOfFunctions = *(lib + imageExportDirectory + 28);// imageExportDirectory->NumberOfFunctions
  k = 0;
  addressOfNamesRVA = (lib + *(lib + imageExportDirectory + 32));
  addressOfNameOrdinalsRVA = lib + *(lib + imageExportDirectory + 36);
  if ( !imageExportDirectory_NumberOfFunctions )
    return 0;
  while ( 1 )
  {
   /**
   * SDBM hashing implementation
   */
    sdbm = 0;
    functionNameRVA = lib + *addressOfNamesRVA;
    for ( i = *functionNameRVA; *functionNameRVA; i = *functionNameRVA )
    {
      ++functionNameRVA;
      sdbm = i + 65599 * sdbm;
    }
    if ( sdbm == argHash ) // Second Argument 
      break;
    k = (k + 1);
    addressOfNamesRVA = (addressOfNamesRVA + 4);
    if ( k >= imageExportDirectory_NumberOfFunctions )
      return 0;
  }
  return lib + *(lib + 4 * *&addressOfNameOrdinalsRVA[2 * k] + imageExportDirectory_NumberOfFunctions);// functionAddress

This function implements a technique known as API hashing, a technique commonly used in modern malware in order to evade detection, antivirus and EDR solutions. This is attempted by dynamically loading all system API calls they make, defeating simpler static analysis tools that are unable to see the imports.

The implementation of the mechanism is  fairly simple - the executable maps into memory the library that exports the desired function, then searches its exported function names, hashing them and comparing them against a static hash value.

Upon finding a matching hash value, the pointer to the function is returned. The hash used in the getFuncHashResolve function is named SDBM, which has been used by the infamous Emotet malware.

Knowing this. We may develop an automatic resolver for all functions:

import pefile
import sys
import os

win32ApiHashes = []

def sdbm(ptext):
	'''
	Calculates the sbdm hash that is used in the ransomware
	I-> ptext: plaintext string to be hashed
	O<- hashValue: integer hash value of the input string
	'''
	sdbmHash = 0
	for pchr in ptext:
		sdbmHash = ord(pchr) + (sdbmHash << 6) + (sdbmHash << 16) - sdbmHash
	return sdbmHash & 0xFFFFFFFFFFFFFFFF

def getDLLExports(filename):
	'''
	Fetches the exported functions of a DLL
	I->filename: filename of a dll under the System32 folder
	O<-exports: Dictionary of the exports of a dll 
	'''
	exports = []
	d = [pefile.DIRECTORY_ENTRY["IMAGE_DIRECTORY_ENTRY_EXPORT"]]
	pe = pefile.PE("C:\\Windows\\System32\\" + filename, fast_load=True)
	pe.parse_data_directories(d)
	for i in pe.DIRECTORY_ENTRY_EXPORT.symbols:
		if i.name != None:
			exports.append({'name' : i.name.decode(), 'hash' : sdbm(i.name.decode())})
	return exports

def craftDLLexports(dllNames):
	'''
	Retrieves the exported API calls of all the dll files
	named in the dllNames list and stores them in a global array
	I->dllNames: list of dllNames that we want to retrieve the
				 hashes from
	O<-None
	'''
	for dllName in dllNames:
		win32ApiHashes.extend(getDLLExports(dllName))

def resolveHash(hashValue):
	'''
	Attempts to reverse the resolve the hash to the string of the API call
	I->hashValue: Integer value of a hash
	O<-apiCall.Name: Returns the name of the API call that produces that hash
	'''
	for apiCall in win32ApiHashes:
		if apiCall['hash'] == hashValue:
			return apiCall['name']
	return None

if len(sys.argv) == 2:
	dllNames = ["kernel32.dll", "advapi32.dll", "shlwapi.dll", 'ntdll.dll']
	craftDLLexports(dllNames)
	print("f'({}) -> {}".format(sys.argv[1], resolveHash(int(sys.argv[1], base=16))))
'''
python calculateFunctionHashes.py 0xECC89E7E1B474400
f'(0xECC89E7E1B474400) -> CloseHandle
'''

Defeating anti-debugging

The challenge uses an anti-debugging technique in [email protected]:0x140001EB0, the first function that gets called upon starting the challenge.

The function calls CloseHandle(0xDEADBEEF); - closing an invalid handle will generate an EXCEPTION_INVALID_HANDLE. If the program is running under a debugger, the exception will be handled by the debugger and not by SEH (Structured Exception Handler). If the control is passed to the exception handler, it indicates that a debugger is present. This check can be easily bypassed either by setting the rax register to TRUE (1) upon return or by nop’ing out the check on the calling function that branches the execution flow. In order to solve the challenge, it is not necessary to bypass this trick, since the challenge can be solved statically.

Understanding the logic of the malware

Now that we have a way to resolve all the function pointers that are being returned from, getFuncHashResolve, we can easily understand what the malware does.

The malware starts by traversing the contents of a starting directory:

C:\Users\Administrator\\Desktop\\ShipSalesThisIsSuperImportantPleaseDontDelete\*

And searches for files that have the following extensions:

.pdf, .docx, .txt, .xls, .ppt, .png, jpeg, .jpg, .gif, .mp4, .mov, .wmv, .avi, .html

It then creates an encryption context for AES-256, and then (using a static plaintext key stored inside the executable) encrypts every file that it has been linked to a global single linked list.

const BYTE keyBuffer[] = {0xf9, 0x97, 0xb6, 'G', '`' ,0x08 ,
                          0xa7,0xea, 0xfb, '-', 0xbe , 'P',
                          0xe9 ,0x96, 0x94, 0xf6};

The global list of files as well as the files themselves is described by the following structures, (Please note that the names are chosen arbitrarily):

struct FileListEntry {
  struct FileListEntry* next;
  CHAR filePath[MAX_PATH];
  BOOL isEncrypted;
};
typedef struct FileList {
  struct FileListEntry* head;
  UINT32 totalLength;
} FileList;

FileList fileList;

Last but not least, the malware drops a text file on the \\Desktop folder of Mr.Abilgate, with the following note, that again decrypts using the common algorithm explained above.

PS C:\Users\Administrator\Desktop> type .\YOUAREDOOMED.txt
No one can save you now Mr. Abilgate your important contract is now encrypted.
You have to give all your fortune to charity!
Sent $1.000.000 worth of BTC to the following BTC address and maybe you'll get your files back
BTC address = 1NgiUwkhYVYMy3eoMC9dHcvdHejGxcuaWmo

Decrypting the .xls file

Compiling the above knowledge, retrieving the encrypted file is trivial. All we have to do is implement a simple AES decryptor using as the key, the one statically stored in the executable.

from Crypto.Cipher import AES
import os
import hashlib


def getEncryptedBytes(fileName, N):
	'''
	Returns N number of bytes read from a file
	If N = -1 then it returns all the bytes of the file
	'''
	try:
		f = open(fileName, 'rb')
		if N != -1:
			return f.read(N)
		return f.read()
	except:
		print("[-] Couldn't open the file :(")
		exit(-1)

def getSHA256Hash(data):
	'''
	Produces the SHA256 hash of a data buffer
	'''
	m = hashlib.sha256()
	m.update(data)
	return m.digest()


def decryptFile(fileName,key):
	'''
	Final function that gets called
	attempts to decrypt an AES-256-CBC encrypted file with a crafted
	key, assuming the key is 32 bytes
	'''
	aes = AES.new(key, AES.MODE_CBC, iv=b'\x00'*16)
	fl = getEncryptedBytes(fileName, -1)
	f = open("decrypted.xls", "wb")
	f.write(aes.decrypt(fl))
	f.close()

def main():
	fileName = "ImportantAssets.xls.bhtbr"
	print("[+] Decrypting the file")
	initKey = b'\xf9\x97\xb6G`\x08\xa7\xea\xfb-\xbeP\xe9\x96\x94\xf6'
	key = getSHA256Hash(initKey)
	decryptFile(fileName, key)
		

if __name__ == '__main__':
	main()

Hack The Blog

The latest news and updates, direct from Hack The Box