Data Exfiltration Techniques
Introduction
In this article we will see how malware encode or encrypt data that's exfiltrated to the Command and Control Server from infected machines. This is often done using a custom encoding or encryption algorithm.
 
        Learn Digital Forensics
It is becoming increasingly common these days to see malware using this technique to prevent Security Analysts from understanding the type of data that is being exchanged between the malware and its Server. Similarly, these algorithms can also be used for randomizing the artifact details such as names of the files or registry keys created on the infected machine.
In all such cases, Behavioral Analysis of the malware is not sufficient. Only after analyzing the code used by the malware can these algorithms be understood.
Randomization
Most malware create certain disk artifacts once they execute. If these disk artifacts have names that remain the same upon multiple executions of the malware, then it becomes easy to discover the presence of the malware on other machines using the indicators gathered during Behavioral Analysis.
To prevent this, malwares can use custom algorithms that are used to generate random names for the disk artifacts they create.
Similarly, most malware will gather some data from the infected machine and send it to the attacker's controlled server. If this communication channel is not encrypted or it sends the data in plain text, then it becomes trivial to understand the intention of the malware and its nature.
There are certain Win32 APIs which are often used to generate a random value which in turn is used in a custom obfuscation or encryption algorithm to randomize the disk artifact names or encrypt the communication channel.
Two of these Win32 APIs which are quite commonly used are: GetTickCount() and QueryPerformanceCounter().
In this article, we are going to look into a Custom Encryption and Encoding algorithm that uses QueryPerformanceCounter to generate a 16 byte Random Seed.
For the purpose of completeness, we will look at how the data is gathered from the machine and what type of data it is, followed by the details of the Encryption and Encoding algorithms.
Collection of Data
Once the malware has successfully executed on the machine, it proceeds to gather various details specific to the machine like the MAC Address, Username, Hostname, IP Address, Timestamp and destination domain name.
Below is a high level overview of how these details are gathered and in what format are they captured.
Data is collected by calling Win32 APIs like GetAdaptersAddresses, GetUserNameA, GetCurrentProcessID, gethostname, gethostbyname and GetLocalTime.
The data is gathered using the return values of the above functions and stored at 0041C1D0 as shown below:

Here mark:xxsy is a marker for the data collected.
Then it calls the main Encryption Routine at 00401768 to encrypt this data:

Stack arguments:

0041C1D0 – Pointer to the Data Collected from the machine
0041C360 – Encrypted Data will be stored here
Random Seed Generation
Once the data has been collected from the system, it starts the encryption routine. The first step in the encryption routine is to generate the random seed which will be used in the algorithm.
To generate the random seed, the QueryPerformanceCounter API is used as shown below:

The function prototype of QueryPerformanceCounter() is:
BOOL WINAPI QueryPerformanceCounter(
_Out_ LARGE_INTEGER *lpPerformanceCount
);
It accepts one argument, which is a pointer to the Performance Counter. Once the API has executed, it will return the Performance Counter value at this memory address. The return value has a size of 2 DWORDs.
In our case, the algorithm uses only the first DWORD.
Stack arguments just before the call to QueryPerformanceCounter:

So, the return value will be stored at the address, 0012FD78:

Below is an explanation of the code used to generate the 16 byte Random Seed:
[cpp]
LEA EAX,DWORD PTR SS:[EBP-1C]
PUSH EAX                                ; pointer to Performance Counter
CALL DWORD PTR DS:[<&KERNEL32.QueryPerf>; QueryPerformanceCounter
PUSH DWORD PTR SS:[EBP-1C]          ; Arg1 (1st DWORD of Performance Counter)
CALL sysmgr.00404E15
POP ECX
CALL sysmgr.00404E27            ; Subroutine to modify the first DWORD
MOV BYTE PTR DS:[EDI+ESI],AL        ; Form the Random Seed byte by byte
TEST AL,AL
JNZ SHORT sysmgr.004017B7
MOV BYTE PTR DS:[EDI+ESI],1
INC EDI
CMP EDI,10              ; Total Length of the seed is 0x10 bytes
JL SHORT sysmgr.00401794
After it retrieves the value of the Performance Counter, the first DWORD is passed to the subroutine at address 00404E27:

This subroutine will modify the value of the DWORD and finally store the lower byte of the high order word in AL. This value will then be written to the new memory location.
In each loop, one byte of random seed is generated. Since the total length of the seed is 16 bytes, there will be 16 invocations of QueryPerformanceCounter and it will write a new byte each time to the memory address where the random seed is stored.
The random seed will be stored at the memory address [EDI+ESI], which is 00922D00 in our case.
Before the random seed generation:

It contains 0xBAADF00D because it is a new chunk of memory allocated by RtlAllocateHeap.
After the first execution of the loop, the byte 0x27 is written to this location:

After the above loop completes and the complete 16 byte random seed is generated and stored at 00922D00, it will copy the random seed to a new location.

Below is an explanation of the code:
[cpp]
PUSH 10
LEA EAX,DWORD PTR DS:[EBX+1]
PUSH ESI
PUSH EAX
MOV BYTE PTR DS:[EBX],1 ; EBX is 00B30018 where the random seed is copied to.
CALL sysmgr.00403ED0
EBX points to the location where the value of random seed will be copied to. This memory address is 00B30018 in our case. The first byte of this is fixed and it is 0x01.
The subroutine at 00403ED0 is used to write the random seed to the new memory location.
[cpp]
MOV AL,BYTE PTR DS:[ESI] ; ESI points to the original location of random seed.
MOV BYTE PTR DS:[EDI],AL ; EDI points to the new location of random seed.
MOV AL,BYTE PTR DS:[ESI+1]
MOV BYTE PTR DS:[EDI+1],AL
MOV AL,BYTE PTR DS:[ESI+2]
SHR ECX,2
MOV BYTE PTR DS:[EDI+2],AL
ADD ESI,3
ADD EDI,3
CMP ECX,8
JB SHORT sysmgr.00403F54
It copies the random seed this way:
- 
It first copies 3 bytes of the random seed to the new location, byte by byte.
- 
It then copies 3 DWORDs of the random seed to the new location, DWORD by DWORD as shown below.

- 
It then writes 1 byte to the new location.
After the above subroutine has executed, the new random seed is stored as shown below.

Encryption Key Formation
Once the random seed is generated and copied to 00B30018, it calls a subroutine at 00401083 to form the Encryption Key.

The stack arguments:

0012ED30 – Location of the new encryption key
00922D00 – Original location of the random seed
00B30019 – New location of the random seed
After we step into the subroutine at 00401083:
[cpp]
MOV ECX,sysmgr.00416CA0 ; 00416CA0 is the location of the private key
PUSH 12 ; The total size of the key is 0x12 DWORDs
XOR EDX,EDX
SUB ECX,EAX // subtract 12ED30 from 416CA0
POP EDI ; EDI will be used as the outer loop counter
The malware has the private key used for the encryption stored at the address, 00416CA0. The size of this key is 0x12 DWORDs or 72 bytes. This key along with the random seed will be used to form a new key located at 0012ED30.
72 byte key located at 00416CA0:

Here is the loop used to generate the new key:

Here is the explanation of the code:
[cpp]
XOR ESI,ESI
MOV DWORD PTR SS:[EBP-8],4 ; Initialize the inner loop counter to 4
MOV EBX,DWORD PTR SS:[EBP+C] ; EBX points to the original location of random seed
MOVZX EBX,BYTE PTR DS:[EDX+EBX] ; Read one byte at a time from the random seed
SHL ESI,8
OR ESI,EBX ; ESI will stored one DWORD from the random seed
INC EDX
CMP EDX,10 ; Check if all the bytes from the random seed have been read
JL SHORT sysmgr.004010C7
XOR EDX,EDX ; If all the bytes from the random seed are read then reset EDX
DEC DWORD PTR SS:[EBP-8]
JNZ SHORT sysmgr.004010B3
MOV EBX,DWORD PTR DS:[ECX+EAX]
XOR EBX,ESI ; XOR the DWORD from random seed with the private key
MOV DWORD PTR DS:[EAX],EBX ; new Encryption key will be stored at 0012ED30
ADD EAX,4
DEC EDI ; There are a total of 12 DWORDs in the key
JNZ SHORT sysmgr.004010AA
Here is an explanation of the encryption routine:
- 
It reads one DWORD (byte by byte) from the random seed.
- 
It XORs the DWORD read from the random seed with the DWORD read from private key.
- 
It stores the result into the location of the new encryption key.
- 
It reads the bytes from the random seed in a cyclic order. Since the length of the random seed is 0x10 bytes or 4 DWORDs and the length of the private key is 0x48 bytes or 0x12 DWORDs, it reads the bytes from the random seed from start once it has finished reading all the bytes.
Before the key formation routine has completed executing, at address 0012ED30:

Once the above loop has executed, the new encryption key is stored at 0012ED30 as shown below:

Key Modification Routine
Once the new encrypted key is formed and stored at 0012ED30, in the next loop this key is modified. It reads 2 DWORDs at a time and modifies them using a subroutine at 00401040.

Below is an explanation of the code:
[cpp]
MOV ECX,DWORD PTR SS:[EBP+8] ; ECX points to the key
LEA EAX,DWORD PTR SS:[EBP-4] ; This will hold the final modified value of the first DWORD
PUSH EAX ; EAX points to 0012ED00
LEA EBX,DWORD PTR SS:[EBP-8] ; This will hold the final modified value of the second DWORD
CALL sysmgr.00401040
MOV EAX,DWORD PTR SS:[EBP+8] ; EAX points again to the start of the Key, 0012ED30
POP ECX         ;  0012ED00
MOV ECX,DWORD PTR SS:[EBP-4] ; Final DWORD from previous iteration is stored in ECX
MOV DWORD PTR DS:[EAX+ESI*4],ECX ; Modify the first DWORD of the key
MOV ECX,DWORD PTR SS:[EBP-8] ;
MOV DWORD PTR DS:[EAX+ESI*4+4],ECX ; Modify the second DWORD of the key
INC ESI
INC ESI ; Increment ESI two times since we are modifying two DWORDs at a time
CMP ESI,12 ; The total length of the key is 0x12 DWORDs
JL SHORT sysmgr.004010E1
Before the execution of the above loop, the key at 0012ED30 is:

Once the above subroutine has executed, the key is modified as shown below:

Encryption of Data
Once the encryption key has been formed, the data that was gathered previously from the machine will be encrypted using it.
In the Data Encryption Subroutine, we read two DWORDs at a time from the data and use the encryption key to modify them. Once this is done, each of these 2 DWORDs are written to the new memory location.
The subroutine at 00401040 is used to encrypt two DWORDs at a time.

It passes 2 parameters:

12FD8C – One of the 2 encrypted DWORDs will be stored here.
41C1D0 – Points to the data to be encrypted
It reads two DWORDs at a time from the data to be encrypted and stores them at addresses 12FD88 and 12FD8C as shown below:

Once the subroutine at 00401040 has executed, these two DWORDs will be encrypted as shown below:

Now these two DWORDs will be written to the new memory location.

Below is an explanation of the encryption subroutine:
[cpp]
MOV EAX,DWORD PTR SS:[EBP+8]   ; EAX holds the data to be encrypted
MOV ECX,DWORD PTR DS:[EAX+EDI*8] ; First DWORD from the data to be encrypted is stored in ECX
MOV EAX,DWORD PTR DS:[EAX+EDI*8+4] ; Second DWORD from the data to be encrypted is stored in EAX
MOV DWORD PTR SS:[EBP-C],EAX  ; Store second DWORD at 0012FD88
LEA EAX,DWORD PTR SS:[EBP-8]
MOV DWORD PTR SS:[EBP-8],ECX ; Store the first DWORD at 0012FD8C
PUSH EAX
LEA EBX,DWORD PTR SS:[EBP-C]
LEA ECX,DWORD PTR SS:[EBP-1064] ; Points to the 0x48 byte key
CALL sysmgr.00401040 ; Modify the first and second DWORDs stored at 0012FD88 and 0012FD8C
PUSH 4
LEA EAX,DWORD PTR SS:[EBP-8]
PUSH EAX
LEA EAX,DWORD PTR DS:[ESI-4]
PUSH EAX
CALL sysmgr.00403ED0 ; Store the second DWORD at new memory address
PUSH 4
MOV EAX,EBX
PUSH EAX
PUSH ESI
CALL sysmgr.00403ED0 ; Store the first DWORD at the new memory address
ADD ESP,1C
INC EDI ; Increment EDI to read the next DWORDs from the data to be encrypted
ADD ESI,8
CMP EDI,DWORD PTR SS:[EBP+10] ; Total of 13 iterations are required to read all data
JL SHORT sysmgr.004017FA
The subroutine at 00403ED0 will be used to write the DWORD to the memory location.

As can be seen above, the DWORDs at 12FD88 and 12FD8C are swapped and written to the new memory location, 00B30029.
Also, it is important to note that during Random Seed Generation, the 16 byte random seed was written to the memory address, 00B30018.
So, the encrypted data is stored after the random seed.
The loop above continues to execute for the entire length of the data.
After the loop has executed completely, the encrypted data is stored as shown below:

Obfuscation of Encrypted Data
Once the data is encrypted and stored at 00B30029, in the next subroutine at 004011E8 it is obfuscated.

The 2 parameters passed to the obfuscation routine are:

00B30018 – Pointer to the random seed and encrypted data
0041C360 – The final obfuscated data will be stored here
If we step into the subroutine at 004011E8, we can see the obfuscation algorithm here:

The inner loop will run 3 times and write 3 bytes to the new memory location. The outer loop will use the 3rd byte from the previous sequence of bytes and modify it and write to the new memory location.
Outer loop will run 0x39 times; it will write 4 bytes to the new memory location each time.
Below is an explanation of the code:
[cpp]
MOV EDI,EDX
MOV DWORD PTR SS:[EBP-8],2 ; Initialize local variable (this will be incremented in steps of 2)
MOV BYTE PTR SS:[EBP-1],0 ; Initialize local variable
MOV EAX,ESI
MOV DWORD PTR SS:[EBP-C],6 ; Initialize local variable (this will be decremented in steps of 2)
SUB EDI,ESI
MOV DWORD PTR SS:[EBP-10],3 ; Inner loop counter
MOV BL,BYTE PTR DS:[EAX] ; Read a byte from the encrypted data
MOV CL,BYTE PTR SS:[EBP-8]
ADD DWORD PTR SS:[EBP-8],2 ; Increment local variable by 2
SHR BL,CL ; modify BL
MOV ECX,DWORD PTR SS:[EBP-C]
SUB DWORD PTR SS:[EBP-C],2 ; Decrement local variable by 2
OR BL,BYTE PTR SS:[EBP-1] ; Modify BL
MOV BYTE PTR DS:[EDI+EAX],BL ; Write BL to new location
MOV BL,BYTE PTR DS:[EAX]
SHL BL,CL
SHR BL,2
INC EAX
DEC DWORD PTR SS:[EBP-10] ; Decrement inner loop counter
MOV BYTE PTR SS:[EBP-1],BL ; This value will be used in OR operation in next iteration
JNZ SHORT sysmgr.0040122B
MOV AL,BYTE PTR DS:[ESI+2]
AND AL,3F
MOV BYTE PTR DS:[EDX+3],AL
ADD ESI,3
ADD EDX,4
DEC DWORD PTR SS:[EBP-14] ; Decrement outer loop counter
JNZ SHORT sysmgr.0040120C
As can be seen above, it modifies the bytes of Encrypted Data and the Random Seed. It also adds an extra byte after every 3 bytes which is a modification of the third byte in the previous byte sequence.
Before the obfuscation of encrypted data:

After the obfuscation of encrypted data:

So, the new size of the obfuscated data is greater than the encrypted data.
Encoding the Obfuscated Data
Once the data is encrypted and stored at 0041C360, in the next subroutine the malware will encode this data as shown below:

Below is an explanation of the code:
[cpp]
MOV ESI,EAX
SHL ESI,2 ; ESI will be the total length of the encrypted data above (0xE4 bytes)
XOR EDX,EDX
TEST ESI,ESI
JLE SHORT sysmgr.004012B5
MOV EAX,DWORD PTR SS:[EBP+C]  ; EAX points to encrypted data
LEA ECX,DWORD PTR DS:[EDX+EAX] ; EDX is the counter used as an offset into the encrypted data
MOV AL,BYTE PTR DS:[ECX] ; Read a byte from the encrypted data
CMP AL,19 ; If less than 19 then add 41 to it
JA SHORT sysmgr.00401286
ADD AL,41
JMP SHORT sysmgr.0040129C
CMP AL,1A  ; If it is greater than 19 then it checks if it is lesser than 1A
JB SHORT sysmgr.00401292
CMP AL,33
JA SHORT sysmgr.00401292
ADD AL,47 ; Add 47 if it is greater than 1A but less than 33
JMP SHORT sysmgr.0040129C
CMP AL,34
JB SHORT sysmgr.004012A0
CMP AL,3D
JA SHORT sysmgr.004012A0
SUB AL,4     ; If greater than 34 but less than 3D then subtract 4
MOV BYTE PTR DS:[ECX],AL ; Write the modified byte into encrypted data
JMP SHORT sysmgr.004012B0
CMP AL,3E
JNZ SHORT sysmgr.004012A9
MOV BYTE PTR DS:[ECX],2B
JMP SHORT sysmgr.004012B0
CMP AL,3F
JNZ SHORT sysmgr.004012B0
MOV BYTE PTR DS:[ECX],2F
INC EDX
CMP EDX,ESI
JL SHORT sysmgr.00401276
This encoding algorithm will check the value of each byte read from the encrypted data and modify it based on various comparisons. The resulting encrypted data will consist of readable ASCII characters as shown below:

Random Seed Transfer
Once the encrypted data is received by the Server, it will use the Decryption algorithm to retrieve the data. However in order to decrypt, the Server requires the random seed which was generated at the client side and used to form the encrypted data.
All other elements used to perform the encryption such as the private key are already available to the Server.
If we look at the encrypted data stored at 0041C360 as shown above, the first byte is always fixed as 0x41. The next 16 bytes are the obfuscated version of the random seed.
In the random seed generation section, we can see that the 16 byte random seed is written to the memory address 00B30019. After it is used to form the encryption key and encrypt the data, in the obfuscation stage, the random seed itself is also obfuscated.
So, the random seed is present in the Header of the Encrypted Data sent to the Server. In this way, the Server now has all the elements required to decrypt and retrieve the data.
Sending the Encrypted Data
Now that the data is encrypted and stored at 00413C60, it is ready to be transferred to the Server. In our case, the malware makes use of HTTP Protocol to send this data to the Server.
It first forms the HTTP Header field, "Set-Cookie:" as shown below:

Stack arguments:

The subroutine at 00404A40 takes 2 arguments.
00B30018 – Pointer to the Set-Cookie: Header
0041C360 – Pointer to the encrypted data

After this subroutine has executed:

Once this is done, it will add this field to the HTTP Request Headers:

Stack arguments:

00CC000C – Handle returned by HTTPOpenRequestA
00B30018 – Pointer to the Set-Cookie field that needs to be added to the HTTP Request Headers
Then it creates a Thread in the Suspended State:

Stack arguments:

It resumes the Thread by calling WaitForSingleObject:


0x1C8 is the handle of the Thread created above.
Once WaitForSingleObject has executed, we break at the Thread Function at 0040196C.
This Thread Function will be used to send the HTTP Request to the Server:

Once HTTPSendRequestA has executed, it will send the HTTP request to the Server along with the encrypted data sent in the Set-Cookie header field.
Conclusion
 
        Learn Digital Forensics
In this way, we can see how malware protect the data exchanged between them and their servers from behavioral analysis. These methods can also be used to randomize the artifact details to prevent the discovery of malware on other machines.
 
                             
                                 
 
 
     
                                 
                                 
                                