Digital forensics

Data Exfiltration Techniques

Sudeep Singh
June 7, 2013 by
Sudeep Singh

Introduction

In this article we will see how malware encode or encrypt data that's exfiltrated to the Command and Control Server from infected machines. This is often done using a custom encoding or encryption algorithm.

Learn Digital Forensics

Learn Digital Forensics

Build your skills with hands-on forensics training for computers, mobile devices, networks and more.

It is becoming increasingly common these days to see malware using this technique to prevent Security Analysts from understanding the type of data that is being exchanged between the malware and its Server. Similarly, these algorithms can also be used for randomizing the artifact details such as names of the files or registry keys created on the infected machine.

In all such cases, Behavioral Analysis of the malware is not sufficient. Only after analyzing the code used by the malware can these algorithms be understood.

Randomization

Most malware create certain disk artifacts once they execute. If these disk artifacts have names that remain the same upon multiple executions of the malware, then it becomes easy to discover the presence of the malware on other machines using the indicators gathered during Behavioral Analysis.

To prevent this, malwares can use custom algorithms that are used to generate random names for the disk artifacts they create.

Similarly, most malware will gather some data from the infected machine and send it to the attacker's controlled server. If this communication channel is not encrypted or it sends the data in plain text, then it becomes trivial to understand the intention of the malware and its nature.

There are certain Win32 APIs which are often used to generate a random value which in turn is used in a custom obfuscation or encryption algorithm to randomize the disk artifact names or encrypt the communication channel.

Two of these Win32 APIs which are quite commonly used are: GetTickCount() and QueryPerformanceCounter().

In this article, we are going to look into a Custom Encryption and Encoding algorithm that uses QueryPerformanceCounter to generate a 16 byte Random Seed.

For the purpose of completeness, we will look at how the data is gathered from the machine and what type of data it is, followed by the details of the Encryption and Encoding algorithms.

Collection of Data

Once the malware has successfully executed on the machine, it proceeds to gather various details specific to the machine like the MAC Address, Username, Hostname, IP Address, Timestamp and destination domain name.

Below is a high level overview of how these details are gathered and in what format are they captured.

Data is collected by calling Win32 APIs like GetAdaptersAddresses, GetUserNameA, GetCurrentProcessID, gethostname, gethostbyname and GetLocalTime.

The data is gathered using the return values of the above functions and stored at 0041C1D0 as shown below:

Here mark:xxsy is a marker for the data collected.

Then it calls the main Encryption Routine at 00401768 to encrypt this data:

Stack arguments:

0041C1D0 – Pointer to the Data Collected from the machine

0041C360 – Encrypted Data will be stored here

Random Seed Generation

Once the data has been collected from the system, it starts the encryption routine. The first step in the encryption routine is to generate the random seed which will be used in the algorithm.

To generate the random seed, the QueryPerformanceCounter API is used as shown below:

The function prototype of QueryPerformanceCounter() is:

BOOL WINAPI QueryPerformanceCounter(

_Out_ LARGE_INTEGER *lpPerformanceCount

);

It accepts one argument, which is a pointer to the Performance Counter. Once the API has executed, it will return the Performance Counter value at this memory address. The return value has a size of 2 DWORDs.

In our case, the algorithm uses only the first DWORD.

Stack arguments just before the call to QueryPerformanceCounter:

So, the return value will be stored at the address, 0012FD78:

Below is an explanation of the code used to generate the 16 byte Random Seed:

[cpp]

LEA EAX,DWORD PTR SS:[EBP-1C]

PUSH EAX ; pointer to Performance Counter

CALL DWORD PTR DS:[<&KERNEL32.QueryPerf>; QueryPerformanceCounter

PUSH DWORD PTR SS:[EBP-1C] ; Arg1 (1st DWORD of Performance Counter)

CALL sysmgr.00404E15

POP ECX

CALL sysmgr.00404E27 ; Subroutine to modify the first DWORD

MOV BYTE PTR DS:[EDI+ESI],AL ; Form the Random Seed byte by byte

TEST AL,AL

JNZ SHORT sysmgr.004017B7

MOV BYTE PTR DS:[EDI+ESI],1

INC EDI

CMP EDI,10 ; Total Length of the seed is 0x10 bytes

JL SHORT sysmgr.00401794

[/cpp]

After it retrieves the value of the Performance Counter, the first DWORD is passed to the subroutine at address 00404E27:

This subroutine will modify the value of the DWORD and finally store the lower byte of the high order word in AL. This value will then be written to the new memory location.

In each loop, one byte of random seed is generated. Since the total length of the seed is 16 bytes, there will be 16 invocations of QueryPerformanceCounter and it will write a new byte each time to the memory address where the random seed is stored.

The random seed will be stored at the memory address [EDI+ESI], which is 00922D00 in our case.

Before the random seed generation:

It contains 0xBAADF00D because it is a new chunk of memory allocated by RtlAllocateHeap.

After the first execution of the loop, the byte 0x27 is written to this location:

After the above loop completes and the complete 16 byte random seed is generated and stored at 00922D00, it will copy the random seed to a new location.

Below is an explanation of the code:

[cpp]

PUSH 10

LEA EAX,DWORD PTR DS:[EBX+1]

PUSH ESI

PUSH EAX

MOV BYTE PTR DS:[EBX],1 ; EBX is 00B30018 where the random seed is copied to.

CALL sysmgr.00403ED0

[/cpp]

EBX points to the location where the value of random seed will be copied to. This memory address is 00B30018 in our case. The first byte of this is fixed and it is 0x01.

The subroutine at 00403ED0 is used to write the random seed to the new memory location.

[cpp]

MOV AL,BYTE PTR DS:[ESI] ; ESI points to the original location of random seed.

MOV BYTE PTR DS:[EDI],AL ; EDI points to the new location of random seed.

MOV AL,BYTE PTR DS:[ESI+1]

MOV BYTE PTR DS:[EDI+1],AL

MOV AL,BYTE PTR DS:[ESI+2]

SHR ECX,2

MOV BYTE PTR DS:[EDI+2],AL

ADD ESI,3

ADD EDI,3

CMP ECX,8

JB SHORT sysmgr.00403F54

[/cpp]

It copies the random seed this way:

  1. It first copies 3 bytes of the random seed to the new location, byte by byte.
  2. It then copies 3 DWORDs of the random seed to the new location, DWORD by DWORD as shown below.

  1. It then writes 1 byte to the new location.

After the above subroutine has executed, the new random seed is stored as shown below.

Encryption Key Formation

Once the random seed is generated and copied to 00B30018, it calls a subroutine at 00401083 to form the Encryption Key.

The stack arguments:

0012ED30 – Location of the new encryption key

00922D00 – Original location of the random seed

00B30019 – New location of the random seed

After we step into the subroutine at 00401083:

[cpp]

MOV ECX,sysmgr.00416CA0 ; 00416CA0 is the location of the private key

PUSH 12 ; The total size of the key is 0x12 DWORDs

XOR EDX,EDX

SUB ECX,EAX // subtract 12ED30 from 416CA0

POP EDI ; EDI will be used as the outer loop counter

[/cpp]

The malware has the private key used for the encryption stored at the address, 00416CA0. The size of this key is 0x12 DWORDs or 72 bytes. This key along with the random seed will be used to form a new key located at 0012ED30.

72 byte key located at 00416CA0:

Here is the loop used to generate the new key:

Here is the explanation of the code:

[cpp]

XOR ESI,ESI

MOV DWORD PTR SS:[EBP-8],4 ; Initialize the inner loop counter to 4

MOV EBX,DWORD PTR SS:[EBP+C] ; EBX points to the original location of random seed

MOVZX EBX,BYTE PTR DS:[EDX+EBX] ; Read one byte at a time from the random seed

SHL ESI,8

OR ESI,EBX ; ESI will stored one DWORD from the random seed

INC EDX

CMP EDX,10 ; Check if all the bytes from the random seed have been read

JL SHORT sysmgr.004010C7

XOR EDX,EDX ; If all the bytes from the random seed are read then reset EDX

DEC DWORD PTR SS:[EBP-8]

JNZ SHORT sysmgr.004010B3

MOV EBX,DWORD PTR DS:[ECX+EAX]

XOR EBX,ESI ; XOR the DWORD from random seed with the private key

MOV DWORD PTR DS:[EAX],EBX ; new Encryption key will be stored at 0012ED30

ADD EAX,4

DEC EDI ; There are a total of 12 DWORDs in the key

JNZ SHORT sysmgr.004010AA

[/cpp]

Here is an explanation of the encryption routine:

  1. It reads one DWORD (byte by byte) from the random seed.
  2. It XORs the DWORD read from the random seed with the DWORD read from private key.
  3. It stores the result into the location of the new encryption key.
  4. It reads the bytes from the random seed in a cyclic order. Since the length of the random seed is 0x10 bytes or 4 DWORDs and the length of the private key is 0x48 bytes or 0x12 DWORDs, it reads the bytes from the random seed from start once it has finished reading all the bytes.

Before the key formation routine has completed executing, at address 0012ED30:

Once the above loop has executed, the new encryption key is stored at 0012ED30 as shown below:

Key Modification Routine

Once the new encrypted key is formed and stored at 0012ED30, in the next loop this key is modified. It reads 2 DWORDs at a time and modifies them using a subroutine at 00401040.

Below is an explanation of the code:

[cpp]

MOV ECX,DWORD PTR SS:[EBP+8] ; ECX points to the key

LEA EAX,DWORD PTR SS:[EBP-4] ; This will hold the final modified value of the first DWORD

PUSH EAX ; EAX points to 0012ED00

LEA EBX,DWORD PTR SS:[EBP-8] ; This will hold the final modified value of the second DWORD

CALL sysmgr.00401040

MOV EAX,DWORD PTR SS:[EBP+8] ; EAX points again to the start of the Key, 0012ED30

POP ECX ; 0012ED00

MOV ECX,DWORD PTR SS:[EBP-4] ; Final DWORD from previous iteration is stored in ECX

MOV DWORD PTR DS:[EAX+ESI*4],ECX ; Modify the first DWORD of the key

MOV ECX,DWORD PTR SS:[EBP-8] ;

MOV DWORD PTR DS:[EAX+ESI*4+4],ECX ; Modify the second DWORD of the key

INC ESI

INC ESI ; Increment ESI two times since we are modifying two DWORDs at a time

CMP ESI,12 ; The total length of the key is 0x12 DWORDs

JL SHORT sysmgr.004010E1

[/cpp]

Before the execution of the above loop, the key at 0012ED30 is:

Once the above subroutine has executed, the key is modified as shown below:

Encryption of Data

Once the encryption key has been formed, the data that was gathered previously from the machine will be encrypted using it.

In the Data Encryption Subroutine, we read two DWORDs at a time from the data and use the encryption key to modify them. Once this is done, each of these 2 DWORDs are written to the new memory location.

The subroutine at 00401040 is used to encrypt two DWORDs at a time.

It passes 2 parameters:

12FD8C – One of the 2 encrypted DWORDs will be stored here.

41C1D0 – Points to the data to be encrypted

It reads two DWORDs at a time from the data to be encrypted and stores them at addresses 12FD88 and 12FD8C as shown below:

Once the subroutine at 00401040 has executed, these two DWORDs will be encrypted as shown below:

Now these two DWORDs will be written to the new memory location.

Below is an explanation of the encryption subroutine:

[cpp]

MOV EAX,DWORD PTR SS:[EBP+8] ; EAX holds the data to be encrypted

MOV ECX,DWORD PTR DS:[EAX+EDI*8] ; First DWORD from the data to be encrypted is stored in ECX

MOV EAX,DWORD PTR DS:[EAX+EDI*8+4] ; Second DWORD from the data to be encrypted is stored in EAX

MOV DWORD PTR SS:[EBP-C],EAX ; Store second DWORD at 0012FD88

LEA EAX,DWORD PTR SS:[EBP-8]

MOV DWORD PTR SS:[EBP-8],ECX ; Store the first DWORD at 0012FD8C

PUSH EAX

LEA EBX,DWORD PTR SS:[EBP-C]

LEA ECX,DWORD PTR SS:[EBP-1064] ; Points to the 0x48 byte key

CALL sysmgr.00401040 ; Modify the first and second DWORDs stored at 0012FD88 and 0012FD8C

PUSH 4

LEA EAX,DWORD PTR SS:[EBP-8]

PUSH EAX

LEA EAX,DWORD PTR DS:[ESI-4]

PUSH EAX

CALL sysmgr.00403ED0 ; Store the second DWORD at new memory address

PUSH 4

MOV EAX,EBX

PUSH EAX

PUSH ESI

CALL sysmgr.00403ED0 ; Store the first DWORD at the new memory address

ADD ESP,1C

INC EDI ; Increment EDI to read the next DWORDs from the data to be encrypted

ADD ESI,8

CMP EDI,DWORD PTR SS:[EBP+10] ; Total of 13 iterations are required to read all data

JL SHORT sysmgr.004017FA

[/cpp]

The subroutine at 00403ED0 will be used to write the DWORD to the memory location.

As can be seen above, the DWORDs at 12FD88 and 12FD8C are swapped and written to the new memory location, 00B30029.

Also, it is important to note that during Random Seed Generation, the 16 byte random seed was written to the memory address, 00B30018.

So, the encrypted data is stored after the random seed.

The loop above continues to execute for the entire length of the data.

After the loop has executed completely, the encrypted data is stored as shown below:

Obfuscation of Encrypted Data

Once the data is encrypted and stored at 00B30029, in the next subroutine at 004011E8 it is obfuscated.

The 2 parameters passed to the obfuscation routine are:

00B30018 – Pointer to the random seed and encrypted data

0041C360 – The final obfuscated data will be stored here

If we step into the subroutine at 004011E8, we can see the obfuscation algorithm here:

The inner loop will run 3 times and write 3 bytes to the new memory location. The outer loop will use the 3rd byte from the previous sequence of bytes and modify it and write to the new memory location.

Outer loop will run 0x39 times; it will write 4 bytes to the new memory location each time.

Below is an explanation of the code:

[cpp]

MOV EDI,EDX

MOV DWORD PTR SS:[EBP-8],2 ; Initialize local variable (this will be incremented in steps of 2)

MOV BYTE PTR SS:[EBP-1],0 ; Initialize local variable

MOV EAX,ESI

MOV DWORD PTR SS:[EBP-C],6 ; Initialize local variable (this will be decremented in steps of 2)

SUB EDI,ESI

MOV DWORD PTR SS:[EBP-10],3 ; Inner loop counter

MOV BL,BYTE PTR DS:[EAX] ; Read a byte from the encrypted data

MOV CL,BYTE PTR SS:[EBP-8]

ADD DWORD PTR SS:[EBP-8],2 ; Increment local variable by 2

SHR BL,CL ; modify BL

MOV ECX,DWORD PTR SS:[EBP-C]

SUB DWORD PTR SS:[EBP-C],2 ; Decrement local variable by 2

OR BL,BYTE PTR SS:[EBP-1] ; Modify BL

MOV BYTE PTR DS:[EDI+EAX],BL ; Write BL to new location

MOV BL,BYTE PTR DS:[EAX]

SHL BL,CL

SHR BL,2

INC EAX

DEC DWORD PTR SS:[EBP-10] ; Decrement inner loop counter

MOV BYTE PTR SS:[EBP-1],BL ; This value will be used in OR operation in next iteration

JNZ SHORT sysmgr.0040122B

MOV AL,BYTE PTR DS:[ESI+2]

AND AL,3F

MOV BYTE PTR DS:[EDX+3],AL

ADD ESI,3

ADD EDX,4

DEC DWORD PTR SS:[EBP-14] ; Decrement outer loop counter

JNZ SHORT sysmgr.0040120C

[/cpp]

As can be seen above, it modifies the bytes of Encrypted Data and the Random Seed. It also adds an extra byte after every 3 bytes which is a modification of the third byte in the previous byte sequence.

Before the obfuscation of encrypted data:

After the obfuscation of encrypted data:

So, the new size of the obfuscated data is greater than the encrypted data.

Encoding the Obfuscated Data

Once the data is encrypted and stored at 0041C360, in the next subroutine the malware will encode this data as shown below:

Below is an explanation of the code:

[cpp]

MOV ESI,EAX

SHL ESI,2 ; ESI will be the total length of the encrypted data above (0xE4 bytes)

XOR EDX,EDX

TEST ESI,ESI

JLE SHORT sysmgr.004012B5

MOV EAX,DWORD PTR SS:[EBP+C] ; EAX points to encrypted data

LEA ECX,DWORD PTR DS:[EDX+EAX] ; EDX is the counter used as an offset into the encrypted data

MOV AL,BYTE PTR DS:[ECX] ; Read a byte from the encrypted data

CMP AL,19 ; If less than 19 then add 41 to it

JA SHORT sysmgr.00401286

ADD AL,41

JMP SHORT sysmgr.0040129C

CMP AL,1A ; If it is greater than 19 then it checks if it is lesser than 1A

JB SHORT sysmgr.00401292

CMP AL,33

JA SHORT sysmgr.00401292

ADD AL,47 ; Add 47 if it is greater than 1A but less than 33

JMP SHORT sysmgr.0040129C

CMP AL,34

JB SHORT sysmgr.004012A0

CMP AL,3D

JA SHORT sysmgr.004012A0

SUB AL,4 ; If greater than 34 but less than 3D then subtract 4

MOV BYTE PTR DS:[ECX],AL ; Write the modified byte into encrypted data

JMP SHORT sysmgr.004012B0

CMP AL,3E

JNZ SHORT sysmgr.004012A9

MOV BYTE PTR DS:[ECX],2B

JMP SHORT sysmgr.004012B0

CMP AL,3F

JNZ SHORT sysmgr.004012B0

MOV BYTE PTR DS:[ECX],2F

INC EDX

CMP EDX,ESI

JL SHORT sysmgr.00401276

[/cpp]

This encoding algorithm will check the value of each byte read from the encrypted data and modify it based on various comparisons. The resulting encrypted data will consist of readable ASCII characters as shown below:

Random Seed Transfer

Once the encrypted data is received by the Server, it will use the Decryption algorithm to retrieve the data. However in order to decrypt, the Server requires the random seed which was generated at the client side and used to form the encrypted data.

All other elements used to perform the encryption such as the private key are already available to the Server.

If we look at the encrypted data stored at 0041C360 as shown above, the first byte is always fixed as 0x41. The next 16 bytes are the obfuscated version of the random seed.

In the random seed generation section, we can see that the 16 byte random seed is written to the memory address 00B30019. After it is used to form the encryption key and encrypt the data, in the obfuscation stage, the random seed itself is also obfuscated.

So, the random seed is present in the Header of the Encrypted Data sent to the Server. In this way, the Server now has all the elements required to decrypt and retrieve the data.

Sending the Encrypted Data

Now that the data is encrypted and stored at 00413C60, it is ready to be transferred to the Server. In our case, the malware makes use of HTTP Protocol to send this data to the Server.

It first forms the HTTP Header field, "Set-Cookie:" as shown below:

Stack arguments:

The subroutine at 00404A40 takes 2 arguments.

00B30018 – Pointer to the Set-Cookie: Header

0041C360 – Pointer to the encrypted data

After this subroutine has executed:

Once this is done, it will add this field to the HTTP Request Headers:

Stack arguments:

00CC000C – Handle returned by HTTPOpenRequestA

00B30018 – Pointer to the Set-Cookie field that needs to be added to the HTTP Request Headers

Then it creates a Thread in the Suspended State:

Stack arguments:

It resumes the Thread by calling WaitForSingleObject:

0x1C8 is the handle of the Thread created above.

Once WaitForSingleObject has executed, we break at the Thread Function at 0040196C.

This Thread Function will be used to send the HTTP Request to the Server:

Once HTTPSendRequestA has executed, it will send the HTTP request to the Server along with the encrypted data sent in the Set-Cookie header field.

Conclusion

Learn Digital Forensics

Learn Digital Forensics

Build your skills with hands-on forensics training for computers, mobile devices, networks and more.

In this way, we can see how malware protect the data exchanged between them and their servers from behavioral analysis. These methods can also be used to randomize the artifact details to prevent the discovery of malware on other machines.

Sudeep Singh
Sudeep Singh

Sudeep Singh is an Information Security Professional and has a strong interest in various areas of Information Security ranging from Reverse Engineering, Cryptography, Malware Analysis, latest online threats to Web Application Security, Password Security. His other interests include working in GPGPU related tasks like Cryptographic Hashing Algorithm Cracking and Crypto Currencies.