A Case Study of Information Stealers: Part II
Introduction:
In the second part of this analysis, we will be exploring how Pony steals data and how it sends it to the C&C server. We are equally interested in knowing whether Pony does any cleanup before it terminates execution!
In the previous part we've stopped at the decryption of a wordlist. So let's actually see what the sample does next.
What should you learn next?
Preparing to steal information:
Before actually starting to call functions responsible of gathering software passwords from disk. Pony calls the WSAStartup function to initiate the use of the Winsock DLL by the process. It then allocates an OLE stream (similar to the ones we've seen in the first part) which will be used to store some kind of data. A function at 00410772 is then called with the stream pointer as an argument.
The interesting thing that the sample does next is store both strings "PWDFILE0" and "1.0" in the stream. These 2 strings will actually serve as a header to the buffer and with each step of encryption of the buffer that header will change to significantly tell about the content of that buffer. The buffer "headers" that Pony uses are the following:
- PWDFILE0
- PKDFILE0
- CRYPTED0
After storing the header that will precede the data, Pony is very close to start searching the system for passwords. But just before it does that it checks for a debugger by using an inlined version of IsDebuggerPresent.
If a debugger is detected 00401026 will be called and an exception will be raised. It is a very easy technique to implement and to bypass. Thus, it won't cause any problems for us.
The sample now will load a callback table address to EDI and enter a loop to iterate through all these functions and invoking them one by one. The callback table is located in: 00417D03 and contains 134 functions.
The first functions in the callback table.
It is also worth mentioning that each one of these callbacks is supplied a single argument. This argument is nothing but the pointer to the stream that was allocated before. It is now safe to conclude that each one of these function will actually write something into that stream. We can also see by looking at a random function from this table, that these callbacks are the ones actually responsible for stealing information from disk files and storing them to the stream.
Disassembly of a function from the callback table.
As I mentioned in the first part, we won't be able to go through each of these functions. However, we are going to examine the first function in the table since it does store some interesting information in the stream. This function is at 004044A0
A look at 004044A0 (Callbacks[0](stream) ):
The routine starts by copying these 8 bytes to the stream (02 00 4D 4F 44 55 01 01). After than it will separately add another 4 bytes (01 00 EF BE). Next, it calls both APIs GetVersionExA and GetLocalInfoA then store their results in the stream.
After storing information about the Windows version and the locale it tries to uniquely identify the machine by gathering its Hardware ID. This is very useful to keep track of the machine from which information are coming, since using something unspecific like for example IP addresses is unreliable.
To get the hardware ID, the sample first verifies if the machine has Winrar installed and then read the HWID value from the key: HKEY_CURRENT_USERSoftwareWinRAR since Winrar stores the machine's HWID in the registry.
Now if Winrar isn't installed on the machine, which might be the case, the sample gets the path to the AppDataTemp directory for the current user and searches for a file named HWID. If the file is found all it does is that it reads from that files what Pony supposed to be a unique hardware ID for the machine. However, on the machine I've tested this sample on the file wasn't there. What Pony does in that final case is just generate a random unique GUID to identify the machine with, the function used to generate that GUID is CoGreateGuid. Afterwards, the sample copies the HWID (or the GUID if previous tests failed) to the stream which is to be sent to the server after doing all the stealing.
Stealing Information:
At this stage, the malware does what it was designed to do, steal information. It iterates through the callback table we saw earlier calling each function until exhausting the whole array. A callback function is characterized (in general) in Pony of:
- Checking if the software to steal from is installed. Return if it isn't.
- Gathering the information from files, registry keys…
- Writing the information into the stream with details of the application.
Compressing the data:
After all the callbacks being called, all of the stolen data (if there is) is stored at the global data stream. In this stage, Pony gets the data ready to be sent to the server and before it does it must complete a list of tasks. First of all, it compresses the data to make its size smaller and thus faster to send to the server. In order to do that, it uses the aPLib v1.01 library. It is in fact a compression library based on the algorithm used in aPACK, a tool created by the same author of aPLib.
After compressing the data, the sample deletes the unpacked data and writes in the packed data in the stream. In the first section of this second part I've mentioned the different headers that precede the data. They're in fact used to know which type of data is stored in the stream: is it clear text, packed or encrypted data?
The previous header that was indicating the clear text data (PWDFILE01) is now compressed. So what Pony does is that it inserts another header to define the data as being packed (PKDFILE0) and it looks like this in memory:
Preparing to encrypt the data:
After data compression is done now is the time for encryption. The routine that is responsible for all the tasks needed to encrypt the data can be found at: 004019E0.
The first thing that is done is that the data stream (starting by the PDKFILE0 header) is copied to a separate dynamically allocated memory location. For now, Pony doesn't touch the copied data so we'll get back to it later. What is does instead is calculate the length of a hardcoded string "babajay@1234", we actually saw in the first part how Pony decrypts this string from "dcdclc{B3456" and we wondered about its nature.
Afterwards, an array of 257 bytes is created and initialized in the following fashion:
buffer[0] = buffer[1] = 0;
for (int i=2;i<257;i++)
{
buffer[i] = i;
}
In this stage the string "babajay@1234" is used to 'encrypt' the byte sequence by doing something like this:
UCHAR* key = "babajay@1234",Sum = 0,Character;
int j=0,i;
for(i=2;i<257;i++)
{
Old = buffer[i];
Sum += Old;
Character = key[j];
Sum += Character;
buffer[i] = buffer[Sum];
buffer[Sum] = Old;
if (++j == strlen(key))
j = 0;
}
Which gives in the end this sequence which will be used to encrypt the packed data in the stream:
Byte sequence that will be used to encrypt the stream data.
Conclusion:
In this second part, we saw how Pony prepares to (Locale, Windows version and unique machine identifier) and then steals information. We've also uncovered the specific details on how it compresses data and then prepares to encrypt it.
In the next part we will see how Pony actually encrypts the data, how it sends it and then what tasks it performs after serving its purpose.
Sample [download]
Become a Certified Ethical Hacker, guaranteed!
Get training from anywhere to earn your Certified Ethical Hacker (CEH) Certification — backed with an Exam Pass Guarantee.
The archive's password is: pony