Malware analysis

Understanding variables in C

Srinivas
October 7, 2019 by
Srinivas

Variables in C are data storage units that reserve space in the memory. There are different types of variables. Each type requires different amounts of memory, but the memory requirements are predetermined. Variables are further governed by sets of operations applied to them. 

Below, we’ll discuss how to identify variables when analyzing executables. C code snippets are shown, as well as their assembly equivalents and how the stack is used when subroutines are called in a program.

Variables

Depending on where they are declared, variables are of two types — global variables and local variables. This is how they can be identified with a debugger.

Global variables

The C code snippet shows the use of global variables.

#include <stdio.h>

int a = 10;  

void main()

{

printf("The value of a is %dn", a);

}

When compiled, global variables are referenced by memory location as shown in the excerpt when opened in OllyDbg.

PUSH EBP

MOV EBP,ESP

AND ESP,FFFFFFF0

SUB ESP,10

CALL global_v.004015D0

MOV EAX,DWORD PTR DS:[403004]         ; |

MOV DWORD PTR SS:[ESP+4],EAX         ; |

MOV DWORD PTR SS:[ESP],global_v.00404000 ; |ASCII "The value of a is %d"

CALL <JMP.&msvcrt.printf>            ; printf

NOP

LEAVE

RETN

The checked value is referenced at 00403004, as shown below. 

[00403004]=0000000A

0000000A is the hex equivalent of 10, which is what is stored in variable “a” in the C program. This is verified using a debugger. 

The preceding program needs to be executed to get a Windows 32-bit binary. A cross-compiler known as MinGW is used to produce a Windows 32-bit executable on a Kali Linux machine. The command below can be used to do the same. 

i686-w64-mingw32-gcc global_variables.c -o global_variables.exe

Note: The cross-compiler to produce Windows executables is not preinstalled in Kali Linux.

The executable file is opened in OllyDbg and the program is run in the debugger. Below is the result.

A breakpoint is set at the address 0040151E and the program is run again. When the breakpoint is hit, the value of the EAX register is checked and shows as 00000001, as shown below.

The instructions are single-stepped through once so the instruction below will be executed.

MOV EAX,DWORD PTR DS:[403004]

The value of EAX is checked again. It should hold hex value 0000000A, as shown in below.

The figure above shows that global variables are referenced by memory addresses.

Local variables

The C code snippet shows the use of local variables. It initializes a variable inside the main method. The variable “a” is local to the main method. 

#include <stdio.h>

 

void main()

{

int a = 10; 

printf("The value of a is %dn", a);

}

When compiled, local variables are referenced by stack addresses. This looks as shown below when opened in OllyDbg.

PUSH EBP

MOV EBP,ESP

AND ESP,FFFFFFF0

SUB ESP,20

CALL local_va.004015D0

MOV DWORD PTR SS:[ESP+1C],0A         ; |

MOV EAX,DWORD PTR SS:[ESP+1C]        ; |

MOV DWORD PTR SS:[ESP+4],EAX         ; |

MOV DWORD PTR SS:[ESP],local_va.00404000 ; |ASCII "The value of a is %d"

CALL <JMP.&msvcrt.printf>            ; printf

NOP

LEAVE

RETN

As shown above, the value stored at [ESP+1C] is being moved to EAX. The value on the stack that is referenced by [ESP+1C] is verified. 1C in hex translates to 28 in decimal. The value at 28 bytes from ESP is checked.

The value is 0000000A, which was translated to decimal value 10. This is being moved to the EAX register. Once this instruction is executed, EAX should have the value referenced by the stack address shown below.

Calling conventions in C

The snippet below shows how functions are used in C.

#include <stdio.h>

void test_function(int arg1, int agr2); 

void main()

{

test_function(10,20);

}

void test_function(int arg1, int arg2){

int x = 50;

int y = 40;

}

A function named test_function is called, and it has two arguments with the values 10 and 20. There are two local variables initialized in the called function test_function. The code below is taken from the disassembly of the binary obtained using the above code. 

PUSH EBP

MOV EBP,ESP

AND ESP,FFFFFFF0

SUB ESP,10

CALL function.004015E0

MOV DWORD PTR SS:[ESP+4],14

MOV DWORD PTR SS:[ESP],0A

CALL function.00401535

NOP

LEAVE

RETN

The highlighted lines above demonstrate what happens before a subroutine is called.  

The two arguments represented in hex are pushed onto the stack. Hex 14 and 0A translate to decimal values 20 and 10, respectively. These are the arguments passed to the function named test_function. 

The arguments are pushed onto the stack from an inverted order. The second argument is pushed onto the stack, followed by the first argument.

PUSH EBP

MOV EBP,ESP

SUB ESP,10

MOV DWORD PTR SS:[EBP-4],32

MOV DWORD PTR SS:[EBP-8],28

NOP

LEAVE

RETN

Once the function is called, the two local variables represented in hex are moved onto the stack. 32 in hex translates to 50, which is variable “x.” Next, hex value 28 is pushed onto the stack. 28 in hex translates to decimal value 40, which is used to initialize variable “y.” 

Become a certified reverse engineer!

Become a certified reverse engineer!

Get live, hands-on malware analysis training from anywhere, and become a Certified Reverse Engineering Analyst.

Conclusion

Variables in C are placeholders and workspaces within the memory. They have preset memory requirements and are controlled by sets of operations that are applied to them. Local variables and global variables written in C programming are referenced in assembly language. Global variables are referenced using memory addresses, whereas local variables are referenced using stack and arguments. Local variables are also pushed onto the stack when a function is called.

 

Sources

  1. Michael Sikorski and Andrew Honig, "Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software," No Starch Press, February 2012
  2. Reverse Engineering for Beginners, Dennis Yurichev
  3. x86 Assembly Guide, University of Virginia Computer Science
Srinivas
Srinivas

Srinivas is an Information Security professional with 4 years of industry experience in Web, Mobile and Infrastructure Penetration Testing. He is currently a security researcher at Infosec Institute Inc. He holds Offensive Security Certified Professional(OSCP) Certification. He blogs atwww.androidpentesting.com. Email: srini0x00@gmail.com