Understanding stack instructions
This article will introduce readers to the assembly concepts in relation to the stack. We will discuss basic concepts related to stack and various registers, and the instructions used when working with a stack. We will also see practical examples of how common instructions like PUSH and POP work by using a debugger.
What is a stack?
A stack is a data structure used to save register contents for later restoration, pass parameters into procedures and save addresses so procedures can return to the right place.
Stack strictly operates on the Last-In-First-Out rule, i.e., data that is pushed onto the stack must be popped out of the stack in reverse order.
Become a certified reverse engineer!
Stack pointer
A stack pointer is a CPU register (ESP) that keeps track of data on the stack. It is a pointer that always points to the top of the stack.
The preceding figure shows that the ESP register is holding the address 0065FEDC, which is pointing to the top of the stack.
Stack instructions
PUSH and POP are the two most common instructions that are known to be used for pushing data onto the stack and popping data from the stack. The following example shows the usage of PUSH and POP instructions when a subroutine is called.
call subroutine
mov edx, eax
pop eaxThe instruction PUSH EAX is used to preserve the value of EAX before calling a subroutine. This is because the return value will be pushed into EAX when a subroutine is called. Once the subroutine is executed and execution is returned, the return value is kept in the register EDX. Finally, the original value of EAX is popped out of the stack and placed in the EAX register.
It should be noted that the pushed value is stored at the current address of ESP register, and the size of the ESP register will be incremented by the size of pushed value.
Let’s see an example of PUSH instruction using a debugger. The following excerpt is taken from a compiled C program and opened in a debugger.
MOV EBP,ESP ;prologue
.
.
.
.
.
LEAVE ;epilogue
RETNLet’s check the status of the register EBP and the stack before executing the instruction PUSH EBP.
ESP 0065FEDC
STACK:
ADDRESS DATA
0065FEDC 0040138B
0065FEE4 00690DA0
0065FEE8 006914C8
0065FEEC 00000000
0065FEF0 00000000
0065FEF4 00000004
0065FEF8 00690DA0
0065FEFC 00690D5C
0065FF00 FFFFFFFF
0065FF04 F67394D0
0065FF08 C3F41E41
0065FF0C 00000000Now let’s execute the PUSH EBP instruction and observe the changes on the stack.
0065FED8 /0065FF68
0065FEDC |0040138B RETURN to function.0040138B from function.00401510
0065FEE0 |00000001
0065FEE4 |00690DA0
0065FEE8 |006914C8
0065FEEC |00000000
0065FEF0 |00000000
0065FEF4 |00000004
0065FEF8 |00690DA0
0065FEFC |00690D5C
0065FF00 |FFFFFFFF
0065FF04 |F67394D0
0065FF08 |C3F41E41After executing the PUSH EBP instruction, there are a few things that are changed, as follows.
- The value of the EBP register is pushed on top of the stack
- The address of the ESP register is changed from 0065FEDC to 0065FED8, which is an incrementation of four bytes
Now let’s step through all the lines and wait before executing the LEAVE instruction. The LEAVE instruction is composed of the following two instructions:
This means the value of the EBP register should be moved to the register ESP, and then the value residing at the top of the stack should be placed in the register EBP when the LEAVE instruction is executed. Let us observe the status of EBP, ESP and the stack before executing the LEAVE instruction.
EBP 0065FED8
STACK:
ADDRESS DATA
0065FEC0 0000000A
0065FEC4 00000014
0065FEC8 00000026
0065FECC 00690D5C
0065FED0 00000026
0065FED4 00690D5C
0065FED8 /0065FF68
0065FEDC |0040138B RETURN to function.0040138B from function.00401510
0065FEE0 |00000001
0065FEE4 |00690DA0
0065FEE8 |006914C8
0065FEEC |00000000
0065FEF0 |00000000
0065FEF4 |00000004Now let’s execute the LEAVE instruction and observe the changes on ESP, EBP and the stack.
EBP 0065FF68
STACK:
ADDRESS DATA
0065FEDC 0040138B RETURN to function.0040138B from function.00401510
0065FEE0 00000001
0065FEE4 00690DA0
0065FEE8 006914C8
0065FEEC 00000000
0065FEF0 00000000
0065FEF4 00000004
0065FEF8 00690DA0
0065FEFC 00690D5C
0065FF00 FFFFFFFF
0065FF04 F67394D0
0065FF08 C3F41E41
0065FF0C 00000000
0065FF10 00000000First, the value of EBP is moved to ESP. Next, the value at the top of the stack was popped out of the stack and placed in EBP. The stack pointer was also decremented by four bytes, resulting in 0065FEDC being the value of ESP.
MOV instruction instead of PUSH
While PUSH and POP are used for pushing data onto the stack and popping data from the stack respectively, compilers can choose to use other instructions to perform the same operations. Let’s see an example of how the MOV instruction is used to push data onto the stack.
The following C program is compiled using MingW32 cross-compiler and opened using OllyDbg.
void test_function(int arg1, int agr2);
void main()
{
test_function(10,20);
}
void test_function(int arg1, int arg2){
int x = 50;
int y = 40;
}In the preceding code, a function named test_function is called, and it has two arguments with the values 10 and 20. There are two local variables initialized in the called function test_function.
The following excerpt is taken from the disassembly of the binary obtained using the preceding code.
MOV EBP,ESP
AND ESP,FFFFFFF0
SUB ESP,10
CALL function.004015E0
MOV DWORD PTR SS:[ESP+4],14
MOV DWORD PTR SS:[ESP],0A
CALL function.00401535
NOP
LEAVE
RETNThe highlighted lines in this excerpt demonstrate what happens before a subroutine is called.
Notice how the two arguments represented in hex are pushed onto the stack using the MOV instruction. Hex 14 and 0A translate to decimal values 20 and 10 respectively. These are the arguments passed to the function named test_function. If you notice, the arguments are pushed onto the stack in inverted order. The second argument is pushed onto the stack, followed by the first argument.
MOV EBP,ESP
SUB ESP,10
MOV DWORD PTR SS:[EBP-4],32
MOV DWORD PTR SS:[EBP-8],28
NOP
LEAVE
RETNOnce the function is called, notice how two local variables represented in hex are moved onto the stack using the MOV instruction. 32 in hex translates to 50, which is variable “x.” The next hex value, 28, is pushed onto the stack. 28 in hex translates to decimal value 40, which is used to initialize variable “y.”
When reversing a compiled binary, analysts will not have control over which instructions are used by the compiler to push data onto the stack. Thus, it is necessary to be aware of various instructions that may be used when working with stack.
Conclusion
In this article, we explored various instructions that are used in relation to stacks. We have seen examples of how PUSH and POP work by using a debugger. We also learned that different compilers may use different instructions for performing the same operation. As an example, we discussed how MOV instruction can be used instead of PUSH.
Sources
- Randal Hyde, “The Art of Assembly Language,” No Starch Press, March 2010
- Michael Sikorski and Andrew Honig, “Practical Malware Analysis,” No Starch Press, February 2012
- Reverse Engineering for Beginners, Dennis Yurichev