Understanding for and while loops in assembly
In the previous article, we discussed how if statements can be spotted in the disassembly of a binary. We learned that if conditions are translated to conditional jumps when exploring the disassembly.
In this article, we will explore how for and while loops are translated in assembly.
Intro to x86 Disassembly
For loops
Let us begin by taking a simple for loop as an example. The following program iterates through the values 0 to 6 and prints them using the printf statement.
void main()
{
int i;
for (i=0; i<7; i++){
printf("value of a is %dn", i);
}
}When the preceding program is executed, we will see the following output.
For loops have the following syntax in C.
{
//code to be executed until the condition fails.
}In our case, the variable i is initialized to 0. The condition verifies if the value of i is less than 7. Finally, i is incremented by 1 after the statements are executed.
Spotting these initialization, condition and increment/decrement blocks in assembly can help us spotting for loops in assembly.
Let us open the executable in a debugger and observe what the disassembly looks like.
MOV EBP,ESP
AND ESP,FFFFFFF0
SUB ESP,20
CALL for.004015E0
MOV DWORD PTR SS:[ESP+1C],0
JMP SHORT for.00401541
MOV EAX,DWORD PTR SS:[ESP+1C] ; |
MOV DWORD PTR SS:[ESP+4],EAX ; |
MOV DWORD PTR SS:[ESP],for.00404000 ; |ASCII "value of a is %d"
CALL <JMP.&msvcrt.printf> ; printf
ADD DWORD PTR SS:[ESP+1C],1
CMP DWORD PTR SS:[ESP+1C],6
JLE SHORT for.00401528
NOP
LEAVE
RETNIn the preceding excerpt, the following instruction is used for initialization:
In one of the previous articles, we learned that local variables are referenced by stack addresses. 0 is pushed onto the stack. In this case, the value 0 is pushed onto the stack for referencing later, and thus it is a local variable. From the source code we know that this is for initializing variable i.
Next, an unconditional jump is taken to the following instructions, which will lead to a conditional jump based on the result of CMP instruction.
Note: The following figure can be seen for better readability of assembly instructions and their respective addresses.
CMP instruction is comparing the value 6 with the value referenced by [ESP+1C]. The following figure shows stack before the CMP instruction is executed, and the value referenced by [ESP+1C] is highlighted below.
After the CMP instruction is executed, the following is the status of the flags.
JLE instruction usually checks for the flags O, S and Z. Since the S flag is set to 1, JLE will take the jump to the address specified in the instruction. This completes the condition part of the for loop for the first iteration.
After the conditional jump, the following instructions will be executed to print the value 0.
MOV DWORD PTR SS:[ESP+4],EAX ; |
MOV DWORD PTR SS:[ESP],for.00404000 ; |ASCII "value of a is %d"
CALL <JMP.&msvcrt.printf> ; printf
ADD DWORD PTR SS:[ESP+1C],1Notice the highlighted instruction in the preceding excerpt. It is adding the value 1 to the value referenced by [ESP+1C]. When the highlighted instruction is executed, the following is the local variable i’s value on stack.
This completes the increment/decrement part of our for loop after the first iteration. When these initialization, condition and increment/decrement patterns are identified in assembly, you can identify that it is caused by a for loop in its high-level language.
While loops
While loops produce disassembly similar to for loops. The following code is equivalent to the for loop example we have seen earlier but written using the while loop.
void main()
{
int i=0;
while(i<7){
printf("value of a is %dn", i);
i++;
}
}When compiled and run, it has the same effect as the previously discussed for loop and displays the following output.
Let’s open the compiled executable using a debugger (OllyDbg in this case) and observe the assembly code.
MOV EBP,ESP
AND ESP,FFFFFFF0
SUB ESP,20
CALL while.004015E0
MOV DWORD PTR SS:[ESP+1C],0
JMP SHORT while.00401541
MOV EAX,DWORD PTR SS:[ESP+1C] ; |
MOV DWORD PTR SS:[ESP+4],EAX ; |
MOV DWORD PTR SS:[ESP],while.00404000 ; |ASCII "value of a is %d"
CALL <JMP.&msvcrt.printf> ; printf
ADD DWORD PTR SS:[ESP+1C],1
CMP DWORD PTR SS:[ESP+1C],6
JLE SHORT while.00401528
NOP
LEAVE
RETNIf you compare this with the assembly code we have obtained for the for loop example, there’s no difference. To start with, there is an initialization and unconditional jump, as highlighted in brown. Next, there is a comparison operation, followed by a conditional jump using JLE as highlighted in green. Next, the actual block of code is executed for the iteration as highlighted in yellow. Finally, an increment is done, as highlighted in blue.
It should be noted that the preceding code explicitly performs an increment operation in this case, but if the while loop does not contain an explicit increment operation, we will only see the initialization and condition pattern when analyzing while loops.
Note: The following figure can be seen for better readability of assembly instructions and their respective addresses.
Intro to x86 Disassembly
Conclusion
In this article, we discussed how for and while loops can be identified when exploring the disassembly of an executable. We learned that for loops can be identified in assembly by looking for initialization, condition and increment/decrement patterns. We also learned that while loops can also have similar code to the for loop except for the increment/decrement operation unless it is explicitly done by the programmer.
Sources
- x86 Instruction Set Reference, c9x.me
- Assembly - Loops, Tutorialspoint
- Michael Sikorski and Andrew Honig, "Practical Malware Analysis," No Starch Press, February 2012