Overwriting the Stack
In previous articles, we got to know the basics of the Stack Based Buffer Overflow and changing the address in the run time by modifying the register value using the debugger.
In this article, we will analyze another simple C program which takes the user input and prints the same input data on the screen. In this article, we will not change any values by modifying the value through debugger like we did in the last article, but we will learn how to do it by user input values. Let us have a look at the program.
FREE role-guided training plans
In the program shown in the above screen shot, we have created two functions marked as 1 and 2. The 1st is the main function of the program from where the program execution will start. In the main function, there is a command to print message on the screen, then it is calling the V1 function. The 2nd one is defining the V1 function in which we have defined an array of size 10, then there are commands to take the user input and print it back to the output screen.
In the V1 function, we have used the 'gets' function to take the user input. The 'gets' function is vulnerable to buffer overflow as it cannot check whether the size of the value entered by the user is lesser or greater than the size of the buffer. So, 'gets' would take whatever value the user enters and would write it into the buffer. If the buffer size is small, then it would write beyond the buffer and corrupt the rest of the stack. So, let's analyze this with a debugger.
Note: You can download the EXE file here:
[download]
Now, let's run this program normally. We can see that our program runs perfectly and it asks to "Enter the name". When we enter the name and hit the enter key it accepts the value without crashing the program, as the input string is less than the size of the buffer we have created.
Now, let us run the program again in the same manner, but this time we will enter a value which is greater in size than the buffer size. For this we enter 35 A's (41 is the hexadecimal representation of A), then the program throws an error and our program gets crashed. We can see the same in the below screen shot.
If we click on "click here" in the window shown in the above screenshot, we can see the following type of window on the screen.
By closely looking into the red box, we can see the offset is written as '41414141', which is actually the hexadecimal value of A. This clearly indicates that the program is vulnerable to buffer overflow.
Note: In many cases, an application crash does not lead to exploitation, but sometimes it does.
So, let us open the program with the debugger and create a break point just before the function 'V1' is called and note down the
return address. The return address is actually the next instruction address from where the function call happened. We can see the same in the below screen shot. (We have already mentioned all the basics in previous articles, so we are not going to discuss basics in detail again.) We can create the break point by selecting the address and pressing the F2 key.
Run the program by hitting the F9 key and then click on F7 which is Step Into. It means we have created a break point before the V1 function call, after that the function 'V1' is called and execution control has switched to the V1 function, and the Top of the Stack is now pointing to the return address of the main function, which can be seen in the screen shot given below.
Now, we will note down the written address position as well as the written address from the Top of the Stack, which is the following.
Table 1
0022FF5C
004013FC
We will overwrite this return address with the user input data in the next step of the article. If we look at the program screen we can see 'Main Function is called' is being shown on the screen.
Now, hit Step Over until we reach the 'gets' function. This can be done by pressing the F8 key.
As can be seen in the above screenshot, we have reached the gets function, now the program has continued to run. When we look at the program screen, we can see the program is asking to enter the name, so enter the name and hit the enter key.
As can be seen, when we enter the name and hit the enter key, then execution control has moved to the next instruction and the program has again reached the Paused state.
Now, hit the Step Over (F8 Key) until we reach the RETN instruction.
If we look into the window 4 now, we can see that the top of the stack is pointing to the return address which is the following.
Table 2
0022FF5C
004013FC
Now, we will have to compare the addresses of Table 1 and Table 2.
Till now nothing caused any change in the tables we created, as we did not input anything wrong in the program.
So, let us restart the program again and input a very long string value into the program input and analyze the return address when the program execution control reaches the RETN instruction. We can restart the program by pressing CTRL+F2 and input 35 A's into the program.
As can be seen in the above screenshot, we have entered a very long input value in the program, now hit the F8 key (Step Over) until we will reach the RETN instruction.
Now, we will create another table and note down the Top of the Stack values into the table.
Table 3
0022FF5C
41414141
If we compare Table 2 and Table 3 addresses, we can see return address to the main program has been replaced to 41414141 in Table 3. 41 is actually the ASCII HEX value of A.
So, we can see the return address has been overwritten by the user input value A. Now think, what if we could modify the input value at this position, and write some different address which points it to a location in the memory that contains your own piece of code. In this way, we can actually change the program flow and make it execute something different. The code that we want to execute after controlling the flow is often referred to as a "SHELLCODE". We will discuss shellcode in later articles.
But the string that we have entered contains 35 A's, we do not know which ones have overwritten the stack. We will have to identify the positions in the user input where the stack is overwritten into the memory. We can do it by entering some pattern instead of A's. The input pattern could be anything. We will use the following pattern.
A1B2C3D4E5F6G7H8I9J0K1L2M3N4O5P6Q7U8S9T0U1V2W3X4Y5Z6
In this article, we have created this pattern manually, but in further articles we will use automated Metasploit scripts to generate the pattern.
Now, we need to restart the program again in the debugger and enter the above pattern as an input in the program we created.
As can be seen in the above screenshot, we have entered the pattern when the program asked for the input. Now, press F8 (Step Over) until we reach the RETN instruction.
As we can see in the screenshot, we have reached the RETN instruction which can be seen in screen 1, and the Top of the Stack address has been overwritten by the user input and is pointing to the value "O5P6".
So, this 4 byte data is actually the return address in the user input. So, let us verify this by replacing the "O5P6" to "BBBB" in our pattern before entering the user input. So, now according to our logic, the return address should point to "BBBB" in the memory when we reach the RETN instruction.
As can be seen in the above screenshot, our B's have been successfully written in the position where the return address should be. So, if we change our B's to the address somewhere else in the memory, then the program execution would go to that address and execute that instruction.
In this way, we can control the program flow and run our own code just by manipulating the input data. So we have understood the following things by completing this exercise:
What should you learn next?
References