Exploiting Format Strings: The Stack

July 1, 2016 by

This article covers how to exploit format String vulnerabilities to fetch and write arbitrary values to and from the stack.

What should you learn next?

What should you learn next?

From SOC Analyst to Secure Coder to Security Manager — our team of experts has 12 free training plans to help you hit your goals. Get your free copy now.

Overview :

In this article, we will learn what Format String Vulnerabilities are, how we exploit it to read specific values from the stack, further we will also have a look at how we can use different format specifiers to write arbitrary values to the stack.

NOTE: Memory addresses shown in the commands can be totally different in your setup so you need to make calculation adjustments according to that.

What is a Format String?

A format string is a simple representation of ASCII string in a controlled manner using format specifiers. Further, this complete ASCII string is fed to format functions such as printf,vprintf,scanf to convert the C datatypes into String representation.

Example: Here we have used %s to specify that the next argument that needs to be picked from the stack should be converted to a string for final representation.

char *s[] = "Format String";


But what if we did not specify the format specifiers in format function, well the format functions are not going to change their behavior and start fetching the arbitrary values from the stack. To exploit this property we need to know three basic things:

  • Location of arguments on the stack.
  • Basic understanding of x86 assembly.
  • Familiarity with GNU debugger.

Sample Program

We will be using the following piece of code throughout the tutorial. We are also assuming that memory protection mechanisms such as ASLR is disabled, and the stack is executable.

Switch to root user and execute the following commands:

Disable ASLR :echo 0 > /proc/sys/kernel/randomize_va_space

Compile program with stack executableoption :gcc –z execstack –o fmtfmt.c

Exploitation Phase:

In this phase, we will dive deep into the basics of exploiting the issue we have discussed so far. Before we start fuzzing the binary with different symbols, we need to make sure we have an understanding of what these symbols stands for.

The following screenshot is taken directly from Linux programmer manual for printf function.

As can be seen,

%x is used to convert an unsigned int into unsigned hex

%u is used to convert an unsigned int into unsigned decimal

n is used to store the number of characters written so far.

As we know that our code is not using any particular format specifier, therefore when we pass any format specifier as an argument to our program it will start fetching the value from the start and present in the specified format. As in this case we are using %x it is fetching some arbitrary data from the stack and presents us in a hex format.

Before we go any further let us analyze the result in GNU debugger, we initially set the breakpoint at main using command "break main", we further run the program by passing arguments and format specifier "run AAAA-%x".

We set another breakpoint "break *0x80484d0" before the printf function to examine the state of the stack before it throws the segmentation fault error and nullifies the stack registers.

In the above screenshot we can see that our breakpoint is hit, let's examine the first ten words from the top of the stack, we can see that our input is placed on the 8th argument from the top of the stack.

To make sure our analysis is correct we will directly fetch 8th argument from the top of the stack using the dollar sign. As we are executing these commands in a bash shell, we need to escape dollar sign with a backslash. As can be seeing we can determine the exact location where our argument is getting placed.

We will now be using format specifier "n" to write some data at a precise memory location, let first understand what happen if we pass n instead of x. As can been the program crashes with segmentation fault error. Let's examine the EIP register; there is a move instruction moving data from EDX in EAX whereas EDX holds some bytes written so far which is five, i.e., "AAAA and one hyphen (-)" and EAX holds the value of passed argument.

Further in the same screenshot we have used format specifier "u" to write decimal value to the same memory location. As %u holds ten decimal value the overall results come as 0x10 i.e., we have written 16 characters so far. I.e., "5 A's, one hyphen - and ten characters from u". Using this same technique, we will be writing address of our shellcode to an offset of a function.



What should you learn next?

What should you learn next?

From SOC Analyst to Secure Coder to Security Manager — our team of experts has 12 free training plans to help you hit your goals. Get your free copy now.



Warlock works as a Information Security Professional. He has quite a few global certifications to his name such as CEH, CHFI, OSCP and ISO 27001 Lead Implementer. He has experience in penetration testing, social engineering, password cracking and malware obfuscation. He is also involved with various organizations to help them in strengthening the security of their applications and infrastructure.