Buffer overflow & format string attacks: More basics

Security Ninja
August 6, 2015 by
Security Ninja

In the previous article we learned about the basics of buffer overflow, how attackers exploit this vulnerability, and then various defenses that can be put around buffer overflow like the concept of canaries and non-execution stack. In this part of the series, we will learn about a very famous but insidious form of attack known as the format string attack. This attack is also a cause of insecure or faulty programming. We will also understand this vulnerability better with the help of an example where I will explain the vulnerability with various stack diagrams, and then at the last we will see what defenses can be put for this vulnerability. With this vulnerability attackers, misuse functions and use it to read and manipulate information form memory. Let's discuss this vulnerability ion great detail below.

FREE role-guided training plans

FREE role-guided training plans

Get 12 cybersecurity training plans — one for each of the most common roles requested by employers.

Format string attacks

These vulnerabilities are associated with the 'printf' statement. Yes you read it right- 'printf'. Suppose a programmer is writing a code and in that he is using printf statement to print something. He uses the following printf:


Instead of

printf("%s", output);

Now you may argue "What is the difference between these two as both of them will compile without any errors?" Imagine if the output is set to "%d" in the first printf. The printf command will dutifully interpret the output as a format for printing a decimal integer, and in turn it will go to the stack to grab an integer. Since there is none, printf will print a garbage value which should not be treated as a failure because we can successfully print a garbage value with a printf command.

Let's dance in the stack

To understand this attack even better, we will showcase this attack with the help of an example and then trace it down in stack.

Reading from stack

Consider the following sprintf statement

"sprintf (buffer,sizeof_buffer,input);"

In this sprint statement, there is no format string so it can come under a format string attack. Suppose the attacker enters "%x%x%x" into the input above, then the above sprint statement will become

sprintf (buffer,sizeof_buffer, "%x%x%x");

Now this input from the attacker is interpreted as a format string and sprint will fetch the next three hexadecimal values from the stack and load it in the variable buffer. If we issue now:


We will see the value if next three hexadecimal values from the buffer.

Writing to the stack

For now we have been reading contents from the stack by passing format string as user input but we can also write to the stack. Let's see how.

The "%n" format is used to store the number of characters before encountering %n. For example, consider the following printf command:


This will load the number 6 into the memory location of the test. Notice we have just written to the memory location of variable 'test' using a printf. Now let's try with a more complex example, Suppose there is some value that attacker wants to change in the stack and following is the program for that:



Char input[50];

Char buffer[50];

int a=1;

snprintf(buffer, sizeof_buffer,input)


So the corresponding stack will look this this

Now let's say the value to change is at address 0xaffbfca0. This can be collected from the attacker by looking at the source code or by printing the content of stack. So in this example let's say user input:

"xa0xfcxfbxaf%d%n", so that the sprint statement will become

snprintf(buffer, sizeof_buffer, "xa0xfcxfbxaf%d%n") and the stack will be like this:

Let's understand this input from the user. Note that "" which is used to escape and x indicates a hexadecimal number. So 2 hexadecimal gets translated into 1 ASCII character so there are 4 ASCII characters in the input. After this attacker enters %d, this means to print a decimal integer but what integer will it print. Look at the above stack diagram and the next value in the stack after input is of integer 'a' which is set to 1, so now there are total of five characters in the buffer (4 ASCII+ 1 decimal). So the stack will become like this

After this, the next thing that comes in the input is '%n' and as stated earlier this format string is used to store the number of characters, which in this case is 5, and write it in the memory of the next argument. What is the memory location one would ask? snprintf will look out for next argument. It is being provided, but in this case there no such argument , so it will look at the stack and pick the next item, which is the buffer, and which is loaded with "xa0xfcxfbxaf" which in memory will be interpreted as 0xaffbfca0 (because it is interpreted as little endian) and thus the value 5 is written in this location:

So we have seen as to how we can write a number to a memory location. Now this memory location can be where return pointer resides, thus overwriting it attackers can take control of the program.

Format string attack detection & defenses

Following section describes about how we can detect and defend against format string attacks.

What should you learn next?

What should you learn next?

From SOC Analyst to Secure Coder to Security Manager — our team of experts has 12 free training plans to help you hit your goals. Get your free copy now.

  • Whenever user input is provided with a ("),%x,%d,%n, it is likely that a format string attack is underway.
  • The best way to defend against a format string attack is to make sure programmer includes format strings in printf, sprint,fprintf,snprintf function calls.
  • Deploy all the patches whenever applicable.

So in this article we have seen that how a small function like printf can lead to serious issued if not handled correctly/securely.

Security Ninja
Security Ninja