YTread Logo
YTread Logo

Solving AVR reverse engineering challenge with radare2 - rhme2 Jumpy (reversing 100)

Jun 07, 2021
Let's take a look at our first

reversing

challenge

in the "

reversing

desert" called Jumpy. If you haven't seen the last video, you should watch it, because we are about to read a lot about the AVR assembler and we need to understand the AVR architecture a little bit. The description says: We really need access to this lab, protected with an Arduino-based access control system. Our agents got a quick dump of a non-custom engineered sample, but we're having trouble

reverse

engineering

it. Ah! That's great. As we know, the .hex binary files we put on the dashboard are customized, or rather encrypted.
solving avr reverse engineering challenge with radare2   rhme2 jumpy reversing 100
So we can't dismantle them. But this

challenge

gives us a Jumpy.bin file, which is presumably a non-custom version. So it's not encrypted. I also wonder what the name Jumpy could mean. Will the program jump around a lot to obfuscate what the password is? Or does it mean we have to use a jumper on some of the pins of the arduino? Well, we'll figure it out eventually. So, let's load the challenge on the board and connect to it through the screen. We are presented with an input message and when we submit some data it says, "Better luck next time." Next we take a first look at the .bin file.
solving avr reverse engineering challenge with radare2   rhme2 jumpy reversing 100

More Interesting Facts About,

solving avr reverse engineering challenge with radare2 rhme2 jumpy reversing 100...

With "file" we see that it is not a standard executable file format like ELF, so it is probably basically a raw dump from memory. We can also quickly check with the strings command if the correct input password is simply embedded in it. I had a little hope for only 100 points, but it turns out it's not here. The next steps I'm going to present seem like I had a simple plan, but in reality what followed was hours and hours of looking at things all over the place, trying to understand this binary, and there was a lot of frustration.
solving avr reverse engineering challenge with radare2   rhme2 jumpy reversing 100
It was the first time I had to

reverse

engineer AVR, and especially low-end hardware. So let me give you a short list of what I set up and explored, and then I'll tell you how I finally solved it. I was first looking into how I could debug this, because static analysis is always a little harder if you don't know what to look for. And I quickly found simavr, installed it and tried to run the binary. It took me some time to understand how I can execute it. I had some issues with setting the frequency and a floating point exception that I had to track down.
solving avr reverse engineering challenge with radare2   rhme2 jumpy reversing 100
But in the end I figured out how to use the Simduino example, which basically simulates an arduino board. So that's perfect. To use it, I had to convert the binary file to an Intel hexadecimal format, as the arduino would expect. And then I could run it. It also starts a gdb server that you can connect to with gdb, but it has to be a special avr-gdb included in the GNU AVR toolchain. With simavr you can also use picocom to interface via the simulated UART serial interface. But there were countless problems. More on that soon. I also checked the binary with IDA, fortunately my IDA license can unmount AVR, but if you look at the addresses they are incremented by one, although the opcodes are actually 2 bytes and here you see the actual address.
So it was quite annoying to cross reference with debugging in gdb. Since I like binary ninja, I also installed a plugin that someone wrote for AVR, because it's not supported by default, and that works too, at least it shows the addresses so you can work with them, but it's a very basic plugin and not It doesn't identify many functions and also doesn't really understand the interrupt table at first. By the way, this here is an interrupt table. It's easy to guess because it's just a big list of jumps to somewhere else and we can read from the datasheet which address each interrupt is.
So for example, when the board reboots or powers on, execution will start at 0. But you can trigger other interrupts, for example from a timer, and then the CPU would start continuing execution here and jump to the actual code it should. be executed. Very typical of an integrated system. Anyway, although I used all these tools during my exploration phase and experimented with them, in the end I used radar, I also read that a lot of people use radar a lot for AVR. I want to emphasize again, before explaining the solution, that this frustrated me very much. It may seem a bit easy, but I don't even know how many hours and days I spend learning AVR and running things like simavr.
So don't get discouraged when you face a new platform or CTF challenge where you get stuck and have no idea what to do. I had no idea either. But the satisfaction at the end of finally figuring it out after working so hard is worth it. This is basically my line of thinking after I started to understand the teardown of the AVR and also, in general, the hardware platform. This challenge wants you to enter a password via serial. And in the last video we learned how that works at the low level. This means we want to look for locations where this special address 0xc6 is written or read.
Because this is where the individual characters of the serial interface will be handled. We can open Jumpy with

radare2

, tell it that it is avr and we can analyze it with aaa. With pD we can print the disassembly for a specific length, so we can simply print the disassembly of the entire binary and use the tilde, which is basically just grep, to filter out lines that do something with 0xc6. And we find here two lines that load that address into register r24. Let's take a look at these two locations. The first one is at 0xea. Then we can print the disassembly of the function to which this address belongs.
PDF. Or we can also show a graphical view with VV at that location. But for now I'll stick with the linear view. We can also enable assembler descriptions as comments which are very useful for reading AVR for the first time. Here you can see that 0xc6 moves to r24. And that is? Doesn't it load from that address, does it just load the value into that register? Here's another weird AVR thing that took me a while to figure out. The value in r24 is moved to register 30. Now that register contains 0xC6. And then whatever value is in register 18, it is moved and stored in the memory location referenced by Z.
We can check the ATmega datasheet again and read about general purpose registers and we learn that Z simply refers to register 30. and register 31 combined. And 30 contains 0xc6, so everything in register 18 will be written to the UART buffer register and sent via serial. This is the location where a character is sent to the computer. These X,Y Z registers make reading so annoying, until you can memorize which registers refer to what. Anyway, let's take a look at the other location. Here we can basically see the same thing, 0xc6 is loaded into register 24, which is then loaded into register 30, which is then used in the indirect load instruction to load the byte that is contained in the serial buffer.
So this is where our entry reads. We can also follow the cross reference, which called this function and find it here. where if we look at the graph, we see that it apparently checks if the character is a carriage return or a new line. And if not, it will read the next character. Great, so that function seems to read the Input: line. So password verification can't be far away, but the problem is that this function has no external references. I have no idea where this is coming from or where the execution will go after I return.
So now I choose to debug this with gdb and simavr. First I start simavr, or to be more precise, the simduino example and use Jumpy in the converted hexadecimal format. This will also spawn a gdb server. Now I can use avr-gdb to connect to this server and we can see that the execution is still waiting for startup. Now we can set a breakpoint at 0x2ba, which is just before loading the value from the serial buffer. Then I can use picocom to simulate serial input and I type a capital A, which will trigger the breakpoint here.
Slowly moving forward I see 0x41, which is our character A. So it works. Well not really. Turns out this is buggy. After reading the first character it will keep reading garbage in an infinite loop and I couldn't figure out why. If you sent a new line immediately, it would read it and continue, but you wouldn't be able to enter any more data. At this point, thank you very much bvernoux from IRC, that person created an issue on simavr and explained it. It turns out that there is a bug where the emulator does not reset the flags indicating that new data has arrived, so it constantly thinks that the serial buffer is full and reads from it.
This patch fixes this. So modifying the simavr sources, completing it again and now it works like a charm. Now we can enter a longer string. So the breakpoint I set here is immediately after reading the next character, so we can see that I entered the alphabet ABCDEF and so on. We can continue until reading the carriage return by pressing enter. Now we can carefully step forward, pass the check for a carriage return and return from this function. So where do we go back to? It turns out that radare didn't find this function, but that's not a problem, we can define a new one here.
When we look at the graph view, we can immediately see a loop. And when the loop ends, compare with 0xd. I checked this and it turns out that it's basically counting the number of characters in our input string, so I figured we need 13 characters for our input. If we look at what happens if this comparison succeeds or fails, we see that when it fails it will load a value from SRAM and OR it. Which means it will definitely set a 1 there. And then type this value again. If the comparison had been good, I wouldn't do it.
I didn't realize that right away when I first looked at this, but then it became pretty clear that this will set an error flag. Basically, if you do something wrong with your input, that flag will be set and it will fail. But although our contribution will clearly eventually fail, we can take it a step further. And see where this feature will return. And it turns out that, what a surprise, it is not a function identified by

radare2

either. We can immediately see that there is also a comparison that looks very similar to the previous one. If the comparison fails, we set the error flag.
And if we debug this step by step we can see that from this location SRAM 0x135 and 0x136 loads the letters “H” and “I”. Then add the ascii values ​​and compare the result with 0xd3. And this is our first big breakthrough on how the password is verified. If you continue down this path you will find more and more features like this. These all take two characters, add them, and check the result. There are many of the same functions, just with different character offsets. And when you continue taking steps you also find a second type of these, which multiply the ascii values ​​of two characters.
Otherwise it's the same. Check the result of the multiplication. So the first step is to find all of these blocks, which I did by printing a linear disassembly for the entire binary, copying it into sublime, and using a huge, ugly regex to find all the functions that have this pattern of loading two values ​​into r24. Then I copy them to another file and start cleaning them up until I extract the important pieces. Basically now we know all the rules. These two characters have to have this sum. These two characters have to have this sum. These two characters multiplied have to have this result and so on.
A super easy and perfect tool to solve this now is to use an SMT solver like Z3. I won't go into too much detail since the video is already very long, but I'm basically just writing a couple of constraints that express exactly that logic. The variable values ​​represent our input character array and I say, the sum of character 7 and character 8 has to be 0xd3. The multiplication of characters 4 and 5 must be 0x122f. I write all that and let z3 solve this puzzle for me and I get a string that follows all these rules. Give it to me.
Wow...I was so happy when that showed up. I connect to the dashboard, submit the password and get the flag. Another 100 points. Damn, this 100 point challenge was hard for me. I don't even want to know how hard a 300 point challenge is.

If you have any copyright issue, please Contact