The Official Radare2 Book — страница 61 из 64

[0d34], and it involves the following nodes:

   • [0d34]

   • [0d65]

   • [0d3d]

   • [0d61]

Here are the assembly listings for those blocks. The first one puts 0 into local variable local_10_4:

And this one compares local_10_4 to 8, and executing a conditional jump based on the result:

It's pretty obvious that local_10_4 is the loop counter, so lets name it accordingly:

:> afvn local_10_4 i

Next block is the actual loop body:

The memory area at 0x6020e0 is treated as an array of dwords (4 byte values), and checked if the ith value of it is zero. If it is not, the loop simply continues:

If the value is zero, the loop breaks and this block is executed before exiting:

It prints the following message: Use everything!" As we've established earlier, we are dealing with a virtual machine. In that context, this message probably means that we have to use every available instructions. Whether we executed an instruction or not is stored at 0x6020e0 - so lets flag that memory area:

:> f sym.instr_dirty 4*9 0x6020e0

Assuming we don't break out and the loop completes, we are moving on to some more checks:

This piece of code may look a bit strange if you are not familiar with x86_64 specific stuff. In particular, we are talking about RIP-relative addressing, where offsets are described as displacements from the current instruction pointer, which makes implementing PIE easier. Anyways, r2 is nice enough to display the actual address (0x602104). Got the address, flag it!

:> f sym.good_if_ne_zero 4 0x602104

Keep in mind though, that if RIP-relative addressing is used, flags won't appear directly in the disassembly, but r2 displays them as comments:

If sym.good_if_ne_zero is zero, we get a message ("Your getting closer!"), and then the program exits. If it is non-zero, we move to the last check:

Here the program compares a dword at 0x6020f0 (again, RIP-relative addressing) to 9. If its greater than 9, we get the same "Your getting closer!" message, but if it's lesser, or equal to 9, we finally reach our destination, and get the flag:

As usual, we should flag 0x6020f0:

:> f sym.good_if_le_9 4 0x6020f0

Well, it seems that we have fully reversed the main function. To summarize it: the program reads a bytecode from the standard input, and feeds it to a virtual machine. After VM execution, the program's state have to satisfy these conditions in order to reach the goodboy code:

   • vmloop's return value has to be "*"

   • sym.memory has to contain the string "Such VM! MuCH reV3rse!"

   • all 9 elements of sym.instr_dirty array should not be zero (probably means that all instructions had to be used at least once)

   • sym.good_if_ne_zero should not be zero

   • sym.good_if_le_9 has to be lesser or equal to 9

This concludes our analysis of the main function, we can now move on to the VM itself.

.vmloop

[offset]> fcn.vmloop

Well, that seems disappointingly short, but no worries, we have plenty to reverse yet. The thing is that this function uses a jump table at 0x00400a74,

and r2 can't yet recognize jump tables (Issue 3201), so the analysis of this function is a bit incomplete. This means that we can't really use the graph view now, so either we just use visual mode, or fix those basic blocks. The entire function is just 542 bytes long, so we certainly could reverse it without the aid of the graph mode, but since this writeup aims to include as much r2 wisdom as possible, I'm going to show you how to define basic blocks.

First, lets analyze what we already have! First, rdi is put into local_3. Since the application is a 64bit Linux executable, we know that rdi is the first function argument (as you may have recognized, the automatic analysis of arguments and local variables was not entirely correct), and we also know that vmloop's first argument is the bytecode. So lets rename local_3:

:> afvn local_3 bytecode

Next, sym.memory is put into another local variable at rbp-8 that r2 did not recognize. So let's define it!

:> afv 8 memory qword

r2 tip: The afv [idx] [name] [type] command is used to define local variable at [frame pointer - idx] with the name [name] and type [type]. You can also remove local variables using the afv- [idx] command.

In the next block, the program checks one byte of bytecode, and if it is 0, the function returns with 1.

If that byte is not zero, the program subtracts 0x41 from it, and compares the result to 0x17. If it is above 0x17, we get the dreaded "Wrong!" message, and the function returns with 0. This basically means that valid bytecodes are ASCII characters in the range of "A" (0x41) through "X" (0x41 + 0x17). If the bytecode is valid, we arrive at the code piece that uses the jump table:

The jump table's base is at 0x400ec0, so lets define that memory area as a series of qwords:

[0x00400a74]> s 0x00400ec0

[0x00400ec0]> Cd 8 @@=`?s $$ $$+8*0x17 8`

r2 tip: Except for the ?s, all parts of this command should be familiar now, but lets recap it! Cd defines a memory area as data, and 8 is the size of that memory area. @@ is an iterator that make the preceding command run for every element that @@ holds. In this example it holds a series generated using the ?s command. ?s simply generates a series from the current seek ($$) to current seek + 80x17 ($$+80x17) with a step of 8.

This is how the disassembly looks like after we add this metadata:

[0x00400ec0]> pd 0x18

; DATA XREF from 0x00400a76 (unk)

0x00400ec0 .qword 0x0000000000400a80

0x00400ec8 .qword 0x0000000000400c04

0x00400ed0 .qword 0x0000000000400b6d

0x00400ed8 .qword 0x0000000000400b17

0x00400ee0 .qword 0x0000000000400c04

0x00400ee8 .qword 0x0000000000400c04

0x00400ef0 .qword 0x0000000000400c04

0x00400ef8 .qword 0x0000000000400c04

0x00400f00 .qword 0x0000000000400aec

0x00400f08 .qword 0x0000000000400bc1

0x00400f10 .qword 0x0000000000400c04

0x00400f18 .qword 0x0000000000400c04

0x00400f20 .qword 0x0000000000400c04

0x00400f28 .qword 0x0000000000400c04

0x00400f30 .qword 0x0000000000400c04

0x00400f38 .qword 0x0000000000400b42

0x00400f40 .qword 0x0000000000400c04

0x00400f48 .qword 0x0000000000400be5

0x00400f50 .qword 0x0000000000400ab6

0x00400f58 .qword 0x0000000000400c04

0x00400f60 .qword 0x0000000000400c04

0x00400f68 .qword 0x0000000000400c04

0x00400f70 .qword 0x0000000000400c04

0x00400f78 .qword 0x0000000000400b99

As we can see, the address 0x400c04 is used a lot, and besides that there are 9 different addresses. Lets see that 0x400c04 first!

We get the message "Wrong!", and the function just returns 0. This means that those are not valid instructions (they are valid bytecode though, they can be e.g. parameters!) We should flag 0x400c04 accordingly:

[0x00400ec0]> f not_instr @ 0x0000000000400c04

As for the other offsets, they all seem to be doing something meaningful, so we can assume they belong to valid instructions. I'm going to flag them using the instructions' ASCII values:

[0x00400ec0]> f instr_A @ 0x0000000000400a80

[0x00400ec0]> f instr_C @ 0x0000000000400b6d

[0x00400ec0]> f instr_D @ 0x0000000000400b17

[0x00400ec0]> f instr_I @ 0x0000000000400aec

[0x00400ec0]> f instr_J @ 0x0000000000400bc1

[0x00400ec0]> f instr_P @ 0x0000000000400b42

[0x00400ec0]> f instr_R @ 0x0000000000400be5

[0x00400ec0]> f instr_S @ 0x0000000000400ab6

[0x00400ec0]> f instr_X @ 0x0000000000400b99

Ok, so these offsets were not on the graph, so it is time to define basic blocks for them!

r2 tip: You can define basic blocks using the afb+ command. You have to supply what function the block belongs to, where does it start, and what is its size. If the block ends in a jump, you have to specify where does it jump too. If the jump is a conditional jump, the false branch's destination address should be specified too.

We can get the start and end addresses of these basic blocks from the full disasm of vmloop.

As I've mentioned previously, the function itself is pretty short, and easy to read, especially with our annotations. But a promise is a promise, so here is how we can create the missing bacic blocks for the instructions:

[0x00400ec0]> afb+ 0x00400a45 0x00400a80 0x00400ab6-0x00400a80 0x400c15

[0x00400ec0]> afb+ 0x00400a45 0x00400ab6 0x00400aec-0x00400ab6 0x400c15

[0x00400ec0]> afb+ 0x00400a45 0x00400aec 0x00400b17-0x00400aec 0x400c15

[0x00400ec0]> afb+ 0x00400a45 0x00400b17 0x00400b42-0x00400b17 0x400c15

[0x00400ec0]> afb+ 0x00400a45 0x00400b42 0x00400b6d-0x00400b42 0x400c15

[0x00400ec0]> afb+ 0x00400a45 0x00400b6d 0x00400b99-0x00400b6d 0x400c15

[0x00400ec0]> afb+ 0x00400a45 0x00400b99 0x00400bc1-0x00400b99 0x400c15

[0x00400ec0]> afb+ 0x00400a45 0x00400bc1 0x00400be5-0x00400bc1 0x400c15

[0x00400ec0]> afb+ 0x00400a45 0x00400be5 0x00400c04-0x00400be5 0x400c15

It is also apparent from the disassembly that besides the instructions there are three more basic blocks. Lets create them too!