>
r2 tip: You can save a project using Ps [file], and load one using Po [file]. With the -p option, you can load a project when starting r2.
We can list all the strings r2 found:
[0x00400720]> fs strings
[0x00400720]> f
0x00400e98 7 str.Wrong_
0x00400e9f 27 str.We_are_in_the_outer_space_
0x00400f80 18 str.Size_of_data:__u_n
0x00400f92 23 str.Such_VM__MuCH_reV3rse_
0x00400fa9 16 str.Use_everything_
0x00400fbb 9 str.flag.txt
0x00400fc7 26 str.You_won__The_flag_is:__s_n
0x00400fe1 21 str.Your_getting_closer_
[0x00400720]>
r2 tip: r2 puts so called flags on important/interesting offsets, and organizes these flags into flagspaces (strings, functions, symbols, etc.) You can list all flagspaces using fs, and switch the current one using fs [flagspace] (the default is *, which means all the flagspaces). The command f prints all flags from the currently selected flagspace(s).
OK, the strings looks interesting, especially the one at 0x00400f92. It seems to hint that this crackme is based on a virtual machine. Keep that in mind!
These strings could be a good starting point if we were talking about a real-life application with many-many features. But we are talking about a crackme, and they tend to be small and simple, and focused around the problem to be solved. So I usually just take a look at the entry point(s) and see if I can figure out something from there. Nevertheless, I'll show you how to find where these strings are used:
[0x00400720]> axt @@=`f~[0]`
d 0x400cb5 mov edi, str.Size_of_data:__u_n
d 0x400d1d mov esi, str.Such_VM__MuCH_reV3rse_
d 0x400d4d mov edi, str.Use_everything_
d 0x400d85 mov edi, str.flag.txt
d 0x400db4 mov edi, str.You_won__The_flag_is:__s_n
d 0x400dd2 mov edi, str.Your_getting_closer_
r2 tip: We can list crossreferences to addresses using the axt [addr] command (similarly, we can use axf to list references from the address). The @@ is an iterator, it just runs the command once for every arguments listed.
The argument list in this case comes from the command f~[0]. It lists the strings from the executable with f, and uses the internal grep command ~ to select only the first column ([0]) that contains the strings' addresses.
.main
As I was saying, I usually take a look at the entry point, so let's just do that:
[0x00400720]> s main
[0x00400c63]>
r2 tip: You can go to any offset, flag, expression, etc. in the executable using the s command (seek). You can use references, like $$ (current offset), you can undo (s-) or redo (s+) seeks, search strings (s/ [string]) or hex values (s/x 4142), and a lot of other useful stuff. Make sure to check out s?!
Now that we are at the beginning of the main function, we could use p to show a disassembly (pd, pdf), but r2 can do something much cooler: it has a visual mode, and it can display graphs similar to IDA, but way cooler, since they are ASCII-art graphs :)
r2 tip: The command family p is used to print stuff. For example it can show disassembly (pd), disassembly of the current function (pdf), print strings (ps), hexdump (px), base64 encode/decode data (p6e, p6d), or print raw bytes (pr) so you can for example dump parts of the binary to other files. There are many more functionalities, check ?!
R2 also has a minimap view which is incredibly useful for getting an overall look at a function:
r2 tip: With command V you can enter the so-called visual mode, which has several views. You can switch between them using p and P. The graph view can be displayed by hitting V in visual mode (or using VV at the prompt).
Hitting p in graph view will bring up the minimap. It displays the basic blocks and the connections between them in the current function, and it also shows the disassembly of the currently selected block (marked with @@@@@ on the minimap). You can select the next or the previous block using the ** and the ** keys respectively. You can also select the true or the false branches using the t and the f keys.
It is possible to bring up the prompt in visual mode using the : key, and you can use o to seek.
Lets read main node-by-node! The first block looks like this:
We can see that the program reads a word (2 bytes) into the local variable named local_10_6, and than compares it to 0xbb8. Thats 3000 in decimal:
[0x00400c63]> ? 0xbb8
3000 0xbb8 05670 2.9K 0000:0bb8 3000 10111000 3000.0 0.000000f 0.000000
r2 tip: yep, ? will evaluate expressions, and print the result in various formats.
If the value is greater than 3000, then it will be forced to be 3000:
There are a few things happening in the next block:
First, the "Size of data: " message we saw when we run the program is printed. So now we know that the local variable local_10_6 is the size of the input data - so lets name it accordingly (remember, you can open the r2 shell from visual mode using the : key!):
:> afvn local_10_6 input_size
r2 tip: The af command family is used to analyze functions. This includes manipulating arguments and local variables too, which is accessible via the afv commands. You can list function arguments (afa), local variables (afv), or you can even rename them (afan, afvn). Of course there are lots of other features too - as usual: use the "?", Luke!
After this an input_size bytes long memory chunk is allocated, and filled with data from the standard input. The address of this memory chunk is stored in local_10 - time to use afvn again:
:> afvn local_10 input_data
We've almost finished with this block, there are only two things remained. First, an 512 (0x200) bytes memory chunk is zeroed out at offset 0x00602120. A quick glance at XREFS to this address reveals that this memory is indeed used somewhere in the application:
:> axt 0x00602120
d 0x400cfe mov edi, 0x602120
d 0x400d22 mov edi, 0x602120
d 0x400dde mov edi, 0x602120
d 0x400a51 mov qword [rbp - 8], 0x602120
Since it probably will be important later on, we should label it:
:> f sym.memory 0x200 0x602120
r2 tip: Flags can be managed using the f command family. We've just added the flag sym.memory to a 0x200 bytes long memory area at 0x602120. It is also possible to remove (f-name), rename (fr [old] [new]), add comment (fC [name] [cmt]) or even color (fc [name] [color]) flags.
While we are here, we should also declare that memory chunk as data, so it will show up as a hexdump in disassembly view:
:> Cd 0x200 @ sym.memory
r2 tip: The command family C is used to manage metadata. You can set (CC) or edit (CC) comments, declare memory areas as data (Cd), strings (Cs), etc. These commands can also be issued via a menu in visual mode invoked by pressing d.
The only remaining thing in this block is a function call to 0x400a45 with the input data as an argument. The function's return value is compared to "*", and a conditional jump is executed depending on the result.
Earlier I told you that this crackme is probably based on a virtual machine. Well, with that information in mind, one can guess that this function will be the VM's main loop, and the input data is the instructions the VM will execute. Based on this hunch, I've named this function vmloop, and renamed input_data to bytecode and input_size to bytecode_length. This is not really necessary in a small project like this, but it's a good practice to name stuff according to their purpose (just like when you are writing programs).
:> af vmloop 0x400a45
:> afvn input_size bytecode_length
:> afvn input_data bytecode
r2 tip: The af command is used to analyze a function with a given name at the given address. The other two commands should be familiar from earlier.
After renaming local variables, flagging that memory area, and renaming the VM loop function the disassembly looks like this:
So, back to that conditional jump. If vmloop returns anything else than "*", the program just exits without giving us our flag. Obviously we don't want that, so we follow the false branch.
Now we see that a string in that 512 bytes memory area (sym.memory) gets compared to "Such VM! MuCH reV3rse!". If they are not equal, the program prints the bytecode, and exits:
OK, so now we know that we have to supply a bytecode that will generate that string when executed. As we can see on the minimap, there are still a few more branches ahead, which probably means more conditions to meet. Lets investigate them before we delve into vmloop!
If you take a look at the minimap of the whole function, you can probably recognize that there is some kind of loop starting at block