var int local_38h @ rsp+0x38
var int local_45h @ rsp+0x45
var int local_46h @ rsp+0x46
var int local_47h @ rsp+0x47
var int local_48h @ rsp+0x48
Less commonly used feature, which is still under heavy development - distinction between variables being read and written. You can list those being read with afvR command and those being written with afvW command. Both commands provide a list of the places those operations are performed:
[0x00003b92]> afvR
local_48h 0x48ee
local_30h 0x3c93,0x520b,0x52ea,0x532c,0x5400,0x3cfb
local_10h 0x4b53,0x5225,0x53bd,0x50cc
local_8h 0x4d40,0x4d99,0x5221,0x53b9,0x50c8,0x4620
local_28h 0x503a,0x51d8,0x51fa,0x52d3,0x531b
local_38h
local_45h 0x50a1
local_47h
local_46h
local_32h 0x3cb1
[0x00003b92]> afvW
local_48h 0x3adf
local_30h 0x3d3e,0x4868,0x5030
local_10h 0x3d0e,0x5035
local_8h 0x3d13,0x4d39,0x5025
local_28h 0x4d00,0x52dc,0x53af,0x5060,0x507a,0x508b
local_38h 0x486d
local_45h 0x5014,0x5068
local_47h 0x501b
local_46h 0x5083
local_32h
[0x00003b92]>
Type inference
The type inference for local variables and arguments is well integrated with the command afta.
Let's see an example of this with a simple hello_world binary
[0x000007aa]> pdf
| ;-- main:
/ (fcn) sym.main 157
| sym.main ();
| ; var int local_20h @ rbp-0x20
| ; var int local_1ch @ rbp-0x1c
| ; var int local_18h @ rbp-0x18
| ; var int local_10h @ rbp-0x10
| ; var int local_8h @ rbp-0x8
| ; DATA XREF from entry0 (0x6bd)
| 0x000007aa push rbp
| 0x000007ab mov rbp, rsp
| 0x000007ae sub rsp, 0x20
| 0x000007b2 lea rax, str.Hello ; 0x8d4 ; "Hello"
| 0x000007b9 mov qword [local_18h], rax
| 0x000007bd lea rax, str.r2_folks ; 0x8da ; " r2-folks"
| 0x000007c4 mov qword [local_10h], rax
| 0x000007c8 mov rax, qword [local_18h]
| 0x000007cc mov rdi, rax
| 0x000007cf call sym.imp.strlen ; size_t strlen(const char *s)
• After applying afta
[0x000007aa]> afta
[0x000007aa]> pdf
| ;-- main:
| ;-- rip:
/ (fcn) sym.main 157
| sym.main ();
| ; var size_t local_20h @ rbp-0x20
| ; var size_t size @ rbp-0x1c
| ; var char *src @ rbp-0x18
| ; var char *s2 @ rbp-0x10
| ; var char *dest @ rbp-0x8
| ; DATA XREF from entry0 (0x6bd)
| 0x000007aa push rbp
| 0x000007ab mov rbp, rsp
| 0x000007ae sub rsp, 0x20
| 0x000007b2 lea rax, str.Hello ; 0x8d4 ; "Hello"
| 0x000007b9 mov qword [src], rax
| 0x000007bd lea rax, str.r2_folks ; 0x8da ; " r2-folks"
| 0x000007c4 mov qword [s2], rax
| 0x000007c8 mov rax, qword [src]
| 0x000007cc mov rdi, rax ; const char *s
| 0x000007cf call sym.imp.strlen ; size_t strlen(const char *s)
It also extracts type information from format strings like printf ("fmt : %s , %u , %d", ...), the format specifications are extracted from anal/d/spec.sdb
You could create a new profile for specifying a set of format chars depending on different libraries/operating systems/programming languages like this :
win=spec
spec.win.u32=unsigned int
Then change your default specification to newly created one using this config variable e anal.spec = win
For more information about primitive and user-defined types support in radare2 refer to types chapter.
Types
Radare2 supports the C-syntax data types description. Those types are parsed by a C11-compatible parser and stored in the internal SDB, thus are introspectable with k command.
Most of the related commands are located in t namespace:
[0x000051c0]> t?
| Usage: t # cparse types commands
| t List all loaded types
| tj List all loaded types as json
| t
| t* List types info in r2 commands
| t-
| t-* Remove all types
| tail [filename] Output the last part of files
| tc [type.name] List all/given types in C output format
| te[?] List all loaded enums
| td[?]
| tf List all loaded functions signatures
| tk
| tl[?] Show/Link type to an address
| tn[?] [-][addr] manage noreturn function attributes and marks
| to - Open cfg.editor to load types
| to
| toe [type.name] Open cfg.editor to edit types
| tos
| tp
| tpv
| tpx
| ts[?] Print loaded struct types
| tu[?] Print loaded union types
| tx[f?] Type xrefs
| tt[?] List all loaded typedefs
Note that the basic (atomic) types are not those from C standard - not char, _Bool, or short. Because those types can be different from one platform to another, radare2 uses definite types like as int8_t or uint64_t and will convert int to int32_t or int64_t depending on the binary or debuggee platform/compiler.
Basic types can be listed using t command, for the structured types you need to use ts, tu or te for enums:
[0x000051c0]> t
char
char *
int
int16_t
int32_t
int64_t
int8_t
long
long long
...
Loading types
There are three easy ways to define a new type:
• Directly from the string using td command
• From the file using to
• Open an $EDITOR to type the definitions in place using to -
[0x000051c0]> "td struct foo {char* a; int b;}"
[0x000051c0]> cat ~/radare2-regressions/bins/headers/s3.h
struct S1 {
int x[3];
int y[4];
int z;
};
[0x000051c0]> to ~/radare2-regressions/bins/headers/s3.h
[0x000051c0]> ts
foo
S1
Also note there is a config option to specify include directories for types parsing
[0x00000000]> e??~dir.type
dir.types: Default path to look for cparse type files
[0x00000000]> e dir.types
/usr/include
Printing types
Notice below we have used ts command, which basically converts the C type description (or to be precise it's SDB representation) into the sequence of pf commands. See more about print format.
The tp command uses the pf string to print all the members of type at the current offset/given address:
[0x000051c0]> ts foo
pf zd a b
[0x000051c0]> tp foo
a : 0x000051c0 = 'hello'
b : 0x000051cc = 10
[0x000051c0]> tp foo 0x000053c0
a : 0x000053c0 = 'world'
b : 0x000053cc = 20
Also, you could fill your own data into the struct and print it using tpx command
[0x000051c0]> tpx foo 4141414144141414141442001000000
a : 0x000051c0 = AAAAD.....B
b : 0x000051cc = 16
Linking Types
The tp command just performs a temporary cast. But if we want to link some address or variable with the chosen type, we can use tl command to store the relationship in SDB.
[0x000051c0]> tl S1 = 0x51cf
[0x000051c0]> tll
(S1)
x : 0x000051cf = [ 2315619660, 1207959810, 34803085 ]
y : 0x000051db = [ 2370306049, 4293315645, 3860201471, 4093649307 ]
z : 0x000051eb = 4464399
Moreover, the link will be shown in the disassembly output or visual mode:
[0x000051c0 15% 300 /bin/ls]> pd $r @ entry0
;-- entry0:
0x000051c0 xor ebp, ebp
0x000051c2 mov r9, rdx
0x000051c5 pop rsi
0x000051c6 mov rdx, rsp
0x000051c9 and rsp, 0xfffffffffffffff0
0x000051cd push rax
0x000051ce push rsp
(S1)
x : 0x000051cf = [ 2315619660, 1207959810, 34803085 ]
y : 0x000051db = [ 2370306049, 4293315645, 3860201471, 4093649307 ]
z : 0x000051eb = 4464399
0x000051f0 lea rdi, loc._edata ; 0x21f248
0x000051f7 push rbp
0x000051f8 lea rax, loc._edata ; 0x21f248
0x000051ff cmp rax, rdi
0x00005202 mov rbp, rsp
Once the struct is linked, radare2 tries to propagate structure offset in the function at current offset, to run this analysis on whole program or at any targeted functions after all structs are linked you have aat command:
[0x00000000]> aa?
| aat [fcn] Analyze all/given function to convert immediate to linked structure offsets (see tl?)
Note sometimes the emulation may not be accurate, for example as below :
|0x000006da push rbp
|0x000006db mov rbp, rsp
|0x000006de sub rsp, 0x10
|0x000006e2 mov edi, 0x20 ; "@"
|0x000006e7 call sym.imp.malloc ; void *malloc(size_t size)
|0x000006ec mov qword [local_8h], rax
|0x000006f0 mov rax, qword [local_8h]
The return value of malloc may differ between two emulations, so you have to set the hint for return value manually using ahr command, so run tl or aat command after setting up the return value hint.
[0x000006da]> ah?
| ahr val set hint for return value of a function
Structure Immediates
There is one more important aspect of using types in radare2 - using aht you can change the immediate in the opcode to the structure offset. Lets see a simple example of [R]SI-relative addressing
[0x000052f0]> pd 1
0x000052f0 mov rax, qword [rsi + 8] ; [0x8:8]=0
Here 8 - is some offset in the memory, where rsi probably holds some structure pointer. Imagine that we have the following structures
[0x000052f0]> "td struct ms { char b[8]; int member1; int member2; };"
[0x000052f0]> "td struct ms1 { uint64_t a; int member1; };"
[0x000052f0]> "td struct ms2 { uint16_t a; int64_t b; int member1; };"
Now we need to set the proper structure member offset instead of 8 in this instruction. At first, we need to list available types matching this offset:
[0x000052f0]> ahts 8
ms.member1
ms1.member1
Note, that ms2 is not listed, because it has no members with offset 8. After listing available options we can link it to the chosen offset at the current address:
[0x000052f0]> aht ms1.member1
[0x000052f0]> pd 1
0x000052f0 488b4608 mov rax, qword [rsi + ms1.member1] ; [0x8:8]=0
Managing enums
• Printing all fields in enum using te command
[0x00000000]> "td enum Foo {COW=1,BAR=2};"
[0x00000000]> te Foo
COW = 0x1
BAR = 0x2
• Finding matching enum member for given bitfield and vice-versa
[0x00000000]> te Foo 0x1
COW
[0x00000000]> teb Foo COW
0x1