The challenge and its firmware could be accessed in this link. Basically, this challenge is all about password guess against authentication. This firmware is built for arduino boards which run on atmega328p processor.
I do know some people solved it with a great amount of reverse engineering effort; some even mystified the use of z3 solver to distract others from the focus on reverse engineering. That is to say, once one understands how this authentication is implemented, writing a z3 script is rather normal and natural.
Manual analysis
Ghidra shit
Honestly, I got stuck in reverse engineering its auth logic at the beginning since pseudo code shown in ghidra’s decompiler for avr8 architecture is very non-understanable. For example, a function involves checking input_10 * input_11 == 0x2873
, however, ghidra’s decompilation makes things look so painful. Beware that my ghidra version is ghidra_11.0.2_PUBLIC_20240326
.
uint check_10_11(void){
R20 = PHY_CC_CCA;
R21 = 0;
R18 = CCA_THRES;
R19 = 0;
R1R0 = (uint)PHY_CC_CCA * (uint)CCA_THRES;
R25R24._1_1_ = (char)((uint)R1R0 >> 8);
R25R24._0_1_ = (byte)R1R0;
R1R0 = 0;
R25R24._1_1_ = R25R24._1_1_ - (((byte)R25R24 < 0x73) + '(');
if ((byte)R25R24 == 0x73 && R25R24._1_1_ == 00) {
R25R24 = CONCAT11(IRQ_STATUS,IRQ_MASK) | 0x800;
IRQ_STATUS = R25R24._1_1_;
}
else {
IRQ_STATUS = 0;
IRQ_MASK = 0;
}
return R25R24;
}
I don’t know how it looks like in IDA because I have no money on its expensive license. Feel it on your own..
Simavr
Although reverse enginnering all check functions is possible in time, I do not want to sacrifice a great amount of my time that should belong to a cup of coffee and outdoor activities. So, try whatever to get it run. Simavr is an AVR simulator that allows me to run this atmega328p firmware and supports execution trace as well as debugging. In its example
folder, its base simulator run_avr
has been wrapped around with proper configuration to make an arduino emulator simduino
. This configuration involves frequency, processor, and uart pty connection.
user@b09bd05ddefb:~$ ./simavr/examples/board_simduino/obj-x86_64-linux-gnu/simduino.elf -d jumpy/jumpy.hex -t
00ca: ldi r25, 0x00
->> r25=00
00cc: ld r18, (Y+1[030b])=[01] [Stack]
->> r18=0d
00ce: movw ZL:ZH, r24:r25[00c6]
->> ZL=c6 ZH=00
00d0: st (Z+0[00c6]), r18[0d] io:c6
Another shell for picocom that handles uart communication with the emulated firmware.
./picocom/picocom /tmp/simavr-uart0 --echo --omap crcrlf
Input: 1234
Better luck next time!
Note that I made some changes in simavr/examples/board_simduino/simduino.c
to enable instruction trace.
// even if not setup at startup, activate gdb if crashing
avr->gdb_port = 1234;
if (debug) {
avr->state = cpu_Stopped;
avr_gdb_init(avr);
}
// enable instruction trace
if (trace)
avr->trace = 1;
However, I can’t figure out execution flow, e.g., which check function is executed first since the generated log is so messy and length and makes it so hard to determine the order of two adjacent check functions.
Debugging with simavr
Fortunately, simavr offers users a gdb debug portal. I located user inputs in memory and set up watchpoints on them.
Type [C-a] [C-h] to see available commands
Terminal ready
Input: whereismyinput
Turn to gdb and take a look at memory layout. (use info mem
because info proc mappings
is not supported in this target/architecture). Print all strings in the sram section then. Note that avr-gdb
intentionally places sram at 0x8000000. Find more details in this link.
AVR_IMEM_START = 0x00000000, /* INSN memory */
AVR_SMEM_START = 0x00800000, /* SRAM memory */
Now connect to gdbserver and check basic info:
(gdb) target remote localhost:1234
(gdb) c
Continuing.
(gdb) info mem
Using memory regions provided by the target.
Num Enb Low Addr High Addr Attrs
0 y 0x00000000 0x00008000 flash blocksize 0x80 nocache
1 y 0x00800000 0x00800900 rw nocache
(gdb) find /b 0x800100, +1000, 0x77, 0x68, 0x65, 0x72
0x80013e
(gdb) x/10s 0x800100
0x800100: "\r\nFLAG:D0_you_3ven_ROP?"
0x800118: "\r\n"
0x80011b: "\r\nBetter luck next time!\r\n"
0x800136: "Input: "
0x80013e: "whereismyinput"
Put watchpoints on those addresses and continue execution. I only present parts of logs due to too many hittings:
info b
Num Type Disp Enb Address What
2 read watchpoint keep y *(char*)0x80013e
breakpoint already hit 1 time
3 read watchpoint keep y *(char*)(0x80013e+1)
breakpoint already hit 1 time
4 read watchpoint keep y *(char*)(0x80013e+2)
breakpoint already hit 1 time
5 read watchpoint keep y *(char*)(0x80013e+3)
breakpoint already hit 1 time
6 read watchpoint keep y *(char*)(0x80013e+4)
breakpoint already hit 1 time
7 read watchpoint keep y *(char*)(0x80013e+5)
breakpoint already hit 1 time
8 read watchpoint keep y *(char*)(0x80013e+6)
breakpoint already hit 1 time
9 read watchpoint keep y *(char*)(0x80013e+7)
breakpoint already hit 1 time
10 read watchpoint keep y *(char*)(0x80013e+8)
breakpoint already hit 1 time
11 read watchpoint keep y *(char*)(0x80013e+9)
breakpoint already hit 1 time
12 read watchpoint keep y *(char*)(0x80013e+10)
breakpoint already hit 1 time
13 read watchpoint keep y *(char*)(0x80013e+11)
breakpoint already hit 1 time
14 read watchpoint keep y *(char*)(0x80013e+12)
breakpoint already hit 1 time
Value = 49 '1'
0x00000326 in ?? ()
(gdb) info r $pc
pc 0x193 0x326
(gdb) c
Continuing.
Hardware read watchpoint 3: *(char*)(0x80013e+1)
Value = 50 '2'
0x00000326 in ?? ()
(gdb) info r $pc
pc 0x193 0x326
(gdb) c
Continuing.
Hardware read watchpoint 4: *(char*)(0x80013e+2)
Value = 52 '4'
0x00000326 in ?? ()
(gdb) info r $pc
pc 0x193 0x326
(gdb) c
Continuing.
From now on, I could easily idenfity which function deals with some specific characters. Basically, this authentication will first check the length of user inputs. If this is satisfied, then run them through further check functions. Overall, the analyzed ghidra zip file is here.
Write a z3 script
I demonstrate my exploit script straightaway because there is no point to discuss those detals anymore and this post aims for automated analysis. Since I already know the required length of user inputs is 13, I create 13 symbolic variables and apply constraints resolved from each check function on them.
import z3
from z3 import *
input_0 = Real("0")
input_1 = Real("1")
input_2 = Real("2")
input_3 = Real("3")
input_4 = Real("4")
input_5 = Real("5")
input_6 = Real("6")
input_7 = Real("7")
input_8 = Real("8")
input_9 = Real("9")
input_10 = Real("10")
input_11 = Real("11")
input_12 = Real("12")
s = Solver()
s.add(input_0 * input_1 == 0x13b7, input_0 > 0, input_1 > 0)
s.add(input_1 + input_2 == 0xa7, input_1 > 0, input_2 > 0)
s.add(input_2 * input_3 == 0x1782, input_2 >0 , input_3 > 0)
s.add(input_3 + input_4 == 0x92, input_3 > 0, input_4 > 0)
s.add(input_4 * input_5 == 0x122f, input_4 > 0, input_5 > 0)
s.add(input_5 + input_6 == 0xa5, input_5 > 0, input_6 > 0)
s.add(input_6 * input_7 == 0x2b0c, input_6 > 0, input_7 > 0)
s.add(input_7 + input_8 == 0xd3, input_7 > 0, input_8 > 0)
s.add(input_8 * input_9 == 0x15c0, input_8 > 0, input_9 > 0)
s.add(input_9 + input_10 == 0x8f, input_9 > 0, input_10 > 0)
s.add(input_10 * input_11 == 0x2873, input_10 > 0, input_11 > 0)
s.add(input_11 + input_12 == 0xa0, input_11 > 0, input_12 > 0)
s.add(0xd * input_12 == 0x297, input_12 > 0)
if s.check() == sat:
m = s.model()
flag = []
for i in m:
flag.append(chr(m[i].as_long()))
flag.reverse()
print("".join(flag))
Automated analysis
A long journey to cretea an avatar’s target for simduino
The terminology target
in avatar2 is the topmost unit that users use to manipulate targets, such as hardware, emulator, and analysis framework. Creating a new target requires supports of protocols such as gdb protocol, openocd protocol, qmp protocol. Those protocols are designed to be decoupled from target code space in order to keep avatar’s code slim and well-structured. Moreover, protocols, as we have seen similar concepts in network programming, are used to establish connections with endpoints, which ultimately refer to actual targets as mentioned earlier.
In this case, building a new target for simduino
actually underwent a few attempts. At the very beginning, I was planning to incorporate run_avr
this base simulator into the new target because it offers more configuration, such that this target could be compatible with other avr8-based processors. However, I realized that dealing with uart pty connection needs to be done programatically by using uart pty interfaces while extending from run_avr
is pretty cubersome even though I think it is doable. I do not have too much ambition on this project. Much easier a solution is, much sonner I start working on analysis. Thus, the straightaway solution is to incorporate simduino
into a target and I only need to create a file that stores register profiles used for gdb protocol. Moreover, gdb protocol has been supported already, this is why I just need to import register profile to enable programatic control over gdb in avatar2.
Adding AVR as a new arch class of avatar2
Since avr-gdb
uses a pesudo PC register, looking for its register profile is kind of confusing. Furthermore, I am not sure if the index of a register conforms what I see in gdb logs. I did refer to angr
that has already implemented avr8 at the link. Howeverr, I got more confused when noticing the index of pc
as well as no pesudo PC register.
(gdb) info r
...
r31 0x0 0
SREG 0x21 33
SP 0x30a 0x80030a
PC2 0xd2 210
pc 0x69 0xd2
Clearly, I see two pc registers in gdb. The address in the pesudo register pc
refers to file offset while that in PC2
indicates loading offsets.
After researching the gdb implementation of simavr
, I got some useful information:
static int
gdb_write_register(
avr_gdb_t * g,
int regi,
uint8_t * src )
{
switch (regi) {
case 0 ... 31:
g->avr->data[regi] = *src;
return 1;
case 32:
g->avr->data[R_SREG] = *src;
SET_SREG_FROM(g->avr, *src);
return 1;
case 33:
g->avr->data[R_SPL] = src[0];
g->avr->data[R_SPH] = src[1];
return 2;
case 34:
g->avr->pc = src[0] | (src[1] << 8) | (src[2] << 16) | (src[3] << 24);
return 4;
}
return 1;
}
It is interesting that index 34 points to PC2
but there is nothing about the pesudo register pc
in simavr
’s gdb implementation. However, I realize that avr-gdb
maintains this pesudo register on it own somehow rather than acquires values from simavr
. Nevertheless, this is just my assumption since I do not have so much time on researching gdb internals.
Hence, following the order of occurance of registers presented by command info r
, the index of pc
is 35. I already validated the index in avatar2 and successfully read its corresponding value.
Building simduino target upon gdb protocol
Considering what this class needs to do during initialization? It only needs to Popen
a simduino
process with a gdb port open and connects gdb protocol to this port. gdb executable in this case should be avr-gdb
because gdb-multiarch
does not support this architecture. Finally, the target class looks like:
from subprocess import Popen
from distutils.spawn import find_executable as find
from avatar2.protocols.gdb import GDBProtocol
from avatar2.targets import Target, TargetStates
from .target import action_valid_decorator_factory, synchronize_state
from ..watchmen import watch
class SimduinoTarget(Target):
def __init__(
self,
avatar,
simduini_executable,
avr_gdb_executable,
firmware,
gdb_additional_args=None,
gdb_verbose=False,
**kwargs
):
super(SimduinoTarget, self).__init__(avatar, **kwargs)
self.fw = firmware
self.gdb_port = 1234
self.gdb_verbose = gdb_verbose
self.gdb_additional_args = gdb_additional_args if gdb_additional_args else []
if simduini_executable is not None:
isExist = find(simduini_executable)
else:
raise Exception("Simduino executable is not specified")
if not isExist:
raise Exception("Simduino executable is not found")
self.sim_executable = simduini_executable
if avr_gdb_executable is not None:
isExist = find(avr_gdb_executable)
else:
raise Exception("avr-gdb is not specified")
if not isExist:
raise Exception("avr-gdb is not found")
self.avr_gdb_executable = avr_gdb_executable
def shutdown(self):
if self._process is not None:
self._process.terminate()
self._process.wait()
self._process = None
super(SimduinoTarget, self).shutdown()
def init(self):
cmd_line = [self.sim_executable, "-d", self.fw]
with open(
"%s/%s_out.txt" % (self.avatar.output_directory, self.name), "wb"
) as out, open(
"%s/%s_err.txt" % (self.avatar.output_directory, self.name), "wb"
) as err:
self.log.debug("output directory: %s/%s_out.txt" % (self.avatar.output_directory, self.name))
self._process = Popen(cmd_line, stdout=out, stderr=err)
self.log.debug("Simduino command line: %s" % " ".join(cmd_line))
self.log.info("Simduino process running")
self._connect_protocols()
def _connect_protocols(self):
gdb = GDBProtocol(
gdb_executable=self.avr_gdb_executable,
arch=self.avatar.arch,
verbose=self.gdb_verbose,
additional_args=self.gdb_additional_args,
avatar=self.avatar,
origin=self,
)
self.protocols.set_all(gdb)
connect_success = gdb.remote_connect(port=self.gdb_port)
if connect_success:
self.log.info("Connected to remote target")
else:
self.log.warning("Connection to remote target failed")
self.wait()
@watch('TargetCont')
@action_valid_decorator_factory(TargetStates.INITIALIZED, 'execution')
@synchronize_state(TargetStates.RUNNING)
def run(self, blocking=True):
return self.protocols.execution.run()
def cont(self, blocking=True):
if self.state != TargetStates.INITIALIZED:
super(SimduinoTarget, self).cont(blocking=blocking)
else:
self.run()
The following code snippet is to create avr
arch class:
from avatar2.installer.config import GDB_MULTI
from .architecture import Architecture
class AVR(Architecture):
get_gdb_executable = Architecture.resolve(GDB_MULTI)
gdb_name = 'avr'
endian = 'little'
registers = { }
registers.update({'r%d' % i: i for i in range(0, 32)})
pc2_name = 'PC2' # load offset
pc_name = 'pc' # file offset
sr_name = 'SREG'
registers.update({'': 32})
registers.update({'SP': 33})
registers.update({'%s' % pc2_name: 34})
registers.update({'%s' % pc_name: 35})
Run sudo python3 setup.py install
in the topmost folder of avatar2.
Found limitation of simavr and avr-gdb while testing the built target
I manually debugged and confirmed that CPU states are the same between the direct use of avr-gdb
and the new target. Moreover, I Popen
ed picocom to establish an uart pty connection with simduino target to provide inputs and succeeded in locating the address of user inputs, which is the same as I found in avr-gdb
. Notably, I make use of this address to leverage symbolic execution by setting symbolic memory range properly.
Furthermore, in order to gain more information about memory layout, I extended an avatar’s built-in gdb plugin load_memory_mappings
which in turn could resolve memory rannges from info mem
in avr-gdb
. Those information could be useful when initializing an angr target later:
def load_memory_mappings(avatar, target, forward=False, update=True):
if not isinstance(target, GDBTarget) and not isinstance(target, SimduinoTarget):
raise TypeError("The memory mapping can be loaded ony from GDBTargets or SimduinoTarget")
# ...
if avatar.arch == AVR:
raw_data = resp.strip()
mappings = []
avatar.log.debug(raw_data.encode())
while raw_data.count("\n\n") >= 2:
x = raw_data[raw_data.rfind("\n\n"):].split()
if int(x[2], 16) == 0x0:
obj = "flash"
elif int(x[2], 16) == 0x800000:
obj = "sram"
else:
obj = "unknown"
mappings.append(
{
"start": int(x[2], 16),
"end": int(x[3], 16),
"size": int(x[3],16)-int(x[2],16),
"obj": obj
}
)
raw_data = raw_data[:raw_data.rfind("\n\n")-2]
else:
lines = resp.split("objfile")[-1].split("\n")
mappings = [
{
"start": int(x[0], 16),
"end": int(x[1], 16),
"size": int(x[2], 16),
"offset": int(x[3], 16),
"obj": x[4],
}
for x in [y.split() for y in lines if y != ""]
]
# ...
Furthermore, I used this newly built target to write a script that locates the address of user inputs:
from avatar2 import *
import logging
from pwn import *
picocom_cmdline = ["/home/user/picocom/picocom", "/tmp/simavr-uart0", "--echo", "--omap", "crcrlf"]
binary = "/home/user/jumpy/jumpy.hex"
avatar = Avatar(arch=AVR, output_directory="/tmp/jumpy")
gdb = avatar.add_target(SimduinoTarget,
simduini_executable="/home/user/simavr/examples/board_simduino/obj-x86_64-linux-gnu/simduino.elf",
avr_gdb_executable="avr-gdb",
firmware=binary
)
avatar.init_targets()
avatar.load_plugin('gdb_memory_map_loader')
mem_ranges = gdb.load_memory_mappings(update=True)
for interval_obj in mem_ranges.items():
print(interval_obj.data.name)
picocom = process(picocom_cmdline, stdin=PIPE)
gdb.set_breakpoint(0x780)
cur_input = b"1234567"
gdb.cont()
gdb.wait()
print( "PC2: 0x%x" % gdb.read_register("PC2"))
print( "pc: 0x%x" % gdb.read_register("pc"))
serialization = lambda data: "".join([chr(i) for i in data if i != 0])
res = serialization(gdb.rm(0x80013e, 1, 16))
print(res, " Passed: %r" % (res == cur_input.decode()))
sram_base_addr = 0x800000
sram = bytes(gdb.rm(sram_base_addr, 1, 0x900))
print("Found input located at offset 0x%x" % (sram_base_addr + sram.find(cur_input)))
avatar.shutdown()
picocom.kill()
However, it is very annoying to use picocom this way and I am thinking of handling uart output by monitoring writes to its data register UDR0
. I configured a write watchpoint on this register in a script but there was no watchpoint hitting. The assembly snippets of uart_send
as following is about to read a charcter from r18
and send to picocom.
code:000068 20 83 st Z,R18=>UDR0
code:000069 0f 90 pop R0
Instead, I manually set up breakpoints at where an incoming write to UDR0
is about to take place, and confirmed the address of Z
register (a pair of r31
and r30
) points to that of UDR0
. I was stuck for a while. With those doubts, I set up other two watchpoints at 0x8000c6 and 0xc6, respectively. Ran it again and still found no hitting. I modified the value of r31
and r30
, forcing the memory write to somewhere else and setting a watchpoint beforehands. Finally, the watchpoint got triggered. Hence, if a write takes place at register, watchpoints will not be taken over and dealt with.
(gdb) set $r31=0x1
(gdb) set $r30=0x42
(gdb) info r $r31 $r30
r31 0x1 1
r30 0x42 66
(gdb) info r $r18
r18 0x6c 108
(gdb) watch *0x800142
Hardware watchpoint 7: *0x800142
(gdb) x/2i $pc
=> 0xd0: st Z, r18
0xd2: pop r0
(gdb) si
Hardware watchpoint 7: *0x800142
Old value = 0
New value = 108
I had started guessing whether this issue comes from the implementation of memory read in simavr
. By checking its code, I finally found an answer.
static inline void _avr_set_ram(avr_t * avr, uint16_t addr, uint8_t v)
{
if (addr <= avr->ioend)
_avr_set_r(avr, addr, v);
else
avr_core_watch_write(avr, addr, v);
}
void avr_core_watch_write(avr_t *avr, uint16_t addr, uint8_t v)
{
...
if (avr->gdb) {
avr_gdb_handle_watchpoints(avr, addr, AVR_GDB_WATCH_WRITE);
}
avr->data[addr] = v;
...
}
avr_flashaddr_t avr_run_one(avr_t * avr)
{
...
switch (opcode & 0xd008) {
case 0xa000:
case 0x8000: { // LD (LDD) -- Load Indirect using Z -- 10q0 qqsd dddd yqqq
uint16_t v = avr->data[R_ZL] | (avr->data[R_ZH] << 8);
get_d5_q6(opcode);
if (opcode & 0x0200) {
STATE("st (Z+%d[%04x]), %s[%02x] \t%s\n",
q, v+q, AVR_REGNAME(d), avr->data[d], DAS(v + q));
// actual memory access happens here
_avr_set_ram(avr, v+q, avr->data[d]);
...
As above code snippets shown, st
instruction will finally call _avr_set_ram
to perform memory access, which either invokes _avr_set_r
if the address points to any of I/O registers or invokes avr_core_watch_write
if the address is in interal SRAM (from 0x800100). Clearly, only avr_core_watch_write
handles gdb watchpoints. Of course, it won’t be so hard to implement this handling in _avr_set_r
.
static inline void _avr_set_r(avr_t * avr, uint16_t r, uint8_t v)
{
...
if (avr->gdb) {
avr_gdb_handle_watchpoints(avr, r, AVR_GDB_WATCH_WRITE);
}
}
Compile simavr
and see if it works.
0x00000000 in ?? ()
(gdb) info b
Num Type Disp Enb Address What
4 breakpoint keep y 0x000000d0
breakpoint already hit 34 times
5 hw watchpoint keep y *0x8000c6
breakpoint already hit 2 times
6 hw watchpoint keep y *0xc6
(gdb) c
Continuing.
Hardware watchpoint 5: *0x8000c6
Old value = 0
New value = 13
0x000000d2 in ?? ()
Hmmm, it is still weird that this watchpoint only happened once. It did not make sense because there are quite a few characters to be printed. It is supposed to take place a few times. For validation, I modified Z register as 0x145, and it hit everytime.
I even enabled simavr’s logging in a verbose level and its gdb module also sent a correct command to avr-gdb
which had no reaction.
The following is logs from simavr
:
Addr 00c6 found watchpoint 0 size 2 type 4 wanted 4
gdb_send_reply '$T0520:21;21:0a03;22:d0000000;watch:8000c6;#39'
gdb_send_reply '$T0520:21;21:0a03;22:d2000000;hwbreak:;#a7'
Hence, my educated guess is on avr-gdb
, which maybe conducts lazy handling of I/O registers because access to them is very intensive and frequent throughout the entire execution. Nonetheless, I do not wanna spend more of time on just handling UART output. So, just keep using picocom. Few more words, I feel like that sim-avr
is a good simulator, but it does not offer a good support in instrumentation (or called monitoring).
Adding angr target
TODO