Flare-On Challenge 7 - 10 - Break

tags: ctf challenge write-up binary reverse engineer Malware Analysis
by Firpo7

As a reward for making it this far in Flare-On, we’ve decided to give you a break. Welcome to the land of sunshine and rainbows!

The challenge provides with an executable, it was obfuscated using a technique known as nanomites. Nanomites is an anti-debugging and anti-dumping technique that consists of having two or more processes to perform a task together by interacting with each other using ptraces. In this case, there were three processes where the second ptrace-attach to the first, and the third that will ptrace-attach to the second. The goal of the challenge was to insert the correct flag as input.

Starting the analysis from the main program seems simple.

The flag seems to be sunsh1n3_4nd_r41nb0ws@flare-on.com. Let’s test it:

$ ./break
welcome to the land of sunshine and rainbows!
as a reward for getting this far in FLARE-ON, we've decided to make this one soooper easy

please enter a password friend :) sunsh1n3_4nd_r41nb0ws@flare-on.com
sorry, but 'sorry i stole your input :)' is not correct

Obviously no…. It also says to have stolen our input… Where sorry i stole your input :) comes from?

Trying to run ltrace against the program we can see a fork(), looking at the references it was first called in the init1 function.

In the init1 function, after the fork(), if the process continues in the parent then it modifies the process permissions to let children attaching using a ptrace to him and then do a nanosleep and continuing then to the main.

If the process continues to the child instead it starts a whole new function. In this function, the child process will attach to the parent using ptrace. Once attached, using PTRACE_POKEDATA, it modifies the firsts 2 bytes at the address 0x08048cdb with 0x0f0b. That address is the function charged to check the password and it is modifying the firsts 2 bytes by putting the defined undefined instruction (ud2). What will happen when the father enters to that function will be to raise an Illegal Instruction Signal, that will be handled by the child process as it is ptracing him.

But let’s take a step back. After doing this modification the child will spawn another process, that will be discussed later, and then do a PTRACE_SYSEMU.

This particular ptrace request was designed to stop the execution of the ptraced process when it encounter a syscall, to emulate it. Every time the ptraced process will stop, the waitpid in the debugger will receive 2 bytes in the wstatus parameter, to have a full understanding of what these 2 bytes mean I suggest you take a look at the waitstatus.h source code.

What it is mainly useful for this challenge is that when the parent stops the lower byte of wstatus will be set to 0x7f, while the higher one to the signal that makes it stops. When the parent encounters a syscall it will generate a SIGTRAP signal, that will move the control to the child process, that will handle the syscall in a customized way. The syscall called by the parent process is defined by the eax register that the ptracer will take, along with the other registers, using the PTRACE_GETREGS request.

What is said in the block above will be much clear in this image. Child SIG Handling

The eax of the father will be processed as to obfuscate which will be the code that will handle the syscalls. For example, as it can be seen in the image, the pivot_root syscall code (217) will become 0xe8135594 and it will do, in the memory space of the father, something like:

*ebx = ecx

Returning to the father execution, when it is inside the main function it will print some strings. As explained before, EACH syscall will be handled by the second process so when the father prints something the write syscall will be handled by the child.

Write Syscall handler

Fortunately, the “write handler” will do a write without any customization.

But after the prints, the parent will do a read and if you remember the first test of the program it says it has stolen our input.

Read Syscall handler

As we can see the read will be emulated by a fgets inside the second process. It will save our string in a global variable and then it will decode the string sorry i stole your input :) to be inserted in the read buffer of the father.

After the read, the parent will go into the check flag function that the second process has modified before, so what will happen is that the first process will generate an Illegal Instruction Signal that will be handled by the child as follows:

SIGILL Handler

In this piece of code, the second process is going to copy in the memory space of the father the input taken by the fgets, at the same address. Then it substitutes the eip of the father with the address of another function and set as first parameter the address of the buffer where it has just copied our input.

This means that the execution of the parent will go to check our flag in another function, precisely this one:

Real Check Flag

Before analyzing this function I need to introduce the nice function. Looking at the syscalls of Linux 32-bit there is a nice syscall, but if you test this library function in a program, using strace you will discover that it calls only setpriority and getpriority syscalls. What the handlers of these two calls do together in the child process is returning the negated address of a bunch of bytes decoded. In fact, as we can see in the image above, it is passing the negated address of the return value of the nice at the next function.

Considering that all these parts are static and don’t depend on our input we can try to intercept the memcmp to see what it expect as the first part of our input. The problem now is that the first process cannot be debugged, as it has already a process attached to it.

To bypass this problem I hooked the function using LD_PRELOAD.

int memcmp(const void *s1, const void *s2, size_t n) {
     static int (*memcmp_real)(const void *s1, const void *s2, size_t n) = NULL;
     int ret;

     if (!memcmp_real)
         memcmp_real = dlsym(RTLD_NEXT, "memcmp");

     ret = memcmp_real(s1, s2, n);

     fprintf(stderr, "memcmp(\"%s\", \"%s\", %d) = 0x%x\n", s1, s2, n, ret);

     return ret;

Fun fact, as explained above EACH syscall of the parent process is intercepted and so the fprintf in our memcmp too. Fortunately, it does not change anything but using LD_PRELOAD in this context should be done carefully.

$ LD_PRELOAD=./ch10_preload.so ./break
welcome to the land of sunshine and rainbows!
as a reward for getting this far in FLARE-ON, we've decided to make this one soooper easy

please enter a password friend :) sunsh1n3_4nd_r41nb0ws@flare-on.com
memcmp("sunsh1n3_4nd_r41nb0ws@flare-on.com", "w3lc0mE_t0_Th3_l", 16) = 0xffffffff
sorry, but 'sorry i stole your input :)' is not correct

Perfect! The first part of the flag is w3lc0mE_t0_Th3_l!

The second part of the flag will be checked inside the function at 0x08048f05, in the image above named “TF_checkSecondPart”.

Check Second Part

The first thing done by this function is generating 2 constants, starting from the return value of the nice(0xa4), using the function FUN_0804bfed. Then, in the while, it processes 8 bytes at a time where it has stored the second part of our input. The strange thing here is the while loop of 40000 for 32 bytes of the second part of the flag.

Below the function charged to process the bytes.

Decryption Function

Initialize Buffer

In this part of the executable, many library calls will lead to syscalls, so the interactions between the first and the second process will be a lot.

Another technique to pass the execution to the second process used in this part of the execution it’s by calling a pointer NULLed. The variable named GENERATE_TRAP was charged to generate a SIGSEGV and in the child process it will be handled as follows:


It is an implementation of a loop, where it jumps at the address passed as the first argument and will increment the counter until 0x10, which address was passed as the second parameter. The addresses where the two loops iterate are shown in the images above.

At the end of TF_checkSecondPart there is a call to truncate.

Truncate Handler

What it is doing is storing all the 40000 decrypted bytes in a buffer (local_3f50) and checking how many bytes of the flag we guessed by comparing the result against the bytes at 0x081a5100.

Aware of the while masked as SIGSEGV, it’s more clear that this was a decryption function, probably of a known algorithm. Since I didn’t recognize which algorithm it was I decided to patch the executable in order to let angr to the job to find back the bytes of the second part.

What I have done is firstly nop the fork and the other calls in the init, then changing the call to fake check flag to TF_checkSecondPart and then substituting all the library calls, until the truncate, with the code that the child is going to perform. During this part, the second process will pass the execution to the third one.

To explain better how the patch was performed let’s take as an example the call to pivot_root seen before in assembly.

sub esp, 8      ; align the stack
push ecx        ; this will be the ecx of the syscall
push eax        ; ebx of the syscall
call pivot_root
add esp, 0x10

I have substituted all these instructions with:

mov dword [eax], ecx

nopping all the other bytes.

For bigger parts of the code, such as the mlockall, I had to put a call to that part of the code and reconstructing a stack there.

After all these patches I had only one process doing the checks for the second part. Since the decryption was performed on 8 bytes at a time I instrumented angr and patched the executable to input 8 bytes per execution. To provide to angr a successful address and a failure one I had to put also an “if”.

import angr
import claripy

SUCESS_ADDR = 0x08049cf4
FAILITURE_ADDR = 0x08049d11

project = angr.Project("./break_ANGR")
flag_chars = [claripy.BVS('flag_%d' % i, 8) for i in range(FLAG_LEN)]
flag = claripy.Concat( *flag_chars + [claripy.BVV(b'\n')])

state = project.factory.full_init_state(

for i,k in enumerate(flag_chars):
    if i < 8:
        state.solver.add(k >= 0x20)
        state.solver.add(k <= 0x7f)

simulation_manager = project.factory.simulation_manager(state)

if (len(simulation_manager.found) > 0):
        for found in simulation_manager.found:

Running more than one instance at a time, in about 1 hour, it was able to retrieve all the second part of the flag:


As explained above, during the truncate, it will store all the 40000 bytes in local_3f50, the problem is that this buffer has a size of 16000 bytes! So it was going to overflow overwriting the other variables. Since, normally, the second process will not reach a return, the overwriting of the return pointer should be excluded. Immediately after the while, there is a call to the GENERATE_TRAP variable, which is stored above the buffer and it will be overwritten with 0x08053b70.

Overwritten Address

This address’s inside the decrypted bytes at the offset of the GENERATE_TRAP variable. Dumping that address reveals the last part of code, where it checks for the last part.

During this part, there are no ptraces except for one at the start and one at the end, that could be both nopped for the moment. Once inserted those decrypted bytes in the executable and nopped the ptraces, the call to the fake check could be modified again to jump at this new function. This lets us debug this part of the challenge.

What it is doing in this part is handling big numbers. At the start of the function, it will get 4 bignumbers from the memory and uses them to perform an equation against the last part of our input, to finally check the result.

BigIntegers Equation

What it will do is: RESULT_TO_CHECK == bignum_1

And, as shown in the image above:

RESULT_TO_CHECK = ( last_part_input * bignum_2 ) % bignum_4

The last part of the flag could be retrieved by multiplying the expected result (bignum_1) by the modular inverse of bignum_2 modulo bignum_4.


The flag is: