meow.n, running file command on it said: NekoVM bytecode (418 global symbols, 323 global fields, 35212 bytecode ops)
flag_enc.png which was a seemingly encrypted PNG image:
I googled NekoVM which lead me to https://nekovm.org/ which said:
Neko is a high-level dynamically typed programming language. It can be used as an embedded scripting language. It has been designed to provide a common runtime for several different languages. Learning and using Neko is very easy. You can easily extend the language with C libraries. You can also write generators from your own language to Neko and then use the Neko Runtime to compile, run, and access existing libraries.
Okay, something, something, yet again some unknown byte code. I googled for disassembler, but it turned out the official toolchain includes one: https://nekovm.org/doc/tools/#dumping-bytecode , so I run
nekoc -d meow.n
and I got a 620k meow.dump file. Uhh, that’s a lot of code. I searched for xor in the dump as I usually do if I start working with a lot of code, and I found a few interesting strings around the xoring code lines like get_pixel, set_pixel, setSeed, random_set_seed, file_open, file_read, AccInt 13337.
I also found some usage information: Usage: meow INPUT OUTPUT.
As the bytecode was new to me, I googled some description on what these “operators” mean, and I found this Neko Bytecode Explained page which would be a really good resource I think.
BUT I thought: okay, so it reads a PNG and generates some random, uses some magic seeds and contants like 13337 and writes an another PNG. I could probably find out the exact algorithm from the byte code statically but that would take ages, so let’s analyze it dynamically instead.
So I encrypted a full black and white 768x768 PNG (just like the original). Actually I did it both twice and got the exact same result (for the same color), and I got the inverse for the other, so I could be sure that the algorithm kind of static and there is no random stored in the encrypted file.
Black
was encrypted to
White
was encrypted to
Flag (first try)
Okay, the result was similar to our encrypted flag, so I simply xored the black_enc.png with flag_enc.png and I got this:
Wow that is almost a readable flag, but it looks like the columns were mixed.
Then I encrypted images with unique columns and decrypted with my method and result seemed consistent:
Color
was encrypted to
Unique colors
was encrypted to
Flag (final)
So I simply tried to reorganize the columns in the same order as it changed on my test file and I got the original image with the flag actually:
I put the binary into IDA and while I waited for the analysis the finish, I dynamically run the binary on a few input cases.
My initial observations were the followings:
It probably encrypts the input data and the output.txt contains the encrypted flag
It used 8-byte blocks
Seemingly the block’s position mattered, so it was not a simple ECB mode.
01234567 was encrypted to 4e6fa474bbf4eb00 if it was the first input block, but it was encrypted to a1eb35de298e9ecd if it was the second input block
If I changed a byte in the block it did not change any other block, so the previous input / output block did not influence the current one. So it was not CBC, OFB or CFB block cipher mode, but probably CTR or similar
There was padding in place, if I used a 7-byte input then I got a 8-byte output, but if I used a 8-byte input then I got a 16-byte output.
Here is an example which illustrates the situation (the | sign was not part of the input / output, I just put there to make it easier to see the block boundaries):
Meanwhile IDA analyzed the binary and I saw there are a lot of functions with symbol names like graal_attach_thread. Google for this directed me to this page: https://github.com/oracle/graal/blob/master/substratevm/C-API.md. Looks like the binary was created with GraalVM which is yet again some craziness (like the NekoVM) I heard about before, but I never really tried out.
So my teammates tried to play with the GraalVM toolchain, compile an empty app, BinDiff it with the challenge binary and try to figure out which parts of the binary contained framework code and where is the encryption.
Meanwhile I basically binary searched the binary with GDB (+pwndbg), searching for my test input strings / encrypted output in the memory and putting memory read/write watchpoints to these addresses to find out on which part of the code the encryption takes place. Without ASLR, the binary run the same way every execution, every memory address was the same, etc, so this technique worked quite well.
It turned out fast that this part of the binary is the interesting one:
I saw from dumping the memory / registers that it breaks 8-byte input / output blocks into two 4-byte ones. So the whole encryption looked similar to a Feistel cipher to me, so I reversed the encryption the same way as illustrated here:
I got this decryption code which worked great on the first block:
I tested some of my test inputs and it could decrypted them which was a good sign!
I also tried to decrypt the output.txt’s first block and I the following plaintext which was also really good sign :) :
TWCTF{Fa
So tried to find out the padding and block cipher mode, but I did not find anything. But meanwhile I found out that the magicConstants array (which I thought were constants indeed) changes with every new block and it does not depend on the input, so I simply dumped this array out from gdb with the following command x/32wx 0x78e068 for every block :D. This actually worked and I could decrypt the whole output.txt and
I got the flag:
TWCTF{Fat3_Gr4nd_Ord3r_1s_fuck1n6_h07}
This was my complete code to decrypt the flag (C#):
The challange implemented a custom printf function which was called on our two inputs then the program exited. This opened a format string vulnerability, which made it possible to leak out important stack/libc/etc base addresses, but %n was not implemented, so we could not write the memory.
Fortunately for us, it used an uncontrolled alloca aka. sub rsp, rax where the parameter was the predicted output buffer length which could be controlled by us using constructs like %1000d which generated a 1000 byte length buffer on the stack with alloca.
This made possible to set rsp to any lower memory address, and even write into libc’s memory.
Somehow a bunch of pages which included also pointers were not read-only on Ubuntu 19.04 with libc 2.29. This was a weird behavior as the situation was much better on Ubuntu 18.04… I don’t know why they changed it, but whatever, this made the exploitation much easier (possible)!
I tried to overwrite a bunch of targets with OneGadget RCE, but at the end I replaced _IO_cleanup in the libc_atexit array which was called at the exit :D
Last weekend I played on the Google CTF 2018 Quals which was one of the best CTFs I played recently. They separated the easy challenges into a “beginner’s quest” so we got only the medium-hard ones. I think this was a really good choice and made the whole CTF experience more pleasant. Also as with other CTFs, changing to dynamic scoring was also a welcomed change.
I started solving all the web challenges as our web guy was missing on Saturday and nobody else is really into web, and as I am kind of a jack of all trades when comes to CTFs, I looked into the web ones. At the end (I mean at the really end but I will talk about this later) we solved all the web challenges, including BBS which was only solved by 3 teams by the end of the CTF.
Introduction to the challenge
BBS was… well… a BBS (wikipedia article) in a form of really old school website.
The functionality of the website was really simple: you could register (Home > New), you could login (Home > Open), contact the admin (Home > Contact - this menu did not do anything, but more about this later):
After registering and logging in you could post to your board:
You could also modify your profile settings:
For avatar images only PNG files were accepted (the file type mattered, not the extension) but they were resized to 64x64 PNGs (in every case, even if you uploaded a PNG with the exact same size). The resized PNG was saved to https://bbs.web.ctfcompetition.com/avatar/<md5_hash_of_the_resized_png>
Let’s look under the hood!
So after the other web challenges (we solved this after the other 4 web ones) I first checked the headers, especially the Content Security Policy header.
This means if we find an XSS then we can inject <script> tags and execute code (that’s the unsafe-inline), but only on the BBS challenge domain (because of the self) and we cannot communicate with other domains (eg. I could not download files from my domain).
We could find there that although the Contact menu item did not do anything we still have a report functionality which usually means we are talking about an XSS challenge or at least an XSS vulnerability is expected:
We did not have to search too much to find some clues what is expected from us to solve the challenge: you could link posts into other posts and when you hovered mouse above the ‘quote’ you could view the other post’s content in an iframe:
In the iframe the /post?p=/ajax/post/4614877532389376 endpoint was called which executed the following logic:
As the url parameter was an input from the query string we could modify the any other URL!
Let’s change it to our website (eg. kt.gy) and serve some nasty payload!
Oh… wait… we cannot because of the CSP…
((Yes, I know I did not update my Chrome in time, but I promise I will, okay???))
As the BBS server HTML-encoded every post’s response (after making them also l33tsp43k), we had to find some other way to inject our content.
There is the avatar upload of course, but we cannot make a PNG-Base64 polyglot as the code called a base64 decode (atob) before injecting into the site as HTML.
But if we look at the code a little bit more we can see that the query string parsing is using a module (qs) which can create objects if we use query string in the format like this:
This gives us the following object structure as q:
jQuery’s ajax function although can accept an URL as the first parameter, but it can also accept a settings object where we can parameterize various aspects of the request including which URL to download, but we can also modify the request headers with the headers property. And there is this little header called Range where we can tell the server which part of the file we want to download (among others, this makes possible to continue downloading interrupted downloads).
I hope you see where we are going… ;)
That moment when you feel you will solve this challenge soon
Yep, we can actually upload an avatar image and request just the part of it as our payload. :)
First I uploaded a random PNG and wanted to concat my payload character by character by using multiple ranges (which is allowed in the Range header specification), but unfortunately it did not work on the server as I got Internal Server Errors. So I had to upload a PNG file which contained my payload contiguously.
If I could make that I could just call the /post endpoint like below and my payload would have been executed as $('#post').html(atob(text)); would interpret it as HTML and my <script> tag inside the HTML would execute:
Now that we have to URL, we just have to send it to admin, right?
This allows us to only send post IDs, but of course we can call the /report endpoint directly:
Hmmm the server responds to this request with Invalid post response. That’s unfortunate.
On the other hand the /admin endpoint does not look too promising: even reporting a valid post does not seem to be working:
But even if this did work, we could not modify the /post?p= parameter.
Actually the solution is easier than thought, the /report endpoint just checks that the url is in format /admin/<id>, so this URL will bypass the filter:
Okay, now that we put together everything, we spent only 1.5 hours solving the challenge till now, we just need a PNG image which contains our base64 payload. It should be simple, right?
That moment when you feel you won’t solve this challenge any time soon
Well yeah, it should be kind of simple. As mentioned earlier the uploaded PNG is resized every time and all the metadata (eg. comment sections) are stripped, only image data is used.
The image data is compressed with zlib. But if you use a high entropy input the compression engine will use the input bytes literally, so you can ‘bypass’ the compression part and inject your payload directly.
Or can you?
I’ve spent the next 3.5 hours to create a PNG which contained my base64 payload. After creating approximately ~1369 PNG files, I gave up. ;)
The main problem was that the PNG encoder tries to minimalize the file size by applying filtering to every row before sending into zlib. And even if I chose any input filtering method for my payload, the PNG encoder always changed it to something else.
I went so mad, I even thought it must be some custom PNG encoder created only for this challenges just to make our job harder. As there was only 1 solve on the challenge that time so I thought this could be even a valid theory…
So I tried to experiment with various tricks to bypass this ‘tricky converter’ like generating such input which will contain my payload for every possible filter mode and stuff like that.
After the competition I spoke with the challenge author (phiber) and he told me this was of course not true, in the opposite of that, he used the most generic conversion code that can be used:
Of course I did not know this at time of solving the challenge, but…
That moment when you’ve just given up
So it was only 30 minutes left from the Google CTF Quals and it was clear I won’t be able to generate a correct PNG in time, but as a last time effort I looked into the code again…
Also I knew from the latest 3.5 hours of trying that I can inject smaller payload as that won’t be changed, but the smallest base64 payload was not small enough. Maybe a raw JS payload? The smallest one I could create was
But how to execute that?
It’s simple! Just ask jQuery to do this for us by passing dataType: "script" to the $.ajax method!
And it worked!
This image:
(you can find the $(atob(location.hash.slice(1))) between offsets 12326 - 12356)
and this URL finally made it possible to execute my XSS:
The final step was just to send this URL to the admin:
This was the encoded payload, but probably others were working as well (but the location= type redirection is important as CSP blocks almost every other way):
These challenges were solved by and the writeups were written by one of my teammates, @ngg.
Integrity
There was an encryption scheme, we had to calculate encrypt(‘admin’) based on that we could encrypt anything other than the string ‘admin’.
The scheme looked like this:
md5 length is exactly a block size, so
This means that
If we cut the first block out from this then the rest is exactly encrypt('admin') with the iv set to md5(...).
Sending this to the server gave me
OneTimePad1 && OneTimePad2
Both tasks were about breaking a random generator. The hardest part was to understand the ideas and math based on the given python codes.
They were based on finite field arithmetics of GF(2)[x]/P(x).
If you do not know what I’m talking about then visit https://en.wikipedia.org/wiki/Finite_field_arithmetic first.
For OneTimePad1 P(x) = x**256 + x**10 + x**5 + x**2 + 1, which meant that the field is GF(2**256) because P(x) is irreducible over GF(2).
For OneTimePad2 P(x) = x**128 + x**7 + x**2 + x + 1, which is the standard GCM polynomial (the field is GF(2**128)).
In OneTimePad1 the process(m, k) function did the math, it calculated (m+k)**2 where m and k are field elements.
In OneTimePad2 the two helper functions were process1(m, k) calculated m*k where m and k are field elements and
process2(a, b) calculated a*b where a and b are 2x2 matrices over this field.
In both problems the random generated worked like the following pseudo-python code:
OneTimePad1
In OneTimePad1 the calculate function returned (process(next, seed), seed) and the flag and two known strings were one time pad encrypted
with its first 3 outputs respectively. Based on the second and third encrypted string we knew the second and third generated random numbers (r2 and r3), we had to calculate the first (r1).
r3 = process(r2, seed) = (r2+seed)**2, this means that seed = sqrt(r3)+r2.
Calculating square roots over binary fields are easy because squaring is a linear function.
(This might seem strange, but (x+y)**2 = x**2 + 2*x*y + y**2 = x**2 + y**2 so it’s really linear).
Being linear means that there exists a matrix S such that x**2 = x*S for every x.
Calculating S is straightforward, it happens to be invertible so we can calculate square roots as well (every field element has exactly 1 square root).
We now know seed and we also know that r1 = sqrt(r2)+seed so we can decrypt the flag:
OneTimePad2
In OneTimePad2 the calculate function was more complicated.
A and B were two known constants, the function calculated M = [[A,B],[0,1]]**seed and returned M[0][0]*next + M[0][1] and it also changed seed to seed**2.
We can see by induction that [[A,B],[0,1]]**seed = [[A**seed, B*(1+A+A**2+...+A**(seed-1))],[0,1]] = [[A**seed, B*(A**seed - 1)/(A - 1)],[0,1]].
One time pad was used here as well but know we knew the first 4 plaintexts and we had to calculate the following ones.
r2 = A**seed * r1 + B*(A**seed - 1)/(A-1)
To get seed from this we first calculated A**seed as ((A-1)*r2 + B)/((A-1)*r1 + B) and then had to calculate the discrete logarithm
seed = Log(A, A**seed).
This last part was the hardest for me as I couldn’t get Sage to solve it (I tried in lots of different ways but it always threw NotImplementedExceptions…).
After the competition I’ve read hellman’s writeup, he found a way to do this in Sage.
I knew that Mathematica cannot do this either, so I had no better idea than to look for any implementation on the web and read articles on how solve this if I have to implement my own.
Fortunately I found some promising references in Magma’s documentation and I could solve this with the free online version at http://magma.maths.usyd.edu.au/calc/ with the following script:
After this we knew r1 and seed, so I just modified the original code to use these and to encrypt the ciphertext instead (one time pad is an involution).