One of the more interesting aspects of Capture the Flag (CTF) events is the frequent necessity to pick up, learn, and apply various reverse engineering and binary analysis tools to solve difficult challenges. Recently I completed The FireEye FLARE-On 2017 challenges, requiring me to add a few tools to my binary analysis VM. I’d like to share those tools in this blog post, and show how they helped me complete the challenges.
PHP Dynamic Analysis Environment Using XDebug
For the tenth challenge of the FLARE-On 2017 challenges, we were presented with this PHP script.
Essentially, a large, base64-encoded blob is decoded, decrypted with a key, and executed as a second-stage PHP function. The key begins with the MD5 hash of the $o_o variable (this is the flag for the level). Appended to the MD5 hash is the reverse string’s MD5 hash string truncated to the length of the flag. Assuming the flag has a length greater than 0, we can conclude that the key needed is between 32 and 64 characters (2 characters to represent a byte in a hex string).
It’s going to take a bit of manual scripting, a debugger and some intuition to crack the correct key string. So let’s get a development/debugging environment set up, shall we?
I built a development/debugging environment for this challenge in a 64-bit Ubuntu 16.04 LTS virtual machine. First, I installed the PHPStorm IDE using instructions found on the website here. Next, create a project, and copy/paste the shell.php code into it and save it.
I then installed XDebug on the virtual machine using the package manager:
sudo apt-get install php-xdebug
Now that we have a development environment set up and a debugger, we need to be able to have them talk to each other to debug our PHP script. First, we must configure PHP to know about XDebug and it’s required parameters – namely, a port and host to send debug messages to. Navigate to the /etc/php/7.0/cli directory and edit the php.ini file contained therein to include these lines.
These environment variables tell the php process where the XDebug library is (you may need to edit the zend_extension variable accordingly), enable remote debugging, establish the debugging host and port, and identify the key of the IDE we are using to debug (in this case, PHPSTORM). Save the file and go back to PHPStorm so that we can configure it.
In PHPStorm, go to File -> Settings -> Languages & Frameworks -> Debug and ensure that the “XDebug” settings look like the following:
Next, install the XDebug extension for the browser that you will be navigating to the PHP Script with – this extension will communicate with PHPStorm for debugging. I followed the instructions for Google Chrome, here.
Now, we’re ready to start our PHP server, navigate to the page and debug our PHP script. In PHPStorm, set a breakpoint on the first line of the PHP script and then click Run->”Start Listening For PHP Debug Connections.” Next, in a terminal navigate to the PHPStorm project directory and start the server using the following command:
php -S 127.0.0.1:8888
Now, you can navigate to http://127.0.0.1:8888/shell.php and, if all is configured correctly, your breakpoint should be hit. We can now start writing more code and debugging!
The first thing we need to do is to write some crude PHP code to determine the length of the key as well as possible characters. We know that the length of the key string is somewhere between 33 and 64, we also know that the key string consists of only the characters ‘a’ – ‘f’ and ‘0’ – ‘9’ because those characters are the only ones that can appear in a hash digest. We also hope that the plaintext will only be readable characters because it is PHP code, so it will only contain alphanumeric characters, tab, newline, and carriage return. Using this knowledge, we can write some PHP code to print out all possible lengths and possible characters at each position. My code to do this is shown below
This code will only display a length if it has found a position that results in only readable ASCII characters in the plaintext. The only key length which gives a possible character for every position within the key is length 64 – therefore the key length is 64. Below are all the possible characters for each character position in the key (tabulated nicely by the challenge author https://www.fireeye.com/content/dam/fireeye-www/global/en/blog/threat-research/Flare-On%202017/Challenge10.pdf). You can try this code out by simply commenting out the shell.php code in PHPStorm, pasting the code in and refreshing your page (this is why development environments are helpful!).
We can begin constructing the key using these values where the possible characters are unique. Then we can use our debugger to inspect the decrypted PHP code. An example of this is shown below:
We can then identify PHP keywords (such as “b–e64” at index 716) in the decrypted output and use the console and debugging features of PHPStorm to reconstruct the entire key through trial and error and a little bit of intuition – we finally get the key string “db6952b84a49b934acb436418ad9d93d237df05769afc796d067bccb379f2cac.” We can then pull the stage 2 PHP code out from the debugger and put it into it’s own script within the project – I called this “shell2.php,” and it is shown below.
Using the first decrypted blob and breaking at the point before decryption, we can use the debugger console to XOR the ciphertext blob with “<html>” to get the first few bytes of the key – “t_rsaa.” The only mystery remaining is – how do we figure out the exact key length? Well, the answer actually lies in the encrypted blob itself – there is a string of 13 0x0 bytes – indicating that the key and plaintext matched in these places – we can therefore try a key length of 13, and see if we get the correct plaintext. In order to get the plaintext, we pad “t_rsaa” to 13 bytes, use the debugging console and look for keywords in the resultant plaintext string to derive the remaining key bytes.
As we can see in the above debugging session, after padding out the key to 13 bytes, familiar strings start to appear. “Ray” and “heckbo” are more than likely part of the string “Raytraced Checkboard.” This allows us to figure out the key bytes for the rest of the first blob. The key bytes for the first blob are “t_rsaat_4froc”. If we repeat this strategy for the other ciphertext blobs, we get “”hx__ayowkleno” and “3Oiwa_o3@a-.m”. Using these we can reconstruct the flag by taking one character at a time from each string, which yields “th3_xOr_is_waaaay_too_w34k@flare-on.com.”
Atmel AVR Simulation Using simavr
I am always interested in performing binary analysis on architectures that are different than the standard x86/x64/ARM/AArch64 binaries that run on most modern mobile and desktop machines. In the FLARE-On 2017 challenges, I encountered an Atmel AVR binary – more specifically, a .hex file meant to be run on an Arduino board using an ATMega328p processor. Unfortunately, when I was working on the challenge, I did not have an Arduino handy, so I had to find a tool which would allow me to potentially simulate one.
Challenge 9 of the 2017 FLARE-On challenges presents the analyst with a .hex file along with a description that alludes to an Arduino board. The challenge description provides no other background information. The .hex file was a series of ASCII representations of hexadecimal strings, so I copied those into a text editor and found the string “Flare-On 2017 Adruino UNO Digital Pin state:.” The Arduino UNO uses an an 8-bit ATMega328p processor. Knowing this, we can start to build an analysis environment.
For static analysis. the easiest tool I found to use was IDA Pro – a staple of any reverse engineer’s toolkit. IDA Pro is able to ingest the raw .hex file and disassemble it, provided the instruction type is changed to “Atmel AVR” in the “Load a new file” dialog, as shown below.
After that, IDA will prompt the user to select a processor. My version of IDA Pro didn’t have “ATMega328p” as an option, so I went with “ATmega103_L” – same processor family. After that, the AVR opcodes should show up.
For dynamic analysis. I decided to use simavr. Simavr is a Python emulator for Atmel AVR processors. What makes this especially attractive is that it has fully working GDB support – meaning that we can actually debug at the assembly level. The GitHub page for simavr contains installation instructions.
After performing static analysis for a little while in IDA, i came across a function that looked suspiciously like a decryption loop. This is shown below.
A set of 23 of static bytes are inserted into an array pointed at by the “Y” register (Because the ATMega328p is an 8-bit processor, it uses three special memory-access registers called “X,” “Y,” and “Z.” The X register is based on r27:r26 , the Y register is based on r29:r28 and the Z register is based on r31:r30). These bytes are XOR’d with a key byte that is stored in r24 (Z and Y are pointing to the same array) and then the index of the array is added to the byte value. This is then stored in an array pointed to by the X register. After the decryption routine, we see that the value at memory address 0x576 is compared to the ‘@’ character – indicating a correct decryption. We know that FLARE-On challenge flags are in the form of an email address, so we just need to figure out what index in the array of size 23 the ‘@’ character is. Then we can write a routine to brute force the constant key byte and then pull the decrypted flag out of memory. We can use simavr with gdb to do this. First, we start simavr with the following command line.
Then, in a separate terminal, we can attach to the simulator with gdb using the following command:
avr-gdb -ex "target remote:1234"
We can then set breakpoints within the AVR code. First, we set a breakpoint at what IDA identifies as “loc_576” but is actually at 0xAEC in memory. To do this, we have to use the gdb command br *$pc + 0xaec . For some reason, only setting a breakpoint from a pc-relative offset works with simavr (at the beginning of execution, $pc is at 0x0). We can continue the program and, when it breaks, we can then run an info reg command to figure out what address the X register is pointing to.
The output of the info reg command is shown above. We can see that the X register is pointing to 0x56C in memory (remember the X register encompasses the value of r27 and r26). This means that the ‘@’ character of the flag is at index 0x576 – 0x56C = 0xA (decimal 10). We can now write a brute force method for the key in Python by transcribing the decryption algorithm I outlined above.
cipher_text_byte = 0xed
test_index = 10
for key in range(256):
if (((cipher_text_byte ^ key) + test_index) & 0xff) == ord('@'):
print "KEY IS 0x%x" % (key)
Running this code reveals the key byte to be 0xDB. Now, we can run the program again in simavr, set a breakpoint at the beginning of the decryption loop, set $r24 to 0xDB, run the decryption and break at the comparison of the 10th character to ‘@.’ Here, we should be able to print the flag string at 0x56C to reveal the decrypted flag in memory. This quick and dirty set of GDB commands should do the trick – you can save this to a text file and run avr-gdb -x cmds.txt to run it as a script.
# Connect to our debugging session
target remote :1234
# Break at main decryption loop
br *$pc + 0xaec
# Set r24 to be the key value we calculated
# Disable the breakpoint so we leave the decryption loop
# Break after the decryption loop
br *$pc + 0x12
# Print the flag
Running this script yields the following:
The flag, “email@example.com” is printed. Alternatively, we could’ve used Python to generalize the decrypt script to the entire array, but where’s the fun in that?
In this blog post, I shared two very different tools with two very different applications. Acquiring and learning new tools and skills is important for every reverse engineer – from the fledgling to the veteran. I hope that I was able to convey my mindset, motivation and method of using these tools, and was able to provide inspiration for you, the reader, to maybe give them a try for yourself. Happy reversing!
Two Six Labs pushes the boundaries of the possible to protect the future. We design innovative solutions to complex challenges in data, cyber, IoT and beyond. We empower our clients’ critical missions, expanding operational capabilities and bringing new technologies to market.