Write-up for the Google CTF 2019

Introduction

This is my 2nd time participating in a CTF contest. This time I paired up with awesome people from TUNA and THU CSTA in the Tea Deliverers team. These guys are like superheroes when it comes to problem-solving!

Finally I was only able to fully solve one problem: poly. I'll show the problem and my solution below.

Problem

nc poly.ctfcompetition.com 1337

It will present this banner:

Give me a bios image that I can load into qemu that will send
"Pull_that_lever1"to 169.254.169.254:80 over tcp. The server can
be delayed a bit, so your payload should retry connecting. And by
the way, it has to work both in x86_64 and on a qemu arm virt-2.8
machine, all in under 300 seconds

To make your life easier, here are the commands the server will run:

/usr/bin/qemu-system-arm -nographic -machine virt-2.8 -net nic -net "user,restrict=on,net=169.254.0.0/16,host=169.254.169.253,guestfwd=tcp:169.254.169.254:80-cmd:netcat 127.0.0.1 $random_port" -bios $tmpfile

/usr/bin/qemu-system-x86_64 -nographic -net nic -net "user,restrict=on,net=169.254.0.0/16,host=169.254.169.253,guestfwd=tcp:169.254.169.254:80-cmd:netcat 127.0.0.1 $random_port" -bios $tmpfile

Boot image (base64 encoded, limit 3MiB encoded):

It is obvious that it wants you to construct a polyglot that can run
both on ARM and x86. At first thought one would think of using some
"dual-use" instructions to construct a "loader" and jump to payloads
for different platforms. This strategy works for shellcodes beautifully, however, this will not work because we are constructing a BIOS image.

BIOS Loading Process on x86 and ARM

Let there be some background information on the BIOS loading process of an x86 machine.
We all know that the BIOS is actually stored on some flash memory chip on an actual motherboard. On x86 this memory is mapped to the 0xF0000 to 0x100000 range. The key point is that, the system will jump to 0x100000 after initialization. And this address is actually the end of the BIOS ROM image.

0x00000 .. 0xA0000      DOS Memory Area       RAM
0xA0000 .. 0xC0000      Video Memory          Device Memory
0xC0000 .. 0xE0000      ISA Extension ROM     ROM
0xE0000 .. 0xF0000      BIOS Extension ROM    ROM
0xF0000 .. 0x100000     BIOS Area             ROM
(Quoted from QEMU Wiki)

However, on ARM machines, this is not the case. The address mapping is largely vender-dependent while for our case (The QEMU virt 2.7 machine), Execution starts from 0x00000000 and this is mapped to the begin of the ROM image. Voila! We just need to concat the two files and our polyglot is done!

Solution

Now the problem becomes "how to create the payloads for ARM and x86?". Well, there is a project called U-Boot (Das U-Boot) that is known by every embedded developer, which is a bootloader for Linux (and others) for almost all platforms. We just compile U-Boot on both platforms and concat them! Also, your final payload must be padded to multiples of 65536 bytes.

And now we have the following snippet:

; multi-arch u-boot

; compile with nasm
bits 32
_start:
    incbin "../u-boot/u-boot.bin"
    times 23812 db 0x90 ; This is calculated padding
    incbin "../u-boot-x86/u-boot.rom"

Now the payload will already be running on both platforms. Meanwhile, another problem arises that the two machines have different Ethernet hardware: e1000 for x86 and virtio-net for ARM. Fortunately, U-Boot has built-in support for both cards. We just need to compile it and add some code for actually loading it with the U-Boot Driver Model (DM).

int do_demo_net(cmd_tbl_t *cmdtp, int flag, int argc, char * const argv[])
{
	struct udevice *dev;
	int i, ret;

	puts("VIRTIO uclass entries:\n");
	int devnum = 0;
	struct udevice *net_dev;
	ret = uclass_get_device(UCLASS_VIRTIO, devnum, &net_dev);
	printf("getting devices %d\n", ret);
	for (i = 0, ret = uclass_first_device(UCLASS_VIRTIO, &dev);
	     dev;
	     ret = uclass_next_device(&dev)) {
		printf("entry %d - instance %08x, ops %08x, platdata %08x\n",
		       i++, (uint)map_to_sysmem(dev),
		       (uint)map_to_sysmem(dev->driver->ops),
		       (uint)map_to_sysmem(dev_get_platdata(dev)));
	}

	if (ret == 0) {
		ret = uclass_get_device(UCLASS_ETH, devnum, &net_dev);

		printf("getting devices %d\n", ret);
		for (i = 0, ret = uclass_first_device(UCLASS_ETH, &dev);
			dev;
			ret = uclass_next_device(&dev)) {
			printf("entry %d - instance %08x, ops %08x, platdata %08x\n",
				i++, (uint)map_to_sysmem(dev),
				(uint)map_to_sysmem(dev->driver->ops),
				(uint)map_to_sysmem(dev_get_platdata(dev)));
		}
	}
	return cmd_process_error(cmdtp, ret);
}

Seems easy, right? It comes with a big surprise-the U-Boot network stack does not support TCP! I patched it against a TCP patch that I found from the U-Boot mailing list, and hacked it up to send the challenge and receive the reply(Turns out to be unnecessary).

Now we just pwn the server:

from pwn import *

#context.log_level = 'debug'
context.proxy = (socks.SOCKS5, 'localhost', 1080) 
p = remote("poly.ctfcompetition.com", 1337)

p.recvuntil("(base64 encoded, limit 3MiB encoded):")
p.send(file("./noptest.b64", "rb").read())
p.interactive()
exit()

And here be the flag!

P.S. & Acknowledgment

Really exciting competition and a nice learning experience for a 菜鸡 (newbie) like me.

The patched U-Boot with TCP can be found at https://github.com/ProfFan/u-boot.
Check the commit logs to learn how to use it.

The author would like to thank:

  • The U-Boot authors :)
  • The patch author Duncan Hare for the TCP Patch
  • @riatre for his expertise (waaaaaaay above me XD)
  • @iromise for inviting me to the team
  • TUNA and all people from THU CSTA