Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Support address sanitizers within VMs via memory overcommit #826

Open
Ionic opened this issue Jun 3, 2022 · 0 comments · May be fixed by #827
Open

[RFC] Support address sanitizers within VMs via memory overcommit #826

Ionic opened this issue Jun 3, 2022 · 0 comments · May be fixed by #827

Comments

@Ionic
Copy link
Contributor

Ionic commented Jun 3, 2022

I'm creating this issue as a discussion platform. The PR implementing this is #827.

Original issue

Currently, packages utilizing asan fail to build within OBS - whether built in a VM or directly on the host. Thus, most packages employ hacks to disable the usage of asan, as done for meson's test suite (lines 165+ currently, can't link to specific line numbers on OBS).

Reason

This is because asan tries to opportunistically allocate multiple TB of memory, even when used on a simple hello-world program. That sounds rather weird at first, but the Linux kernel actually allows such behavior heuristically (though it can be tuned via kernel parameters), since it doesn't actually allocate the memory a process requests up front. Instead, it just assigns the region to the process and handles memory allocation on-demand via page faults when first accessed. Most programs requesting huge regions, including asan, never actually use even a medium portion of this region, so just allowing overcommitment is fine.

Now, that only works if RLIMIT_AS is higher than or equal to the requested memory size. if it's set to infinity, that's trivially true, however, build actually limits it to (2/3) * (total_mem + total_swap) in stage1, which makes sense, since stage1 runs natively on the host (and does things like setting up the VM if necessary, preinstalling packages and the like).

Then, in stage2, which is executed within the VM, build skips setting the memory limit, but at that point, it's too late, since the parent process that spawned the VM (stage1) already had a limit set which is passed down to the VM process.

This issue might not actually affect all VM types, as far as I've seen, since types such as ec2 or zvm are spawned remotely and not influenced by any PRLIMIT set on the builder host machine.

Remedy

We should be able to make that work by removing the limit imposed on VM processes either via the VM's spawn tools (if they support modifying prlimits natively) or by setting it to unlimited in shell right before spawning the VM. This is essentially what my PR is doing.

Impact

That is a bit difficult to tell. I assume that the memory limit was imposed to protect the host system from malicious (or just buggy) code executed within the builders, i. e., so as to not make a builder unresponsive or trigger the OOM watchdog.

  • For full virtualization, like with qemu(/kvm), Xen or even paravirtualization like UVM, limiting PRLIMIT_AS on the VM spawn process is useless, since we're already spawning the hypervisor with a specific memory amount set which, by design, won't be surpassable. Allowing memory overcommit within the VM is thus totally safe.
  • Anything not spawned on the build host, but remotely on cloud instances is not even affected by whatever PRLIMIT_AS is set to on the build host. Likewise, no security consequences here.
  • For containerized virtualization, like LXC, docker and systemd-nspawn, the situation is different. While the memory can be limited for some, we usually don't do this and internally, limiting the memory usage is often done through PRLIMITs anyway. Setting PRLIMIT_AS to infinity would allow legitimate usage of asan and probably other programs, too, though. But even without that, the current memory limit is not sufficient to protect from what it is supposed to do, since it's hardcoded to (2/3)*(total_mem_size) including the swap space, as mentioned above, per build process/worker and most build hosts spawn num_cpu workers. Hence, you'd overcommit with at least two workers, anyway.

Due to this, I think that just removing the PRLIMIT memory limitation for VM-type builds is beneficial and the risk assessment seems to be very low.

Ionic added a commit to Ionic/obs-build that referenced this issue Jun 3, 2022
The "build" script limits PRLIMIT_AS to (2/3) * (mem_size + swap_size),
which breaks asan, which tries to opportunistically allocate multiple TB
of memory, but crucially never even closely uses this.

This commit eliminates the memory limit right before spawning VMs (and
containers) if necessary, so that stage2 can run unrestricted
(notwithstanding the memory limit imposed by hypervisors) and use memory
overcommitment within the build process.

See the linked bug for a discussion of this.

Fixes: openSUSE#826
@Ionic Ionic linked a pull request Jun 3, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant