Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lxcfs: tighten policy about write() syscall #630

Merged
merged 4 commits into from
Mar 18, 2024

Conversation

mihalicyn
Copy link
Member

No description provided.

It's just dangerous to allow passthrough of write()
syscall anywhere under emulated sysfs subtree.

Let's forbid it.

Signed-off-by: Alexander Mikhalitsyn <[email protected]>
@mihalicyn mihalicyn requested a review from stgraber March 15, 2024 16:06
Copy link
Contributor

@tomponline tomponline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!

@stgraber which scenarios are you thinking that enabling --allow-write-on-cgroup may be useful for btw?

@mihalicyn
Copy link
Member Author

@stgraber which scenarios are you thinking that enabling --allow-write-on-cgroup may be useful for btw?

before cgroup namespace appeared, LXCFS was used to emulate cgroup tree inside the container. So it's usually safe to allow cgroup writes to a cgroup subtree from inside the container (it is something that we allow these days by default when cgroup namespace is used. "cgroup delegation" thing).

We introducing this just as an extra security measure. These days this cgroup emulation code in LXCFS is almost dead.

@stgraber
Copy link
Member

I actually expected a --enable-cgroup as I don't think we should show a cgroup tree at all by default, not even read only.

I just don't want to throw away the code because of embedded platforms running very old kernels and because we may reuse the code to offer a cgroup1 tree on a cgroup2 host system down the line.

@tomponline
Copy link
Contributor

@stgraber which scenarios are you thinking that enabling --allow-write-on-cgroup may be useful for btw?

before cgroup namespace appeared, LXCFS was used to emulate cgroup tree inside the container. So it's usually safe to allow cgroup writes to a cgroup subtree from inside the container (it is something that we allow these days by default when cgroup namespace is used. "cgroup delegation" thing).

We introducing this just as an extra security measure. These days this cgroup emulation code in LXCFS is almost dead.

Thanks. So if LXCFS on newer systems isn;t emulating cgroup tree, and writes are usually safe and allowed from inside the container, then can you help me understand the issue that requires disabling writes by default (and fixing the disabling of cpu).

@mihalicyn
Copy link
Member Author

I actually expected a --enable-cgroup as I don't think we should show a cgroup tree at all by default, not even read only.

It's just empty, if you don't have cgroup-v1 on the host.

@mihalicyn
Copy link
Member Author

Let me rework it a bit, then :)

@tomponline
Copy link
Contributor

I just don't want to throw away the code because of embedded platforms running very old kernels and because we may reuse the code to offer a cgroup1 tree on a cgroup2 host system down the line.

Got you, thanks!

@stgraber
Copy link
Member

Yeah, but even on cgroup1 hosts, I'd prefer we default to no virtual cgroup tree unless --enable-cgroup is passed.

@mihalicyn mihalicyn force-pushed the sys_write_forbid branch 3 times, most recently from 8582928 to 839e0db Compare March 15, 2024 16:48
src/lxcfs.c Outdated Show resolved Hide resolved
src/lxcfs.c Outdated Show resolved Hide resolved
@mihalicyn mihalicyn force-pushed the sys_write_forbid branch 4 times, most recently from 59da42b to ba9788a Compare March 15, 2024 17:22
During our private discussion, Stéphane proposed
to add a new option --enable-cgroup to explicitly
enable old cgroup emulation code

It's worth mentioning that cgroup code in LXCFS
is not widely used, because it was written before
cgroup namespace era and not actual these days.

Signed-off-by: Alexander Mikhalitsyn <[email protected]>
@mihalicyn
Copy link
Member Author

Hm, it looks like something is wrong with ubuntu-22.04 github runners. Let's wait a bit and once it become green we can merge.
I can't see or reproduce any issues with this PR on my machine.

@mihalicyn mihalicyn force-pushed the sys_write_forbid branch 4 times, most recently from 22fab6f to 7b18f1a Compare March 15, 2024 22:39
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
@mihalicyn
Copy link
Member Author

mihalicyn commented Mar 18, 2024

@mihalicyn
Copy link
Member Author

... and this:

[    4.442222] ================================================================================
[    4.444656] UBSAN: array-index-out-of-bounds in /build/linux-azure-6.5-dqcUxs/linux-azure-6.5-6.5.0/drivers/net/hyperv/netvsc.c:1448:41
[    4.448028] index 1 is out of range for type 'vmtransfer_page_range [1]'
[    4.449852] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.5.0-1016-azure #16~22.04.1-Ubuntu
[    4.449856] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008  12/07/2018
[    4.449858] Call Trace:
[    4.449860]  <IRQ>
[    4.449864]  dump_stack_lvl+0x37/0x50
[    4.449874]  dump_stack+0x10/0x20
[    4.449876]  __ubsan_handle_out_of_bounds+0xa2/0xe0
[    4.449881]  ? down_write_failed+0x191/0x1b0
[    4.449886]  ? rndis_filter_receive+0x85/0x190 [hv_netvsc]
[    4.449895]  ? __update_load_avg_se+0x248/0x2c0
[    4.449900]  netvsc_receive+0x418/0x440 [hv_netvsc]
[    4.449906]  ? srso_alias_return_thunk+0x5/0x7f
[    4.449911]  netvsc_poll+0x15c/0x470 [hv_netvsc]
[    4.449918]  __napi_poll+0x33/0x1d0
[    4.449923]  net_rx_action+0x17b/0x2e0
[    4.449926]  ? srso_alias_return_thunk+0x5/0x7f
[    4.449930]  __do_softirq+0xd8/0x2de
[    4.449933]  __irq_exit_rcu+0xb0/0xd0
[    4.449938]  irq_exit_rcu+0xe/0x20
[    4.449942]  sysvec_hyperv_callback+0x80/0x90
[    4.449945]  </IRQ>
[    4.449947]  <TASK>
[    4.449949]  asm_sysvec_hyperv_callback+0x1b/0x20
[    4.449952] RIP: 0010:default_idle+0xb/0x20
[    4.449956] Code: ff eb c9 cc cc cc cc cc cc cc cc cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 eb 07 0f 00 2d 77 9d 23 00 fb f4 <fa> e9 ff 47 01 00 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 90
[    4.449959] RSP: 0018:ffffbc59400a3e88 EFLAGS: 00000216
[    4.449962] RAX: ffff92ae6fcadba8 RBX: ffff92aac038cc80 RCX: ffff92aad4449000
[    4.449964] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000033ddc
[    4.449966] RBP: ffffbc59400a3e90 R08: 0000000108b747a2 R09: 0000000000000001
[    4.449968] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[    4.449970] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[    4.449974]  ? srso_alias_return_thunk+0x5/0x7f
[    4.449976]  ? arch_cpu_idle+0x9/0x10
[    4.449979]  default_idle_call+0x2c/0xa0
[    4.449983]  cpuidle_idle_call+0x132/0x170
[    4.449986]  do_idle+0x82/0xf0
[    4.449989]  cpu_startup_entry+0x2d/0x30
[    4.449991]  start_secondary+0x120/0x150
[    4.449996]  secondary_startup_64_no_verify+0x17e/0x18b
[    4.450002]  </TASK>
[    4.450004] ================================================================================
[    4.452418] ================================================================================
[    4.454661] UBSAN: array-index-out-of-bounds in /build/linux-azure-6.5-dqcUxs/linux-azure-6.5-6.5.0/drivers/net/hyperv/netvsc.c:1449:41
[    4.457833] index 1 is out of range for type 'vmtransfer_page_range [1]'
[    4.459604] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.5.0-1016-azure #16~22.04.1-Ubuntu
[    4.459607] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008  12/07/2018
[    4.459609] Call Trace:
[    4.459610]  <IRQ>
[    4.459612]  dump_stack_lvl+0x37/0x50
[    4.459617]  dump_stack+0x10/0x20
[    4.459620]  __ubsan_handle_out_of_bounds+0xa2/0xe0
[    4.459624]  ? down_write_failed+0x191/0x1b0
[    4.459628]  ? rndis_filter_receive+0x85/0x190 [hv_netvsc]
[    4.459635]  ? __update_load_avg_se+0x248/0x2c0
[    4.459639]  netvsc_receive+0x401/0x440 [hv_netvsc]
[    4.459645]  ? srso_alias_return_thunk+0x5/0x7f
[    4.459649]  netvsc_poll+0x15c/0x470 [hv_netvsc]
[    4.459656]  __napi_poll+0x33/0x1d0
[    4.459660]  net_rx_action+0x17b/0x2e0
[    4.459663]  ? srso_alias_return_thunk+0x5/0x7f
[    4.459667]  __do_softirq+0xd8/0x2de
[    4.459671]  __irq_exit_rcu+0xb0/0xd0
[    4.459675]  irq_exit_rcu+0xe/0x20
[    4.459678]  sysvec_hyperv_callback+0x80/0x90
[    4.459682]  </IRQ>
[    4.459683]  <TASK>
[    4.459685]  asm_sysvec_hyperv_callback+0x1b/0x20
[    4.459688] RIP: 0010:default_idle+0xb/0x20
[    4.459691] Code: ff eb c9 cc cc cc cc cc cc cc cc cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 eb 07 0f 00 2d 77 9d 23 00 fb f4 <fa> e9 ff 47 01 00 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 90
[    4.459693] RSP: 0018:ffffbc59400a3e88 EFLAGS: 00000216
[    4.459696] RAX: ffff92ae6fcadba8 RBX: ffff92aac038cc80 RCX: ffff92aad4449000
[    4.459698] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000033ddc
[    4.459700] RBP: ffffbc59400a3e90 R08: 0000000108b747a2 R09: 0000000000000001
[    4.459702] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[    4.459703] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[    4.459707]  ? srso_alias_return_thunk+0x5/0x7f
[    4.459710]  ? arch_cpu_idle+0x9/0x10
[    4.459713]  default_idle_call+0x2c/0xa0
[    4.459716]  cpuidle_idle_call+0x132/0x170
[    4.459719]  do_idle+0x82/0xf0
[    4.459722]  cpu_startup_entry+0x2d/0x30
[    4.459724]  start_secondary+0x120/0x150
[    4.459728]  secondary_startup_64_no_verify+0x17e/0x18b
[    4.459734]  </TASK>
[    4.459735] ================================================================================

@brauner brauner merged commit 889c683 into lxc:main Mar 18, 2024
10 checks passed
Copy link
Contributor

@tomponline tomponline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants