Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cset error :failed to create shield, hint: do other cpusets exist? #34

Open
redno2 opened this issue Apr 1, 2020 · 14 comments
Open

cset error :failed to create shield, hint: do other cpusets exist? #34

redno2 opened this issue Apr 1, 2020 · 14 comments

Comments

@redno2
Copy link

redno2 commented Apr 1, 2020

Hello, i got an error when I try to use shield on Ubuntu.
Maybe the same as #26 ?
Tested with the current and the 1.6 version.

root@host:~# cset shield -c 0-1
cset: --> failed to create shield, hint: do other cpusets exist?
cset: **> [Errno 22] Invalid argument

root@host:~# cset --version
cset: Cpuset (cset) 1.5.6

root@host:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.4 LTS
Release:        18.04
Codename:       bionic

Same with manually install the latest package on 19.10

root@host2:~# cset shield -c 0-1
cset: --> failed to create shield, hint: do other cpusets exist?
cset: **> [Errno 22] Invalid argument

root@red-desktop:~# cset --version
cset: Cpuset (cset) 1.6

root@red-desktop:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 19.10
Release:        19.10
Codename:       eoan

Thanks.

@ryankurte
Copy link

I have the same error failed to create shield, hint: do other cpusets exist? on debian buster with cset version 1.5.6.

cset set shows that i have an additional set for docker, I wonder if this is the issue / am not sure how to select the right cpuset.

cset: 
         Name       CPUs-X    MEMs-X Tasks Subs Path
 ------------ ---------- - ------- - ----- ---- ----------
         root        0-7 y       0 y  1550    2 /
       docker        0-7 n       0 n     0    9 /docker
 machine.slice        0-7 n       0 n     0    0 /machine.slice

@ryankurte
Copy link

ryankurte commented Apr 23, 2020

It looks like there are some assumptions about existing cpusets, manually creating a set with cset set -s user -c 0-7 then invoking shield with cset shield --sysset=root --userset=user -k on -c 6-7 to explicitly specify the appropriate sets gets a step further but still errors out:

CSET_DEBUG_LEVEL=10 cset shield --sysset=root --userset=user -k on -c 6-7
cset: **> [Errno 13] Permission denied
cset: insufficient permissions, you probably need to be root
Traceback (most recent call last):
  File "/usr/bin/cset", line 47, in <module>
    main()
  File "/usr/lib/python2.7/dist-packages/cpuset/main.py", line 228, in main
    command.func(parser, options, args)
  File "/usr/lib/python2.7/dist-packages/cpuset/commands/shield.py", line 262, in func
    make_shield(options.cpu, options.kthread)
  File "/usr/lib/python2.7/dist-packages/cpuset/commands/shield.py", line 395, in make_shield
    set.modify(SYS_SET, cpuspec_inv, memspec, False, False)
  File "/usr/lib/python2.7/dist-packages/cpuset/commands/set.py", line 411, in modify
    if cpuspec: nset.cpus = cpuspec
  File "/usr/lib/python2.7/dist-packages/cpuset/cset.py", line 186, in setcpus
    f.close()
IOError: [Errno 13] Permission denied

Removing the docker cset so there's nothing there doesn't seem to help either.

(Also you can enable debug outputs with CSET_DEBUG_LEVEL=10)

@ryankurte
Copy link

Manually creating and moving processes between sets still works, for example:

  • sudo cset set -s system -c 0-1 -m 0 to create a system cpuset with cores 1-0
  • sudo cset set -s user -c 2-7 -m 0 to create a user cpuset with cores 2-7
  • sudo cset proc -m -s root -t system -k to move tasks from / to /system cpuset

I cannot however alter the docker cpuset

➜  ~ cset set -l                           
cset: 
         Name       CPUs-X    MEMs-X Tasks Subs Path
 ------------ ---------- - ------- - ----- ---- ----------
         root        0-7 y       0 y   136    3 /
         vfio        2-7 n       0 n     0    0 /vfio
       docker        0-7 n       0 n     0    8 /docker
       system        0-1 n       0 n  1556    0 /system

➜  ~ sudo cset set -s docker -c 0-1
cset: **> [Errno 16] Device or resource busy

@thiagokokada
Copy link

Applying this patch (without the python2 part) seems to resolve the issue, at least for me:
https://rokups.github.io/#!pages/gaming-vm-performance.md#Update_1:_cpuset_patch

@ryankurte
Copy link

Iiiinteresting, I'll have to try this patch.

It's not the root issue but in case it's useful for others, to control the docker cset you need to change the cgroup driver to systemd with something like "exec-opts": ["native.cgroupdriver=systemd"] in /etc/docker/daemon.json.

@thiagokokada
Copy link

@ryankurte Report to me if this patch works for your case, in that case I can open a PR.

@thiagokokada
Copy link

Posting a cleaned version of the patch here if someone wants to try it: cpuset.txt

@jonibim
Copy link

jonibim commented Jan 9, 2021

@thiagokokada The patch fixed the error that I had. Although I am curious why commenting two lines fixed the whole problem? I am interested to know more about this

@thiagokokada
Copy link

@thiagokokada The patch fixed the error that I had. Although I am curious why commenting two lines fixed the whole problem? I am interested to know more about this

I really don't understand too much about the code to know why this fixes the problem, but AFAIK this is probably skipping some guards.

My use case is to isolate CPUs for a VM in libvirt. Without this patch libvirt (that AFAIK uses cpuset syscall, not this program, internally) can't migrate the vCPUs to the isolated CPUs, but with this patch it works fine.

It definitely works fine for me, because there is no user space or kernel threads running in the VM (this VM is highly sensitive to latency so any thread stealing CPU cycles results in huge spikes in latency).

@thiagokokada
Copy link

So I think this is the options that commenting those lines disable: https://github.com/lpechacek/cpuset/blob/master/cpuset/commands/set.py#L166-L171. They're not exposed in shield though, so I think shield set both of those options to true unconditionally.

Maybe a patch exposing those options to shield should do the trick.

@thiagokokada
Copy link

So we have those calls in make_shield() function: https://github.com/lpechacek/cpuset/blob/master/cpuset/commands/shield.py#L379-L380

That basically is this function: https://github.com/lpechacek/cpuset/blob/master/cpuset/commands/set.py#L383-L400

So yeah, it basically uses modify() with CPU exclusive set (but not Memory like I said before). A better patch would be to simply set False in those two lines: https://github.com/lpechacek/cpuset/blob/master/cpuset/commands/shield.py#L379-L380.

@thiagokokada
Copy link

A slightly better patch: cpuset2.txt

@Werkov
Copy link
Member

Werkov commented Jan 11, 2021

Dropping the CPU exclusivity of the cpuset won't keep the CPUs "shielded" as intended. I don't think this is an approach that would be broadly acceptable.

@thiagokokada Do you also have a cpuset cgroup on your system that intersects the shielded CPUs (as in the original report)? What happens if you exclude the offending CPUs from that cgroup?

@cout
Copy link

cout commented Sep 8, 2021

I also had this error:

$ sudo cset shield --cpu 2
cset: --> failed to create shield, hint: do other cpusets exist?
cset: **> [Errno 22] Invalid argument

I see I have no docker containers running:

$ sudo docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

But I do have a cpu set for docker:

$ sudo cset set -l
cset: 
         Name       CPUs-X    MEMs-X Tasks Subs Path
 ------------ ---------- - ------- - ----- ---- ----------
         root       0-63 y     0-1 y  4117    1 /
       docker 0,2,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62 n     0-1 n     0    0 ...

But on another machine, where cset shield works just fine, I have:

$ cset set -l
cset: 
         Name       CPUs-X    MEMs-X Tasks Subs Path
 ------------ ---------- - ------- - ----- ---- ----------
         root        0-7 y       0 y   155    3 /
         user          2 y       0 n     5    0 /user
       docker      ***** n   ***** n     0    0 /docker
       system    0-1,3-7 y       0 n   955    0 /system

So on the first machine I tried removing the cpus assigned to docker:

root# echo > /sys/fs/cgroup/cpuset/docker/cpuset.cpus

Then the list looks like this:

$ cset set -l
cset: 
         Name       CPUs-X    MEMs-X Tasks Subs Path
 ------------ ---------- - ------- - ----- ---- ----------
         root       0-63 y     0-1 y  4112    1 /
       docker      ***** n     0-1 n     0    0 /docker

And I am able to create a shield with no errors:

$ sudo cset shield --cpu 2                        
cset: --> activating shielding:
cset: moving 3370 tasks from root into system cpuset...
[==================================================]%
cset: "system" cpuset of CPUSPEC(0-1,3-63) with 3370 tasks running
cset: "user" cpuset of CPUSPEC(2) with 0 tasks running

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants