Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

juju crashdump sometimes tries to use the wrong ip #60

Open
jhobbs opened this issue Jul 28, 2020 · 1 comment
Open

juju crashdump sometimes tries to use the wrong ip #60

jhobbs opened this issue Jul 28, 2020 · 1 comment

Comments

@jhobbs
Copy link
Contributor

jhobbs commented Jul 28, 2020

Not all ip addresses for machines are routable - sometimes it picks the "internal" one. It should, like juju, try them all until one works.

@dasm
Copy link

dasm commented Mar 22, 2021

I just had exactly the same problem.
I've noticed one of my machines in error state, so I wanted to investigate what happened.

ubuntu@dasm-bastion:~/.local/share/juju$ juju status
Model         Controller                   Cloud/Region             Version  SLA          Timestamp
dasm  dasm-serverstack  serverstack/serverstack  2.8.9    unsupported  15:44:35Z

App                     Version  Status  Scale  Charm                   Store       Rev  OS      Notes
ceph-mon                15.2.8   active      3  ceph-mon                jujucharms   53  ubuntu
ceph-osd                15.2.8   active      3  ceph-osd                jujucharms  308  ubuntu
ceph-radosgw            15.2.8   active      1  ceph-radosgw            jujucharms  294  ubuntu
cinder                  17.0.1   active      1  cinder                  jujucharms  308  ubuntu  exposed
cinder-ceph             17.0.1   active      1  cinder-ceph             jujucharms  260  ubuntu
cinder-mysql-router     8.0.23   active      1  mysql-router            jujucharms    6  ubuntu
dashboard-mysql-router  8.0.23   active      1  mysql-router            jujucharms    6  ubuntu
glance                  21.0.0   active      1  glance                  jujucharms  303  ubuntu  exposed
glance-mysql-router     8.0.23   active      1  mysql-router            jujucharms    6  ubuntu
keystone                18.0.0   active      1  keystone                jujucharms  321  ubuntu  exposed
keystone-mysql-router   8.0.23   active      1  mysql-router            jujucharms    6  ubuntu
mysql-innodb-cluster    8.0.23   active      3  mysql-innodb-cluster    jujucharms    5  ubuntu
neutron-api             17.0.0   active      1  neutron-api             jujucharms  292  ubuntu  exposed
neutron-api-plugin-ovn  17.0.0   active      1  neutron-api-plugin-ovn  jujucharms    4  ubuntu
neutron-mysql-router    8.0.23   active      1  mysql-router            jujucharms    6  ubuntu
nova-cloud-controller   22.0.1   active      1  nova-cloud-controller   jujucharms  352  ubuntu  exposed
nova-compute            22.0.1   active      3  nova-compute            jujucharms  325  ubuntu
nova-mysql-router       8.0.23   active      1  mysql-router            jujucharms    6  ubuntu
ntp                     3.5      active      3  ntp                     jujucharms   44  ubuntu
openstack-dashboard     18.6.1   active      1  openstack-dashboard     jujucharms  311  ubuntu  exposed
ovn-central             20.03.1  active      3  ovn-central             jujucharms    5  ubuntu
ovn-chassis             20.03.1  active      3  ovn-chassis             jujucharms   10  ubuntu
placement               4.0.0    active      1  placement               jujucharms   17  ubuntu
placement-mysql-router  8.0.23   active      1  mysql-router            jujucharms    6  ubuntu
rabbitmq-server         3.8.2    active      1  rabbitmq-server         jujucharms  108  ubuntu
vault                   1.5.4    error       1  vault                   jujucharms   44  ubuntu
vault-mysql-router      8.0.23   active      1  mysql-router            jujucharms    6  ubuntu

Unit                         Workload  Agent  Machine  Public address  Ports              Message
ceph-mon/0                   active    idle   0        10.5.0.4                           Unit is ready and clustered
ceph-mon/1                   active    idle   1        10.5.0.5                           Unit is ready and clustered
ceph-mon/2*                  active    idle   2        10.5.0.15                          Unit is ready and clustered
ceph-osd/0                   active    idle   3        10.5.0.46                          Unit is ready (1 OSD)
ceph-osd/1*                  active    idle   4        10.5.0.27                          Unit is ready (1 OSD)
ceph-osd/2                   active    idle   5        10.5.0.12                          Unit is ready (1 OSD)
ceph-radosgw/0*              active    idle   6        10.5.0.10       80/tcp             Unit is ready
cinder/0*                    active    idle   7        10.5.0.38       8776/tcp           Unit is ready
  cinder-ceph/0*             active    idle            10.5.0.38                          Unit is ready
  cinder-mysql-router/0*     active    idle            10.5.0.38                          Unit is ready
glance/0*                    active    idle   8        10.5.0.30       9292/tcp           Unit is ready
  glance-mysql-router/0*     active    idle            10.5.0.30                          Unit is ready
keystone/0*                  active    idle   9        10.5.0.28       5000/tcp           Unit is ready
  keystone-mysql-router/0*   active    idle            10.5.0.28                          Unit is ready                                                                                                       
mysql-innodb-cluster/0       active    idle   10       10.5.0.21                          Unit is ready: Mode: R/W
mysql-innodb-cluster/1       active    idle   11       10.5.0.44                          Unit is ready: Mode: R/O
mysql-innodb-cluster/2*      active    idle   12       10.5.0.7                           Unit is ready: Mode: R/O
neutron-api/0*               active    idle   13       10.5.0.9        9696/tcp           Unit is ready
  neutron-api-plugin-ovn/0*  active    idle            10.5.0.9                           Unit is ready  
  neutron-mysql-router/0*    active    idle            10.5.0.9                           Unit is ready
nova-cloud-controller/0*     active    idle   14       10.5.0.18       8774/tcp,8775/tcp  Unit is ready
  nova-mysql-router/0*       active    idle            10.5.0.18                          Unit is ready             
nova-compute/0*              active    idle   15       10.5.0.17                          Unit is ready                   
  ntp/0*                     active    idle            10.5.0.17       123/udp            chrony: Ready
  ovn-chassis/0*             active    idle            10.5.0.17                          Unit is ready
nova-compute/1               active    idle   16       10.5.0.6                           Unit is ready
  ntp/1                      active    idle            10.5.0.6        123/udp            chrony: Ready
  ovn-chassis/1              active    idle            10.5.0.6                           Unit is ready
nova-compute/2               active    idle   17       10.5.0.11                          Unit is ready
  ntp/2                      active    idle            10.5.0.11       123/udp            chrony: Ready
  ovn-chassis/2              active    idle            10.5.0.11                          Unit is ready
openstack-dashboard/0*       active    idle   18       10.5.0.20       80/tcp,443/tcp     Unit is ready
  dashboard-mysql-router/0*  active    idle            10.5.0.20                          Unit is ready 
ovn-central/0*               active    idle   19       10.5.0.45       6641/tcp,6642/tcp  Unit is ready (leader: ovnnb_db, ovnsb_db)
ovn-central/1                active    idle   20       10.5.0.14       6641/tcp,6642/tcp  Unit is ready
ovn-central/2                active    idle   21       10.5.0.24       6641/tcp,6642/tcp  Unit is ready (northd: active)
placement/0*                 active    idle   22       10.5.0.52       8778/tcp           Unit is ready 
  placement-mysql-router/0*  active    idle            10.5.0.52                          Unit is ready
rabbitmq-server/0*           active    idle   23       10.5.0.22       5672/tcp           Unit is ready 
vault/0*                     error     idle   24       10.5.0.8        8200/tcp           hook failed: "update-status"
  vault-mysql-router/0*      active    idle            10.5.0.8                           Unit is ready
                                                                                                        
Machine  State    DNS        Inst id                               Series  AZ    Message       
0        started  10.5.0.4   af868f48-6848-4708-9d9e-f3f9074758d9  focal   nova  ACTIVE        
1        started  10.5.0.5   8013182f-2505-477e-8b43-a3ea656cc728  focal   nova  ACTIVE                 
2        started  10.5.0.15  41127c94-16b8-455a-9500-6922da577990  focal   nova  ACTIVE        
3        started  10.5.0.46  f5405e8f-fb35-4b0d-aa96-4e41d1152904  focal   nova  ACTIVE        
4        started  10.5.0.27  d9e4d129-0a9d-48a1-bf51-17dd00488410  focal   nova  ACTIVE        
5        started  10.5.0.12  aa9c16e9-abfb-4d40-9e0d-061f3bf7140c  focal   nova  ACTIVE                 
6        started  10.5.0.10  42d02804-2ee9-4066-a12d-9ea2e733c3bb  focal   nova  ACTIVE        
7        started  10.5.0.38  52246ec1-5705-4f64-9758-2bf27675a09a  focal   nova  ACTIVE        
8        started  10.5.0.30  16ceb443-623b-41c9-ac57-7bbe8a8eaab6  focal   nova  ACTIVE        
9        started  10.5.0.28  3533b6c9-95b7-4314-953f-3d37e83e66b7  focal   nova  ACTIVE        
10       started  10.5.0.21  d768edb6-a506-4f46-ac3b-7a263b9cd9ec  focal   nova  ACTIVE        
11       started  10.5.0.44  2e6b2b11-eb76-4a85-be96-39cda8e775db  focal   nova  ACTIVE        
12       started  10.5.0.7   0442f43e-f1f8-40a5-a9f4-0fd15eda99a4  focal   nova  ACTIVE        
13       started  10.5.0.9   47c5b116-4121-4d37-bdc2-bc5024db1452  focal   nova  ACTIVE
14       started  10.5.0.18  457cc00f-f6f7-4d24-9721-9f6eb8202e73  focal   nova  ACTIVE          
15       started  10.5.0.17  4718bf65-c570-448f-a119-8ab3f9d4f697  focal   nova  ACTIVE                              
16       started  10.5.0.6   8d6e9f88-1756-4ea4-9afc-5f9328e6256e  focal   nova  ACTIVE                              
17       started  10.5.0.11  f20e619c-86d2-456c-9205-eae0c3e9d9f4  focal   nova  ACTIVE                              
18       started  10.5.0.20  8626ab19-03d2-40f9-ab01-6763043b7f85  focal   nova  ACTIVE                        
19       started  10.5.0.45  217f5495-b46e-4315-9f88-b13503eb0009  focal   nova  ACTIVE                        
20       started  10.5.0.14  ec3c8ae9-b4df-4437-9295-5b775019a16a  focal   nova  ACTIVE                        
21       started  10.5.0.24  2c421863-5626-4e38-9326-63d498d927ce  focal   nova  ACTIVE                
22       started  10.5.0.52  4d623f88-e57d-403f-bd6b-a13cb9fe885a  focal   nova  ACTIVE                
23       started  10.5.0.22  adf28f27-1881-400a-b221-16d5b8efcd8b  focal   nova  ACTIVE                
24       started  10.5.0.8   0e8ce15e-082a-4620-b599-ce05f2e6c02d  focal   nova  ACTIVE                

juju tried to access data from 10.5.0.42 which doesn't exist in my pool of machines.

ubuntu@dasm-bastion:~$ juju crashdump -s                                    
2021-03-22 15:36:39,700 - juju-crashdump started.                                      
2021-03-22 15:36:56,854 - Command "timeout 45s ssh -o StrictHostKeyChecking=no -i ~/.local/share/juju/ssh/juju_id_rsa [email protected] sudo 'mkdir -p /tmp/732596b1-0238-4792-8786-ea3743c76897/cmd_output;sud
o netstat -taupn | grep LISTEN 2>/dev/null | sudo tee /tmp/732596b1-0238-4792-8786-ea3743c76897/cmd_output/listening.txt || true'" failed
2021-03-22 15:36:59,925 - Command "timeout 45s ssh -o StrictHostKeyChecking=no -i ~/.local/share/juju/ssh/juju_id_rsa [email protected] sudo 'mkdir -p /tmp/732596b1-0238-4792-8786-ea3743c76897/cmd_output;sud
o ps aux | sudo tee /tmp/732596b1-0238-4792-8786-ea3743c76897/cmd_output/psaux.txt || true'" failed
2021-03-22 15:37:05,557 - Command "timeout 45s ssh -o StrictHostKeyChecking=no -i ~/.local/share/juju/ssh/juju_id_rsa [email protected] sudo 'sudo find /etc/alternatives /etc/ceilometer /etc/ceph /etc/cinder
 /etc/cloud /etc/glance /etc/gnocchi /etc/keystone /etc/netplan /etc/network /etc/neutron /etc/nova /etc/quantum /etc/swift /etc/udev/rules.d /lib/udev/rules.d /opt/nedge/var/log /run/cloud-init /usr/share/
lxc/config /var/lib/charm /var/lib/libvirt/filesystems/plumgrid-data/log /var/lib/libvirt/filesystems/plumgrid/var/log /var/lib/cloud/seed /var/log /var/snap/simplestreams/common/sstream-mirror-glance.log /
var/crash /var/snap/juju-db/common/logs/ /var/lib/mysql/*-mysql-router /tmp/juju-exec*/script.sh /var/lib/lxd/containers/*/rootfs/etc/alternatives /var/lib/lxd/containers/*/rootfs/etc/ceilometer /var/lib/lx
d/containers/*/rootfs/etc/ceph /var/lib/lxd/containers/*/rootfs/etc/cinder /var/lib/lxd/containers/*/rootfs/etc/cloud /var/lib/lxd/containers/*/rootfs/etc/glance /var/lib/lxd/containers/*/rootfs/etc/gnocchi
 /var/lib/lxd/containers/*/rootfs/etc/keystone /var/lib/lxd/containers/*/rootfs/etc/netplan /var/lib/lxd/containers/*/rootfs/etc/network /var/lib/lxd/containers/*/rootfs/etc/neutron /var/lib/lxd/containers/
*/rootfs/etc/nova /var/lib/lxd/containers/*/rootfs/etc/quantum /var/lib/lxd/containers/*/rootfs/etc/swift /var/lib/lxd/containers/*/rootfs/etc/udev/rules.d /var/lib/lxd/containers/*/rootfs/lib/udev/rules.d
/var/lib/lxd/containers/*/rootfs/opt/nedge/var/log /var/lib/lxd/containers/*/rootfs/run/cloud-init /var/lib/lxd/containers/*/rootfs/usr/share/lxc/config /var/lib/lxd/containers/*/rootfs/var/lib/charm /var/l
ib/lxd/containers/*/rootfs/var/lib/libvirt/filesystems/plumgrid-data/log /var/lib/lxd/containers/*/rootfs/var/lib/libvirt/filesystems/plumgrid/var/log /var/lib/lxd/containers/*/rootfs/var/lib/cloud/seed /va
r/lib/lxd/containers/*/rootfs/var/log /var/lib/lxd/containers/*/rootfs/var/snap/simplestreams/common/sstream-mirror-glance.log /var/lib/lxd/containers/*/rootfs/var/crash /var/lib/lxd/containers/*/rootfs/var
/snap/juju-db/common/logs/ /var/lib/lxd/containers/*/rootfs/var/lib/mysql/*-mysql-router /var/lib/lxd/containers/*/rootfs/tmp/juju-exec*/script.sh -mount -type f -size -5000000c -o -size 5000000c 2>/dev/nul
l | sudo tar -pcf /tmp/juju-dump-732596b1-0238-4792-8786-ea3743c76897.tar --files-from - 2>/dev/null;sudo tar --append -f /tmp/juju-dump-732596b1-0238-4792-8786-ea3743c76897.tar -C /tmp/732596b1-0238-4792-8
786-ea3743c76897/cmd_output . || true;sudo tar --append -f /tmp/juju-dump-732596b1-0238-4792-8786-ea3743c76897.tar -C /tmp/732596b1-0238-4792-8786-ea3743c76897/ journalctl || true;sudo tar --append -f /tmp/
juju-dump-732596b1-0238-4792-8786-ea3743c76897.tar -C /tmp/732596b1-0238-4792-8786-ea3743c76897/addon_output . || true'" failed
2021-03-22 15:37:29,334 - Command "scp -o StrictHostKeyChecking=no -i ~/.local/share/juju/ssh/juju_id_rsa [email protected]:/tmp/juju-dump-732596b1-0238-4792-8786-ea3743c76897.tar 20f8e194-7d03-40c1-a727-fa1
29df68be9.tar" failed                            

When I tried to gather extra logs to analyze its behavior, I reran command with debug juju crashdump -l debug -s, but this time, there was no problem, and everything worked.

ubuntu@dasm-bastion:~/.local/share/juju$ juju crashdump -l debug -s                                                                                                                                
2021-03-22 15:41:26,123 - juju-crashdump started.                                                                                                                                                             
2021-03-22 15:41:26,125 - Calling juju version                                                                                                                                                                
2021-03-22 15:41:26,233 - Returned from juju version                                                                                                                                                          
2021-03-22 15:41:26,234 - Calling juju switch                                                                                                                                                                 
2021-03-22 15:41:26,325 - Returned from juju switch                                                                                                                                                           
2021-03-22 15:41:26,326 - Calling juju  status --format=yaml                                                                                                                                                  
2021-03-22 15:41:27,240 - Returned from juju  status --format=yaml                                                                                                                                            
2021-03-22 15:41:27,241 - Calling juju  status --format=tabular --relations --storage                                                                                                                         
2021-03-22 15:41:28,113 - Returned from juju  status --format=tabular --relations --storage                                                                                                                   
2021-03-22 15:41:28,113 - Calling juju debug-log --date --replay --no-tail                                                                                                                                    
2021-03-22 15:41:35,741 - Returned from juju debug-log --date --replay --no-tail                                                                                                                              
2021-03-22 15:41:35,745 - Calling juju model-config --format=yaml                                                                                                                                             
2021-03-22 15:41:35,946 - Returned from juju model-config --format=yaml                                                                                                                                       
2021-03-22 15:41:35,946 - Calling juju storage --format=yaml                                                                                                                                                  
2021-03-22 15:41:36,586 - Returned from juju storage --format=yaml                                                                                                                                            
2021-03-22 15:41:36,587 - Calling juju storage-pools --format=yaml                                                                                                                                            
2021-03-22 15:41:37,188 - Returned from juju storage-pools --format=yaml                                                                                                                                      
[...]
2021-03-22 15:44:14,377 - Returned from tar -pacf juju-crashdump-ff274cbe-b25a-40d4-aebc-05cf32c30796.tar.xz * 2>/dev/null                                                                                    
2021-03-22 15:44:14,784 - juju-crashdump finished.                                                                                                                                                            

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants