Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[postgresql-repmgr] All commands that require SSH are failing #75579

Open
MarcoColomb0 opened this issue Dec 9, 2024 · 4 comments
Open

[postgresql-repmgr] All commands that require SSH are failing #75579

MarcoColomb0 opened this issue Dec 9, 2024 · 4 comments
Assignees
Labels
postgresql-repmgr tech-issues The user has a technical issue about an application triage Triage is needed

Comments

@MarcoColomb0
Copy link

MarcoColomb0 commented Dec 9, 2024

Name and Version

postgresql-repmgr:16.4.0-debian-12-r34

What architecture are you using?

amd64

What steps will reproduce the bug?

Hi everyone,
when trying to switch over the secondary database to become the new primary using the repmgr command repmgr standby switchover, I get an "unable to connect to remote host via SSH" error.
image
I have also tried commands like repmgr cluster matrix, but it seems that any command used to interact with a partner node is not working.
image

What do you see instead?

I see the following output when trying the switchover:

$ /opt/bitnami/scripts/postgresql-repmgr/entrypoint.sh repmgr -f /opt/bitnami/repmgr/conf/repmgr.conf standby switchover --verbose
postgresql-repmgr 15:49:59.09 INFO  ==>

NOTICE: using provided configuration file "/opt/bitnami/repmgr/conf/repmgr.conf"
NOTICE: executing switchover on node "pg-1" (ID: 1001)
WARNING: unable to connect to remote host "pg-2" via SSH
ERROR: unable to connect via SSH to host "pg-2", user ""

And I see the following when checking the cluster matrix:

$ /opt/bitnami/scripts/postgresql-repmgr/entrypoint.sh repmgr -f /opt/bitnami/repmgr/conf/repmgr.conf cluster matrix --verbose
postgresql-repmgr 15:51:01.15 INFO  ==>

NOTICE: using provided configuration file "/opt/bitnami/repmgr/conf/repmgr.conf"
sh: 1: ssh: not found
 Name | ID   | 1001 | 1002
------+------+------+------
 pg-1 | 1001 | *    | *
 pg-2 | 1002 | ?    | ?
WARNING: following problems detected:
  node 1002 inaccessible via SSH

Looking at the cluster matrix output, I see "ssh: not found."

Additional information

The infrastructure is deployed as follows:
"pg_ha" Docker Swarm encrypted network members:

  • pg-1 node
  • pg-2 node

I have tried building a simple Alpine-based container with the ping package on the same Swarm network, and it works as expected. I can successfully ping both nodes.

Could the problem be caused by the missing SSH package in the container image?
I’ve been trying for a while, but I just can’t get this to work.
Thanks.

@MarcoColomb0 MarcoColomb0 added the tech-issues The user has a technical issue about an application label Dec 9, 2024
@github-actions github-actions bot added the triage Triage is needed label Dec 9, 2024
@carrodher
Copy link
Member

Hi, the issue may not be directly related to the Bitnami container image/Helm chart, but rather to how the application is being utilized, configured in your specific environment, or tied to a particular scenario that is not easy to reproduce on our side.

If you think that's not the case and want to contribute a solution, we'd like to invite you to create a pull request. The Bitnami team is excited to review your submission and offer feedback. You can find the contributing guidelines here.

Your contribution will greatly benefit the community. Please feel free to contact us if you have any questions or need assistance.

Suppose you have any questions about the application, customizing its content, or technology and infrastructure usage. In that case, we highly recommend that you refer to the forums and user guides provided by the project responsible for the application or technology.

With that said, we'll keep this ticket open until the stale bot automatically closes it, in case someone from the community contributes valuable insights.

@MarcoColomb0
Copy link
Author

Hello,

I would like to understand how this could be related to the way I'm deploying the DBs. If the SSH binary is not present on the container image, how can the SSH-based commands actually work?

I would also like to know if anyone has successfully managed to make commands like "standby switchover" work.

I look forward to your answer. Thanks for your help.

@carrodher
Copy link
Member

You can try building your own image modifying the Dockerfile present in this repo adding the ssh package to the list of packages to be installed at

RUN install_packages ca-certificates curl libbrotli1 libbsd0 libcom-err2 libcurl4 libedit2 libffi8 libgcc-s1 libgmp10 libgnutls30 libgssapi-krb5-2 libhogweed6 libicu72 libidn2-0 libk5crypto3 libkeyutils1 libkrb5-3 libkrb5support0 libldap-2.5-0 liblz4-1 liblzma5 libmd0 libnettle8 libnghttp2-14 libp11-kit0 libpcre3 libpsl5 libreadline8 librtmp1 libsasl2-2 libsqlite3-0 libssh2-1 libssl3 libstdc++6 libtasn1-6 libtinfo6 libunistring2 libuuid1 libxml2 libxslt1.1 libzstd1 locales procps zlib1g

@MarcoColomb0
Copy link
Author

Hello, thanks for the suggestion.
I just tried using a custom-built image on both nodes, but it seems like nothing has changed.
I'm still wondering what you meant by this:

Hi, the issue may not be directly related to the Bitnami container image/Helm chart, but rather to how the application is being utilized or configured in your specific environment, or tied to a particular scenario that is not easy to reproduce on our side.

Could this be fixable with a different type of deployment? Has anyone at Bitnami or any user managed to get these SSH-based commands to actually work?

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
postgresql-repmgr tech-issues The user has a technical issue about an application triage Triage is needed
Projects
None yet
Development

No branches or pull requests

3 participants