Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bitnami/mariadb-galera] Docker swarm mode connection issue with healthchecks #75531

Open
digitaltim-de opened this issue Dec 5, 2024 · 2 comments
Assignees
Labels
mariadb-galera tech-issues The user has a technical issue about an application triage Triage is needed

Comments

@digitaltim-de
Copy link

digitaltim-de commented Dec 5, 2024

Name and Version

bitnami/mariadb-galera

What architecture are you using?

None

What steps will reproduce the bug?

When using the Bitnami MariaDB Galera Docker image (bitnami/mariadb-galera:11.5.2) with the docker-compose.yml configuration below, I encounter connection issues when health checks are enabled for database2 and database3. The containers fail to start, and the logs show repeated Connection refused errors:

WSREP: Failed to establish connection: Connection refused
WSREP: Failed to establish connection: Connection refused WSREP: (b96c8524-8d0f, 'tcp://0.0.0.0:4567') reconnecting to 13d88779-9afa (tcp://10.0.2.4:4567), attempt 120

However, if I remove the health checks from database2 and database3, the containers start without any issues. Additionally, even when the cluster is running, I see warnings in the logs of database1:

[Warning] Aborted connection ... user: 'root' host: '10.0.1.20' (Got an error reading communication packets)
These warnings occur frequently but do not appear to affect the functionality of the cluster.

This is my docker-stack.yml:

version: "3.4"

services:

  maxscale:
    image: mariadb/maxscale:latest
    volumes:
      - maxscale-data:/var/lib/maxscale
      - ./docker/maxscale/maxscale-prod.cnf:/etc/maxscale.cnf
    deploy:
      restart_policy:
        condition: on-failure
        delay: 10s
        max_attempts: 3
        window: 120s
      resources:
        reservations:
          cpus: "0.5"
          memory: "256M"
        limits:
          cpus: "2.0"
    networks:
      - appnetwork

  database1:
    image: 'bitnami/mariadb-galera:11.5.2'
    environment:
      - MARIADB_ROOT_PASSWORD=root_password
      - MARIADB_GALERA_CLUSTER_BOOTSTRAP=yes
      - MARIADB_GALERA_FORCE_SAFETOBOOTSTRAP=yes
      - MARIADB_GALERA_CLUSTER_NAME=app_galera
      - MARIADB_GALERA_MARIABACKUP_USER=galera
      - MARIADB_GALERA_MARIABACKUP_PASSWORD=mariabackup_password
      - MARIADB_PASSWORD=mariadb_password
      - MARIADB_USER=user
      - MARIADB_DATABASE=db
      - MARIADB_CHARACTER_SET=utf8mb4
      - MARIADB_COLLATE=utf8mb4_unicode_ci
      - MARIADB_UPGRADE=no
    networks:
      - appnetwork
    volumes:
      - maria_data1:/bitnami/mariadb
      - ./docker/database/my_custom.cnf:/opt/bitnami/mariadb/conf/my_custom.cnf
    deploy:
      restart_policy:
        condition: on-failure
        delay: 10s
        max_attempts: 3
        window: 60s
      update_config:
        delay: 30s
        order: start-first
        failure_action: rollback
      resources:
        reservations:
          cpus: "1.0"
          memory: "4G"
        limits:
          cpus: "10.0"
          memory: "32G"
    healthcheck:
      test: ['CMD', '/opt/bitnami/scripts/mariadb-galera/healthcheck.sh']
      interval: 30s
      timeout: 10s
      retries: 10

  database2:
    image: 'bitnami/mariadb-galera:11.5.2'
    environment:
      - MARIADB_ROOT_PASSWORD=root_password
      - MARIADB_GALERA_CLUSTER_NAME=app_galera
      - MARIADB_GALERA_CLUSTER_ADDRESS=gcomm://database1,database2,database3
      - MARIADB_GALERA_MARIABACKUP_USER=galera
      - MARIADB_GALERA_MARIABACKUP_PASSWORD=mariabackup_password
      - MARIADB_PASSWORD=mariadb_password
      - MARIADB_USER=user
      - MARIADB_DATABASE=db
      - MARIADB_CHARACTER_SET=utf8mb4
      - MARIADB_COLLATE=utf8mb4_unicode_ci
      - MARIADB_UPGRADE=no
    networks:
      - appnetwork
    volumes:
      - maria_data2:/bitnami/mariadb
      - ./docker/database/my_custom.cnf:/opt/bitnami/mariadb/conf/my_custom.cnf
    deploy:
      restart_policy:
        condition: on-failure
        delay: 10s
        max_attempts: 3
        window: 60s
      update_config:
        delay: 45s
        order: start-first
        failure_action: rollback
      resources:
        reservations:
          cpus: "0.25"
          memory: "512M"
        limits:
          cpus: "4.0"
          memory: "8G"
    healthcheck:
      test: [ 'CMD', '/opt/bitnami/scripts/mariadb-galera/healthcheck.sh' ]
      interval: 30s
      timeout: 10s
      retries: 10

  database3:
    image: 'bitnami/mariadb-galera:11.5.2'
    environment:
      - MARIADB_ROOT_PASSWORD=root_password
      - MARIADB_GALERA_CLUSTER_NAME=app_galera
      - MARIADB_GALERA_CLUSTER_ADDRESS=gcomm://database1,database2,database3
      - MARIADB_GALERA_MARIABACKUP_USER=galera
      - MARIADB_GALERA_MARIABACKUP_PASSWORD=mariabackup_password
      - MARIADB_PASSWORD=mariadb_password
      - MARIADB_USER=user
      - MARIADB_DATABASE=db
      - MARIADB_CHARACTER_SET=utf8mb4
      - MARIADB_COLLATE=utf8mb4_unicode_ci
      - MARIADB_UPGRADE=no
    networks:
      - steamwebapi
    volumes:
      - maria_data3:/bitnami/mariadb
      - ./docker/database/my_custom.cnf:/opt/bitnami/mariadb/conf/my_custom.cnf
    deploy:
      restart_policy:
        condition: on-failure
        delay: 10s
        max_attempts: 3
        window: 60s
      update_config:
        delay: 45s
        order: start-first
        failure_action: rollback
      resources:
        reservations:
          cpus: "0.25"
          memory: "512M"
        limits:
          cpus: "4.0"
          memory: "8G"
    healthcheck:
      test: [ 'CMD', '/opt/bitnami/scripts/mariadb-galera/healthcheck.sh' ]
      interval: 30s
      timeout: 10s
      retries: 10

volumes:
  maxscale-data:
  maria_data1:
  maria_data2:
  maria_data3:

networks:
  appnetwork:
    driver: overlay
    attachable: true

What am I doing wrong?

What is the expected behavior?

That the container healthcheck works

What do you see instead?

Connection refused errors - but sometimes if quorum is reached it startup but it is very hard.

@digitaltim-de digitaltim-de added the tech-issues The user has a technical issue about an application label Dec 5, 2024
@github-actions github-actions bot added the triage Triage is needed label Dec 5, 2024
@carrodher carrodher transferred this issue from bitnami/charts Dec 5, 2024
@digitaltim-de
Copy link
Author

After 3 days testing and 36hrs i found out, that works if i give under my.cnf the param:

wsrep_sst_donor=database1

@carrodher
Copy link
Member

Hi, the issue may not be directly related to the Bitnami container image/Helm chart, but rather to how the application is being utilized, configured in your specific environment, or tied to a particular scenario that is not easy to reproduce on our side.

If you think that's not the case and want to contribute a solution, we'd like to invite you to create a pull request. The Bitnami team is excited to review your submission and offer feedback. You can find the contributing guidelines here.

Your contribution will greatly benefit the community. Please feel free to contact us if you have any questions or need assistance.

Suppose you have any questions about the application, customizing its content, or technology and infrastructure usage. In that case, we highly recommend that you refer to the forums and user guides provided by the project responsible for the application or technology.

With that said, we'll keep this ticket open until the stale bot automatically closes it, in case someone from the community contributes valuable insights.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mariadb-galera tech-issues The user has a technical issue about an application triage Triage is needed
Projects
None yet
Development

No branches or pull requests

2 participants