Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubernetes - Redeploying not working when using S3 as default storage backend #590

Open
LunaticZorr opened this issue Sep 23, 2019 · 22 comments
Labels
Persistence Anything to do with external storage or persistence. This is also where we triage things like NFS. S3 Anything to do with S3 object storage

Comments

@LunaticZorr
Copy link

LunaticZorr commented Sep 23, 2019

Nextcloud version (eg, 12.0.2): 16.0.4
Operating system and version (eg, Ubuntu 17.04): Kubernetes / Docker
Apache or nginx version (eg, Apache 2.4.25): Docker Image nextcloud:16.0.4-apache
PHP version (eg, 7.1): Docker Image nextcloud:16.0.4-apache

The issue you are facing:

  • We are deploying Nextcloud on Kubernetes using the Helm Chart from https://github.com/helm/charts/tree/master/stable/nextcloud .
  • We changed the Docker image to be nextcloud:16.0.4-apache
  • We use the s3.config.php option to store our files on S3.
  • We use the external database option to use a MariaDB server we already have.

We launch Nextcloud the first time and it creates the DB correctly, creates the first user correctly, and starts up as expected. We can login, and create / upload files.
To verify our files are secure and retrievable after a major failure, we re-deploy the Nextcloud deployment (scale to 0, scale to 1).
After this, while starting the logs show the following:

Initializing nextcloud 16.0.4.1 ...
Initializing finished
New nextcloud instance
Installing with MySQL database
starting nextcloud installation
The username is already being used
retrying install...
The username is already being used
retrying install...
The username is already being used
retrying install...

This is the first issue. WHY does it try to re-install? The Database is still there, and so is the previous user. Why does it not just connect and re-use what is there?

After a couple of minutes the container dies and starts again, this time without failure. BUT, when trying to browse to nextcloud, we are greeted with the following message:

Error
It looks like you are trying to reinstall your Nextcloud. However the file CAN_INSTALL is missing from your config directory. Please create the file CAN_INSTALL in your config folder to continue.

If I create the CAN_INSTALL file, I am prompted with the installation/setup screen and am told that the admin account I want to use does already exist.

Is this the first time you've seen this error? (Y/N): Y

The output of your config.php file in /path/to/nextcloud (make sure you remove any identifiable information!):

<?php
$CONFIG = array (
  'debug' => true,
  'htaccess.RewriteBase' => '/',
  'memcache.local' => '\\OC\\Memcache\\APCu',
  'apps_paths' => 
  array (
    0 => 
    array (
      'path' => '/var/www/html/apps',
      'url' => '/apps',
      'writable' => false,
    ),
    1 => 
    array (
      'path' => '/var/www/html/custom_apps',
      'url' => '/custom_apps',
      'writable' => true,
    ),
  ),
  'objectstore' => 
  array (
    'class' => '\\OC\\Files\\ObjectStore\\S3',
    'arguments' => 
    array (
      'bucket' => 'nextcloud-files',
      'autocreate' => true,
      'key' => '**************',
      'secret' => '****************',
      'region' => 'eu-west-1',
      'use_ssl' => true,
    ),
  ),
  'passwordsalt' => '********************',
  'secret' => '******************',
  'trusted_domains' => 
  array (
    0 => 'localhost',
  ),
  'datadirectory' => '/var/www/html/data',
  'dbtype' => 'mysql',
  'version' => '16.0.4.1',
  'overwrite.cli.url' => 'http://localhost',
  'dbname' => 'nextcloud',
  'dbhost' => 'mariadb.mariadb',
  'dbport' => '',
  'dbtableprefix' => 'oc_',
  'mysql.utf8mb4' => true,
  'dbuser' => 'oc_user129',
  'dbpassword' => '*****************',
  'instanceid' => '************',
);

Any idea on how to solve this issue?

@cuihaikuo
Copy link

Same problem!

@fle108
Copy link

fle108 commented Nov 5, 2019

exactly the same problem even with manual yaml manifest (not helm) and persistentvolumeclaims.
I tested to delete the user in database, I works for this step but the next problem is :

Command "maintenance:install" is not defined.

  Did you mean one of these?
      app:install
      maintenance:data-fingerprint
      maintenance:mimetype:update-db
      maintenance:mimetype:update-js
      maintenance:mode
      maintenance:repair
      maintenance:theme:update
      maintenance:update:htaccess


retrying install...

@JasperZ
Copy link

JasperZ commented Nov 5, 2019

Just out of curiosity what do you use as storage backend?
I never got it working with nfs backed persistent volumes. The rsync happening in the entrypoint.sh didn't fully finish for some reason. Also it took a pretty long time until the "install" was finished.
And when I killed the pod the new one was trying to install nextcloud again.

@fle108
Copy link

fle108 commented Nov 5, 2019

Just out of curiosity what do you use as storage backend?
I never got it working with nfs backed persistent volumes. The rsync happening in the entrypoint.sh didn't fully finish for some reason. Also it took a pretty long time until the "install" was finished.
And when I killed the pod the new one was trying to install nextcloud again.

hi, I use azure file storage but I've to mount it with specific mount options (uid 33 for www-data) in my persistentVolume manifest otherwise it doesn't work

  mountOptions:
  - dir_mode=0770
  - file_mode=0770
  - uid=33
  - gid=33

actually I battle with initContainers to be able to push ConfigMap file (.user.ini) to set up php options like upload_max_filesize

sources:
https://github.com/rabbitmq/rabbitmq-peer-discovery-k8s/issues/37
https://docs.nextcloud.com/server/13.0.0/admin_manual/configuration_files/big_file_upload_configuration.html

@GoingOffRoading
Copy link

Anybody figure out a fix to this problem?

I have all data persisted on a NAS and just wiped my Kubernetes host to re-start my containers from scratch.

When I launch NExtCloud, I get the same "It looks like you are trying to reinstall your Nextcloud. However the file CAN_INSTALL is missing from your config directory. Please create the file CAN_INSTALL in your config folder to continue." error.

I can not find any documentation on this nor many other threads.

@kquinsland
Copy link

@GoingOffRoading You need to make sure that instanceid persists across the pod lifecycle. Do this by making sure that /var/www/html/config is on a persistent vol.

see: nextcloud/docker#1006 (comment)

@robertoschwald
Copy link

This seems to be a bug when you use S3 as primary storage. We don't want to have any persistent storages at all but use S3, only.

@agowa
Copy link

agowa commented Jan 7, 2024

@robertoschwald just get the file out of the config folder and put it into your helm chart as well?

Or use an init container that creates that file from something you stored in S3...

@joshtrichards joshtrichards transferred this issue from nextcloud/docker Jul 11, 2024
@jessebot
Copy link
Collaborator

For right now, you still need a persistent volume for the config directory as well, even when using S3. That's been my experience, at least. You can set persistence in the helm chart, but we probably still need to separate out the config dir persistence entirely from the data dir. I'll see if I can find the other issue mentioning this and link it back here.

@agowa
Copy link

agowa commented Jul 12, 2024

you still need a persistent volume for the config directory as well

you don't configmap works as well at least last time I checked the files within the config folder weren't dynamically updated at runtime by the application itself...

And alternatively an init container could just bootstrap the config directory using a script or something...

@jessebot jessebot changed the title Kubernetes - Redeploying not working Kubernetes - Redeploying not working when using S3 as default store backend Jul 12, 2024
@jessebot jessebot added the Persistence Anything to do with external storage or persistence. This is also where we triage things like NFS. label Jul 12, 2024
@jessebot
Copy link
Collaborator

I haven't tested this in about 6 months, but I thought there was something that changed in the config directory that prevented this from working. I can't remember what it was though. Oh, maybe it was the /var/www/html directory itself or something else in the /var/www/html/data directory?

Either way, I haven't had time to test this again in a while, so I'm open to anyone else in the community testing installing the latest version of this helm chart, enabling s3 as the default storage backend via a the nextcloud.configs parameter (which should create a ConfigMap), and verifying it's still broken. If it is broken still, we need to know precisely which directory needs to be persisted to fix this and why. From there, we can figure out what needs to be done, including the suggestions you've made, @agowa to see if there's anything we can mount or script away to solve this. 🙏

@jessebot jessebot changed the title Kubernetes - Redeploying not working when using S3 as default store backend Kubernetes - Redeploying not working when using S3 as default storage backend Jul 12, 2024
@jessebot jessebot added the S3 Anything to do with S3 object storage label Jul 26, 2024
@WladyX
Copy link

WladyX commented Sep 7, 2024

This is still broken, I've installed the latest version and hit this after a few redeploys of the pod.
Any workarounds to get it back running, preferably without wiping the DB and S3 and starting from scratch?

@agowa
Copy link

agowa commented Sep 11, 2024

@jessebot I stopped using nextcloud years ago because of this and another S3 backend related issue. I just were still subscribed to this issue...

@jessebot
Copy link
Collaborator

jessebot commented Sep 20, 2024

I haven't had a chance to test this again because I was waiting for the following to be merged:

In the meantime, @wrenix have you used s3 as a primary object store and done a restore successfully yet? I plan on testing this again soonish, but not before the above are merged. @provokateurin, @joshtrichards not sure if either of you use s3 either? 🤔

@WladyX
Copy link

WladyX commented Sep 21, 2024

Maybe installed version also needs to be persisted? Not just instance_id. Because I made instanceid static via nextcloud.config, but it still did not work.
Because of this:

            # Install
            if [ "$installed_version" = "0.0.0.0" ]; then
                echo "New nextcloud instance"

https://github.com/nextcloud/docker/blob/30b570f0b553736d63dc63cf487ff1e5e5331474/docker-entrypoint.sh#L183

@wrenix
Copy link
Collaborator

wrenix commented Sep 21, 2024

@jessebot sorry i do not use S3 current in my setup and has no time to build a testsetup with S3

@jessebot
Copy link
Collaborator

So I think you, @WladyX, and @kquinsland are onto something with the installed_version note. I also took a look at nextcloud/docker#1006 again.

So in the docker-entrypoint.sh script, we're looking for installed_version in /var/www/html/version.php:

        installed_version="0.0.0.0"
        if [ -f /var/www/html/version.php ]; then
            # shellcheck disable=SC2016
            installed_version="$(php -r 'require "/var/www/html/version.php"; echo implode(".", $OC_Version);')"
        fi

Which as @WladyX pointed out, then later down hits this conditional:

            if [ "$installed_version" = "0.0.0.0" ]; then
                echo "New nextcloud instance"

I checked on my instance and version.php looks like this:

<?php
$OC_Version = array(29,0,7,1);
$OC_VersionString = '29.0.7';
$OC_Edition = '';
$OC_Channel = 'stable';
$OC_VersionCanBeUpgradedFrom = array (
  'nextcloud' =>
  array (
    '28.0' => true,
    '29.0' => true,
  ),
  'owncloud' =>
  array (
    '10.13' => true,
  ),
);
$OC_Build = '2024-09-12T12:35:46+00:00 873a4d0e1db10a5ae0e50133c7ef39e00750015b';
$vendor = 'nextcloud';

The issue is that I'm not sure how to persist that file, without just using our normal PVC setup, since it's not created by nextcloud/helm or nextcloud/docker. I think it's created by nextcloud/server 🤔

Perhaps we can do some sort of check to see if s3 is already enabled? 🤔 Maybe checking if $OBJECTSTORE_S3_BUCKET is set? Open to ideas and suggestions. Will cross post to the other thread in nextcloud/docker too.

@provokateurin
Copy link
Member

The issue is that I'm not sure how to persist that file

The file is part of the source code and not generated at runtime. See https://github.com/nextcloud/server/blob/master/version.php

@jessebot
Copy link
Collaborator

So then the question is: is there a way to accommodate not having to manage PVCs while using S3? 🤔 Could we maybe add some sort of configmap with a simple php script like:

<?php
$S3_INSTALLED = true;

and then we tweak docker-entrypoint.sh upstream in nextcloud/docker to check there? I'm just throwing out suggestions, as I haven't tested anything on a live system yet, but want to try and help.

@WladyX
Copy link

WladyX commented Sep 25, 2024

Just thinking outloud:

  • Explain how to generate an instanceid for the helm values and populate the config with that
  • If S3_INSTALLED variable is defined or something then update the upstream docker-entrypoint to check the DB if the instance was installed or not instead of looking at the version for S3 cases.

Or I think I saw the version also in the DB, maybe docker-entrypoint should check the DB instead of the config for the version to decide if nextcloud was installed or not.
Thank you for looking into this one, I went for static PVC for the time beeing.

@minkbear
Copy link

minkbear commented Oct 4, 2024

I face this problems. How to solve this

Step

  1. helm install with value below
  2. first time pod start and running
  3. log in and upload some file
  4. delete nexcloud pod
  5. waiting pod start and then goto home and see "The Login is already being used"
nextcloud:
  existingSecret:
    enabled: true
    secretName: nextcloud-secret
    usernameKey: nextcloud-username
    passwordKey: nextcloud-password
  objectStore:
    s3:
      enabled: true
      accessKey: "xxxxx"
      secretKey: "xxxxx"
      region: xxxxx
      bucket: "xxxxx"

replicaCount: 1

internalDatabase:
  enabled: false

externalDatabase:
  enabled: true
  existingSecret:
    enabled: true
    secretName: nextcloud-secret
    hostKey: externaldb-host
    databaseKey: externaldb-database
    usernameKey: externaldb-username
    passwordKey: externaldb-password

mariadb:
  enabled: true
  auth:
    rootPassword: test

@joshtrichards
Copy link
Member

joshtrichards commented Oct 27, 2024

So then the question is: is there a way to accommodate not having to manage PVCs while using S3? 🤔 Could we maybe add some sort of configmap with a simple php script like:
[...]
and then we tweak docker-entrypoint.sh upstream in nextcloud/docker to check there? I'm just throwing out suggestions, as I haven't tested anything on a live system yet, but want to try and help.

Just some Sunday afternoon thoughts...

What problem are we actually trying to solve here? If the aim is to eliminate persistent storage, that's not feasible at this juncture. That's a much larger discussion (that touches on a re-design of the image and/or Nextcloud Server itself).

I guess OP didn't have any persistent storage12 for /var/www/html in place? Then this sounds like expected behavior. At the risk of putting my foot in my mouth because I'm coming from the docker repo and less familiar with the helm side of things, it seems the issue is that it is very important that persistence.enabled be on (so maybe there's room for doc enhancements or examples or something).

But you definitely need to have version.php around. It's part of the app itself, as Kate said. If it's not available, it's not a valid deployment. It means you don't have the persistent storage in-place that the image expects to be around for /var/www/html/.

Context

The version check in the entry point is used by the image to determine if there is already a version of Server installed on the container's persistent storage then:

  • if not detected, it installs it
  • if detected, see if it needs to be upgraded to match the new version from the image

The key here is that Server doesn't technically run from the image itself. The image installs a version of Server on persistent storage (i.e. the contents of /var/www/html/ within a running a container).

This is due to a mixture of how Nextcloud Server functions historically + how the image currently functions. But the bottom line is:

  • S3 Primary Storage + a database alone are not sufficient for a Nextcloud deployment. The former is only for user home directories/etc.

So /var/www/html/ + config /var/www/html/config + datadirectory (/var/www/html/data by default) are still expected to be available to any containers that boot the image.

If there are challenges like nextcloud/docker#1006, those need to be tackled directly. The OP in that one may have hit a weird NFS issue or similar. In part that's why I recently did nextcloud/docker#2311 (increasing verbosity of the rsync part of the entrypoint, which is the typical culprit for NFS interaction problems; the locking in the entrypoint to prevent multiple containers from updating simultaneously being another).

Footnotes

  1. https://github.com/nextcloud/helm/tree/main/charts/nextcloud#persistence-configurations

  2. https://github.com/nextcloud/docker/#persistent-data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Persistence Anything to do with external storage or persistence. This is also where we triage things like NFS. S3 Anything to do with S3 object storage
Projects
None yet
Development

No branches or pull requests