In this tutorial we will go through a version upgrade of Dachs/Postgres holding on the publishing services, persisting the data and minimizing the downtime.
Some steps in this tutorial were already covered in the [README], Data Persistence, or Workflow documents and details will not be repeated here. Please make sure any doubt is satisfied by those documents and if something remains unclear, let us know.
Here we will
- start from the ARIHIP example using DaCHS v0.9.6*,
- then detach the data from the main container,
- upgrade the Postgres database to v9.6,
- upgrade the Dachs server to v1.2,
- finally, check services.
As always, this tutorial does not substitute the official GAVO/DaCHS documentation, it is complementary to the Docker application. In particular, the upgrade process of DaCHS (and Postgres) is covered by the documents at 'Upgrading DaCHS' (and 'upgrading db engine').
We start with the old version of DaCHS and the classic ARIHIP example; the process we are going through should be the same when handling multiple data sets (if not, please let us know).
- Instanciate the containers:
(host)$ docker run -dt --name postgres chbrandt/dachs:postgres-9.4
(host)$ docker run -dt --name dachs --link postgres -p 80:80 chbrandt/dachs:server-0.9.6
- Add data:
(host)$ mkdir -p arihip/data
(host)$ curl http://svn.ari.uni-heidelberg.de/svn/gavo/hdinputs/arihip/q.rd -o arihip/q.rd
(host)$ curl http://dc.g-vo.org/arihip/q/cone/static/data.txt.gz -o arihip/data/data.txt.gz
(host)$ docker cp arihip dachs:/var/gavo/inputs/.
(host)$ docker exec -it dachs bash -c 'gavo imp arihip/q && gavo pub arihip/q'
(host)$ docker exec dachs bash -c 'gavo serve reload'
(host)$ rm -rf arihip
What we want to do now is to detach the data from the dachs container. But when the container is running we cannot associate volumes to it. To do so we have to stop first stop the container and then restart it with the associated volume.
- Save the current container state:
(host)$ docker commit --pause dachs dachs_arihip:tmp
(host)$ docker stop dachs
(host)$ docker volume create dachs_data
(host)$ docker run -dt --name dachs_tmp --link postgres -p 80:80 \
--volume dachs_data:/var/gavo/inputs dachs_arihip:tmp
At this point, the service should be running just like before. If we go to http://localhost we will/should see DaCHS web page, and the ARIHIP dataset published.
What happened when we included the
dachs_data
volume was that the data in/var/gavo/inputs
was copied to the (then empty) volume. For further details, please visit the official Docker documentation on volumes.
Here we have detached (or allowed for persistence if you will) the content of
/var/gavo/inputs
simply because that was what we added/modified to the container. But you can attach as many volumes as you want or need. For instance, if you modified the content of your$GAVOSETTINGS
, or your/var/gavo/web
you may want to create a volume for/var/gavo
instead. An setup that I personally like is to have each data service in its own volume (/var/gavo/inputs/*
). Just keep a backup of your data, then you are free to play around.
Before upgrading our setup, we better check if everything is fine:
(host)$ docker exec dachs bash -c 'gavo val -c ALL'
** WARNING: RD __system__/tests: resource directory '/var/gavo/inputs/__tests' does not exist
** WARNING: RD __system__/run: resource directory '/var/gavo/inputs/__tests' does not exist
From where we should see no errors. Warnings are OK to have.
(host)$ docker run -dt --name dachs_new --link postgres -p 8080:80 \
--volume dachs_data:/var/gavo/inputs chbrandt/dachs:server-1.2
(host)$ docker stop dachs_tmp
(host)$ docker exec -it dachs_new bash
And then, from inside the new container:
(dachs_new)$ gavo serve stop
(dachs_new)$ gavo upgrade
ERROR:
> upgrade ivoa.obscore to obscore 1.1.
...PythonScriptRunner excecuting script update all obscore definitions
Starting ivoa.emptyobscore
Done ivoa.emptyobscore, read 0
Starting dummy
Done dummy, read 0
Making dependent __system__/obscore#create
PythonScriptRunner excecuting script create obscore view
Making dependent __system__/obscore#create
PythonScriptRunner excecuting script create obscore view
ok
> update schemaversion to 13... ok
ok
> Adding column_index column to TAP_SCHEMA.columns"
... ok
> ingesting column_index for TAP-published tables....*X*X* Uncaught exception at toplevel
*** Error: Oops. Unhandled exception ProgrammingError.
Exception payload: column "arraysize" of relation "columns" does not
exist LINE 1: ...mn_name, description, unit, ucd, utype, datatype,
arraysize,...
^
*: DaCHS version v0.9.6
is going to be used since that was,
effectively, the version Dachs-on-Docker (DoD) started, and remained there
until recently when people started using it and asked for updates -- when DoD
jumped to v1.2
. So...for historical reasons, like they say.