Skip to content

Latest commit

 

History

History
346 lines (222 loc) · 16.7 KB

README.md

File metadata and controls

346 lines (222 loc) · 16.7 KB

Wikibase Suite Deploy

Wikibase Suite (WBS) Deploy is a containerized, production-ready Wikibase system that allows you to self-host a knowledge graph similar to Wikidata. In addition to Wikibase on MediaWiki, WBS Deploy includes the Wikidata Query Service (WDQS), QuickStatements, Elasticsearch, and a Traefik reverse proxy with SSL termination and ACME support. The service orchestration is implemented using Docker Compose.

🔧 This document is for people wanting to self-host the full Wikibase Suite using Wikibase Suite Deploy. If you are looking for individual WBS images, head over to hub.docker.com/u/wikibase.

💡 This document presumes familiarity with basic Linux administration tasks and with Docker and Docker Compose.

What's in the box?

WBS Deploy consists of the following services:

  • Wikibase MediaWiki packaged with the Wikibase extension and other commonly used extensions.
  • Job Runner The MediaWiki JobRunner service which uses the same Wikibase container as above.
  • MariaDB Database service for MediaWiki and Wikibase.
  • Elasticsearch Search service used by MediaWiki.
  • WDQS Wikidata Query Service to process SPARQL queries.
  • WDQS Frontend Web front end for SPARQL queries.
  • WDQS Proxy A middle layer for WDQS which serves to filter requests and make the service more secure.
  • WDQS Updater Keeps the WDQS data in sync with Wikibase.
  • Quickstatements A web-based tool to import and manipulate large amounts of data.
  • Traefik A reverse proxy that handles TLS termination and SSL certificate renewal through ACME.

Quickstart

💡 If you want to run a quick test on a machine that has no public IP address (such as your local machine), check our FAQ entry below.

Requirements

Hardware

  • Network connection with a public IP address
  • AMD64 architecture
  • 8 GB RAM
  • 4 GB free disk space

Software

  • Docker 22.0 (or greater)
  • Docker Compose 2.10 (or greater)
  • git

Domain names

You need three DNS records that resolve to your machine's IP address, one for each user-facing service:

  • Wikibase, e.g., "wikibase.example"
  • QueryService, e.g., "query.example"
  • QuickStatements, e.g., "quickstatements.example"

Initial setup

Download WBS Deploy

Check out the files from Github, move to the subdirectory deploy and check out the latest stable branch.

git clone https://github.com/wmde/wikibase-release-pipeline
cd wikibase-release-pipeline/deploy
git checkout deploy-3

Initial configuration

Make a copy of the configuration template in the wikibase-release-pipeline/deploy directory.

cp template.env .env

Follow the instructions in the comments in your newly created .env file to set usernames, passwords and domain names.

Starting

Run the following command from within wikibase-release-pipeline/deploy:

docker compose up

The first start can take a couple of minutes. Wait for your shell prompt to return.

🎉 Congratulations, your Wikibase Suite instance should now be up and running. Web interfaces are available over HTTPS (port 443) for the domain names you configured for Wikibase, the WDQS front end and Quickstatements.

💡 If anything goes wrong, you can run docker logs <CONTAINER_NAME> to see some hopefully helpful error messages.

Stopping

To stop, use

docker compose stop

Resetting the configuration

Most values set in .env are written into the respective containers after you run docker compose up for the first time.

If you want to reset the configuration while retaining your existing data:

  1. Make any needed changes to the values in the .env file copied from template.env above. NOTE: Do not change DB_* values unless you are also re-creating the database.
  2. Delete your LocalSettings.php file from the ./config directory.
  3. Remove and re-create containers:
docker compose down
docker compose up

Advanced configuration

On first launch, WBS Deploy will create files in the ./config directory alongside your .env file, the docker-compose.yml and template.env. This is your instance configuration. You own and control those files. Be sure to include them in your backups.

config/LocalSettings.php

This file is generated by the MediaWiki installer script and supplemented by the Wikibase container's entrypoint.sh script on first launch. Once this file has been generated, you own and control it. This means that not only can you make changes to it, you may need to do so for major version updates.

If config/LocalSettings.php is missing, it triggers the Wikibase container to run the MediaWiki installer script. If you need to run the installer again, you can remove the generated LocalSettings.php file (but keep a backup just in case!) and restart your instance.

config/wikibase-php.ini

This is Wikibase's php.ini override file, a good place for tuning PHP configuration values. It gets loaded by the Wikibase web server's PHP interpreter.

docker-compose.yml

To further customize your instance, you can also make changes to docker-compose.yml. To ease updating to newer versions of WBS Deploy, consider putting your customizations into a new file called docker-compose.override.yml. If you do this, you'll need to start using the following commands to restart your instance:

docker compose -f docker-compose.yml -f docker-compose.override.yml down
docker compose -f docker-compose.yml -f docker-compose.override.yml up --wait

This way, your changes are kept separate from the original WBS Deploy code.

Managing your data

Besides your configuration, it's your data that makes your instance unique. All instance data is stored in Docker volumes.

  • wikibase-image-data: MediaWiki image and media file uploads
  • mysql-data: MediaWiki/Wikibase MariaDB raw database
  • wdqs-data: Wikidata Query Service raw database
  • elasticsearch-data: Elasticsearch raw database
  • quickstatements-data: generated Quickstatements OAuth binding for this MediaWiki instance
  • traefik-letsencrypt-data: SSL certificates

Back up your data

To back up your data, shut down the instance and dump the contents of all Docker volumes into .tar.gz files.

docker compose down

for v in \
    wbs-deploy_elasticsearch-data \
    wbs-deploy_mysql-data \
    wbs-deploy_quickstatements-data \
    wbs-deploy_traefik-letsencrypt-data \
    wbs-deploy_wdqs-data \
    wbs-deploy_wikibase-image-data \
    ; do
  docker run --rm --volume $v:/backup debian:12-slim tar cz backup > $v.tar.gz
done

Restore from a backup

To restore the volume backups, ensure your instance has been shut down by running docker compose down and populate the Docker volumes with data from your .tar.gz files.

docker compose down

for v in \
    wbs-deploy_elasticsearch-data \
    wbs-deploy_mysql-data \
    wbs-deploy_quickstatements-data \
    wbs-deploy_traefik-letsencrypt-data \
    wbs-deploy_wdqs-data \
    wbs-deploy_wikibase-image-data \
    ; do
  docker volume rm $v 2> /dev/null
  docker volume create $v
  docker run -i --rm --volume $v:/backup debian:12-slim tar xz < $v.tar.gz
done

Updating and versioning

WBS uses semantic versioning. The WBS Deploy and all the WBS images have individual version numbers.

WBS Deploy always references the latest minor and patch releases of the compatible WBS images' major versions using the images' major version tag.

Example

Let's say the wikibase image version 1.0.0 is the initial version released with WBS Deploy 3.0.0. In that case, the wikibase image carrying the 1.0.0 tag will also carry a 1 tag. When the wikibase image version is bumped to 1.1.0 for a feature release, a new image is released and tagged with 1.1.0. The 1 tag will then be reused and now point to the newly released image 1.1.0.

This way, WBS Deploy can always reference the latest compatible version by using the major version tag. Nothing needs to be updated in WBS Deploy itself. If the wikibase image version gets bumped to 2.0.0, that indicates a breaking change; in this case the new image would not receive the 1 tag. Instead, a new version of WBS Deploy would be released (in this case 4.0.0) and this one would use a new major version tag called 2 to reference the Wikibase image.

WBS Deploy may also receive minor and patch updates, but, as noted above, they are not required to update related WBS images.

Minor and patch updates for WBS images

Because WBS Deploy always references the latest minor and patch releases of compatible WBS images, non-breaking changes (including security updates) are applied automatically when re-creating Docker containers.

This is always safe to do. Simply run:

docker compose down
docker compose up

💡 In order to prevent new versions of WBS images being pulled on container restart, stop your containers using docker compose stop instead of docker compose down, which will keep the current containers intact. Note: this stops security updates from being applied. It is generally recommended to use docker compose down, which removes the containers and allows updates to be applied.

Minor and patch updates for WBS Deploy

WBS Deploy major versions are tracked in dedicated branches such as deploy-3. Pulling from the major version branch you are currently on will only update minor and patch versions and will never trigger breaking changes.

These updates are always considered safe.

If you did not change docker-compose.yml, you can update simply by running git pull.

git pull

💡 If you have made changes to docker-compose.yml, commit them to a separate branch and merge them with upstream changes as you see fit.

💡 Each major version of WBS Deploy always references exactly one major version of each of the WBS images. Thus, updating WBS Deploy minor and patch versions from a major version's git branch will never lead to breaking changes in WBS service images.

Major upgrades

Major version upgrades are performed by updating WBS Deploy's major version. This is done by changing your git checkout to the new major version branch. This may reference new major versions of WBS images or involve breaking changes. In turn, those may require additional steps as described below.

WBS only supports updating from one major version to the next version in sequence. In order to upgrade from 1.x.x to 3.x.x, you must first upgrade from 1.x.x to 2.x.x and then to 3.x.x.

Bring down your instance
docker compose down
Back up your data and config

Create a backup of your data.

Back up your ./config directory as well using:

cp -r ./config ./config-$(date +%Y%M%d%H%M%S)

💡 If you made changes to docker-compose.yml, commit them to a separate branch and merge them as you see fit in the next step.

Pull new version

WBS Deploy major versions are tracked in separate branches called deploy-MAJOR_VERSION, such as deploy-2 or deploy-3. Change your checkout to the new major version branch.

git remote update
git checkout deploy-MAJOR_VERSION
git pull

💡 If you made changes to docker-compose.yml, merge them as you see fit.

Apply any changes to .env

Look for changes in the new template.env that you might want to apply to your .env file.

Apply any migrations for your version
WBS Deploy 2.x.x to 3.x.x (MediaWiki 1.41 to MediaWiki 1.42)

Read the MediaWiki UPGRADE file.

No Wikibase-specific migrations are necessary.

WBS Deploy 1.x.x to 2.x.x (MediaWiki 1.39 to MediaWiki 1.41)

Read the MediaWiki UPGRADE file.

No Wikibase-specific migrations are necessary.

Bring your instance back up
docker compose up

Automatic updates

At the moment, WBS Deploy does not support automatic updates. To automatically deploy minor and patch updates including security fixes to your WBS images, restart your instance on a regular basis with a systemd timer, cron job, or similar.

Downgrades

Downgrades are not supported. In order to revert an update, restore your data from a backup made prior to the upgrade.

Removing Wikibase Suite Completely with all its Data

‼️ This will destroy all data! Back up anything you wish to retain.

To reset the configuration and data, remove the Docker containers, Docker volumes and the generated config/LocalSettings.php file.

docker compose down --volumes
rm config/LocalSettings.php

Removing the traefik-letsencrypt-data volume will request a new certificate from LetsEncrypt on the next launch of your instance. Certificate generation on LetsEncrypt is rate-limited; eventually you may be blocked from generating new certificates for multiple days. To avoid that outcome, change to the LetsEncrypt staging server by appending the following line to the traefik command stanza of your docker-compose.yml file:

      --certificatesresolvers.letsencrypt.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory

WDQS Frontend

To interact with the WDQS frontend, navigate to the URL defined as WDQS_FRONTEND_PUBLIC_HOST in the .env file. By default, this is set to wdqs-frontend.example.

Alternatively, send GET requests with your SPARQL query to the WDQS frontend endpoint: https://wdqs-frontend.example.com/proxy/wdqs/bigdata/namespace/wdq/sparql?query={SPARQL}

FAQ

Can I host WBS Deploy locally?

Yes, WBS Deploy can be hosted locally for testing purposes by using the example domain names *.example from template.env in your .env file. Configure those domains in your host machine's /etc/hosts file, so that your browser (on your host machine) resolves *.example to 127.0.0.1 and access the local WBS Deploy instance.

However, due to OAuth requirements, QuickStatements may not function properly without publicly accessible domain names for both the WIKIBASE_PUBLIC_HOST and QUICKSTATEMENTS_PUBLIC_HOST. Also, running locally without publicly accessible addresses will prevent the generation of a valid SSL certificate; to accessing locally running services, you will need to allow the invalid certificate when loading the page for the first time.

Can I migrate from another Wikibase installation to WBS Deploy?

It is possible to migrate an existing Wikibase installation to WBS Deploy. The general procedure is as follows:

My WDQS Updater keeps crashing, what can I do?

Check out the known issue in the WDQS README. You may find your solution there in the form of a workaround.

Do you recommend any VPS hosting providers?

As of this writing, we can offer no specific recommendations for VPS providers to host Wikibase Suite. The suite has been tested successfully on various providers; as long as the minimum technical requirements are met, it should run as expected.

Where can I get further help?

If you have questions not listed above or need help, use this bug report form to start a conversation with the engineering team.