The Django server for Fragalysis using the Django REST Framework (DRF) for the API, and loaders for data.
See additional documentation relating to the backend on ReadTheDocs at https://fragalysis-backend.readthedocs.io/en/latest/index.html
The Backend is part of the Stack, which consists of three services: -
- a Postgres database
- a neo4j graph database
- the Fraglaysis "stack"
The stack is formed from code resident in a number of repositories. This one, and: -
Other, significant, repositories include: -
The stack is deployed as a container images to Kubernetes using Ansible playbooks that can be found in the Ansible repository. Additional development and deployment documentation can be found in the informaticsmatters/dls-fragalysis-stack-kubernetes repository.
This project uses Poetry https://python-poetry.org/
pacakage management system. Required packages (along with other project settings),
are specified on pyproject.toml
file, and all dependencies and their versions in
poetry.lock
file. When the repository is first downloaded, create the local virtual
environment by running: -
poetry install
New packages are added with: -
poetry add
This opens an interactive dialogue where package name and version (exact or range)
can be specified. Alternatively, the package can be added to pyproject.toml
file
under appropriate section manually.
After package has been added (or just to update packages defined with a range of allowed versions) run: -
poetry update
This resolves all dependencies (and their dependencies), writes poetry.lock
file and
installs/updates new packages to local venv. It's equivalent to running
poetry lock && poetry install
, so if you're not interested in local environment and
just want to update the lockfile, you can run just poetry lock
.
The backend is a Docker container image and can be build and deployed locally using docker-compose
: -
docker-compose build
To run the application (which wil include deployment of the postgres and neo4j databases) run: -
docker-compose up -d
The postgres database is persisted in the data
directory, outside of the repo.
You may need to provide a number of environment variables that are employed in the container image. Fragalysis configuration depends on a large number of variables, and the defaults may not be suitable for your needs.
The typical pattern with docker-compose
, is to provide these variables in the
docker-compose.yml
file and adjust their values (especially the sensitive ones)
using a local .env
file (see environment variables).
The backend API, for example, should be available on port 8080 of your host
at http://localhost:8080/api/
.
You can visit the /accounts/login
endpoint to login (assuming you have setup the
appropriate environment variables for the container). This generates errors relating
to the fact that the FE/Webpack can’t be found. This looks alarming but you are logged in.
The backend no longer writes
.pyc
files (theDockerfile
sets the environment variablePYTHONDONTWRITEBYTECODE
). This, and the fact the backend code is mapped into the container, allows you to make "live" changes to the code on your host and see them reflected in the container app without having to rebuild or restart the backend container.
When you want to spin-down the deployment run: -
docker-compose down
When running locally (via docker-compose
) Celery tasks are set to
run synchronously, like a function call rather than as asynchronous
tasks. This is controlled by the CELERY_TASK_ALWAYS_EAGER
environment variable that you'll find in the docker-compose.yml
file. If asynchronous Celery tasks are needed in local development,
they can be launched with the additional compose file:
docker compose -f docker-compose.yml -f docker-compose.celery.yml up
There is also a convenient bash script that can be used to build and push an image to a repository. you just need to provide the Docker image namespace and a tag. All you need is poetry and docker, and you can run the script: -
export BE_IMAGE_TAG=1187.1
export BE_NAMESPACE=alanbchristie
./build-and-push.sh
With the backend running you should be able to access the REST API. From
the command-line you can use curl or httpie. Here, we use http
to
GET a response from the API root (which does not require authentication)...
http :8080/api/
The response should contain a list of endpoint names and URLs, something like this...
{
"action-type": "http://localhost:8080/api/action-type/",
"cmpdchoice": "http://localhost:8080/api/cmpdchoice/",
"cmpdimg": "http://localhost:8080/api/cmpdimg/",
[...]
"vector3ds": "http://localhost:8080/api/vector3ds/",
"vectors": "http://localhost:8080/api/vectors/",
"viewscene": "http://localhost:8080/api/viewscene/"
}
To use much of the remainder of the API you will need to authenticate. Some endpoints allow you to use a token, obtained from the corresponding Keycloak authentication service. If you are running a local backend a client ID exists that should work for you, assuming you have a Keycloak user identity. With a few variables: -
TOKEN_URL=keycloak.example.com/auth/realms/xchem/protocol/openid-connect/token
CLIENT_ID=fragalysis-local
CLIENT_SECRET=00000000-0000-0000-0000-000000000000
USER=someone
PASSWORD=password123
...you should eb able to obtain an API token. Here we're using http
and jq
: -
TOKEN=$(http --form POST https://$TOKEN_URL/ \
grant_type=password \
client_id=$CLIENT_ID \
client_secret=$CLIENT_SECRET \
username=$USER \
password=$PASSWORD | jq -r '.access_token')
The token should last for at least 15 minutes, depending on the Keycloak configuration. With the Token you should then be able to make authenticated requests to the API on your local backend.
Here's an illustration of how to use the API from the command-line by getting, adding,
and deleting a CompoundIdentifierType
: -
ENDPOINT=api/compound-identifier-types
http :8080/$ENDPOINT/ "Authorization:Bearer $TOKEN"
RID=$(http post :8080/$ENDPOINT/ "Authorization:Bearer $TOKEN" name="XT345632" | jq -r '.id')
http delete :8080/$ENDPOINT/$RID/ "Authorization:Bearer $TOKEN"
The backend writes log information in the container to /code/logs/backend.log
. This is
typically persisted between container restarts on Kubernetes with a separate volume mounted
at /code/logs
.
For local development using the
docker-compose.yml
file you'll find the logs at./data/logs/backend.log
.
The backend configuration is controlled by a number of environment variables.
Variables are typically defined in the project's fragalysis/settings.py
, where you
will also find ALL the dynamically configured variables (those that can be changed
using environment variables in the deployed Pod/Container).
- Not all variables are dynamic. For example
ALLOWED_HOSTS
is a static variable that is set in thesettings.py
file and is not intended to be changed at run-time.
Refer to the documentation in the settings.py
file to understand the environment
and the style guide for new variables that you need to add.
The best approach is to spin-up the development backend (locally) using
docker-compose
with the custom migration compose file and then shell into Django.
For example, to make new migrations called "add_job_request_start_and_finish_times"
for the viewer's model run the following: -
Before starting postgres, if you need to, remove any pre-existing local database (if one exists) with
rm -rf ./data/postgresl
docker-compose -f docker-compose-migrate.yml up -d
Then enter the backend container with: -
Then from within the backend container make the migrations
(in this case for the viewer
)...
docker-compose -f docker-compose-migrate.yml exec backend bash
python manage.py makemigrations viewer --name "add_job_request_start_and_finish_times"
Exit the container and tear-down the deployment: -
docker-compose -f docker-compose-migrate.yml down
The migrations will be written to your clone's filesystem as the project directory is mapped into the container as a volume at
/code
. You just need to commit the migrations that have been written to the corresponding migrations directory.
Sentry can be used to log errors in the backend container image.
In settings.py
, this is controlled by setting the value of FRAGALYSIS_BACKEND_SENTRY_DNS
,
which is also exposed in the developer docker-compose file.
To enable it, you need to set it to a valid Sentry DNS value.
The stack can be deployed in one of tweo modes: - DEVELOPMENT
or PRODUCTION
.
The mode is controlled by the DEPLOYMENT_MODE
environment variable and is used
by the backend in order to tailor the behaviour of the application.
In PRODUCTION
mode the API is typically a little more strict than in DEVELOPMENT
mode.
In order to allow error paths of various elements of the stack to be tested, the
developer can inject specific errors ("infections"). This is
achieved by setting the environment variable INFECTIONS
in the docker-compose.yml
file
or, for kubernetes deployments, using the ansible variable stack_infections
.
Known errors are documented in the api/infections.py
module. To induce the error
(at thew appropriate point in the stack) provide the infection name as the value of the
INFECTIONS
environment variable. You can provide more than one name by separating
them with a comma.
Infections are ignored in PRODUCTION
mode.
Because the documentation uses Sphinx and its autodoc
module, compiling the
documentation needs all the application requirements. As this is often impractical
on the command-line, the most efficient way to build the documentation is from within
the backend container: -
docker-compose up -d
docker-compose exec backend bash
pip install sphinx==5.3.0
pip install importlib-metadata~=4.0
cd docs
sphinx-build -b html source/ build/
The current version of Python used in the Django container image is 3.7 and this suffers from an import error relating to celery. It is fixed by using a pre-v5.0 version of
importlib-metadata
as illustrated in the above example. (see https://stackoverflow.com/questions/73933432/)
The code directory is mounted in the container so the compiled documentation can then be committed from the host machine.
The project uses pre-commit to enforce linting of files prior to committing them to the upstream repository.
As fragalysis is a complex code-based (that's been maintained by a number of key developers) we currently limit the linting to the
viewer
application (see the.pre-commit-config.yaml
file for details). In the future we might extend this to the entire code-base.
To get started review the pre-commit utility and then set-up your local clone by following the Installation and Quick Start sections of the pre-commit documentation.
Ideally from a Python environment...
poetry shell
poetry install --only dev
pre-commit install -t commit-msg -t pre-commit
Now the project's rules will run on every commit and you can check the state of the repository as it stands with...
pre-commit run --all-files
As the application has evolved several design documents have been written detailing improvements. These may be useful for background reading on why decisions have been made.
The documents will be stored in the /design_docs
folder in the repo.
These include, but are not limit to: -
- Fragalysis Discourse Design
- Fragalysis Tags Design V1.0
- Fragalysis Design #651 Fix Data Download V2.0
- Fragalysis Job Launcher V1.0
- Fragalysis Job Launcher V2.0