An environment for local development and testing of tools to update UCLA's ArchivesSpace records.
To support both AMD and ARM architectures, we need to build the main ArchivesSpace container image locally. To do this, first clone the ArchivesSpace repo: https://github.com/archivesspace/archivesspace/. Navigate to the main archivesspace
directory and build the image, tagging it as archivesspace-local
:
docker build . -t archivesspace-local
Then, navigate back to this repo's main directory (archivesspace-toolkit
) and run the containers with docker compose
:
docker compose up -d
Wait until you see the below message in the as_aspace
container logs:
Welcome to ArchivesSpace!
You can now point your browser to http://localhost:8080
The staff interface will be available at http://localhost:8080, and the public interface will be at http://localhost:8081. Log in with username and password admin
.
- Retrieve the latest production database dump, named
ucla.sql.gz
, from Box (ask a teammate if you need access). Move the file to yourarchivesspace-toolkit
project directory. - Start your local system, if not already up:
docker compose up -d
, and wait for the application to be ready. - Run the following to load the data. Since database storage is persisted via volume on the host, this will use about 2.7 GB of local storage on your computer. This will take several minutes, depending on computer:
gunzip -dc ucla.sql.gz | docker compose exec -T db mysql -D archivesspace -u root -p123456
# This unzips the database dump to STDOUT, pipes it to mysql running on the db service, loading the data
# into the archivesspace database. The mysql user must be root; password 123456 comes from .docker-compose_db.env.
- Ignore the warning:
mysql: [Warning] Using a password on the command line interface can be insecure.
- Quick verification of data load:
docker compose exec db mysql -D archivesspace -u as -pas123 -e 'select count(*) from repository;'
+----------+
| count(*) |
+----------+
| 3 |
+----------+
- This process is repeatable: the import drops existing tables (official ones, at least), so all official content is replaced by this import. (Tables you may have created manually will remain, along with their contents.)
- If you do want to start fresh:
- Stop the system with
docker compose down
- Remove the database storage:
docker volume rm archivesspace-toolkit_db
- Start the system as usual
After a database refresh from production, the initial local admin/admin
account/password will no longer work.
To set the admin password to be the same as the production password for your own user, run this after changing YOUR_ASPACE_USERNAME
to the appropriate value:
docker compose exec db mysql -D archivesspace -u as -pas123 \
-e 'update auth_db as t1, (select * from auth_db where username = "YOUR_ASPACE_USERNAME") as t2 set t1.pwhash = t2.pwhash where t1.username = "admin";'
TBD - all I know for now is this is a long-running process (14 hours so far....)
All API access is handled by the main application service, archivesspace
, on port 8089. This can be reached from the python container.
curl
example, for now:
# Open bash session on python container
docker compose run python bash
# Authenticate
curl -s -F password="admin" "http://archivesspace:8089/users/admin/login"
# Use the session key for all other API requests
curl -H "X-ArchivesSpace-Session: your_session_key" "http://archivesspace:8089/repositories"
It's possible to access data "live" in the hosted test instance, from the development environment. This requires extra setup, because the APIs are IP-restricted and must be accessed via HTTPS/TLS. General notes are in our internal documentation. For this specific application:
-
Add
127.0.0.1 uclalsc-test.lyrasistechnology.org
to your local/etc/hosts
-
Create a tunnel from local machine through our jump server. Local port is arbitrary, I've used
9000
:ssh -NT -L 0.0.0.0:9000:uclalsc-test.lyrasistechnology.org:443 jump
-
Connect from local machine, or from within Docker, using
https://uclalsc-test.lyrasistechnology.org:9000/api
and appropriate credentials.
There is a default ArchivesSnake configuration file in python/.archivessnake.yml
. This supports access from the running python
container to the running archivesspace
container, using default (and non-secret) credentials.
For other configurations, copy python/.archivessnake.yml
to python/.archivessnake_secret_DEV.yml
or python/.archivessnake_secret_TEST.yml
, and edit the baseurl
, username
, and password
fields as appropriate. These files must be in the python
directory to be available within the container.
These are excluded from the repository, so contact a teammate if you need specific credentials.
When running against hosted systems (UCLA's test and production ArchivesSpace instances), some APIs can time out (for example, getting top containers for a resource, where some of our resources have thousands of containers). An alternative is to read data from the database, manipulate it as needed, then call APIs as usual to update data.
The hosted databases are IP-restricted, and must be accessed via tunneled connections. To set up the connections, run one of the following on the support server we use, p-u-exlsupport01.library.ucla.edu
:
# Connect to TEST database
### TBD - waiting for vendor to set this up ###
# Connect to PROD database
ssh -i ~/.ssh/id_aspace_ssh -NT -L \
3306:aspace-hosting-production-db-shared-p1.lyrtech.org:3306 \
[email protected]
This tunnel runs in the foreground, so you'll need to open a second connection to run programs which need it.