Middleware server for handling access control for ADC v1 HTTP API compliant repositories.
Check out Docker image here
Project runs on java 11, with (modified) google java style guide.
Features:
- Support for all of the AIRR ADC API functionalities except for:
tsv
format on POST endpoints. - Response fields filtering based on provided token access level.
- Emission of UMA tickets restricted to the fields requested.
You can also checkout this simple front-end for testing the access control capabilities of this middleware.
Example deployment for testing in localhost.
- Clone the repository:
git clone https://github.com/Ross65536/adc-middleware
- Load repository test data:
iReceptor Turnkey is used in this example.
Download backend data (for testing):
cd example
curl -L https://github.com/Ross65536/adc-middleware/releases/download/data/repository-data.tar.gz > repository-data.tar.gz
tar -xvzf repository-data.tar.gz -C data/
sudo chown -R $(whoami) data/mongodb/
- Load all components:
(Optional) build middleware docker image locally:
docker build -t ros65536/adc-middleware .
Load components
cd example
# load all components
MIDDLEWARE_UMA_CLIENT_SECRET=12 docker-compose up # MIDDLEWARE_UMA_CLIENT_SECRET is not important in this step but must be set
You can now make requests to http://localhost/airr/v1/
. Try with http://localhost/airr/v1/info
to see if there is a connection to the backend.
On boot the middleware server automatically connects to the DB.
- Configure keycloak:
Keycloak admin login page is now accessible at http://localhost/auth
. Regular user login at http://localhost/auth/realms/master/account
.
Then see below how to configure keycloak for first use.
Make note of the generated client secret ($MIDDLEWARE_UMA_CLIENT_SECRET
) for the created adc-middleware
client.
You also need to create an additional client in the Keycloak's Clients
side bar tab, load (import) and save the client from the file ./example/config/keycloak/front-end.json
.
- Re-Load the server with the client secret
# first stop docker containers, e.g.
docker-compose stop
# then
MIDDLEWARE_UMA_CLIENT_SECRET=<the client secret from step 3> docker-compose up
- Synchronize middleware and Keycloak state with Repository:
# '12345abcd' is the password
curl --location --request POST 'localhost/airr/v1/synchronize' --header 'Authorization: Bearer 12345abcd'
See below for a discussion on when to re-synchronize.
You should now be able to access the frontend on http://localhost
, login, and access resources.
Important: When deploying it's very important to make the backend's API unavailable to the public
Important: You must generate a new password and hash for the
app.synchronizePasswordHash
property variable, see below how.
Important: The middleware APIs should be under a SSL connection in order not to leak user credentials or the synchronization password. If it is not under SSL you need to disable SSL connections Keycloak (see below).
Important: The host and port used by the middleware to access keycloak and the host and port used by the user to obtain an RPT token must be the exact same (must have the same
Host
header) otherwise the solution doesn't work. Check the nginx configuration to see how this was achieved. When deploying, if using the provided nginx config you must update the line marked asVERY IMPORTANT
.
- Go to the keycloak login page (
$HOSTNAME/auth
). Login as admin withadmin:admin
. - Go to
master
'sRealm Settings
in the sidebar and enableUser-Managed Access
in theGeneral
tab. - Create a new client in the
Clients
side bar tab: load (import) and save the client from the file./example/config/keycloak/adc-middleware.json
. Go to credentials tab in the client and note the generatedSecret
value which is the client secret whileadc-middleware
is the client ID. - In the
Users
tab create a user with usernameowner
, this is the resource owner. Create a user with usernameuser
, this is the user that will access resources. For each created user in the user'sCredentials
tab create the password. A user can then login ($HOSTNAME/auth/realms/master/account
), for example as an owner to grant accesses to users.
You can use different values for these strings, but you would need to update a lot of configuration variables.
If you need to disable Keycloak's SSL for remote connections, for example when acessing a remote server without SSL enabled, you can:
-
Disable using admin panel. You need to access the admin panel in the remote server locally (localhost) and go to
Realm Settings
->Login
->Require SSL
=none
-
Disable using SQL script. You can connect to keycloak's database (after it has been loaded by keycloak) and the run:
UPDATE realm set ssl_required = 'NONE' where id = 'master';
You can connect, for example with:
# in remote server shell psql -U postgres -p 5432 -h localhost \c keycloak_db
The docker image for the middleware accepts the following environment variables:
CLIENT_SECRET
: The UMA client secret for the middleware.DB_PASSWORD
: The DB password for the middlewarePROPERTIES_PATH
: The path for the java properties configuration file.
The remaining configuration is done using java properties (for example see data/config/example.properties
, for explanation see below).
If using OpenJDK, use minimum of v11.0.7
You need to setup and configure a keycloak server.
Run:
docker-compose --file docker-compose.dev.yml up keycloak_db
docker-compose --file docker-compose.dev.yml up keycloak
Keycloak is now hosted on http://localhost:8082
.
Then see above how to configure keycloak.
./gradlew bootRun
With arguments:
./gradlew bootRun --args='--server.port=9999' # --server.port equivalent to java's -Dserver.port
file dev.properties
(You need to update uma.clientSecret
):
adc.resourceServerUrl=http://localhost:80/airr/v1
server.port=8080
# password 'master'
app.synchronizePasswordHash=$2a$10$qr81MrxWblqZlMAt5kf/9.xdBPubtDMuoV3DRKgTk2bu.SPazsaTm
# UMA
uma.wellKnownUrl=http://localhost:8082/auth/realms/master/.well-known/uma2-configuration
uma.clientId=adc-middleware
uma.clientSecret=<the generated client secret from keycloak>
uma.resourceOwner=owner
# Postgres
spring.datasource.url=jdbc:postgresql://localhost:5432/middleware_db
spring.datasource.username=postgres
spring.datasource.password=password
spring.datasource.platform=postgres
#redis
spring.cache.type=redis
spring.redis.host=localhost
spring.redis.port=6379
docker-compose --file docker-compose.dev.yml up
./gradlew bootRun --args='--spring.config.location=classpath:/application.properties,./dev.properties'
The jar uses java 11
./gradlew bootJar # jar will be placed in ./build/libs/
To run style checker run:
./gradlew clean
./gradlew checkstyleMain
./gradlew test
Dockerhub has setup a hook to automatically pull and build images from repository commits that are tagged like v1.0.1
using semantic versioning.
git tag -a v<VERSION> -m <MESSAGE> # tag latest commit
git push origin --tags # This should trigger a build in dockerhub
You can set these by either adding a custom properties file (using --spring.config.location
to inject the file, see example below) or by passing them as CLI options (with -D<property>=<value>
). In the properties files you can use the field names directly as displayed here, for the CLI prepend -D
, for the gradle CLI prepend --
(see above example).
Required:
adc.resourceServerUrl
: The url to the underlying resource server (ADC backend) including base path (examplehttp://localhost:80/airr/v1
).uma.wellKnownUrl
: The url to the keycloak server's UMA well known document (example:http://localhost:8082/auth/realms/master/.well-known/uma2-configuration
)uma.clientId
: Client ID for this middleware in keycloakuma.clientSecret
: Client Secret for the client IDuma.resourceOwner
: The Keycloak username who will be the owner of the created resources.spring.datasource.url
: The url to the DBspring.datasource.username
: DB usernamespring.datasource.password
: DB passwordspring.datasource.platform
: The platform. Omit for H2 DB, set topostgres
for PostgreSQL DB.app.synchronizePasswordHash
: The sha256 hash of the password protecting the synchronization endpoint. See below how to generate.
Optional:
server.servlet.context-path
: The base path of the middleware API. Defaults to:/airr/v1
server.port
: The middleware server port, defaults to80
app.adcCsvConfigPath
: The path for the CSV config file containing the custom fields configuration. Example./field-mapping.csv
. Defaults to the filesrc/main/resources/field-mapping.csv
. See below for structure of file.app.facetsEnabled
: Boolean, indicates whether the resource server supportsfacets
(and by extension ADCfilters
). Defaults totrue
.app.publicEndpointsEnabled
: Boolean, indicates whether the resource server supports the public ADC endpoints (/
,/info
,/swagger
). Defaults totrue
.app.adcFiltersEnabled
: Setting this to true will disable POST endpoint's"filters"
function which should make oracle attacks unfeasible. Defaults tofalse
.app.filtersOperatorsBlacklist
: A comma separated list of"filters"
operators that are disabled. Disabling some operators helps mitigate timing attacks. Defaults to `` which disables this feature.app.requestDelaysPoolSize
: Positive integer, the delays pool size. Defaults to10
. Set to0
to disable request delaying when emitting permissions tickets. Mitigates timing attacks.
Optional Dev:
- (H2 only)
spring.h2.console.enabled
: Will enable H2 web console onhttp://localhost:8080/airr/v1/h2-console
(default with urljdbc:h2:file:./data/db
accountsa:password
). Defaults to false.
Pay attention to spaces, a space at the end of a property value line will be included in the string
Running with custom properties file (using deployment jar):
# ./config.properties is the custom file, in the current working directory
# MAKE sure to also include the MANDATORY default properties file 'classpath:/application.properties'
java -jar ./build/libs/adc-auth-middleware-0.0.1-SNAPSHOT.jar \
--spring.config.location=classpath:/application.properties,./config.properties
- Example config for H2 DB
spring.datasource.url=jdbc:h2:file:./data/h2/db
spring.datasource.username=sa
spring.datasource.password=password
- Example config for PostgreSQL DB
See data/config/example.properties
for example
spring.datasource.url=jdbc:postgresql://localhost:5432/postgres
spring.datasource.username=postgres
spring.datasource.password=password
spring.datasource.platform=postgres
- Using Redis as Cache (Optional)
If these values are not set the default spring cache will be used
spring.cache.type=redis
spring.redis.host=localhost
spring.redis.port=6379
To use simple in-memory cache set:
spring.cache.type=simple
Value for the app.adcCsvConfigPath
config param. You can use the default provided ./field-mapping.csv
or extend it.
The CSV must have header:
class
: Specifies whether the field is aRepertoire
orRearrangement
field
: The field. For nested objects demark with.
. Examplesubject.age_unit.value
orrepertoire_id
. Supports arrays.protection
: Whether the field is publicly access or protected. Public means any user can access this information, protected means only users that were given access to with the specific scope can access the field information. Valid values arepublic
andprotected
.access_scope
: The UMA scope required to be able to access the field. Must be blank ifprotection
ispublic
, cannot be blank if it is not. Values can be any user defined string of pattern[\w_]+
. IMPORTANT: make sure that you make no typos here, the values used here are the UMA scopes stored in keycloak and used for access control.field_type
: The type of the field. User for input validation. Valid values arestring
,boolean
,number
,integer
,array_string
.include_fields
: Can be one ofmiairr
,airr-core
,airr-schema
or empty. Specifies to which type the field belongs to. A field that belongs toairr-schema
belongs also toairr-core
andmiairr
and a field ofairr-core
also belongs tomiairr
. Matches with the ADC API query'sinclude_fields
JSON parameter.
The CSV is comma separated. For an example see src/main/resources/field-mapping.csv
.
The CSV can include other columns after these which are ignored.
The middleware needs to synchronize with the backend periodically. No automatic synchronization is performed so you must invoke synchronization when data in the resource server changes, namely when: a repertoire or rearrangement ir added, deleted or updated (study, repertoire_id and sequence_id fields).
To synchronize you can make the following request to the /airr/v1/synchronize
endpoint using the password as Bearer token:
curl --location --request POST "$MIDDLEWARE_HOST/airr/v1/synchronize" --header "Authorization: Bearer $THE_PASSWORD"
You need to hash a BCrypt password with 10 rounds to use the synchronization endpoint
sdk install springboot # need https://sdkman.io/ installed
PASSWORD=$(xxd -l 32 -c 100000 -p /dev/urandom) # or use a different password
spring encodepassword -a bcrypt $PASSWORD # $THE_PASSWORD
# example acceptable password: 'master' for '$2a$10$qr81MrxWblqZlMAt5kf/9.xdBPubtDMuoV3DRKgTk2bu.SPazsaTm'
You can use the public endpoint:
curl --location --request GET 'localhost:8080/airr/v1/public_fields'
to obtain the public fields for each class of resources.
To be able to make use of this middleware the backend MUST implement the following AIRR ADC API endpoints (the URL base-path is configurable):
- GET /repertoire/{repertoire_id}
- GET /rearrangement/{sequence_id}
- POST /repertoire
- POST /rearrangement
For endpoints 3. and 4. the backend can discard the other query fields except the "facets"
field which must be processed and returned correctly.
The following public endpoints are not mandatory and access to them can be disabled in the middleware:
- GET /
- GET /info
- GET /swagger
The Repertoire
s regular responses (1. and 3.) must be of (minimal) format:
{
"Repertoire": [
{ // can put any extra fields in here
"repertoire_id": "123adc", // string type, must be the id in endpoint 1.
"study": {
"study_id": "12", // string type, multiple repertoires can have the same study, in which case the study id MUST be the same
"study_title": "Research thingy" // string type, while this is optional it is used for UI purposes for keycloak
}
}
]
}
The Rearrangement
s regular responses (2. and 4.) must be of (minimal) format:
{
"Rearrangement": [
{ // can put any extra fields in here
"repertoire_id": "123adc", // string type, must be the id of the repertoire to which this rearrangement belongs to
"sequence_id": "234" // string type, must be the id in endpoint 2.
}
]
}
Any extra fields used for Repertoire or Rearrangement can be used if they are set in the CSV config file.
As mentioned, facets must be supported. The middleware when receiving a user query for repertoires of the like:
{
"filters": {
"op": "=",
"content": {
"field": "repertoire_id",
"value": "5e53dead4d808a03178c7891"
}
},
"from": 5,
"size": 10
}
Will modify the request to before sending it to the repository:
{
"filters": {
"op": "=",
"content": {
"field": "repertoire_id",
"value": "5e53dead4d808a03178c7891"
}
},
"from": 5,
"size": 10,
"facets": "study.study_id" // For rearrangements the field is "repertoire_id"
}
The middleware expects a minimal facets response of the like:
{
"Facet": [
{
"study.study_id": "s1", // string. For rearrangements the field is "repertoire_id"
"count": 130 // integer
},
...
]
}
In the example above the filters
, from
and size
parameters can be discarded by the repository.
If the filters
support is not compliant enough as discussed below the facets feature MUST be disabled in the middleware to avoid security issues (even though the middleware makes use of the repository's facets function).
The official documentation (ADC v1) does not specify that the parameters from
and size
can be used with the facets
parameter, and they can be safely discarded by the repository, but in order to improve security (by limitting the scope of the emitted tokens) and avoid UI clutter these should be supported and should perform the same query along with filters
as done in a regular search without facets, that is, the set of field values returned by a regular search, and by using facets must be the same.
To use facets the repository backend MUST support the ADC filters
query feature as described here, otherwise this feature MUST be disabled in the middleware's config.
More specifically the in
filters
operator must be supported, and the and
operator for chaining with user requests. If a user makes a Repertoires search request like:
{
"filters":{
"op":"=",
"content": {
"field": "repertoire_id",
"value": "5e53dead4d808a03178c7891"
}
}
}
The middleware modifies the request and sends:
{
"filters":{
"op": "and",
"content": [
{
"op":"=",
"content": {
"field": "repertoire_id",
"value": "5e53dead4d808a03178c7891"
}
},
{
"op":"in",
"content": {
"field": "study_id",
"value": ["123", "456"] // example values
}
}
]
}
}
Likewise for Rearrangements but with the repertoire_id
value for in
's field
.
It is assumed that, like in the AIRR ADC API, an empty in
:
{
"op":"in",
"content": {
"field": "study_id",
"value": []
}
}
would make the backend return an empty Facet
response.
If there are values for the array sent in the in
operator the ids MUST be matched against the response, otherwise an information leak is created.
TSV format is supported for the POST /v1/rearrangement
endpoint.
The user's requested fields cannot be nested documents/objects (in the default CSV configuration no rearrangement fields are nested objects).
TSV support is implemented in the middleware itself by translating JSON to TSV.
- Login to keycloak's admin panel.
- Go to
Identity Providers
in the side bar and add a OpenID Connect provider, set thealias
which will be the display name (for example toorcid
) and make note of the generatedRedirect URI
. - Add keycloak to third party OIDC IdP.
For ORCDID login as an account, go to developer tools, and add keycloak: set the Your website URL
to keycloak's host (example http://localhost:8082
) and put in Redirect URIs
the url generated in keycloak from the previous step (example http://localhost:8082/auth/realms/master/broker/orcid/endpoint
). Make note of the Client ID
and Client Secret
. Save.
For EGI Checkin: In the dashboard from step 2, add generated info from previous step.
For ORCID put https://orcid.org/oauth/authorize
in the Authorization URL
, https://orcid.org/oauth/token
in the token url, set Client Authentication
to Client secret sent as post
and input the client ID and client secret from the previous step in Client ID
and Client Secret
. Save
For EGI Checkin put https://aai-dev.egi.eu/oidc/authorize
in the Authorization URL
, https://aai-dev.egi.eu/oidc/token
in the token url, set Client secret sent as post
and input client ID and secret. Save
You can see here a python-like pseudo-code which describes this whole middleware server's working.
You can obtain the Postman folder that you can use to test the middleware solution's ADC features in more depth here