A utility to mirror datasets from other data portals as external datasets in your Socrata Open Data Portal. Uses external data portal's data.json dataset catalog for aggregation.
This was setup for Heroku, so that is the suggested platform. Otherwise, you will need to provide leiningen, postgres yourself.
- Install leiningen using its install script (don't use
apt
). - Clone this project and
cd
into its directory.. - Add an application to Heroku.
- Add a Heroku git remote to this project:
heroku git:remote -a yourherokuapp`
- Add Postgres to your application:
heroku addons:add heroku-postgresql:dev
- Set the environment variables for the application, substituting your own values for those shown:
heroku config:add SYNC_DATA_JSON_URL=http://some.remotedataportalyourepullingfrom.com/data.json
heroku config:add SYNC_HOST=some.remotedataportalyourepullingfrom.com
heroku config:add SYNC_URL=http://your.socratadataportal.com
heroku config:app [email protected]
heroku config:app SYNC_PASSWORD=yourpassword
heroku config:app SYNC_TOKEN=yoursocrataapplicationtoken
heroku config:app SYNC_ATTRIBUTION="Hosted by the Some Awesome Portal"
heroku config:app SYNC_ATTRIBUTION_URL="http://anawesome.remoteportal.org"
- Then set your Postgres details. Find them by visiting your app's Resources tab in Heroku, then clicking the "Heroku Postgres :: Aqua" link. This should take you to a page with your assigned Postgres DB settings.
heroku config:add DATABASE_HOST=yourhost
heroku config:add DATABASE_NAME=yourdatabase
heroku config:add DATABASE_USER=youruser
heroku config:add DATABASE_PASSWORD=yourpassword
- Run the migrations (locally). Substitute the all-caps variables below for the same values you set on your Heroku server in the previous step.
lein ragtime migrate -d "jdbc:postgresql://DB_HOST:5432/DB_NAME?user=DB_USER_NAME&password=DB_USER_PASSWORD&ssl=true&sslfactory=org.postgresql.ssl.NonValidatingFactory"
- Push to Heroku to deploy the app:
git push heroku master
- Scale down the the web dyno. We don't need it.
heroku ps:scale web=0
- Add a clock process, because that's what we actually need.
heroku ps:scale clock=1
To set an environment variable, run heroku config:add NAME_OF_VARIABLE=SOMEVALUE
in the root of your project.
SYNC_DATA_JSON_URL
- The location of the data.json catalog of the data portal you want to pull from.
SYNC_HOST
- The host of the remote data portal you want to pull from. If the URL is
http://your.dataportal.org
the host would beyour.dataportal.org
.
- The host of the remote data portal you want to pull from. If the URL is
SYNC_URL
- The URL of the data portal you want to pull from. e.g.
http://your.dataportal.org
.
- The URL of the data portal you want to pull from. e.g.
SYNC_USERNAME
- Your Socrata username.
SYNC_PASSWORD
- Your Socrata password.
SYNC_TOKEN
- Your Socrata application token.
DATABASE_HOST
- The postgres host.
DATABASE_NAME
- The name of the database in Postgres that the application will be using.
DATABASE_USER
- The postgres database user to access DB_NAME.
DATABASE_PASSWORD
- The password for DB_USER_NAME.
SYNC_ATTRIBUTION
- Attribution to go in the footer of the page for the external dataset.
SYNC_ATTRIBUTION_URL
- Attribution URL to go in the footer of the page for the external dataset.
If you're developing this app locally, you will need the following: postgres, leiningen, java (OpenJDK 7 appears to work best.)
At the root of your project, add a .lein-env
file. It will contain an edn config of environment variables defining what they each should be. Here is an example:
{
:database-name "somename"
:database-user "someusername"
:database-password "somepassword"
:database-host "localhost"
:test-migrations-path "jdbc:postgresql://localhost:5432/somename?user=someusername&password=somepassword"
:test-url "https://some.dataportal.org"
:test-username "[email protected]"
:test-password "anotherpassword"
:test-token "somelongtoken"
:sync-host "anawesome.remoteportal.org"
:sync-url "http://some.dataportal.org"
:sync-username "[email protected]"
:sync-data-json-url "https://anawesome.remoteportal.org/data.json"
:sync-password "anotherpassword"
:sync-token "somelongtoken"
:sync-attribution "Hosted by the Some Awesome Portal"
:sync-attribution-url "http://anawesome.remoteportal.org"
}
Ensure you have a .lein-env
file and migrate your test database. This database must be different from your application's development database:
lein ragtime migrate -d "jdbc:postgresql://localhost:5432/somename_test?user=someusername&password=somepassword"
Then, you can either run
lein midje
or to start the test runner
lein midje :autotest
This project is licensed under the MIT License. See LICENSE for more details.