Python tool for analyzing geospatial data in cities.
This repo consists of three components that can be run separately or together:
- https server and UI that allow running the whole analysis process automatically and browsing the result maps from the comfort of your favorite web browser
- import scripts that import a variety of open geospatial datasets for your favorite city
- notebook or export script that
It is recommended to use the UI, if you want to just get a fancy map, but you may run the scripts and notebook manually if you want more control in what is included in your analysis.
- Docker
- Docker-compose
OR
- Python >= 3.8
- Osm2pgsql
- PostGIS accepting connections at localhost:5432
If you wish to run the UI to run analyses with a few clicks, you may start the database and development server locally by typing
docker-compose up dev
Set up the needed username and password as in configuration.
If you wish to run the analysis notebook manually, you will get the database and notebook up by typing
docker-compose up notebook
The https production server can be started by
cp server/flask.subdomain.conf server/swag/nginx/proxy-confs/
docker-compose up serve
Do note that the server requires a domain you must register to get a https certificate. Look at configuration for details.
If you're not running docker, we recommend creating your own conda env, pyenv, or pyenv which contains conda wheels. The last option should make installing all dependencies easier:
pyenv install miniconda3-latest
pyenv local miniconda3-latest
pip install -r requirements.txt
You also need to have PostGIS running. To run the Flask dev server and UI locally, after installing the basic requirements,
cd server
pip install -r requirements-serve.txt
flask run
Or, if you want to use the notebook instead of the UI, start the notebook server by
jupyter notebook
Configuration is by the .env
file or environment variables. You may copy
.env.example to a file called .env
and fill in your secrets.
To use the UI, set the username and password that allows access to the UI by setting
the desired username and password hash in .env
file or the corresponding environment variable.
If you wish to import data from the Flickr API, fill in your
Flickr api key and secret
in a .env
file or the corresponding environment variable. The API key may also be set in
the UI for each import run, but it will not persist to the server environment.
To get https certificates on AWS EC2, you need to add your own domain and subdomain in .env
and your
AWS access credentials in server/swag/dns-conf/route53.ini
. If you use MFA, you have to
create a separate non-MFA-role specific to your EC2 instance and instead add role_arn
and
credential_source=Ec2InstanceMetadata
in route53.ini
just like in AWS Config file. This will allow Swag to automatically
retrieve and update your certificate. If you are running on a provider other than AWS,
read Swag instructions and change DNSPLUGIN value at [docker-compose.yml#L20].
Please, add your API keys and username/password hash in the configuration first. For development convenience you may set a clear-text password instead if running the dev container. Then, start your local dev server at http://localhost:5000 by
docker-compose up dev # if you are running docker
cd server & flask run # if you are not running docker
or your https server at https://yourdomain.com:443 by
cp server/flask.subdomain.conf server/swag/nginx/proxy-confs/
docker-compose up serve
You may import the city of your choice in the UI by
- selecting the datasets you desire,
- selecting your city from the autocomplete list,
- (optionally) adjusting the bounding box on the map, if the initial bbox doesn't look suitable for you,
- (optionally) adding a GTFS url for your city, if it is not known by the app already,
- clicking "Import Datasets".
The process will take a while depending on how many and which datasets you are importing. Importing small datasets such as Ookla and Kontur data is very fast, while using the Flickr API will be particularly slow and will keep you waiting for a long time.
You may look at how the request is processing by clicking the "View Log" button. Once the run is finished, click "View Results" to see all your datasets and the combination total index values on the hex map. All old import runs are listed under "Result maps".
From the command line, you may do the import by running
docker-compose run notebook ./import.py Helsinki # if you are running docker
./import.py Helsinki # if you are not running docker
or any other city. You may only import some datasets by --datasets parameter, e.g.
./import.py Helsinki --datasets "access gtfs ookla"
.
Do note that cities in bigger countries may be slow to import if the city is not available as a separate OSM extract. In that case, we will have to download the whole country. All other dataset sizes are determined by the size of the city.
If a city you want to import does not have a GTFS feed URL in scripts/import_gtfs.py
GTFS_DATASETS
variable, you may add the right URL manually there (please make a PR too), or
alternatively run the import with the right URL as parameter, e.g.
./import.py Tallinn --gtfs http://www.peatus.ee/gtfs/gtfs.zip
The UI calculates the result map automatically if the import is started by the UI. Once the run is finished, your map is available by clicking "View Results" next to your import run.
You may also run export automatically at the end of import by calling ./import.py
with the --export
parameter, or run the export script after import has finished by
docker-compose run notebook ./export.py # if you are running docker
./export.py # if you are not running docker
On the other hand, you calculate the results manually using the export notebook. The notebook is available at localhost:8888/tree/notebooks/export.ipynb .
Open the notebook to adjust
- the datasets you wish to include
- the column/sum/mean value you wish to use from each dataset
- the weight of each dataset in the result map
You may just run the notebook as-is to get the default index that contains representative columns/statistics from each dataset, with equal weights for each dataset. This is the same setup that the export script uses for calculating the results. The resulting map is displayed in the notebook and saved as a standalone HTML map in server/maps/city_name.html.