Documentation: readthedocs
Youtube Series: Kluster Playlist
Development Items: Trello Board
A distributed multibeam processing system built on the Pangeo ecosystem (https://pangeo.io/).
Kluster provides a fully open source hydrographic processing package to produce accessible bathymetry products in support of ocean mapping.
- Kluster does not support 'multifrequency' as seen in the most recent KMALL logged data.
- Kluster .s7k is limited to certain records, see the 'Requirements' section in the documentation.
- Kluster .raw EK80 processing with Power workflow (and not ComplexSamples) has been seen to generate odd results.
- Scalable - uses Dask to provide distributed parallel processing on everything from a laptop to a cloud service (AWS Fargate for example)
- Cloud ready - uses Zarr as a cloud ready storage format for converted multibeam records and processed soundings
- Open - data are presented using Xarray objects for easy interactivity and stored with Zarr, all open formats
- Scriptable - provides a GUI for visualization and processing, but can be run from the command line or scripted easily
- Extensible - From data conversion to sound velocity correction, kluster is built using modules that can be replaced, enhanced or exchanged as needed.
Kluster has been tested on:
- EK60, EK80 (Using Kluster amplitude detection, see Requirements in Documentation)
- Reson 7125, T20, T51
- EM2040/2040c/2040p
- EM2040 dual tx/dual rx
- EM710/712
- EM3002
- EM302/304
- EM122
- ME70 Bathy Module
Kluster is built from the ground up in Python, and was developed using Python 3.8. Kluster includes modules developed by the hydrographic community such as (see drivers):
- kmall - Kongsberg .kmall file reader
- par3 - Kongsberg .all file reader
- prr3 - Reson .s7k file reader
- raw - Kongsberg .raw file reader
- sbet - POSPac sbet/rms file reader
Kluster is a work in progress that has been in development since November 2019 by a small 'team', and is by no means feature complete. If you are interested in contributing or have questions, please contact Eric Younkin ([email protected])
There are three principle motivations behind kluster:
The hydrographic community is continuously innovating. Oftentimes, we want to experiment with an algorithm or technique, but the data is inaccessible, or relies on intermediate products that are locked within the software. How do you get attitude corrected beam vectors into a numpy array? How can I test a new gridding algorithm without exporting soundings to text first?
Cloud data storage and processing is quickly becoming a reality, as the advantages of not owning your own infrastructure become apparent. Where does this leave processing software and our traditional workflow? Kluster is designed from the ground up to address this issue, by providing processing that can be tailored and deployed in multiple different ways depending on the application. In addition, using the multiprocessing capabilities of Dask, kluster provides a powerful tool that can compete with existing software packages in terms of performance.
Much of the existing open source software related to multibeam processing has been in development for decades. There has been an explosion in scientific libraries that can benefit the hydrographic community as a whole that have not been seriously evaluated. Kluster relies on the state of the art in Python libraries to provide a sophisticated and modern software package.
We recommend that users try to run Kluster using the release attached to this GitHub repository, see releases
Kluster has been tested on Windows 10 and Ubuntu 20.04.
Kluster is not on PyPi, but can be installed using pip alongside the HSTB-drivers and HSTB-shared modules that are required.
(For Windows Users) Download and install Visual Studio Build Tools 2019 (If you have not already): MSVC Build Tools
Download and install conda (If you have not already): conda installation
Download and install git (If you have not already): git installation
Some dependencies need to be installed from the conda-forge channel. I have an example below of how to build this environment using conda.
Perform these in order:
conda create -n kluster_test python=3.8.12
conda activate kluster_test
conda install -c conda-forge qgis=3.18.3 vispy=0.9.4 pyside2=5.13.2 gdal=3.3.1 h5py python-geohash
pip install git+https://github.com/noaa-ocs-hydrography/kluster.git#egg=hstb.kluster
Start the GUI by activating the new environment and run Kluster as a module
(kluster_test) C:>python -m HSTB.kluster
###Docker
Build docker image using the provided dockerfile
C:\Pydro21_Dev\NOAA\site-packages\Python38\git_repos\hstb_kluster>docker build -t kluster/ubuntu .
C:\Pydro21_Dev\NOAA\site-packages\Python38\git_repos\hstb_kluster>docker run -it kluster/ubuntu
(base) eyou102@faaec62a4c1c:~/kluster$ conda deactivate
eyou102@faaec62a4c1c:~/kluster$ conda activate kluster_test
(kluster_test) eyou102@faaec62a4c1c:~/kluster$ python
See documentation for the new quick start guide
See examples or notebooks for examples on how to use Kluster in the console.