Ordinal hashing of multidimensonal data and geographic coordinates via Morton coding / Z-ordering.
In mathematical analysis and computer science, Z-order, Morton-order, or a Morton-code is a function which maps multidimensional data to one dimension while preserving locality of the data points. It was introduced in 1966 by IBM researcher, G. M. Morton. The z-value of a point in multidimensions is calculated by interleaving the binary representations of its coordinate values. Once the data are sorted into this ordering, any one-dimensional data structure can be used, such as binary search trees, B-trees, skip lists, or hash tables. The resulting ordering can equivalently be described as the order one would achieve from a depth-first traversal of a quadtree,
where {x, y, ..., K}
are combined into a single ordinal value that is easily compared, searched, and indexed against other Morton numbers.
At the highest level, pymorton is split into two logical functions:
-
(de)interleave: encodes/decodes hashes representing two or three dimensionsal integer sets.
{x, y, z ∈ Z}
or{x, y ∈ Z}
, whereZ
represents all integer values. -
(de)interleave_latlng: encodes and decodes hashes representing latitude and longitude.
-
Given a directory of images, sort the images by color (average RGB):
from statistics import mean from glob import glob from PIL import Image import pymorton imgs = [(fname, Image.open(fname)) for fname in glob('imgpath/*.jpg')[:100]] # for each image, generate a tuple of len==3, representing the image's average RGB value avg_rgb_values = [ [int(mean(img.getdata(band))) for band in range(3)] for _, img in imgs] # using the average RGB values, compute the Z-order of each image hashed_imgs = list(zip([fname for fname, _ in imgs], [pymorton.interleave(*avg_rgb) for avg_rgb in avg_rgb_values])) # returns a sorted-by-color list of photos found within the directory return sorted(hashed_imgs, key=lambda img_tuple: img_tuple[1])
While the above use-case is fairly uncommon in the context of Morton-coding, I believe it illustrates the utility of the algorithm quite well. Morton-coding is most commonly used within the realm of geospatial indexing, but its potential applications are infinite!
via pip:
pip install pymorton
via source:
git clone https://github.com/trevorprater/pymorton.git
cd pymorton
python setup.py install
- 3D-hashing
import pymorton as pm
mortoncode = pm.interleave(100, 200, 50) # 5162080
mortoncode = pm.interleave3(100, 200, 50) # 5162080
pm.deinterleave3(mortoncode) # (100, 200, 50)
- 2D-hashing
import pymorton as pm
mortoncode = pm.interleave(100, 200) # 46224
mortoncode = pm.interleave2(100, 200) # 46224
pm.deinterleave2(mortoncode) # (100, 200)
- geo-hashing
import pymorton as pm
geohash = pm.interleave_latlng(40.723471, -73.985361) # '03023211233202130332202203002303'
pm.deinterleave_latlng(geohash) # (40.723470943048596, -73.98536103777587)
-
pymorton.interleave(*args)
- Hashes
x, y
orx, y, z
into a single value. This function wraps interleave2() and interleave3() by supporting variable-length args.
- Hashes
-
pymorton.interleave2(x, y)
- Returns a hash (int) representing
x, y
.
- Returns a hash (int) representing
-
pymorton.interleave3(x, y, z)
- Returns a hash (int) representing
x, y, z
.
- Returns a hash (int) representing
-
pymorton.interleave_latlng(lat, lng)
- Returns a hash (string base-4)
representing
lat, lng
.
- Returns a hash (string base-4)
representing
-
pymorton.deinterleave2(hash)
- Returns a tuple representing the arguments to the corresponding interleave2() call.
-
pymorton.deinterleave3(hash)
- Returns a tuple representing the arguments to the corresponding interleave3() call.
-
pymorton.deinterleave_latlng(hash)
- Returns a tuple representing the arguments to the corresponding interleave_latlng() call.
From the project's root directory, execute nosetests
.
Please feel free to contact [email protected] regarding any questions/comments/issues.
- Z-order curve
- Implementation for the algorithm (1)
- Implementation for the algorithm (2)
- Extended explanation with different algorithms
MIT