Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,9 +159,9 @@ write_geotiff(data, 'out.tif', gpu=True) # force GPU compress
write_vrt('mosaic.vrt', ['tile1.tif', 'tile2.tif']) # generate VRT
```

**Compression codecs:** Deflate, LZW (Numba JIT), ZSTD, PackBits, JPEG (Pillow), uncompressed
**Compression codecs:** Deflate, LZW (Numba JIT), ZSTD, PackBits, JPEG (Pillow), JPEG 2000 (glymur), uncompressed

**GPU codecs:** Deflate and ZSTD via nvCOMP batch API; LZW via Numba CUDA kernels
**GPU codecs:** Deflate and ZSTD via nvCOMP batch API; JPEG 2000 via nvJPEG2000; LZW via Numba CUDA kernels

**Features:**
- Tiled, stripped, BigTIFF, multi-band (RGB/RGBA), sub-byte (1/2/4/12-bit)
Expand Down Expand Up @@ -540,7 +540,7 @@ Check out the user guide [here](/examples/user_guide/).

- **Zero GDAL installation hassle.** `pip install xarray-spatial` gets you everything needed to read and write GeoTIFFs, COGs, and VRT files.
- **Pure Python, fully extensible.** All codec, header parsing, and metadata code is readable Python/Numba, not wrapped C/C++.
- **GPU-accelerated reads.** With optional nvCOMP, compressed tiles decompress directly on the GPU via CUDA -- something GDAL cannot do.
- **GPU-accelerated reads.** With optional nvCOMP and nvJPEG2000, compressed tiles decompress directly on the GPU via CUDA -- something GDAL cannot do.

The native reader is pixel-exact against rasterio/GDAL across Landsat 8, Copernicus DEM, USGS 1-arc-second, and USGS 1-meter DEMs. For uncompressed files it reads 5-7x faster than rioxarray; for compressed COGs it is comparable or faster with GPU acceleration.

Expand Down
101 changes: 101 additions & 0 deletions examples/user_guide/35_JPEG2000_Compression.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "yox7s6qx13e",
"source": "# JPEG 2000 compression for GeoTIFFs\n\nThe geotiff package supports JPEG 2000 (J2K) as a compression codec for both reading and writing. This is useful for satellite imagery workflows where J2K is common (Sentinel-2, Landsat, etc.).\n\nTwo acceleration tiers are available:\n- **CPU** via `glymur` (pip install glymur) -- works anywhere OpenJPEG is installed\n- **GPU** via NVIDIA's nvJPEG2000 library -- same optional pattern as nvCOMP for deflate/ZSTD\n\nThis notebook demonstrates write/read roundtrips with JPEG 2000 compression.",
"metadata": {}
},
{
"cell_type": "code",
"id": "kamu534xsm",
"source": "import numpy as np\nimport xarray as xr\nimport matplotlib.pyplot as plt\nimport tempfile\nimport os\n\nfrom xrspatial.geotiff import read_geotiff, write_geotiff",
"metadata": {},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"id": "w7tlml1cyqj",
"source": "## Generate synthetic elevation data\n\nWe'll create a small terrain-like raster to use as test data.",
"metadata": {}
},
{
"cell_type": "code",
"id": "9fzhnpcn4xq",
"source": "# Create a 256x256 synthetic terrain (uint16, typical for satellite imagery)\nrng = np.random.RandomState(42)\nyy, xx = np.meshgrid(np.linspace(-2, 2, 256), np.linspace(-2, 2, 256), indexing='ij')\nterrain = np.exp(-(xx**2 + yy**2)) * 10000 + rng.normal(0, 100, (256, 256))\nterrain = np.clip(terrain, 0, 65535).astype(np.uint16)\n\nda = xr.DataArray(\n terrain,\n dims=['y', 'x'],\n coords={\n 'y': np.linspace(45.0, 44.0, 256),\n 'x': np.linspace(-120.0, -119.0, 256),\n },\n attrs={'crs': 4326},\n name='elevation',\n)\n\nfig, ax = plt.subplots(figsize=(6, 5))\nda.plot(ax=ax, cmap='terrain')\nax.set_title('Synthetic elevation (uint16)')\nplt.tight_layout()\nplt.show()",
"metadata": {},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"id": "8tsuyr3jbay",
"source": "## Write with JPEG 2000 (lossless)\n\nPass `compression='jpeg2000'` to `write_geotiff`. The default is lossless encoding.",
"metadata": {}
},
{
"cell_type": "code",
"id": "ystjp6v30d",
"source": "tmpdir = tempfile.mkdtemp(prefix='j2k_demo_')\n\n# Write with JPEG 2000 compression\nj2k_path = os.path.join(tmpdir, 'elevation_j2k.tif')\nwrite_geotiff(da, j2k_path, compression='jpeg2000')\n\n# Compare file sizes with deflate\ndeflate_path = os.path.join(tmpdir, 'elevation_deflate.tif')\nwrite_geotiff(da, deflate_path, compression='deflate')\n\nnone_path = os.path.join(tmpdir, 'elevation_none.tif')\nwrite_geotiff(da, none_path, compression='none')\n\nj2k_size = os.path.getsize(j2k_path)\ndeflate_size = os.path.getsize(deflate_path)\nnone_size = os.path.getsize(none_path)\n\nprint(f\"Uncompressed: {none_size:>8,} bytes\")\nprint(f\"Deflate: {deflate_size:>8,} bytes ({deflate_size/none_size:.1%} of original)\")\nprint(f\"JPEG 2000: {j2k_size:>8,} bytes ({j2k_size/none_size:.1%} of original)\")",
"metadata": {},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"id": "89y9zun97nb",
"source": "## Read it back and verify lossless roundtrip\n\n`read_geotiff` auto-detects the compression from the TIFF header. No special arguments needed.",
"metadata": {}
},
{
"cell_type": "code",
"id": "8vf9ljxkx03",
"source": "# Read back and check lossless roundtrip\nda_read = read_geotiff(j2k_path)\n\nprint(f\"Shape: {da_read.shape}\")\nprint(f\"Dtype: {da_read.dtype}\")\nprint(f\"CRS: {da_read.attrs.get('crs')}\")\nprint(f\"Exact match: {np.array_equal(da_read.values, terrain)}\")\n\nfig, axes = plt.subplots(1, 2, figsize=(12, 5))\nda.plot(ax=axes[0], cmap='terrain')\naxes[0].set_title('Original')\nda_read.plot(ax=axes[1], cmap='terrain')\naxes[1].set_title('After JPEG 2000 roundtrip')\nplt.tight_layout()\nplt.show()",
"metadata": {},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"id": "gcj96utnd3u",
"source": "## Multi-band example (RGB)\n\nJPEG 2000 also handles multi-band imagery, which is the common case for satellite data.",
"metadata": {}
},
{
"cell_type": "code",
"id": "mgv9xhsrcen",
"source": "# Create a 3-band uint8 image\nrgb = np.zeros((128, 128, 3), dtype=np.uint8)\nrgb[:, :, 0] = np.linspace(0, 255, 128).astype(np.uint8)[None, :] # red gradient\nrgb[:, :, 1] = np.linspace(0, 255, 128).astype(np.uint8)[:, None] # green gradient\nrgb[:, :, 2] = 128 # constant blue\n\nda_rgb = xr.DataArray(\n rgb, dims=['y', 'x', 'band'],\n coords={'y': np.arange(128), 'x': np.arange(128), 'band': [0, 1, 2]},\n)\n\nrgb_path = os.path.join(tmpdir, 'rgb_j2k.tif')\nwrite_geotiff(da_rgb, rgb_path, compression='jpeg2000')\n\nda_rgb_read = read_geotiff(rgb_path)\nprint(f\"RGB shape: {da_rgb_read.shape}, dtype: {da_rgb_read.dtype}\")\nprint(f\"Exact match: {np.array_equal(da_rgb_read.values, rgb)}\")\n\nfig, axes = plt.subplots(1, 2, figsize=(10, 4))\naxes[0].imshow(rgb)\naxes[0].set_title('Original RGB')\naxes[1].imshow(da_rgb_read.values)\naxes[1].set_title('After J2K roundtrip')\nplt.tight_layout()\nplt.show()",
"metadata": {},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"id": "zzga5hc3a99",
"source": "## GPU acceleration\n\nOn systems with nvJPEG2000 installed (CUDA toolkit, RAPIDS environments), pass `gpu=True` to use GPU-accelerated J2K encode/decode. The API is the same -- it falls back to CPU automatically if the library isn't found.\n\n```python\n# GPU write (nvJPEG2000 if available, else CPU fallback)\nwrite_geotiff(cupy_data, \"output.tif\", compression=\"jpeg2000\", gpu=True)\n\n# GPU read (nvJPEG2000 decode if available)\nda = read_geotiff(\"satellite.tif\", gpu=True)\n```",
"metadata": {}
},
{
"cell_type": "code",
"id": "x74nrht8kx",
"source": "# Cleanup temp files\nimport shutil\nshutil.rmtree(tmpdir, ignore_errors=True)",
"metadata": {},
"execution_count": null,
"outputs": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.11.0"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
69 changes: 69 additions & 0 deletions xrspatial/geotiff/_compression.py
Original file line number Diff line number Diff line change
Expand Up @@ -727,13 +727,77 @@ def zstd_compress(data: bytes, level: int = 3) -> bytes:
return _zstd.ZstdCompressor(level=level).compress(data)


# -- JPEG 2000 codec (via glymur) --------------------------------------------

JPEG2000_AVAILABLE = False
try:
import glymur as _glymur
JPEG2000_AVAILABLE = True
except ImportError:
_glymur = None


def jpeg2000_decompress(data: bytes, width: int = 0, height: int = 0,
samples: int = 1) -> bytes:
"""Decompress a JPEG 2000 codestream. Requires ``glymur``."""
if not JPEG2000_AVAILABLE:
raise ImportError(
"glymur is required to read JPEG 2000-compressed TIFFs. "
"Install it with: pip install glymur")
import tempfile
import os
# glymur reads from files, so write the codestream to a temp file
fd, tmp = tempfile.mkstemp(suffix='.j2k')
try:
os.write(fd, data)
os.close(fd)
jp2 = _glymur.Jp2k(tmp)
arr = jp2[:]
return arr.tobytes()
finally:
os.unlink(tmp)


def jpeg2000_compress(data: bytes, width: int, height: int,
samples: int = 1, dtype: np.dtype = np.dtype('uint8'),
lossless: bool = True) -> bytes:
"""Compress raw pixel data as JPEG 2000 codestream. Requires ``glymur``."""
if not JPEG2000_AVAILABLE:
raise ImportError(
"glymur is required to write JPEG 2000-compressed TIFFs. "
"Install it with: pip install glymur")
import math
import tempfile
import os
if samples == 1:
arr = np.frombuffer(data, dtype=dtype).reshape(height, width)
else:
arr = np.frombuffer(data, dtype=dtype).reshape(height, width, samples)
fd, tmp = tempfile.mkstemp(suffix='.j2k')
os.close(fd)
os.unlink(tmp) # glymur needs the file to not exist
try:
cratios = [1] if lossless else [20]
# numres must be <= log2(min_dim) + 1 to avoid OpenJPEG errors
min_dim = max(min(width, height), 1)
numres = min(6, int(math.log2(min_dim)) + 1)
numres = max(numres, 1)
_glymur.Jp2k(tmp, data=arr, cratios=cratios, numres=numres)
with open(tmp, 'rb') as f:
return f.read()
finally:
if os.path.exists(tmp):
os.unlink(tmp)


# -- Dispatch helpers ---------------------------------------------------------

# TIFF compression tag values
COMPRESSION_NONE = 1
COMPRESSION_LZW = 5
COMPRESSION_JPEG = 7
COMPRESSION_DEFLATE = 8
COMPRESSION_JPEG2000 = 34712
COMPRESSION_ZSTD = 50000
COMPRESSION_PACKBITS = 32773
COMPRESSION_ADOBE_DEFLATE = 32946
Expand Down Expand Up @@ -771,6 +835,9 @@ def decompress(data, compression: int, expected_size: int = 0,
dtype=np.uint8)
elif compression == COMPRESSION_ZSTD:
return np.frombuffer(zstd_decompress(data), dtype=np.uint8)
elif compression == COMPRESSION_JPEG2000:
return np.frombuffer(
jpeg2000_decompress(data, width, height, samples), dtype=np.uint8)
else:
raise ValueError(f"Unsupported compression type: {compression}")

Expand Down Expand Up @@ -803,5 +870,7 @@ def compress(data: bytes, compression: int, level: int = 6) -> bytes:
return zstd_compress(data, level)
elif compression == COMPRESSION_JPEG:
raise ValueError("Use jpeg_compress() directly with width/height/samples")
elif compression == COMPRESSION_JPEG2000:
raise ValueError("Use jpeg2000_compress() directly with width/height/samples/dtype")
else:
raise ValueError(f"Unsupported compression type: {compression}")
Loading
Loading