GeoTrellis

GeoTrellis is a geographic data processing library designed to work with large geospatial data sets. It is written in Scala and leverages Apache Spark for distributed computing.

GeoTrellis’ core competency is raster data processing: enabling distributed processing of large geospatial raster data sets using the techniques of map algebra. In addition to raster data support, GeoTrellis includes some support for operations using vector and point cloud data. GeoTrellis is designed to efficiently process large batch jobs that leverage cloud compute services in addition to supporting RESTful endpoints for raster processing in the web’s request/response cycle.

The GeoTrellis 2.0 release shifted the core data storage format of the library to Cloud Optimized GeoTIFFs (COGs). COGs are a building block of the cloud native geospatial ecosystem that enables interoperability between GeoTrellis and a range of libraries and tools, including desktop tools like QGIS and ArcMap.

Core features

  • Working with Geopatial Data on Apache Spark

    • Generic way to represent key value RDDs as layers, where the key represents a coordinate in space based on some uniform grid layout, optionally with a temporal component.
    • Represent spatial or spatiotemporal raster data as an RDD of raster tiles.
    • Generic architecture for saving/loading layers RDD data and metadata to/from various backends, using Spark's IO API with Space Filling Curve indexing to optimize storage retrieval (support for Hilbert curve and Z order curve SFCs). Filesystem, S3, HDFS, Cassandra and Accumulo are supported.
    • Query architecture that allows for simple querying of layer data by spatial or spatiotemporal bounds.
    • Perform map algebra operations on layers of raster data, including all supported Map Algebra operations mentioned in the geotrellis-raster feature list.
    • Perform seamless reprojection on raster layers, using neighboring tile information in the reprojection to avoid unwanted NoData cells.
    • Pyramid up layers through zoom levels using various resampling methods.
    • Types to reason about tiled raster layouts in various CRS's and schemes.
    • Perform operations on raster RDD layers: crop, filter, join, mask, merge, partition, pyramid, render, resample, split, stitch, and tile.
  • Working with Raster Data

    • GDAL bindings: read every supported raster format directly in GeoTrellis without a separate translation or ingest step.
    • RasterSource API provides a method of reading raster data from a variety of formats and sources, similar to a GDAL-like VRT feature.
    • Provides types to represent single- and multi-band rasters, supporting Bit, Byte, UByte, Short, UShort, Int, Float, and Double data, with either a constant NoData value (which improves performance) or a user defined NoData value.
    • Treat a tile as a collection of values, by calling "map" and "foreach", along with floating point valued versions of those methods.
    • Combine raster data in generic ways.
    • Render rasters via color ramps and color maps to PNG and JPG images.
    • Read GeoTiffs with DEFLATE, LZW, and PackBits compression, including horizontal and floating point prediction for LZW and DEFLATE.
  • Working with Vector Data

    • Work directly with JTS types: Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon, GeometryCollection.
    • Provides a Feature type that is the composition of a geometry and a generic data type.
    • Read and write geometries and features to and from GeoJSON.
    • Read and write geometries to and from WKT and WKB.
    • Point cloud support with Voronoi Diagram and Delaunay Triangulation methods.

Implemented Standards

  • Geographic JSON (GeoJSON)
  • Georeferenced Tagged Image File Format (GeoTIFF)
  • Web Coverage Service (WCS)
  • Web Map Service (WMS)
  • Web Map Tile Service (WMTS)
geotrellis-screenshot