TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data

TorchGeo is a PyTorch domain library, similar to torchvision, providing datasets, samplers, transforms, and pre-trained models specific to geospatial data.

The goal of this library is to make it simple:

  • for machine learning experts to work with geospatial data, and
  • for remote sensing experts to explore machine learning solutions.

Core features

  • Geospatial Datasets and Samplers

    • 20+ data loaders for satellite imagery and masks
    • Automatic CRS reprojection and resampling
    • Support for multimodal learning and data fusion
  • Benchmark Datasets

    • 50+ data loaders for benchmark datasets
    • Quickly experiment with a variety of models
    • Automatically download most datasets
  • Pre-Trained Weights

    • 40+ models pre-trained on geospatial data
    • Native support for multispectral imagery
    • Enables transfer learning on smaller datasets
  • Reproducibility with Lightning

    • Builtin train/val/test splits and data augmentation
    • Classification, regression, detection, segmentation
    • Command-line interface, support for YAML files
TorchGeo supports sampling image patches (C and D) from geospatial data layers (A and B)