GDAL

Geospatial Data Abstraction Library

a brief introduction

Overview*


  • about + purpose
  • functionality)
  • installation + basic use
  • resources + tutorials
GDAL logo

*focus on raster functionality

About + Purpose

Very Brief History

  • since 1998 --- intitial release in 2000 (Frank Warmerdam)
  • taken over by the Open Source Geospatial Foundation after v1.3.2 (ca. 2006)
  • previously separate from OGR Simple Features Library, now more tightly coupled
    • OGR refers to the vector data model and functionality
    • often referred to as GDAL/OGR
  • currently v2.3.2 (September 2018)

About

  • X/MIT License (permissive free software license)
  • C/C++ based, bindings for python, R, Java, Perl, C#, Ruby …
  • cross-platform geospatial library (MacOS, Linux/Unix, Windows, Android, iOS)

(Intended) Purpose

  • library to read/write raster and vector geospatial formats
    • more than 200 supported geospatial data formats
  • presents a single abstract data model for both raster and vector data formats, respectively
  • provides various command-line utilities for translation and geospatial processing

Not a Geographic Information System!

Some Supported Raster Data Formats (of over 155)


  • TerraSAR-X Complex SAR Data Product
  • ENVI .hdr Labelled Raster
  • GeoPackage
  • GRASS Raster Format
  • GeoTiff
  • JPEG2000
  • GDAL VRT
  • Geospatial PDF
  • Rasdaman
  • Sentinel 1 SAFE
  • Sentinel 2
  • ArcSDE Raster

GDAL Command-line List Drivers:

gdal_translate --formats

Some Software Utilising GDAL

gdal_software_logos.png

Functionality

Some GDAL Utilities

gdalinfo - Report information about a file.

gdal_translate - Copy a raster file, with control of output format.

gdalwarp - Warp an image into a new coordinate system.

gdaldem - Tools to analyze and visualize DEMs.

gdal_merge - Build a quick mosaic from a set of images.

gdal_rasterize - Rasterize vectors into raster file.

gdal_polygonize - Generate polygons from raster.

gdal_pansharpen - Perform a pansharpen operation.

gdalcompare - Compare two images and report on differences.

GDAL Utilities docs here.

QGIS_Transform.png

gdalwarp -overwrite -t_srs EPSG:4326 -dstnodata 0 -of GTiff inputdata.jpg outputdata.tif

gdalwarp docs here.

Installation

Command-line GDAL Utilities

MacOS: via Homebrew or other binary distributions (e.g. kingchaos)

Linux/Unix: access through various standard linux repositories

Windows: OSGeo4W recommended

Note: If QGIS is installed, GDAL is already installed somewhere...

Python Bindings

  • Recommended: the python virtual environment, package and dependency manager conda
    1. install Anaconda or miniconda (simplified version)
    2. start and check installation
    3. manage environments including different versions of python
    4. search and find different versions of packages
    5. create an environment with GDAL and whatever else you require
    6. activate the environment
    7. Go!

(Tutorial: Getting Started with conda)

Basic Use

Set-up necessary python imports

(the %matplotlib inline is a jupyter notebook command to show matplotlib plots in the output)

In [1]:
%matplotlib inline

import os
import sys

import numpy
import matplotlib.pyplot as plt
from osgeo import gdal

Check Versions

In [2]:
# Check version of python:
print(sys.version)

# We can check which version we're running by printing the "__version__" variable
print("GDAL's version is: " + gdal.__version__)

# This shows where the version of GDAL is located (in this case, in a conda environment)
print(gdal)

# This allows GDAL to throw Python Exceptions
gdal.UseExceptions()
3.7.1 | packaged by conda-forge | (default, Nov 13 2018, 19:01:41) [MSC v.1900 64 bit (AMD64)]
GDAL's version is: 2.3.2
<module 'osgeo.gdal' from 'C:\\Users\\b1041827\\AppData\\Local\\Continuum\\miniconda2\\envs\\jupyter_slides\\lib\\site-packages\\osgeo\\gdal.py'>

Command-line Equivalent:

python --version

gdalinfo --version

Get Info about a dataset

The documentation on the Info() function can be found here and for the InfoOptions() fucntion here.

Landsat 8 band

Sentinel-2: Granule 37SBA, 2 November 2018

In [4]:
info_options = gdal.InfoOptions(computeMinMax=True, stats=True)

print(gdal.Info(LC08band2_path, options=info_options))
Driver: GTiff/GeoTIFF
Files: C:\Users\b1041827\ownCloud\01_work\02_presentations\jupyter_slides\gdal_intro\data\LC08_L1TP_146035_20170201_20170215_01_T1\LC08_L1TP_146035_20170201_20170215_01_T1_B2.TIF
       C:\Users\b1041827\ownCloud\01_work\02_presentations\jupyter_slides\gdal_intro\data\LC08_L1TP_146035_20170201_20170215_01_T1\LC08_L1TP_146035_20170201_20170215_01_T1_MTL.txt
Size is 7761, 7891
Coordinate System is:
PROJCS["WGS 84 / UTM zone 44N",
    GEOGCS["WGS 84",
        DATUM["WGS_1984",
            SPHEROID["WGS 84",6378137,298.257223563,
                AUTHORITY["EPSG","7030"]],
            AUTHORITY["EPSG","6326"]],
        PRIMEM["Greenwich",0,
            AUTHORITY["EPSG","8901"]],
        UNIT["degree",0.0174532925199433,
            AUTHORITY["EPSG","9122"]],
        AUTHORITY["EPSG","4326"]],
    PROJECTION["Transverse_Mercator"],
    PARAMETER["latitude_of_origin",0],
    PARAMETER["central_meridian",81],
    PARAMETER["scale_factor",0.9996],
    PARAMETER["false_easting",500000],
    PARAMETER["false_northing",0],
    UNIT["metre",1,
        AUTHORITY["EPSG","9001"]],
    AXIS["Easting",EAST],
    AXIS["Northing",NORTH],
    AUTHORITY["EPSG","32644"]]
Origin = (281385.000000000000000,4107315.000000000000000)
Pixel Size = (30.000000000000000,-30.000000000000000)
Metadata:
  AREA_OR_POINT=Point
  METADATATYPE=ODL
Image Structure Metadata:
  INTERLEAVE=BAND
Corner Coordinates:
Upper Left  (  281385.000, 4107315.000) ( 78d32'25.46"E, 37d 5'11.93"N)
Lower Left  (  281385.000, 3870585.000) ( 78d36'21.01"E, 34d57'15.22"N)
Upper Right (  514215.000, 4107315.000) ( 81d 9'35.99"E, 37d 6'43.40"N)
Lower Right (  514215.000, 3870585.000) ( 81d 9'20.65"E, 34d58'39.84"N)
Center      (  397800.000, 3988950.000) ( 79d51'55.82"E, 36d 2'23.01"N)
Band 1 Block=7761x1 Type=UInt16, ColorInterp=Gray
    Computed Min/Max=0.000,50699.000
  Minimum=0.000, Maximum=50699.000, Mean=7705.460, StdDev=6648.513
  Metadata:
    STATISTICS_MAXIMUM=50699
    STATISTICS_MEAN=7705.4601707249
    STATISTICS_MINIMUM=0
    STATISTICS_STDDEV=6648.5125029085

Command-line Equivalent:

gdalinfo path_to_file.tif -mm -stats

gdalinfo docs here.

Open and Close a Dataset in Python

The gdal.Open() function creates a GDAL Dataset object based on the given file.

In [5]:
# Open the Landsat 8 Band 2 file as a GDAL Dataset object.
in_ds = gdal.Open(LC08band2_path, gdal.GA_ReadOnly)

# Close GDAL Dataset object.
in_ds = None

Access Metadata from Dataset Object in Python

In [6]:
in_ds = gdal.Open(LC08band2_path, gdal.GA_ReadOnly)

# Find the number of bands in the dataset object.
print("[ NUMBER OF BANDS ] = ", in_ds.RasterCount)
[ NUMBER OF BANDS ] =  1
In [7]:
# Create a GDAL raster band object from the only band in the dataset object.
in_band = in_ds.GetRasterBand(1)

print("[ BAND DATA TYPE ] = ", gdal.GetDataTypeName(in_band.DataType))
print("[ NO DATA VALUE ] = ", in_band.GetNoDataValue())
[ BAND DATA TYPE ] =  UInt16
[ NO DATA VALUE ] =  None
In [9]:
# Compute statistics if needed.
if in_band.GetMinimum() is None or in_band.GetMaximum()is None:
    in_band.ComputeStatistics(0)
    print("Statistics computed.")

print("[ OVERALL STATS ] = ", in_band.GetMetadata())
[ OVERALL STATS ] =  {'STATISTICS_MAXIMUM': '50699', 'STATISTICS_MEAN': '7705.4601707249', 'STATISTICS_MINIMUM': '0', 'STATISTICS_STDDEV': '6648.5125029085'}
In [10]:
print("[ MIN ] = ", in_band.GetMinimum())
print("[ MAX ] = ", in_band.GetMaximum())
print("[ SCALE ] = ", in_band.GetScale())

# Close everything.
in_ds = None
in_band = None
[ MIN ] =  0.0
[ MAX ] =  50699.0
[ SCALE ] =  1.0

Create Copies in Different Formats

CreateCopy() or Translate() functions in python, gdal_translate in the command-line.

In [11]:
# Check contents of Landsat 8 processed data folder.
print(os.listdir(LC08proc_path))
[]
In [12]:
in_ds = gdal.Open(LC08band2_path, gdal.GA_ReadOnly)

# Define path to save copy and with what driver.
dst_filename = os.path.join(LC08proc_path, "LC08_B2.vrt")
driver = gdal.GetDriverByName("VRT")
dst_ds = driver.CreateCopy(dst_filename, in_ds, strict=0)
print("VRT copy of Band 2 has been created.")

dst_filename = os.path.join(LC08proc_path, "LC08_B2.dat")
driver = gdal.GetDriverByName("ENVI")
dst_ds = driver.CreateCopy(dst_filename, in_ds, strict=0)
print("ENVI copy of Band 2 has been created.")

# Close initial and resulting datasets.
dst_ds = None
in_ds = None
VRT copy of Band 2 has been created.
ENVI copy of Band 2 has been created.
In [13]:
print(os.listdir(LC08proc_path))
['LC08_B2.dat', 'LC08_B2.dat.aux.xml', 'LC08_B2.hdr', 'LC08_B2.vrt']

Command-line Equivalent:

gdal_translate -of "VRT" path_to_input.tif LC08_B2.vrt

gdal_translate -of "ENVI" path_to_input.tif LC08_B2.dat

gdal_translate docs here.

Access Sentinel-2 bands as dictionary.

Get Sentinel-2 bands in a sorted order accessible via key-value pairs.

In [14]:
# Create dictionary with key-value pairs for easier access to bands.
S2_bands = {}

for entry in os.listdir(S2_path):
    if (entry.endswith('.jp2')):
            # This makes bands accessible by name (e.g. B01, B8A).
            band_name = entry[-7:-4]
            S2_bands[band_name]=os.path.join(S2_path, entry)

# Using the dictionary, individual bands can be accessed so:
print("Band 8A is called: " + os.path.basename(S2_bands['B8A']))
del band_name
Band 8A is called: T37SBA_20181102T082101_B8A.jp2

Open True Colour Band and view with Matplotlib

In [15]:
img = gdal.Open(S2_bands['TCI'], gdal.GA_ReadOnly)

# Read GDAL Dataset Object as a Numpy Array
RGB_Image = img.ReadAsArray()
print(RGB_Image.shape)

# Get extent and boundary
ulx, xres, xskew, uly, yskew, yres  = img.GetGeoTransform()
lrx = ulx + (img.RasterXSize * xres)
lry = uly + (img.RasterYSize * yres)
print(ulx, lrx, lry, uly)

# Re-order dimensions for matplotlib.
RGB_Image = numpy.dstack(((RGB_Image[0]), (RGB_Image[1]), (RGB_Image[2]))).astype('uint8')
print(RGB_Image.shape)
(3, 10980, 10980)
199980.0 309780.0 3990240.0 4100040.0
(10980, 10980, 3)
In [16]:
# Plot image in Matplotlib.
plt.figure(figsize=(10, 10))
plt.imshow(RGB_Image, extent=(ulx, lrx, lry, uly))
Out[16]:
<matplotlib.image.AxesImage at 0x7c031e0470>