CanoClass Batch

Batch processing functions to be utilized with NAIP imagery and it’s corresponding QQ data. Configuration for the process is initialized with the Batch class. Each process can be run indiviually or run all together with batch_naip.


class canoclass.Batch(workspace, naip_dir, roi_shp, roi_id, naipqq_clip, naipqq_query, training_raster, training_fit_raster, projection)
__init__(workspace, naip_dir, roi_shp, roi_id, naipqq_clip, naipqq_query, training_raster, training_fit_raster, projection)

Intialize the configuration for batch NAIP processing.

Parameters:
  • workspace (str, path) – Path wherein all data is contained will be read from and output into. The Data folder containing all input and reference data is contained within the workspace folder.
  • naip_dir (str, path) – Path to directory where NAIP imagery is contained.
  • roi_shp (str, filename) – The region of interest shapefile.
  • roi_id (str) – The field from which to query the ROI.
  • naipqq_clip (str, filename) – The original NAIP QQ shapfile that will allow NAIP tiles to be clipped to their QQ extent.
  • naipqq_query (str, filename) – The NAIP QQ shapefile joined with the ROI shapefile. The roi_id will be queried against this shapefile to know which NAIP tiles to process.
  • training_raster (str, filename) – The rasterized training data.
  • training_fit_raster (str, filename) – The vegetation index raster that the rasterized training data will be fit with.
  • projection (str) – The final projection the data will be in. Must follow GDAL formating. eg: “EPSG:5070”
config

Dictionary of all config parameters

Type:dict
batch_clip_mosaic(pid)

Clips the mosaic to the ROI extent

Parameters:phy_id (int) – roi_id number for the region to be processed.
batch_clip_reproject(pid)

This fucntion clips and reprojects all classified to their respective seamlines and the desired projection

Parameters:phy_id (int) – roi_id number for the region to be processed.
batch_et_class(pid, smoothing=True, class_parameters=None)

This function enables batch classification of NAIP imagery using a sklearn Extra Trees supervised classification algorithm.

Parameters:
  • phy_id (int) – roi_id number for the region to be processed.
  • smoothing (bool, defualt=True) – Applies a 3x3 median filter to output classified raster.
  • class_parameters (dict) –

    arguments for Scikit-learns ET Classifier

    {“n_estimators”: 100, “criterion”: ‘gini’,
    ”max_depth”: None, “min_samples_split”: 2, “min_samples_leaf”: 1, “min_weight_fraction_leaf”: 0.0, “max_features”: ‘auto’, “max_leaf_nodes”: None, “min_impurity_decrease”: 0.0, “min_impurity_split”: None, “bootstrap”: False, “oob_score”: False, “n_jobs”: None, “random_state”: None, “verbose”: 0, “warm_start”: False, “class_weight”: None, “ccp_alpha”: 0.0, “max_samples”: None}
batch_index(pid, index='ARVI')

This function walks through the input NAIP directory and performs the vegetation index calculation on each naip geotiff file and saves each new index geotiff in the output directory.

Parameters:
  • phy_id (int) – roi_id number for the region to be processed.
  • index (str, default="ARVI") – Which vegetation index to compute with rindcalc
batch_mosaic(pid)

This function mosaics all classified NAIP tiles within a physiographic region using gdal_merge.py

Parameters:phy_id (int) – roi_id number for the region to be processed.
batch_naip(pid, index, alg, smoothing=True, class_parameters=None)

This function is a wrapper function run every step to make a canopy dataset.

Parameters:
  • phy_id (int) – roi_id number for the region to be processed.
  • index (str, default="ARVI") – Which vegetation index to compute with rindcalc
  • alg (str) – Which classifiation algorithm to use “RF”: Random Forests, “ET”: Extra Trees
  • smoothing (bool) – Whether or not to apply a 3x3 median filter
  • class_parameters (dict) –

    Parameters to apply to classification

    Random Forests :

    {“n_estimators”: 100, “criterion”: ‘gini’, “max_depth”: None,
    ”min_samples_split”: 2, “min_samples_leaf”: 1, “min_weight_fraction_leaf”: 0.0, “max_features”: ‘auto’, “max_leaf_nodes”: None, “min_impurity_decrease”: 0.0, “min_impurity_split”: None, “bootstrap”: True, “oob_score”: False, “n_jobs”: None, “random_state”: None, “verbose”: 0, “warm_start”: False, “class_weight”: None, “ccp_alpha”: 0.0, “max_samples”: None}

    Extra Trees :

    {“n_estimators”: 100, “criterion”: ‘gini’, “max_depth”: None,
    ”min_samples_split”: 2, “min_samples_leaf”: 1, “min_weight_fraction_leaf”: 0.0, “max_features”: ‘auto’, “max_leaf_nodes”: None, “min_impurity_decrease”: 0.0, “min_impurity_split”: None, “bootstrap”: False, “oob_score”: False, “n_jobs”: None, “random_state”: None, “verbose”: 0, “warm_start”: False, “class_weight”: None, “ccp_alpha”: 0.0, “max_samples”: None}
batch_rf_class(pid, smoothing=True, class_parameters=None)

This function enables batch classification of NAIP imagery using a sklearn Random Forests supervised classification algorithm.

Parameters:
  • phy_id (int) – roi_id number for the region to be processed.
  • smoothing (bool, defualt=True) – Applies a 3x3 median filter to output classified raster.
  • class_parameters (dict) –

    arguments for Scikit-learns ET Classifier {“n_estimators”: 100, “criterion”: ‘gini’, “max_depth”: None,

    ”min_samples_split”: 2, “min_samples_leaf”: 1, “min_weight_fraction_leaf”: 0.0, “max_features”: ‘auto’, “max_leaf_nodes”: None, “min_impurity_decrease”: 0.0, “min_impurity_split”: None, “bootstrap”: True, “oob_score”: False, “n_jobs”: None, “random_state”: None, “verbose”: 0, “warm_start”: False, “class_weight”: None, “ccp_alpha”: 0.0, “max_samples”: None}