About¶
CanoClass is a Python module created to process large amounts of NAIP imagery and create accurate canopy classifications in an open source framework. Need for an open source classification system arose during the creation of the Georgia canopy dataset as tools that were being used , ArcMap and Textron’s Feature Analyst, will be phased out within the next few years. Additionally need for open source arose out of the lack of insight to the algorithms that were being used by the software to process our data and no true method to tweak it to suit our needs.
At its core CanoClass is optimized to to solve canopy classification problems. It is designed to be data agnostic with batch processing functions created to work with NAIP imagery, as scalable processing for NAIP imagery is necessary.
Example NAIP

Example Output

CanoClass Batch¶
Batch processing functions to be utilized with NAIP imagery and it’s corresponding QQ data. Configuration for the process is initialized with the Batch class. Each process can be run indiviually or run all together with batch_naip.
-
class
canoclass.
Batch
(workspace, naip_dir, roi_shp, roi_id, naipqq_clip, naipqq_query, training_raster, training_fit_raster, projection)¶ -
__init__
(workspace, naip_dir, roi_shp, roi_id, naipqq_clip, naipqq_query, training_raster, training_fit_raster, projection)¶ Intialize the configuration for batch NAIP processing.
Parameters: - workspace (str, path) – Path wherein all data is contained will be read from and output into. The Data folder containing all input and reference data is contained within the workspace folder.
- naip_dir (str, path) – Path to directory where NAIP imagery is contained.
- roi_shp (str, filename) – The region of interest shapefile.
- roi_id (str) – The field from which to query the ROI.
- naipqq_clip (str, filename) – The original NAIP QQ shapfile that will allow NAIP tiles to be clipped to their QQ extent.
- naipqq_query (str, filename) – The NAIP QQ shapefile joined with the ROI shapefile. The roi_id will be queried against this shapefile to know which NAIP tiles to process.
- training_raster (str, filename) – The rasterized training data.
- training_fit_raster (str, filename) – The vegetation index raster that the rasterized training data will be fit with.
- projection (str) – The final projection the data will be in. Must follow GDAL formating. eg: “EPSG:5070”
-
config
¶ Dictionary of all config parameters
Type: dict
-
batch_clip_mosaic
(pid)¶ Clips the mosaic to the ROI extent
Parameters: phy_id (int) – roi_id number for the region to be processed.
-
batch_clip_reproject
(pid)¶ This fucntion clips and reprojects all classified to their respective seamlines and the desired projection
Parameters: phy_id (int) – roi_id number for the region to be processed.
-
batch_et_class
(pid, smoothing=True, class_parameters=None)¶ This function enables batch classification of NAIP imagery using a sklearn Extra Trees supervised classification algorithm.
Parameters: - phy_id (int) – roi_id number for the region to be processed.
- smoothing (bool, defualt=True) – Applies a 3x3 median filter to output classified raster.
- class_parameters (dict) –
arguments for Scikit-learns ET Classifier
- {“n_estimators”: 100, “criterion”: ‘gini’,
- ”max_depth”: None, “min_samples_split”: 2, “min_samples_leaf”: 1, “min_weight_fraction_leaf”: 0.0, “max_features”: ‘auto’, “max_leaf_nodes”: None, “min_impurity_decrease”: 0.0, “min_impurity_split”: None, “bootstrap”: False, “oob_score”: False, “n_jobs”: None, “random_state”: None, “verbose”: 0, “warm_start”: False, “class_weight”: None, “ccp_alpha”: 0.0, “max_samples”: None}
-
batch_index
(pid, index='ARVI')¶ This function walks through the input NAIP directory and performs the vegetation index calculation on each naip geotiff file and saves each new index geotiff in the output directory.
Parameters: - phy_id (int) – roi_id number for the region to be processed.
- index (str, default="ARVI") – Which vegetation index to compute with rindcalc
-
batch_mosaic
(pid)¶ This function mosaics all classified NAIP tiles within a physiographic region using gdal_merge.py
Parameters: phy_id (int) – roi_id number for the region to be processed.
-
batch_naip
(pid, index, alg, smoothing=True, class_parameters=None)¶ This function is a wrapper function run every step to make a canopy dataset.
Parameters: - phy_id (int) – roi_id number for the region to be processed.
- index (str, default="ARVI") – Which vegetation index to compute with rindcalc
- alg (str) – Which classifiation algorithm to use “RF”: Random Forests, “ET”: Extra Trees
- smoothing (bool) – Whether or not to apply a 3x3 median filter
- class_parameters (dict) –
Parameters to apply to classification
Random Forests :
- {“n_estimators”: 100, “criterion”: ‘gini’, “max_depth”: None,
- ”min_samples_split”: 2, “min_samples_leaf”: 1, “min_weight_fraction_leaf”: 0.0, “max_features”: ‘auto’, “max_leaf_nodes”: None, “min_impurity_decrease”: 0.0, “min_impurity_split”: None, “bootstrap”: True, “oob_score”: False, “n_jobs”: None, “random_state”: None, “verbose”: 0, “warm_start”: False, “class_weight”: None, “ccp_alpha”: 0.0, “max_samples”: None}
Extra Trees :
- {“n_estimators”: 100, “criterion”: ‘gini’, “max_depth”: None,
- ”min_samples_split”: 2, “min_samples_leaf”: 1, “min_weight_fraction_leaf”: 0.0, “max_features”: ‘auto’, “max_leaf_nodes”: None, “min_impurity_decrease”: 0.0, “min_impurity_split”: None, “bootstrap”: False, “oob_score”: False, “n_jobs”: None, “random_state”: None, “verbose”: 0, “warm_start”: False, “class_weight”: None, “ccp_alpha”: 0.0, “max_samples”: None}
-
batch_rf_class
(pid, smoothing=True, class_parameters=None)¶ This function enables batch classification of NAIP imagery using a sklearn Random Forests supervised classification algorithm.
Parameters: - phy_id (int) – roi_id number for the region to be processed.
- smoothing (bool, defualt=True) – Applies a 3x3 median filter to output classified raster.
- class_parameters (dict) –
arguments for Scikit-learns ET Classifier {“n_estimators”: 100, “criterion”: ‘gini’, “max_depth”: None,
”min_samples_split”: 2, “min_samples_leaf”: 1, “min_weight_fraction_leaf”: 0.0, “max_features”: ‘auto’, “max_leaf_nodes”: None, “min_impurity_decrease”: 0.0, “min_impurity_split”: None, “bootstrap”: True, “oob_score”: False, “n_jobs”: None, “random_state”: None, “verbose”: 0, “warm_start”: False, “class_weight”: None, “ccp_alpha”: 0.0, “max_samples”: None}
-