planetsca.train¶
This module contains functions to train Random Forest models to predict snow covered area (SCA) in Planet imagery.
- train.data_training_new(labeled_polygons_filepath: str, training_image_filepath: str, training_data_filepath: str | None = None, rasterized_mask_output_filepath: str | None = None, ndvi: bool | None = False)[source]¶
Creates training data from scratch
- Parameters:
labeled_polygons_filepath (str) – File path to shapefile or geojson file with labeled polygons
training_image_filepath (str) – File path to Planet Scope image
training_data_filepath (Optional[str]) – Optional: file path to output training data dataframe as a csv file (defaults to None)
rasterized_mask_output_filepath (Optional[str]) – Optional: file path to output the rasterized labeled polygons to a geotiff file (defaults to None)
ndvi (Optional[bool]) – Optional: Set to True to compute the Normalized Difference Vegetation Index (NDVI) and add to training data DataFrame
- Returns:
training_data_df – pandas DataFrame of training data
- Return type:
DataFrame
- train.train_model(df_train: DataFrame, new_model_filepath: str, new_model_score_filepath: str, n_estimators: int = 10, max_depth: int = 10, max_features: int = 4, random_state: int | None = None, n_splits: int = 2, n_repeats: int = 2) RandomForestClassifier [source]¶
Trains and creates a new model with custom parameters
- Parameters:
df_train (pd.DataFrame) – Dataframe containing training data, must have feature columns ‘blue’, ‘green’, ‘red’, ‘nir’ and target column ‘label’
new_model_filepath (str) – Filepath to save the model as a joblib file
new_model_score_filepath (str) – Filepath to save the model score information as a csv file
n_estimators (int) – Number of trees in the forest, defaults to 10
max_depth (int) – Maximum depth of the tree, defaults to 10
max_features (int) – Number of features to consider when looking for the best split, defaults to 4
random_state (int) – Seed to ensure reproducibility, defaults to None
n_splits (int) – Number of folds in the cross-validation, defaults to 2
n_repeats (int) – Number of times cross-validator needs to be repeated, defaults to 2
- Returns:
model – The newly trained model
- Return type:
RandomForestClassifier
- train.vector_rasterize(labeled_polygons_filepath: str, training_image_filepath: str, rasterized_mask_output_filepath: str = None)[source]¶
Helper function for converting vector file to a raster file
- Parameters:
labeled_polygons_filepath (str) – File path to shapefile or geojson file with labeled polygons
training_image_filepath (str) – File path to Planet Scope image
rasterized_mask_output_filepath (Optional[str]) – Optional: file path to output the rasterized labeled polygons to a geotiff file (defaults to None)
- Returns:
rasterized – Rasterized version of the vector file
- Return type:
np.array