Convert sklearn model to ONNX

For reproducability of the planetsca models, you might be interested in converting the sklearn model objects into the ONNX format. This allows for a model trained with one version of scikit-learn to be used with a different version of scikit-learn (and any other ML framework that uses the ONNX format). You can read more about model persistence recommendations for scikit-learn models here.

To do this conversion, we can use the skl2onnx package, following the steps below.

[1]:
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType

import planetsca as ps
/home/jovyan/envs/planetenv/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

For this example, we are just going to retrieve the planetsca model from the joblib file on Hugging Face.

However, model below can also be a new custom planetsca model that you’ve trained yourself. See the Train and Predict notebook for an example of this.

[2]:
# retrieve planetsca model from Hugging Face
model = ps.download.retrieve_model()

Specify the initial types for the model. Here we need to provide the data type, and shape of the input data for our model. We can call this input variable surface_reflectance, and specify that it is a FloatTensorType (for floating point numbers) of shape [None, 4] where None is interpreted as meaning that we will have a flexible number of samples, and 4 corresponds to the four Planet Scope bands (blue, green, red, NIR) for each sample.

[3]:
initial_types = [("surface_reflectance", FloatTensorType([None, 4]))]

Now, convert the model:

[4]:
onnx_model = convert_sklearn(model, initial_types=initial_types)

And save the ONNX model to a new file.

[5]:
with open("random_forest_20240116_binary_174K.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())
[ ]: