Field Variability Measurement#

0. Presentation#

In this use case, we are going to build a simple Variable Rate Zoning application for agriculture, using SpaceSense’s clustering function

For a high-level, non-technical presentation, you can read this blog article

vra

1. Installation#

This installs the SpaceSense SDK in your notebook

[ ]:
!pip install spacesense

2. Import#

Imports the additional librairies needed in this use case

[ ]:
from spacesense import Client, geoutils, Raster, Vector
from google.colab import files
import datetime
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import json
from skimage import exposure

We then import this function as it will be useful later on in the notebook to display some results

[ ]:
def get_rgb(ds, brightness_factor=1):

    def norm(band):
        band_min, band_max = band.min(), band.max()
        return (band - band_min) / (band_max - band_min)

    var_names = ["S2_BLUE", "S2_GREEN", "S2_RED"]
    band_names = ["S2_B02", "S2_B03", "S2_B04"]

    for elem in var_names:
        if elem in ds.data_vars:
            if elem == "S2_BLUE":
                ds = ds.rename_vars({"S2_BLUE": "S2_B02"})
            elif elem == "S2_GREEN":
                ds = ds.rename_vars({"S2_GREEN": "S2_B03"})
            elif elem == "S2_RED":
                ds = ds.rename_vars({"S2_RED": "S2_B04"})

    blue = norm(ds.S2_B02) * brightness_factor
    green = norm(ds.S2_B03) * brightness_factor
    red = norm(ds.S2_B04) * brightness_factor
    rgb = np.dstack([red, green, blue])

    return rgb

3. Authentication#

Insert the API Key that you received by registering with SpaceSense. If you don’t have one, you can register and get one here

[ ]:
if "SS_API_KEY" not in os.environ:
    from getpass import getpass
    api_key = getpass('Enter your api key : ')
    os.environ["SS_API_KEY"] = api_key
Enter your api key : ··········

4. Identify your Area Of Interest#

In the AOI variable you can add your Area Of Interest boundaries using a geojson format. If you want to build one, you can do it here

[ ]:
aoi = {
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {},
      "geometry": {
        "coordinates": [
          [
            [
              -0.518582854662526,
              54.15052308473625
            ],
            [
              -0.5185010341573673,
              54.14867816904507
            ],
            [
              -0.519912437875206,
              54.145167810221125
            ],
            [
              -0.5196260661061558,
              54.14426920636609
            ],
            [
              -0.5201988096442278,
              54.142699598186624
            ],
            [
              -0.5191965084531489,
              54.14242401198479
            ],
            [
              -0.5173146368297523,
              54.142951218769156
            ],
            [
              -0.5162714253861225,
              54.142220316656704
            ],
            [
              -0.5148395665420367,
              54.142232298762536
            ],
            [
              -0.5083348363648099,
              54.14537149105058
            ],
            [
              -0.5133667974453999,
              54.15021161123221
            ],
            [
              -0.5172328163246789,
              54.15017567183108
            ],
            [
              -0.5172328163246789,
              54.15052308473625
            ],
            [
              -0.518582854662526,
              54.15052308473625
            ]
          ]
        ],
        "type": "Polygon"
      }
    }
  ]
}

client = Client(id="Variable Rate Zoning")

5. Search available images#

First add the Time range Of Interest (TOI) for your study. Below the SDK will search Sentinel 2 images available for the AOI and TOI specified above, and then display all the results in a list

[ ]:
# Define the TOI
start_date = "2022-03-15"
end_date = "2022-08-01"
# Call the S2 search function, passing the AOI, and TOI
s2_search_result = client.s2_search(aoi=aoi, start_date=start_date, end_date=end_date, query_filters={"valid_pixel_percentage": {">=": 99}})
# Visualize the Pandas dataframe with the results
s2_search_result.dataframe
id date tile valid_pixel_percentage platform relative_orbit_number product_id datetime swath_coverage_percentage no_data cloud_shadows vegetation not_vegetated water cloud_medium_probability cloud_high_probability thin_cirrus snow
9 S2B_30UXF_20220315_0_L2A 2022-03-15 30UXF 100.0 sentinel-2b 037 S2B_MSIL2A_20220315T112109_N0400_R037_T30UXF_2... 2022-03-15T11:26:00Z 100.0 0.0 0.0 13.42 86.58 0.0 0.0 0.0 0.0 0.0
8 S2A_30UXF_20220317_0_L2A 2022-03-17 30UXF 100.0 sentinel-2a 137 S2A_MSIL2A_20220317T110811_N0400_R137_T30UXF_2... 2022-03-17T11:16:08Z 100.0 0.0 0.0 47.98 52.02 0.0 0.0 0.0 0.0 0.0
7 S2B_30UXF_20220322_0_L2A 2022-03-22 30UXF 100.0 sentinel-2b 137 S2B_MSIL2A_20220322T110639_N0400_R137_T30UXF_2... 2022-03-22T11:16:02Z 100.0 0.0 0.0 19.04 80.96 0.0 0.0 0.0 0.0 0.0
6 S2B_30UXF_20220325_0_L2A 2022-03-25 30UXF 100.0 sentinel-2b 037 S2B_MSIL2A_20220325T112109_N0400_R037_T30UXF_2... 2022-03-25T11:25:59Z 100.0 0.0 0.0 93.12 6.88 0.0 0.0 0.0 0.0 0.0
5 S2B_30UXF_20220421_0_L2A 2022-04-21 30UXF 100.0 sentinel-2b 137 S2B_MSIL2A_20220421T110619_N0400_R137_T30UXF_2... 2022-04-21T11:16:00Z 100.0 0.0 0.0 100.00 0.00 0.0 0.0 0.0 0.0 0.0
4 S2A_30UXF_20220429_0_L2A 2022-04-29 30UXF 100.0 sentinel-2a 037 S2A_MSIL2A_20220429T112121_N0400_R037_T30UXF_2... 2022-04-29T11:26:07Z 100.0 0.0 0.0 100.00 0.00 0.0 0.0 0.0 0.0 0.0
3 S2B_30UXF_20220524_0_L2A 2022-05-24 30UXF 100.0 sentinel-2b 037 S2B_MSIL2A_20220524T112119_N0400_R037_T30UXF_2... 2022-05-24T11:26:02Z 100.0 0.0 0.0 100.00 0.00 0.0 0.0 0.0 0.0 0.0
2 S2B_30UXF_20220620_0_L2A 2022-06-20 30UXF 100.0 sentinel-2b 137 S2B_MSIL2A_20220620T110629_N0400_R137_T30UXF_2... 2022-06-20T11:16:09Z 100.0 0.0 0.0 99.92 0.08 0.0 0.0 0.0 0.0 0.0
1 S2B_30UXF_20220710_0_L2A 2022-07-10 30UXF 100.0 sentinel-2b 137 S2B_MSIL2A_20220710T110629_N0400_R137_T30UXF_2... 2022-07-10T11:16:09Z 100.0 0.0 0.0 100.00 0.00 0.0 0.0 0.0 0.0 0.0
0 S2A_30UXF_20220718_0_L2A 2022-07-18 30UXF 100.0 sentinel-2a 037 S2A_MSIL2A_20220718T112131_N0400_R037_T30UXF_2... 2022-07-18T11:26:14Z 100.0 0.0 0.0 100.00 0.00 0.0 0.0 0.0 0.0 0.0

We define the bands we want to download. This saves processing time and cost, giving us only what we need. The full list is available in the documentation. For this example, we will take the RGB bands for visualisation, as well as some common indices used in Agriculture.

We’ll also remove the duplicates if there are any.

[ ]:
#We remove duplicate dates
s2_search_result.filter_duplicate_dates()

s2_search_result.output_bands = ["B02","B03","B04", "NDVI", "LAI", "NDRE"]

6. Downloading the images#

Through the fuse function we are able to download all images we selected into a single object

[ ]:
# Call the core fusion function, passing our filtered S2 result
fuse_result = client.fuse(
        catalogs_list=[s2_search_result]
    )
# Visualize the Pandas dataframe with the results
fuse_result.dataset
<xarray.Dataset>
Dimensions:  (time: 10, y: 97, x: 76)
Coordinates:
  * time     (time) datetime64[ns] 2022-03-15 2022-03-17 ... 2022-07-18
  * y        (y) float32 54.15 54.15 54.15 54.15 ... 54.14 54.14 54.14 54.14
  * x        (x) float32 -0.5202 -0.5201 -0.5199 ... -0.5087 -0.5085 -0.5083
Data variables:
    S2_B02   (time, y, x) float32 ...
    S2_B03   (time, y, x) float32 ...
    S2_B04   (time, y, x) float32 ...
    S2_NDVI  (time, y, x) float32 ...
    S2_LAI   (time, y, x) float32 ...
    S2_NDRE  (time, y, x) float32 ...
Attributes:
    transform:        [ 1.58336957e-04  0.00000000e+00 -5.20294561e-01  0.000...
    crs:              +init=epsg:4326
    res:              [1.58336957e-04 8.66651535e-05]
    descriptions:     ['B02', 'B03', 'B04', 'NDVI', 'LAI', 'NDRE']
    AREA_OR_POINT:    Area
    _FillValue:       nan
    s2_data_lineage:  {"Data origin": "S3 bucket (ARN=arn:aws:s3:::sentinel-c...
    ulx, uly:         [-0.52029456 54.15055221]

Below we display the images we have over the full season

[ ]:
fuse_result.plot_rgb(all_dates = True, brightness_factor = 1, aspect='equal')
../_images/notebooks_11_Field_variability_measurement_for_fertiliser_and_seed_input_in_Agriculture_19_0.png

We’ll select a date where we already have some clear crop coverage (2022-04-21), and look at the indices we have available for our zoning

[ ]:
current_date = fuse_result.dataset.isel(time=5)

# Create the subplots with a larger figure size
fig, axes = plt.subplots(ncols=4, figsize=(20, 12))

# Display the images with titles
img_before_display = axes[0].imshow(get_rgb(current_date, brightness_factor=1))
axes[0].set_title('RGB')
img_after_display = axes[1].imshow(current_date.S2_NDVI)
axes[1].set_title('NDVI')
img_after_display = axes[2].imshow(current_date.S2_LAI)
axes[2].set_title('LAI')
img_after_display = axes[3].imshow(current_date.S2_NDRE)
axes[3].set_title('NDRE')

# Show the plot
plt.show()
../_images/notebooks_11_Field_variability_measurement_for_fertiliser_and_seed_input_in_Agriculture_21_0.png

7. Creating the zones - Only with satellite imagery#

Now that we have our images, we are able to start creating the zones on the field. The “cluster” function is what makes it possible. There are several options available. In your clustering, you can specify:

  • How many zones you want

  • Which dates to include

  • Which bands to include *Insert custom data (as shown in part 8)

1. Single date, with a single band

[ ]:
#Select the number of clustering classes you want
n_clusters = 6
#What variables are to be taken into account in the clustering?
variables_to_use = ["S2_NDVI"]
#In this example we the date defined in the step above
clustered_dataset_RGB = geoutils.cluster(current_date, n_clusters, variables_to_use)

#Display the cluster
geoutils.plot_clustered_dataset(clustered_dataset_RGB, n_clusters)
/usr/local/lib/python3.10/dist-packages/sklearn/cluster/_kmeans.py:870: FutureWarning: The default value of `n_init` will change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly to suppress the warning
  warnings.warn(
../_images/notebooks_11_Field_variability_measurement_for_fertiliser_and_seed_input_in_Agriculture_24_1.png

Observation: We clearly see the vertical bands that are visible in the NDVI image

2. Single date, with several bands

[ ]:
#Select the number of clustering classes you want
n_clusters = 6
#What variables are to be taken into account in the clustering?
variables_to_use = ["S2_NDVI", "S2_LAI", "S2_NDRE"]
#In this example we the date defined in the step above
clustered_dataset_RGB = geoutils.cluster(current_date, n_clusters, variables_to_use)

#Display the cluster
geoutils.plot_clustered_dataset(clustered_dataset_RGB, n_clusters)
/usr/local/lib/python3.10/dist-packages/sklearn/cluster/_kmeans.py:870: FutureWarning: The default value of `n_init` will change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly to suppress the warning
  warnings.warn(
../_images/notebooks_11_Field_variability_measurement_for_fertiliser_and_seed_input_in_Agriculture_27_1.png

Observation: We still see the vertical bands from the NDVI, but we also see an area that appeared in the bottom, which is visible in the LAI band

3. Several dates, with a several bands

[ ]:
#Select the number of clustering classes you want
n_clusters = 6
#What variables are to be taken into account in the clustering?
variables_to_use = [ "S2_NDVI", "S2_LAI", "S2_NDRE"]
#In this example we take all the dates from fuse_result.dataset)
clustered_dataset_RGB = geoutils.cluster(fuse_result.dataset, n_clusters, variables_to_use)

#Display the cluster
geoutils.plot_clustered_dataset(clustered_dataset_RGB, n_clusters)
/usr/local/lib/python3.10/dist-packages/sklearn/cluster/_kmeans.py:870: FutureWarning: The default value of `n_init` will change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly to suppress the warning
  warnings.warn(
../_images/notebooks_11_Field_variability_measurement_for_fertiliser_and_seed_input_in_Agriculture_30_1.png

Observation: When taking the whole season, we see significantly less noise than with a single date, and it shows the long term differences in the field, and reveals structural variations within the field

8. Creating the zones - Custom file and satellite data#

As a final step, we’ll demonstrate how you can add custom files to this clustering work. These could be soil maps, historical yield map… Anything that you feel would be relevant for your analysis

If you want to download a file adapted for the example presented, you can find it here.

You can also upload vector files, as shown in the documentation

[ ]:
#Uploads the file(s) you want to include
uploaded = files.upload()
Upload widget is only available when the cell has been executed in the current browser session. Please rerun this cell to enable.
Saving test_field_raster.tif to test_field_raster.tif
[ ]:
#Add a raster file
raster_field = Raster('/content/' + list(uploaded.keys())[0])
#If you add several fields, you can add them by updating the index for uploaded.keys(), as shown below)
#raster_field2 = Raster('/content/' + list(uploaded.keys())[1])

We download and fuse the images and the file

[ ]:
fuse_result = client.fuse(
    catalogs_list=[s2_search_result],
    to_fuse=[raster_field],
    additional_info={"Info": "You can put any additional information, such as custom non-raster or non-vector data, in this parameter"}
    )
fuse_result.dataset
/usr/local/lib/python3.10/dist-packages/spacesense/core.py:223: UserWarning: Variable(s) referenced in grid_mapping not in variables: ['spatial_ref']
  dataset = xr.open_dataset(tempfile.name, decode_coords="all", engine="netcdf4", mask_and_scale=False)
<xarray.Dataset>
Dimensions:                  (time: 10, y: 97, x: 76)
Coordinates:
  * time                     (time) datetime64[ns] 2022-03-15 ... 2022-07-18
  * y                        (y) float32 54.15 54.15 54.15 ... 54.14 54.14 54.14
  * x                        (x) float32 -0.5202 -0.5201 ... -0.5085 -0.5083
Data variables:
    S2_B02                   (time, y, x) float32 ...
    S2_B03                   (time, y, x) float32 ...
    S2_B04                   (time, y, x) float32 ...
    S2_NDVI                  (time, y, x) float32 ...
    S2_LAI                   (time, y, x) float32 ...
    S2_NDRE                  (time, y, x) float32 ...
    test_field_raster_BAND1  (y, x) float32 ...
Attributes:
    transform:        [ 1.58336957e-04  0.00000000e+00 -5.20294561e-01  0.000...
    crs:              +init=epsg:4326
    res:              [1.58336957e-04 8.66651535e-05]
    descriptions:     ['B02', 'B03', 'B04', 'NDVI', 'LAI', 'NDRE']
    AREA_OR_POINT:    Area
    _FillValue:       nan
    s2_data_lineage:  {"Data origin": "S3 bucket (ARN=arn:aws:s3:::sentinel-c...
    ulx, uly:         [-0.52029456 54.15055221]
    scale_factor:     1.0
    add_offset:       0.0
    additional_info:  {'Info': 'You can put any additional information, such ...
[ ]:
current_date = fuse_result.dataset.isel(time=5)

We start by displaying the imported file. This is a randomly generated file, just designed for an example. This explains why the image looks like static.

[ ]:
# Plot the image
plt.imshow(current_date["test_field_raster_BAND1"])
plt.axis('off')  # Optional: Turn off the axis labels
plt.show()
../_images/notebooks_11_Field_variability_measurement_for_fertiliser_and_seed_input_in_Agriculture_39_0.png

And below we mix this file with NDVI, LAI and NDRE

[ ]:
#Select the number of clustering classes you want
n_clusters = 6
#What variables are to be taken into account in the clustering?
variables_to_use = ["test_field_raster_BAND1", "S2_NDVI", "S2_LAI", "S2_NDRE"]
#In this example we take all the dates from fuse_result.dataset, but you can select a single date as well)
clustered_dataset_RGB = geoutils.cluster(current_date, n_clusters, variables_to_use)

#Display the cluster
geoutils.plot_clustered_dataset(clustered_dataset_RGB, n_clusters)
/usr/local/lib/python3.10/dist-packages/sklearn/cluster/_kmeans.py:870: FutureWarning: The default value of `n_init` will change from 10 to 'auto' in 1.4. Set the value of `n_init` explicitly to suppress the warning
  warnings.warn(
../_images/notebooks_11_Field_variability_measurement_for_fertiliser_and_seed_input_in_Agriculture_41_1.png

Observation: And this is it! You can see that the file has clearly impacted the result of the clustering. Again, in this case it was random noise, which explains the result above. But it will provide precious information if complemented with actual field information