Time series of custom index and masking#


Short description

This notebook expands on the previous example, creating a custom water or vegetation index, by creating a time series of the index and masking non-target areas.

In this notebook, you will search for, select, and obtain Sentinel-2 data over a time period. We want the selected data needs to be cloud-free in order to better analyze the index. Custom indices can then be calculated over the area. A mask only over areas with specific classifications (water for water indices, vegetation for vegetation indices) will be applied and the resulting data plotted. This example examines several water quality indices over a period of one year.


1 - Import spacesense object(s) and other dependencies#

[1]:
from spacesense import Client
import matplotlib.pyplot as plt
import numpy as np
import json

import os
if "SS_API_KEY" not in os.environ:
    from getpass import getpass
    api_key = getpass('Enter your api key : ')
    os.environ["SS_API_KEY"] = api_key

2 - Define AOI and output options#

The GeoJSON of this AOI, containing a small portion of Lake Matano and coastline in Indonesia, can be downloaded here

[2]:
# Define the AOI
with open("../resources/aois/lake_matano.geojson", mode="r") as file:
    aoi = json.load(file)

# Get an instance of the SpaceSense Client object
client = Client(id="s2_custom_index_ts_mask")

# Enable to save data in local files
client.enable_local_output()

3 - Search S2#

[3]:
start_date = "2021-01-01"
end_date = "2022-01-01"
# Scenes should be cloud-free in order to best analyze the resulting indices
res_S2 = client.s2_search(aoi=aoi, start_date=start_date, end_date=end_date, query_filters = {"valid_pixel_percentage" : {">=": 95}})
res_S2.dataframe
[3]:
id date tile valid_pixel_percentage platform relative_orbit_number product_id datetime swath_coverage_percentage no_data cloud_shadows vegetation not_vegetated water cloud_medium_probability cloud_high_probability thin_cirrus snow
17 S2B_51MUT_20210101_0_L2A 2021-01-01 51MUT 99.12 sentinel-2b 060 S2B_MSIL2A_20210101T021349_N0214_R060_T51MUT_2... 2021-01-01T02:18:17Z 100.0 0.0 0.25 24.38 0.00 73.22 0.53 0.10 0.00 0.0
16 S2B_51MUT_20210111_0_L2A 2021-01-11 51MUT 99.01 sentinel-2b 060 S2B_MSIL2A_20210111T021349_N0214_R060_T51MUT_2... 2021-01-11T02:18:18Z 100.0 0.0 0.09 25.73 0.00 72.62 0.00 0.00 0.90 0.0
15 S2A_51MUT_20210225_0_L2A 2021-02-25 51MUT 97.55 sentinel-2a 060 S2A_MSIL2A_20210225T021341_N0214_R060_T51MUT_2... 2021-02-25T02:18:19Z 100.0 0.0 0.16 21.80 1.04 73.37 0.36 1.93 0.00 0.0
14 S2B_51MUT_20210315_0_L2A 2021-03-15 51MUT 99.98 sentinel-2b 103 S2B_MSIL2A_20210315T022329_N0214_R103_T51MUT_2... 2021-03-15T02:28:12Z 100.0 0.0 0.00 25.61 0.04 73.67 0.02 0.00 0.00 0.0
13 S2A_51MUT_20210317_0_L2A 2021-03-17 51MUT 100.00 sentinel-2a 060 S2A_MSIL2A_20210317T021341_N0214_R060_T51MUT_2... 2021-03-17T02:18:18Z 100.0 0.0 0.00 26.04 0.00 73.09 0.00 0.00 0.00 0.0
12 S2B_51MUT_20210322_0_L2A 2021-03-22 51MUT 100.00 sentinel-2b 060 S2B_MSIL2A_20210322T021349_N0214_R060_T51MUT_2... 2021-03-22T02:18:18Z 100.0 0.0 0.00 25.85 0.00 73.05 0.00 0.00 0.00 0.0
11 S2B_51MUT_20210501_0_L2A 2021-05-01 51MUT 100.00 sentinel-2b 060 S2B_MSIL2A_20210501T021339_N0300_R060_T51MUT_2... 2021-05-01T02:18:15Z 100.0 0.0 0.00 26.02 0.00 72.95 0.00 0.00 0.00 0.0
10 S2B_51MUT_20210603_0_L2A 2021-06-03 51MUT 99.08 sentinel-2b 103 S2B_MSIL2A_20210603T022329_N0300_R103_T51MUT_2... 2021-06-03T02:28:14Z 100.0 0.0 0.00 24.69 0.09 73.29 0.23 0.69 0.00 0.0
9 S2A_51MUT_20210608_0_L2A 2021-06-08 51MUT 100.00 sentinel-2a 103 S2A_MSIL2A_20210608T022331_N0300_R103_T51MUT_2... 2021-06-08T02:28:14Z 100.0 0.0 0.00 26.25 0.00 72.96 0.00 0.00 0.00 0.0
8 S2B_51MUT_20210623_0_L2A 2021-06-23 51MUT 99.95 sentinel-2b 103 S2B_MSIL2A_20210623T022329_N0300_R103_T51MUT_2... 2021-06-23T02:28:14Z 100.0 0.0 0.00 25.81 0.00 72.98 0.00 0.00 0.05 0.0
7 S2A_51MUT_20210725_0_L2A 2021-07-25 51MUT 100.00 sentinel-2a 060 S2A_MSIL2A_20210725T021351_N0301_R060_T51MUT_2... 2021-07-25T02:18:22Z 100.0 0.0 0.00 26.08 0.00 73.27 0.00 0.00 0.00 0.0
6 S2A_51MUT_20210817_0_L2A 2021-08-17 51MUT 99.44 sentinel-2a 103 S2A_MSIL2A_20210817T022331_N0301_R103_T51MUT_2... 2021-08-17T02:28:15Z 100.0 0.0 0.00 24.06 0.75 73.43 0.03 0.53 0.00 0.0
5 S2B_51MUT_20211001_0_L2A 2021-10-01 51MUT 95.10 sentinel-2b 103 S2B_MSIL2A_20211001T022329_N0301_R103_T51MUT_2... 2021-10-01T02:28:13Z 100.0 0.0 0.00 25.74 0.00 68.32 0.60 0.20 4.10 0.0
4 S2A_51MUT_20211003_0_L2A 2021-10-03 51MUT 99.66 sentinel-2a 060 S2A_MSIL2A_20211003T021351_N0301_R060_T51MUT_2... 2021-10-03T02:18:24Z 100.0 0.0 0.02 26.01 0.14 72.64 0.32 0.00 0.00 0.0
3 S2A_51MUT_20211023_0_L2A 2021-10-23 51MUT 98.51 sentinel-2a 060 S2A_MSIL2A_20211023T021351_N0301_R060_T51MUT_2... 2021-10-23T02:18:24Z 100.0 0.0 0.35 23.49 0.47 73.46 0.30 0.84 0.00 0.0
2 S2B_51MUT_20211110_0_L2A 2021-11-10 51MUT 99.97 sentinel-2b 103 S2B_MSIL2A_20211110T022329_N0301_R103_T51MUT_2... 2021-11-10T02:28:13Z 100.0 0.0 0.00 26.09 0.00 72.94 0.00 0.00 0.03 0.0
1 S2B_51MUT_20211127_0_L2A 2021-11-27 51MUT 100.00 sentinel-2b 060 S2B_MSIL2A_20211127T021339_N0301_R060_T51MUT_2... 2021-11-27T02:18:15Z 100.0 0.0 0.00 26.15 0.00 73.05 0.00 0.00 0.00 0.0
0 S2B_51MUT_20211220_0_L2A 2021-12-20 51MUT 100.00 sentinel-2b 103 S2B_MSIL2A_20211220T022319_N0301_R103_T51MUT_2... 2021-12-20T02:28:07Z 100.0 0.0 0.00 25.83 0.00 73.01 0.00 0.00 0.00 0.0

4 - Specify bands#

Only selecting bands from S2 that we are interested in. The water quality indices we want to calculate use bands 1, 2, 3, 4, and 8, thus we only choose those bands plus the scene classification.

[4]:
res_S2.output_bands = ["B01","B02","B03","B04","B08","SCL"]

5 - Obtain S2 data through Fuse function#

[5]:
fusion = client.fuse(
    catalogs_list=[res_S2],
    additional_info={"Info": "You can put any additional information, such as custom non-raster or non-vector data, in this parameter"}
    )
fusion.dataset
2022-11-07 16:30:23,862 INFO spacesense.core : created everything
[5]:
<xarray.Dataset>
Dimensions:  (time: 18, y: 2357, x: 327)
Coordinates:
  * time     (time) datetime64[ns] 2021-01-01 2021-01-11 ... 2021-12-20
  * y        (y) float32 -2.487 -2.487 -2.487 -2.487 ... -2.504 -2.504 -2.504
  * x        (x) float64 121.3 121.3 121.3 121.3 ... 121.3 121.3 121.3 121.3
Data variables:
    S2_B01   (time, y, x) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0388 0.0388 0.0 0.0
    S2_B02   (time, y, x) float32 0.0 0.0 0.0 0.0 ... 0.0397 0.0401 0.0415 0.0
    S2_B03   (time, y, x) float32 0.0 0.0 0.0 0.0 ... 0.0277 0.0267 0.0264 0.0
    S2_B04   (time, y, x) float32 0.0 0.0 0.0 0.0 ... 0.0184 0.0179 0.0179 0.0
    S2_B08   (time, y, x) float32 0.0 0.0 0.0 0.0 ... 0.0126 0.0122 0.0129 0.0
    S2_SCL   (time, y, x) float32 0.0 0.0 0.0 0.0 0.0 ... 6.0 6.0 6.0 6.0 6.0
Attributes:
    crs:              +init=epsg:4326
    transform:        [ 8.98315284e-05  0.00000000e+00  1.21303645e+02  0.000...
    res:              [8.98315284e-05 7.00728646e-06]
    ulx, uly:         [121.30364507  -2.48706115]
    s2_data_lineage:  {"Data origin": "S3 bucket (ARN=arn:aws:s3:::sentinel-c...
    additional_info:  {'Info': 'You can put any additional information, such ...

6 - Postprocessing: Calculating indices#

Various water quality indices are calculated from the resulting S2 band data.

The following quantities are used, and the empirical formulas can be found through the source paper:

  • Concentration of chlorophyll a (Chl_a) (source)

  • Suspended particle matter (SPM) (source)

[6]:
ds = fusion.dataset
ds = ds.assign(CHL_A = (4.23 * ((ds.S2_B03/ds.S2_B01) ** 3.94)))
ds = ds.assign(SPM = 2887 * (ds.S2_B08)**1.223)
### Necessary to replace infinite values with nan
ds['CHL_A'] = ds['CHL_A'].where(np.isinf(ds['CHL_A'])==False)
ds
[6]:
<xarray.Dataset>
Dimensions:  (time: 18, y: 2357, x: 327)
Coordinates:
  * time     (time) datetime64[ns] 2021-01-01 2021-01-11 ... 2021-12-20
  * y        (y) float32 -2.487 -2.487 -2.487 -2.487 ... -2.504 -2.504 -2.504
  * x        (x) float64 121.3 121.3 121.3 121.3 ... 121.3 121.3 121.3 121.3
Data variables:
    S2_B01   (time, y, x) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0388 0.0388 0.0 0.0
    S2_B02   (time, y, x) float32 0.0 0.0 0.0 0.0 ... 0.0397 0.0401 0.0415 0.0
    S2_B03   (time, y, x) float32 0.0 0.0 0.0 0.0 ... 0.0277 0.0267 0.0264 0.0
    S2_B04   (time, y, x) float32 0.0 0.0 0.0 0.0 ... 0.0184 0.0179 0.0179 0.0
    S2_B08   (time, y, x) float32 0.0 0.0 0.0 0.0 ... 0.0126 0.0122 0.0129 0.0
    S2_SCL   (time, y, x) float32 0.0 0.0 0.0 0.0 0.0 ... 6.0 6.0 6.0 6.0 6.0
    CHL_A    (time, y, x) float32 nan nan nan nan nan ... 1.121 0.9701 nan nan
    SPM      (time, y, x) float32 0.0 0.0 0.0 0.0 0.0 ... 13.72 13.18 14.12 0.0
Attributes:
    crs:              +init=epsg:4326
    transform:        [ 8.98315284e-05  0.00000000e+00  1.21303645e+02  0.000...
    res:              [8.98315284e-05 7.00728646e-06]
    ulx, uly:         [121.30364507  -2.48706115]
    s2_data_lineage:  {"Data origin": "S3 bucket (ARN=arn:aws:s3:::sentinel-c...
    additional_info:  {'Info': 'You can put any additional information, such ...

7 - Postprocessing: Masking#

Here we mask out any non-water classifications of the AOI, as definied by the S2 scene classification (SCL) band. This can also be used with custom raster or vector datasets.

[7]:
### The SCL band value 6 is water. This mask looks only at pixels classified as water.
ds['mask'] = ds.S2_SCL.sel(time="2021-06-08")==6

8 - Visualization mask#

Plot the mask that we created according to the S2 SCL band. We are looking to mask water, classified as “6” in the SCL band, represented by the number “1” and colored yellow in this mask.

[8]:
ds.mask.plot()
[8]:
<matplotlib.collections.QuadMesh at 0x7f027c9e4910>
../_images/notebooks_9_index_time_series_and_masking_19_1.png

9 - Create time series of Chl_a and SPM with and without the mask applied#

[9]:
# Take the AOI level mean (average over lat and lon) of the indices and remove N/A
# The "where" function applies the mask we previously created
ts_Chl_a_masked = ds['CHL_A'].where(ds.mask).mean(dim={'y','x'}).dropna(dim='time')
ts_Chl_a = ds['CHL_A'].mean(dim={'y','x'}).dropna(dim='time')

# Apply the same process to SPM
ts_SPM_masked = ds['SPM'].where(ds.mask).mean(dim={'y','x'}).dropna(dim='time')
ts_SPM = ds['SPM'].mean(dim={'y','x'}).dropna(dim='time')

10 - Plot time series#

As we can see, the masking makes a very large difference in the analysis. This is because the AOI includes a large portion of land, which strongly taints the water quality indicies. The reverse scenario is also true when analyzing vegetation indices, the non-vegetated areas in the AOI should be masked.

[10]:
# Set up and plot time series using matplotlib
fig, ax = plt.subplots(figsize=(12,6))
ax2 = ax.twinx()
ts_Chl_a_masked.plot(ax=ax, color='green', label="Chlorophyll-a w/ mask", linestyle="-")
ts_Chl_a.plot(ax=ax, color='green', label="Chlorophyll-a", linestyle="--")
ts_SPM_masked.plot(ax=ax2, color='blue', label="SPM w/ mask", linestyle="-")
ts_SPM.plot(ax=ax2, color='blue', label="SPM", linestyle="--")

fig.legend(loc="upper left")
plt.title("Water Quality Indices Time Series")
[10]:
Text(0.5, 1.0, 'Water Quality Indices Time Series')
../_images/notebooks_9_index_time_series_and_masking_23_1.png