4: IceNet Library Usage

4: IceNet Library Usage#

Context#

Purpose#

This notebook demonstrates the use of the IceNet library for sea-ice forecasting trained using climate reanalysis and observational data.

Description#

IceNet is a python library that provides the ability to download, process, train and predict from end to end. Users can interact with IceNet either via the python interface or via a set of command-line interfaces (CLI) which provide a high-level interface that covers the above abilities.

This notebook demonstrates the use of the python library api for forecasting sea ice for a reduced dataset to demonstrate its capabilities. The final output of interest are maps of sea ice concentration.

Modelling approach#

This modelling approach allows users to immediately utilise the library for producing sea ice concentration forecasts.

Highlights#

The key features of an end to end run are:

Setup: this was concerned with setting up the conda environment, which remains the same as in 01.cli_demonstration
1. Introduction the environment and project structure.
2. Download sea ice concentration data as training data.
3. Process downloaded data, and generate cached datasets to speed up training.
4. Train the neural network and generate checkpoint and model output.
5. Predict for defined dates.
6. Visualisation of the prediction output.

This follows the same structure as the CLI demonstration notebook so that it’s easy to follow step-by-step…

Contributions#

Notebook#

James Byrne (author)

Bryn Noel Ubald (co-author)

Please raise issues in this repository to suggest updates to this notebook!

Contact me at jambyr <at> bas.ac.uk for anything else…

Modelling codebase#

James Byrne (code author), Bryn Noel Ubald (code author), Tom Andersson (science author)

Modelling publications#

Andersson, T.R., Hosking, J.S., Pérez-Ortiz, M. et al. Seasonal Arctic sea ice forecasting with probabilistic deep learning. Nat Commun 12, 5124 (2021). https://doi.org/10.1038/s41467-021-25257-4

Involved organisations#

The Alan Turing Institute and British Antarctic Survey

1. Introduction#

Once installed the API can be utilised as easily as the CLI commands from a shell, via any Python interpreter. As usual ensure that you’re operating within the conda environment you installed the library into.

A tip on CLI - API usage#

All of the icenet_* CLI commands behind the scenes implement API activities. By inspecting the setup.py entry points you can locate the module and thus the code used by these.

In most cases the CLI imposes various assumptions about what to do without exposing, necessarily, all available options to change the behaviour of the library. This is primarily as the CLI entry points are still under development to open up the options, so these CLI operations are for introductory use and API usage is recommended for advanced use cases and pipeline integrations.

What we’ll cover#

For the sake of illustration this notebook will display and execute the equivalent API code, equivalent to the first notebook of this collection as well as some updates that incorporate the visualisations from the third notebook describing the data. However, for the sake of extending our dataset, we’ll work towards extending our original downloads from covering 2019-12-28 through 2020-04-30 to cover 2020 in totality, as well as creating a more complex selection of dates for our dataset, training and predicting with new networks.

import numpy as np
import pandas as pd
import os
import random

# We also set the logging level so that we get some feedback from the API
import logging
logging.basicConfig(level=logging.INFO)

2. Download#

The following is preparation of the downloaders, whose instantiation describes the interactions with the upstream APIs/data interfaces used to source various types of data.

In this section, we download all required data with our extended date range. All downloaders inherit a download method from the Downloader class in icenet.data.producers, which also contains two other data producing classes Generator (which Masks inherits from) and Processor (used in the next section), each providing abstract implementations that multiple classes derive from.

from icenet.data.sic.mask import Masks
from icenet.data.interfaces.cds import ERA5Downloader
from icenet.data.sic.osisaf import SICDownloader

Next we download all required data with our extended date range. All downloaders inherit a download method from the Downloader class in icenet.data.producers, which also contains two other data producing classes Generator (which Masks inherits from) and Processor (used in the next section), each providing abstract implementations that multiple classes derive from.

Mask data#

We start here with generating the masks for training/prediction. This includes regions where sea ice does not form, land regions, and the polar hole.

We saw in 01.cli_demonstration, that we can generate this using the icenet_data_masks CLI, but to do this in Python, we can do the following:

masks = Masks(north=False, south=True)
masks.generate(save_polarhole_masks=False)

INFO:root:Skipping ./data/masks/south/masks/active_grid_cell_mask_01.npy, already exists
INFO:root:Skipping ./data/masks/south/masks/active_grid_cell_mask_02.npy, already exists
INFO:root:Skipping ./data/masks/south/masks/active_grid_cell_mask_03.npy, already exists
INFO:root:Skipping ./data/masks/south/masks/active_grid_cell_mask_04.npy, already exists
INFO:root:Skipping ./data/masks/south/masks/active_grid_cell_mask_05.npy, already exists
INFO:root:Skipping ./data/masks/south/masks/active_grid_cell_mask_06.npy, already exists
INFO:root:Skipping ./data/masks/south/masks/active_grid_cell_mask_07.npy, already exists
INFO:root:Skipping ./data/masks/south/masks/active_grid_cell_mask_08.npy, already exists
INFO:root:Skipping ./data/masks/south/masks/active_grid_cell_mask_09.npy, already exists
INFO:root:Skipping ./data/masks/south/masks/active_grid_cell_mask_10.npy, already exists
INFO:root:Skipping ./data/masks/south/masks/active_grid_cell_mask_11.npy, already exists
INFO:root:Skipping ./data/masks/south/masks/active_grid_cell_mask_12.npy, already exists

Climate and Sea Ice Data#

We saw in 01.cli_demonstration.ipynb that obtaining and preparing data can be achieved using icenet_data_* commands. To do so, you first need to configure the CDS API token yourself - see here for more instructions on how to set this up.

Here we will download ERA5 reanalysis data using the direct API using the icenet.data.interfaces.cds.ERA5Downloader class and OSISAF sea-ice concentration (SIC) data using the icenet.data.sic.osisaf.SICDownloader class.

era5 = ERA5Downloader(
    var_names=["zg", "uas", "vas"],      # Name of variables to download
    dates=[                                     # Dates to download the variable data for
        pd.to_datetime(date).date()
        for date in pd.date_range("2020-01-01", "2020-04-30", freq="D")
    ],
    delete_tempfiles=False,                     # Whether to delete temporary downloaded files
    levels=[[250, 500], None, None],      # The levels at which to obtain the variables for (e.g. for zg, it is the pressure levels)
    max_threads=64,                             # Maximum number of concurrent downloads
    north=False,                                # Boolean: Whether require data across northern hemisphere
    south=True,                                 # Boolean: Whether require data across southern hemisphere
    # NOTE: there appears to be a bug with the toolbox API at present (icenet#54)
    use_toolbox=False)                          # Experimental, alternative download method

era5.download()                                 # Start downloading

WARNING:root:!!! Deletions of temp files are switched off: be careful with this, you need to manage your files manually
2025-02-18 17:50:42,158 INFO [2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.
INFO:datapi.legacy_api_client:[2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.
2025-02-18 17:50:42,161 WARNING [2024-06-16T00:00:00] CDS API syntax is changed and some keys or parameter names may have also changed. To avoid requests failing, please use the "Show API request code" tool on the dataset Download Form to check you are using the correct syntax for your API request.
WARNING:datapi.legacy_api_client:[2024-06-16T00:00:00] CDS API syntax is changed and some keys or parameter names may have also changed. To avoid requests failing, please use the "Show API request code" tool on the dataset Download Form to check you are using the correct syntax for your API request.
INFO:root:Upping connection limit for max_threads > 10
INFO:root:Building request(s), downloading and daily averaging from ERA5 API
INFO:root:Processing single download for zg @ 250 with 121 dates
INFO:root:Processing single download for zg @ 500 with 121 dates
INFO:root:Processing single download for uas @ None with 121 dates
INFO:root:Processing single download for vas @ None with 121 dates
INFO:root:No requested dates remain, likely already present
INFO:root:No requested dates remain, likely already present
INFO:root:No requested dates remain, likely already present
INFO:root:No requested dates remain, likely already present
INFO:root:0 daily files downloaded

sic = SICDownloader(
    dates=[
        pd.to_datetime(date).date()     # Dates to download the variable data for
        for date in pd.date_range("2020-01-01", "2020-04-30", freq="D")
    ],
    delete_tempfiles=False,             # Whether to delete temporary downloaded files
    north=False,                        # Boolean: Whether to use mask for this region
    south=True,                         # Boolean: Whether to use mask for this region
    parallel_opens=False,               # Boolean: Whether to use `dask.delayed` to open and preprocess multiple files in parallel
)

sic.download()

INFO:root:Downloading SIC datafiles to .temp intermediates...

INFO:root:Excluding 121 dates already existing from 121 dates requested.
INFO:root:Opening for interpolation: ['./data/osisaf/south/siconca/2020.nc']
INFO:root:Processing 0 missing dates

The ERA5Downloader inherits from ClimateDownloader, from which several implementations derive their functionality. Two particularly useful methods shown below allow the downloaded data to be converted to the same grid and orientation as the OSISAF SIC data.

era5.regrid()
era5.rotate_wind_data()

INFO:root:No regrid batches to processing, moving on...
INFO:root:Rotating wind data prior to merging
INFO:root:Rotating wind data in ./data/era5/south/uas ./data/era5/south/vas
INFO:root:0 files for uas
INFO:root:0 files for vas
INFO:root:Rotating wind data in ./data/era5/south/uas ./data/era5/south/vas
INFO:root:0 files for uas
INFO:root:0 files for vas

It is hopefully obvious now that the CLI operations wrap several activities within the API up for convenience and initial ease of use, but that for experimentation, research and advancing the pipeline the API offers greater flexibility to manipulate processing chains as required for these purposes.

3. Process#

Similarly to the downloaders, each data producer (be it a Downloader or Generator) has a respective Processor that converts the /data/ products into a normalised, preprocessed dataset under /processed/ as per the icenet_process_* commands.

Firstly, to make life a bit easier, we set up some variables that are normally handled from the CLI arguments. In this case we’re splitting the validation and test sets out of the 2020 data in a fairly naive manner.

processing_dates = dict(
    train=[pd.to_datetime(el) for el in pd.date_range("2020-01-01", "2020-03-31")],
    val=[pd.to_datetime(el) for el in pd.date_range("2020-04-03", "2020-04-23")],
    test=[pd.to_datetime(el) for el in pd.date_range("2020-04-01", "2020-04-02")],
)
processed_name = "tutorial_api_data"

Next, we create the data producer and configure them for the dataset we want to create.

from icenet.data.processors.era5 import IceNetERA5PreProcessor
from icenet.data.processors.meta import IceNetMetaPreProcessor
from icenet.data.processors.osi import IceNetOSIPreProcessor

pp = IceNetERA5PreProcessor(
    ["uas", "vas"],                 # Absolute normalised variables
    ["zg500", "zg250"],      # Variables defined as deviations from an aggregated norm
    processed_name,
    processing_dates["train"],
    processing_dates["val"],
    processing_dates["test"],
    linear_trends=tuple(),
    north=False,
    south=True
)

osi = IceNetOSIPreProcessor(
    ["siconca"],                    # Absolute normalised variables
    [],                             # Variables defined as deviations from an aggregated norm
    processed_name,
    processing_dates["train"],
    processing_dates["val"],
    processing_dates["test"],
    linear_trends=tuple(),
    north=False,
    south=True
)

meta = IceNetMetaPreProcessor(
    processed_name,
    north=False,
    south=True
)

INFO:root:Creating path: ./processed/tutorial_api_data/era5
INFO:root:Creating path: ./processed/tutorial_api_data/osisaf
INFO:root:Creating path: ./processed/tutorial_api_data/meta

Next, we initialise the data processors using init_source_data which scans the data source directories to understand what data is available for processing based on the parameters. Since we named the processed data "tutorial_api_data" above, it will create a data loader config file, loader.tutorial_api_data.json, in the current directory.

pp.init_source_data(
    lag_days=1,
)
pp.process()

osi.init_source_data(
    lag_days=1,
)
osi.process()

meta.process()

INFO:root:Processing 91 dates for train category
INFO:root:Including lag of 1 days
INFO:root:Including lead of 93 days
INFO:root:Processing 21 dates for val category
INFO:root:Including lag of 1 days
INFO:root:Including lead of 93 days
INFO:root:Processing 2 dates for test category
INFO:root:Including lag of 1 days
INFO:root:Including lead of 93 days
INFO:root:Got 2 files for psl
INFO:root:Got 2 files for ta500
INFO:root:Got 2 files for tas
INFO:root:Got 2 files for tos
INFO:root:Got 2 files for uas
INFO:root:Got 2 files for vas
INFO:root:Got 2 files for zg250
INFO:root:Got 2 files for zg500
INFO:root:Opening files for uas
INFO:root:Filtered to 731 units long based on configuration requirements
INFO:root:Normalising uas
INFO:root:Opening files for vas
INFO:root:Filtered to 731 units long based on configuration requirements
INFO:root:Normalising vas
INFO:root:Opening files for zg500
INFO:root:Filtered to 731 units long based on configuration requirements
INFO:root:Generating climatology ./processed/tutorial_api_data/era5/south/params/climatology.zg500
WARNING:root:We don't have a full climatology (1,2,3) compared with data (1,2,3,4,5,6,7,8,9,10,11,12)
INFO:root:Normalising zg500
INFO:root:Opening files for zg250
INFO:root:Filtered to 731 units long based on configuration requirements
INFO:root:Generating climatology ./processed/tutorial_api_data/era5/south/params/climatology.zg250
WARNING:root:We don't have a full climatology (1,2,3) compared with data (1,2,3,4,5,6,7,8,9,10,11,12)
INFO:root:Normalising zg250
INFO:root:Writing configuration to ./loader.tutorial_api_data.json
INFO:root:Processing 91 dates for train category
INFO:root:Including lag of 1 days
INFO:root:Including lead of 93 days
INFO:root:No data found for 2019-12-31, outside data boundary perhaps?
INFO:root:Processing 21 dates for val category
INFO:root:Including lag of 1 days
INFO:root:Including lead of 93 days
INFO:root:Processing 2 dates for test category
INFO:root:Including lag of 1 days
INFO:root:Including lead of 93 days
INFO:root:Got 1 files for siconca
INFO:root:Opening files for siconca
INFO:root:Filtered to 121 units long based on configuration requirements
INFO:root:No normalisation for siconca
INFO:root:Loading configuration ./loader.tutorial_api_data.json
INFO:root:Writing configuration to ./loader.tutorial_api_data.json
INFO:root:Loading configuration ./loader.tutorial_api_data.json
INFO:root:Writing configuration to ./loader.tutorial_api_data.json

At this point the preprocessed data is ready to convert or create a configuration for the network dataset.

Dataset creation#

As with the icenet_dataset_create command we can create a dataset configuration for training the network. As before this can include cached data for the network in the format of a TFRecordDataset compatible set of tfrecords. To achieve this we create the IceNetDataLoader, which can both generate IceNetDataSet configurations (which easily provide the necessary functionality for training and prediction) as well as individual data samples for direct usage.

from icenet.data.loaders import IceNetDataLoaderFactory

implementation = "dask"
loader_config = "loader.tutorial_api_data.json"
dataset_name = "api_dataset"
lag = 1

dl = IceNetDataLoaderFactory().create_data_loader(
    implementation,
    loader_config,
    dataset_name,
    lag,
    n_forecast_days=7,
    north=False,
    south=True,
    output_batch_size=4,
    generate_workers=8)

INFO:root:Creating path: ./network_datasets/api_dataset
INFO:root:Loading configuration loader.tutorial_api_data.json

dl

<icenet.data.loaders.dask.DaskMultiWorkerLoader at 0x7f7f22298c50>

We can see the loader config contains information about the data sources included and also the different dates to use for the training, validation and test sets:

dl._config

{'sources': {'era5': {'name': 'tutorial_api_data',
   'implementation': 'IceNetERA5PreProcessor',
   'anom': ['zg500', 'zg250'],
   'abs': ['uas', 'vas'],
   'dates': {'train': ['2020_01_01',
     '2020_01_02',
     '2020_01_03',
     '2020_01_04',
     '2020_01_05',
     '2020_01_06',
     '2020_01_07',
     '2020_01_08',
     '2020_01_09',
     '2020_01_10',
     '2020_01_11',
     '2020_01_12',
     '2020_01_13',
     '2020_01_14',
     '2020_01_15',
     '2020_01_16',
     '2020_01_17',
     '2020_01_18',
     '2020_01_19',
     '2020_01_20',
     '2020_01_21',
     '2020_01_22',
     '2020_01_23',
     '2020_01_24',
     '2020_01_25',
     '2020_01_26',
     '2020_01_27',
     '2020_01_28',
     '2020_01_29',
     '2020_01_30',
     '2020_01_31',
     '2020_02_01',
     '2020_02_02',
     '2020_02_03',
     '2020_02_04',
     '2020_02_05',
     '2020_02_06',
     '2020_02_07',
     '2020_02_08',
     '2020_02_09',
     '2020_02_10',
     '2020_02_11',
     '2020_02_12',
     '2020_02_13',
     '2020_02_14',
     '2020_02_15',
     '2020_02_16',
     '2020_02_17',
     '2020_02_18',
     '2020_02_19',
     '2020_02_20',
     '2020_02_21',
     '2020_02_22',
     '2020_02_23',
     '2020_02_24',
     '2020_02_25',
     '2020_02_26',
     '2020_02_27',
     '2020_02_28',
     '2020_02_29',
     '2020_03_01',
     '2020_03_02',
     '2020_03_03',
     '2020_03_04',
     '2020_03_05',
     '2020_03_06',
     '2020_03_07',
     '2020_03_08',
     '2020_03_09',
     '2020_03_10',
     '2020_03_11',
     '2020_03_12',
     '2020_03_13',
     '2020_03_14',
     '2020_03_15',
     '2020_03_16',
     '2020_03_17',
     '2020_03_18',
     '2020_03_19',
     '2020_03_20',
     '2020_03_21',
     '2020_03_22',
     '2020_03_23',
     '2020_03_24',
     '2020_03_25',
     '2020_03_26',
     '2020_03_27',
     '2020_03_28',
     '2020_03_29',
     '2020_03_30',
     '2020_03_31'],
    'val': ['2020_04_03',
     '2020_04_04',
     '2020_04_05',
     '2020_04_06',
     '2020_04_07',
     '2020_04_08',
     '2020_04_09',
     '2020_04_10',
     '2020_04_11',
     '2020_04_12',
     '2020_04_13',
     '2020_04_14',
     '2020_04_15',
     '2020_04_16',
     '2020_04_17',
     '2020_04_18',
     '2020_04_19',
     '2020_04_20',
     '2020_04_21',
     '2020_04_22',
     '2020_04_23'],
    'test': ['2020_04_01', '2020_04_02']},
   'linear_trends': [],
   'linear_trend_steps': [1, 2, 3, 4, 5, 6, 7],
   'meta': [],
   'var_files': {'uas': ['./processed/tutorial_api_data/era5/south/uas/uas_abs.nc'],
    'vas': ['./processed/tutorial_api_data/era5/south/vas/vas_abs.nc'],
    'zg500': ['./processed/tutorial_api_data/era5/south/zg500/zg500_anom.nc'],
    'zg250': ['./processed/tutorial_api_data/era5/south/zg250/zg250_anom.nc']}},
  'osisaf': {'name': 'tutorial_api_data',
   'implementation': 'IceNetOSIPreProcessor',
   'anom': [],
   'abs': ['siconca'],
   'dates': {'train': ['2020_01_01',
     '2020_01_02',
     '2020_01_03',
     '2020_01_04',
     '2020_01_05',
     '2020_01_06',
     '2020_01_07',
     '2020_01_08',
     '2020_01_09',
     '2020_01_10',
     '2020_01_11',
     '2020_01_12',
     '2020_01_13',
     '2020_01_14',
     '2020_01_15',
     '2020_01_16',
     '2020_01_17',
     '2020_01_18',
     '2020_01_19',
     '2020_01_20',
     '2020_01_21',
     '2020_01_22',
     '2020_01_23',
     '2020_01_24',
     '2020_01_25',
     '2020_01_26',
     '2020_01_27',
     '2020_01_28',
     '2020_01_29',
     '2020_01_30',
     '2020_01_31',
     '2020_02_01',
     '2020_02_02',
     '2020_02_03',
     '2020_02_04',
     '2020_02_05',
     '2020_02_06',
     '2020_02_07',
     '2020_02_08',
     '2020_02_09',
     '2020_02_10',
     '2020_02_11',
     '2020_02_12',
     '2020_02_13',
     '2020_02_14',
     '2020_02_15',
     '2020_02_16',
     '2020_02_17',
     '2020_02_18',
     '2020_02_19',
     '2020_02_20',
     '2020_02_21',
     '2020_02_22',
     '2020_02_23',
     '2020_02_24',
     '2020_02_25',
     '2020_02_26',
     '2020_02_27',
     '2020_02_28',
     '2020_02_29',
     '2020_03_01',
     '2020_03_02',
     '2020_03_03',
     '2020_03_04',
     '2020_03_05',
     '2020_03_06',
     '2020_03_07',
     '2020_03_08',
     '2020_03_09',
     '2020_03_10',
     '2020_03_11',
     '2020_03_12',
     '2020_03_13',
     '2020_03_14',
     '2020_03_15',
     '2020_03_16',
     '2020_03_17',
     '2020_03_18',
     '2020_03_19',
     '2020_03_20',
     '2020_03_21',
     '2020_03_22',
     '2020_03_23',
     '2020_03_24',
     '2020_03_25',
     '2020_03_26',
     '2020_03_27',
     '2020_03_28',
     '2020_03_29',
     '2020_03_30',
     '2020_03_31'],
    'val': ['2020_04_03',
     '2020_04_04',
     '2020_04_05',
     '2020_04_06',
     '2020_04_07',
     '2020_04_08',
     '2020_04_09',
     '2020_04_10',
     '2020_04_11',
     '2020_04_12',
     '2020_04_13',
     '2020_04_14',
     '2020_04_15',
     '2020_04_16',
     '2020_04_17',
     '2020_04_18',
     '2020_04_19',
     '2020_04_20',
     '2020_04_21',
     '2020_04_22',
     '2020_04_23'],
    'test': ['2020_04_01', '2020_04_02']},
   'linear_trends': [],
   'linear_trend_steps': [1, 2, 3, 4, 5, 6, 7],
   'meta': [],
   'var_files': {'siconca': ['./processed/tutorial_api_data/osisaf/south/siconca/siconca_abs.nc']}},
  'meta': {'name': 'tutorial_api_data',
   'implementation': 'IceNetMetaPreProcessor',
   'anom': [],
   'abs': [],
   'dates': {'train': [], 'val': [], 'test': []},
   'linear_trends': [],
   'linear_trend_steps': [1, 2, 3, 4, 5, 6, 7],
   'meta': ['sin', 'cos', 'land'],
   'var_files': {'sin': ['./processed/tutorial_api_data/meta/south/sin/sin.nc'],
    'cos': ['./processed/tutorial_api_data/meta/south/cos/cos.nc'],
    'land': ['./processed/tutorial_api_data/meta/south/land/land.nc']}}},
 'dtype': 'float32',
 'shape': [432, 432],
 'missing_dates': []}

dl._config.keys()

dict_keys(['sources', 'dtype', 'shape', 'missing_dates'])

At this point we can either use generate or write_dataset_config_only to produce a ready-to-go IceNetDataSet configuration. Both of these will generate a dataset config, dataset_config.api_dataset.json (recall we set the dataset name as api_dataset above).

dl.generate()

INFO:distributed.http.proxy:To route to workers diagnostics web server please install jupyter-server-proxy: python -m pip install jupyter-server-proxy
INFO:distributed.scheduler:State start
INFO:distributed.scheduler:  Scheduler at:     tcp://127.0.0.1:46424
INFO:distributed.scheduler:  dashboard at:  http://127.0.0.1:8888/status

INFO:distributed.scheduler:Registering Worker plugin shuffle
INFO:distributed.nanny:        Start Nanny at: 'tcp://127.0.0.1:35032'
INFO:distributed.nanny:        Start Nanny at: 'tcp://127.0.0.1:40659'
INFO:distributed.nanny:        Start Nanny at: 'tcp://127.0.0.1:41786'
INFO:distributed.nanny:        Start Nanny at: 'tcp://127.0.0.1:38892'
INFO:distributed.nanny:        Start Nanny at: 'tcp://127.0.0.1:35929'
INFO:distributed.nanny:        Start Nanny at: 'tcp://127.0.0.1:33984'
INFO:distributed.nanny:        Start Nanny at: 'tcp://127.0.0.1:35942'
INFO:distributed.nanny:        Start Nanny at: 'tcp://127.0.0.1:39198'
INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:43652 name: 6
INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:43652
INFO:distributed.core:Starting established connection to tcp://127.0.0.1:42940
INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:39165 name: 2
INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:39165
INFO:distributed.core:Starting established connection to tcp://127.0.0.1:42934
INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:36933 name: 0
INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:36933
INFO:distributed.core:Starting established connection to tcp://127.0.0.1:42942
INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:40005 name: 5
INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:40005
INFO:distributed.core:Starting established connection to tcp://127.0.0.1:42930
INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:39324 name: 1
INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:39324
INFO:distributed.core:Starting established connection to tcp://127.0.0.1:42936
INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:38414 name: 4
INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:38414
INFO:distributed.core:Starting established connection to tcp://127.0.0.1:42944
INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:38235 name: 3
INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:38235
INFO:distributed.core:Starting established connection to tcp://127.0.0.1:42932
INFO:distributed.scheduler:Register worker addr: tcp://127.0.0.1:37202 name: 7
INFO:distributed.scheduler:Starting worker compute stream, tcp://127.0.0.1:37202
INFO:distributed.core:Starting established connection to tcp://127.0.0.1:42938
INFO:distributed.scheduler:Receive client connection: Client-0d439a20-ee21-11ef-a3a1-c4cbe1af5a66
INFO:distributed.core:Starting established connection to tcp://127.0.0.1:42946
INFO:root:Dashboard at localhost:8888
INFO:root:Using dask client <Client: 'tcp://127.0.0.1:46424' processes=8 threads=8, memory=503.20 GiB>
INFO:root:91 train dates to process, generating cache data.
INFO:distributed.scheduler:Receive client connection: Client-worker-1094dc07-ee21-11ef-a737-c4cbe1af5a66
INFO:distributed.core:Starting established connection to tcp://127.0.0.1:43018
INFO:distributed.scheduler:Receive client connection: Client-worker-10949de6-ee21-11ef-a743-c4cbe1af5a66
INFO:distributed.core:Starting established connection to tcp://127.0.0.1:43020
INFO:distributed.scheduler:Receive client connection: Client-worker-1094e581-ee21-11ef-a74a-c4cbe1af5a66
INFO:distributed.core:Starting established connection to tcp://127.0.0.1:43022
INFO:distributed.scheduler:Receive client connection: Client-worker-10966a0c-ee21-11ef-a73b-c4cbe1af5a66
INFO:distributed.core:Starting established connection to tcp://127.0.0.1:43026
INFO:distributed.scheduler:Receive client connection: Client-worker-1096f3b7-ee21-11ef-a735-c4cbe1af5a66
INFO:distributed.core:Starting established connection to tcp://127.0.0.1:43024
INFO:distributed.scheduler:Receive client connection: Client-worker-10976f21-ee21-11ef-a74f-c4cbe1af5a66
INFO:distributed.core:Starting established connection to tcp://127.0.0.1:43028
INFO:distributed.scheduler:Receive client connection: Client-worker-10998830-ee21-11ef-a74b-c4cbe1af5a66
INFO:distributed.core:Starting established connection to tcp://127.0.0.1:43032
INFO:distributed.scheduler:Receive client connection: Client-worker-109a8925-ee21-11ef-a741-c4cbe1af5a66
INFO:distributed.core:Starting established connection to tcp://127.0.0.1:43034
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-8edca25275ac5fe21814770f6825e20b', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: waiting
old run_spec: Alias(('astype-8edca25275ac5fe21814770f6825e20b', 0, 0)->('getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0))
new run_spec: Alias(('astype-8edca25275ac5fe21814770f6825e20b', 0, 0)->('array-getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0))
old dependencies: {('getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0)}
new dependencies: {('array-getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: waiting
old run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
new run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
old dependencies: {('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}
new dependencies: {('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 40, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: processing
old run_spec: <Task ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 40, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 40, 0, 0)->('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 40, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687'}
new dependencies: {('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 40, 0, 0)}

WARNING:distributed.scheduler:Detected different `run_spec` for key ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 60, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 60, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 60, 0, 0)->('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 60, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687'}
new dependencies: {('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 60, 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-8edca25275ac5fe21814770f6825e20b', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: Alias(('astype-8edca25275ac5fe21814770f6825e20b', 0, 0)->('array-getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0))
new run_spec: Alias(('astype-8edca25275ac5fe21814770f6825e20b', 0, 0)->('getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0))
old dependencies: {('array-getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0)}
new dependencies: {('getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0)}

WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-ac07afc5678ef766d38010ceed63954d', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: Alias(('astype-ac07afc5678ef766d38010ceed63954d', 0, 0)->('array-getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0))
new run_spec: Alias(('astype-ac07afc5678ef766d38010ceed63954d', 0, 0)->('getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0))
old dependencies: {('array-getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0)}
new dependencies: {('getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 36, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 36, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 36, 0, 0)->('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 36, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687'}
new dependencies: {('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 36, 0, 0)}

WARNING:distributed.scheduler:Detected different `run_spec` for key ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 56, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 56, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 56, 0, 0)->('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 56, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687'}
new dependencies: {('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 56, 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 57, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 57, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 57, 0, 0)->('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 57, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687'}
new dependencies: {('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 57, 0, 0)}

WARNING:distributed.scheduler:Detected different `run_spec` for key ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 29, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 29, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 29, 0, 0)->('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 29, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687'}
new dependencies: {('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 29, 0, 0)}

WARNING:distributed.scheduler:Detected different `run_spec` for key ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 49, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 49, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 49, 0, 0)->('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 49, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687'}
new dependencies: {('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 49, 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-8edca25275ac5fe21814770f6825e20b', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: Alias(('astype-8edca25275ac5fe21814770f6825e20b', 0, 0)->('array-getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0))
new run_spec: Alias(('astype-8edca25275ac5fe21814770f6825e20b', 0, 0)->('getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0))
old dependencies: {('array-getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0)}
new dependencies: {('getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0)}

WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-8edca25275ac5fe21814770f6825e20b', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: processing
old run_spec: Alias(('astype-8edca25275ac5fe21814770f6825e20b', 0, 0)->('array-getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0))
new run_spec: Alias(('astype-8edca25275ac5fe21814770f6825e20b', 0, 0)->('getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0))
old dependencies: {('array-getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0)}
new dependencies: {('getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0)}

WARNING:distributed.scheduler:Detected different `run_spec` for key ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 21, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 21, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 21, 0, 0)->('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 21, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687'}
new dependencies: {('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 21, 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
new run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
old dependencies: {('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}
new dependencies: {('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 17, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: waiting
old run_spec: Alias(('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 17, 0, 0)->('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 17, 0, 0))
new run_spec: <Task ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 17, 0, 0) getter(...)>
old dependencies: {('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 17, 0, 0)}
new dependencies: {'original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687'}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-ac07afc5678ef766d38010ceed63954d', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: Alias(('astype-ac07afc5678ef766d38010ceed63954d', 0, 0)->('array-getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0))
new run_spec: Alias(('astype-ac07afc5678ef766d38010ceed63954d', 0, 0)->('getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0))
old dependencies: {('array-getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0)}
new dependencies: {('getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-8edca25275ac5fe21814770f6825e20b', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: Alias(('astype-8edca25275ac5fe21814770f6825e20b', 0, 0)->('array-getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0))
new run_spec: Alias(('astype-8edca25275ac5fe21814770f6825e20b', 0, 0)->('getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0))
old dependencies: {('array-getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0)}
new dependencies: {('getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 6, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 6, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 6, 0, 0)->('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 6, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687'}
new dependencies: {('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 6, 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-8edca25275ac5fe21814770f6825e20b', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: Alias(('astype-8edca25275ac5fe21814770f6825e20b', 0, 0)->('array-getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0))
new run_spec: Alias(('astype-8edca25275ac5fe21814770f6825e20b', 0, 0)->('getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0))
old dependencies: {('array-getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0)}
new dependencies: {('getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000000.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000001.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000002.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000003.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000004.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000005.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000006.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000007.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000008.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000009.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000010.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000011.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000012.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000013.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000014.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000015.tfrecord
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 72, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 72, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 72, 0, 0)->('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 72, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687'}
new dependencies: {('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 72, 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: processing
old run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
new run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
old dependencies: {('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}
new dependencies: {('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: waiting
old run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
new run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
old dependencies: {('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}
new dependencies: {('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: waiting
old run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
new run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
old dependencies: {('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}
new dependencies: {('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
new run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
old dependencies: {('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}
new dependencies: {('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
new run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
old dependencies: {('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}
new dependencies: {('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}

WARNING:distributed.scheduler:Detected different `run_spec` for key ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 89, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 89, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 89, 0, 0)->('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 89, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687'}
new dependencies: {('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 89, 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
new run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
old dependencies: {('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}
new dependencies: {('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
new run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
old dependencies: {('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}
new dependencies: {('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
new run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
old dependencies: {('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}
new dependencies: {('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
new run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
old dependencies: {('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}
new dependencies: {('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000016.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000017.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000018.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000019.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000020.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000021.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/train/00000022.tfrecord
INFO:root:21 val dates to process, generating cache data.
WARNING:distributed.scheduler:Detected different `run_spec` for key ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 96, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 96, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 96, 0, 0)->('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 96, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687'}
new dependencies: {('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 96, 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 110, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 110, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 110, 0, 0)->('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 110, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687'}
new dependencies: {('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 110, 0, 0)}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
WARNING:distributed.scheduler:Detected different `run_spec` for key ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 98, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888. 
Debugging information
---------------------
old task state: waiting
old run_spec: Alias(('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 98, 0, 0)->('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 98, 0, 0))
new run_spec: <Task ('open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 98, 0, 0) getter(...)>
old dependencies: {('original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687', 98, 0, 0)}
new dependencies: {'original-open_dataset-siconca_abs-a01747618e200dfad2e64a04882ac687'}

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
INFO:root:Finished output ./network_datasets/api_dataset/south/val/00000000.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/val/00000001.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/val/00000002.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/val/00000003.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/val/00000004.tfrecord
INFO:root:Finished output ./network_datasets/api_dataset/south/val/00000005.tfrecord
INFO:root:2 test dates to process, generating cache data.
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
INFO:root:Finished output ./network_datasets/api_dataset/south/test/00000000.tfrecord
INFO:root:Average sample generation time: 5.241115434127941
INFO:root:Writing configuration to ./dataset_config.api_dataset.json
INFO:distributed.scheduler:Remove client Client-0d439a20-ee21-11ef-a3a1-c4cbe1af5a66
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:42946; closing.
INFO:distributed.scheduler:Remove client Client-0d439a20-ee21-11ef-a3a1-c4cbe1af5a66
INFO:distributed.scheduler:Close client connection: Client-0d439a20-ee21-11ef-a3a1-c4cbe1af5a66
INFO:distributed.scheduler:Retire worker addresses (stimulus_id='retire-workers-1739901197.1153474') (0, 1, 2, 3, 4, 5, 6, 7)
INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:35032'. Reason: nanny-close
INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close
INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:40659'. Reason: nanny-close
INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close
INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:41786'. Reason: nanny-close
INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close
INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:38892'. Reason: nanny-close
INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close
INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:35929'. Reason: nanny-close
INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close
INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:33984'. Reason: nanny-close
INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close
INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:35942'. Reason: nanny-close
INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close
INFO:distributed.nanny:Closing Nanny at 'tcp://127.0.0.1:39198'. Reason: nanny-close
INFO:distributed.nanny:Nanny asking worker to close. Reason: nanny-close
INFO:distributed.scheduler:Remove client Client-worker-1096f3b7-ee21-11ef-a735-c4cbe1af5a66
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:43024; closing.
INFO:distributed.scheduler:Remove client Client-worker-1094dc07-ee21-11ef-a737-c4cbe1af5a66
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:43018; closing.
INFO:distributed.scheduler:Remove client Client-worker-10966a0c-ee21-11ef-a73b-c4cbe1af5a66
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:43026; closing.
INFO:distributed.scheduler:Remove client Client-worker-10949de6-ee21-11ef-a743-c4cbe1af5a66
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:43020; closing.
INFO:distributed.scheduler:Remove client Client-worker-109a8925-ee21-11ef-a741-c4cbe1af5a66
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:43034; closing.
INFO:distributed.scheduler:Remove client Client-worker-1094e581-ee21-11ef-a74a-c4cbe1af5a66
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:43022; closing.
INFO:distributed.scheduler:Remove client Client-worker-1096f3b7-ee21-11ef-a735-c4cbe1af5a66
INFO:distributed.scheduler:Remove client Client-worker-1094dc07-ee21-11ef-a737-c4cbe1af5a66
INFO:distributed.scheduler:Remove client Client-worker-10966a0c-ee21-11ef-a73b-c4cbe1af5a66
INFO:distributed.scheduler:Remove client Client-worker-10949de6-ee21-11ef-a743-c4cbe1af5a66
INFO:distributed.scheduler:Remove client Client-worker-109a8925-ee21-11ef-a741-c4cbe1af5a66
INFO:distributed.scheduler:Remove client Client-worker-1094e581-ee21-11ef-a74a-c4cbe1af5a66
INFO:distributed.scheduler:Remove client Client-worker-10998830-ee21-11ef-a74b-c4cbe1af5a66
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:43032; closing.
INFO:distributed.scheduler:Remove client Client-worker-10976f21-ee21-11ef-a74f-c4cbe1af5a66
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:43028; closing.
INFO:distributed.scheduler:Remove client Client-worker-10998830-ee21-11ef-a74b-c4cbe1af5a66
INFO:distributed.scheduler:Remove client Client-worker-10976f21-ee21-11ef-a74f-c4cbe1af5a66
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:42942; closing.
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:42936; closing.
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:42934; closing.
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:42932; closing.
INFO:distributed.scheduler:Close client connection: Client-worker-1096f3b7-ee21-11ef-a735-c4cbe1af5a66
INFO:distributed.scheduler:Close client connection: Client-worker-1094dc07-ee21-11ef-a737-c4cbe1af5a66
INFO:distributed.scheduler:Close client connection: Client-worker-10966a0c-ee21-11ef-a73b-c4cbe1af5a66
INFO:distributed.scheduler:Close client connection: Client-worker-10949de6-ee21-11ef-a743-c4cbe1af5a66
INFO:distributed.scheduler:Close client connection: Client-worker-109a8925-ee21-11ef-a741-c4cbe1af5a66
INFO:distributed.scheduler:Close client connection: Client-worker-1094e581-ee21-11ef-a74a-c4cbe1af5a66
INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:36933 name: 0 (stimulus_id='handle-worker-cleanup-1739901197.2305622')
INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:39324 name: 1 (stimulus_id='handle-worker-cleanup-1739901197.2342887')
INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:39165 name: 2 (stimulus_id='handle-worker-cleanup-1739901197.2362614')
INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:38235 name: 3 (stimulus_id='handle-worker-cleanup-1739901197.238188')
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:42944; closing.
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:42930; closing.
INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:38414 name: 4 (stimulus_id='handle-worker-cleanup-1739901197.2466989')
INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:40005 name: 5 (stimulus_id='handle-worker-cleanup-1739901197.248992')
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:42940; closing.
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:42938; closing.
INFO:distributed.scheduler:Close client connection: Client-worker-10998830-ee21-11ef-a74b-c4cbe1af5a66
INFO:distributed.scheduler:Close client connection: Client-worker-10976f21-ee21-11ef-a74f-c4cbe1af5a66
INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:43652 name: 6 (stimulus_id='handle-worker-cleanup-1739901197.2597573')
INFO:distributed.scheduler:Remove worker addr: tcp://127.0.0.1:37202 name: 7 (stimulus_id='handle-worker-cleanup-1739901197.262013')
INFO:distributed.scheduler:Lost all workers
INFO:distributed.batched:Batched Comm Closed <TCP (closed) Scheduler connection to worker local=tcp://127.0.0.1:46424 remote=tcp://127.0.0.1:42944>
Traceback (most recent call last):
  File "/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/batched.py", line 115, in _background_send
    nbytes = yield coro
             ^^^^^^^^^^
  File "/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/tornado/gen.py", line 766, in run
    value = future.result()
            ^^^^^^^^^^^^^^^
  File "/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/comm/tcp.py", line 263, in write
    raise CommClosedError()
distributed.comm.core.CommClosedError
INFO:distributed.batched:Batched Comm Closed <TCP (closed) Scheduler connection to worker local=tcp://127.0.0.1:46424 remote=tcp://127.0.0.1:42930>
Traceback (most recent call last):
  File "/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/batched.py", line 115, in _background_send
    nbytes = yield coro
             ^^^^^^^^^^
  File "/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/tornado/gen.py", line 766, in run
    value = future.result()
            ^^^^^^^^^^^^^^^
  File "/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/comm/tcp.py", line 263, in write
    raise CommClosedError()
distributed.comm.core.CommClosedError
INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:41786' closed.
INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:39198' closed.
INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:35942' closed.
INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:35032' closed.
INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:38892' closed.
INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:33984' closed.
INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:40659' closed.
INFO:distributed.nanny:Nanny at 'tcp://127.0.0.1:35929' closed.
INFO:distributed.scheduler:Closing scheduler. Reason: unknown
INFO:distributed.scheduler:Scheduler closing all comms

To generate samples from this, we can use the .generate_sample() method, which returns the inputs x, y and sample weights sw:

x, y, sw = dl.generate_sample(pd.Timestamp("2020-04-01"))

print(f"type(x): {type(x)}, x.shape: {x.shape}")
print(f"type(y): {type(y)}, y.shape: {y.shape}")
print(f"type(sw): {type(sw)}, sw.shape: {sw.shape}")

type(x): <class 'numpy.ndarray'>, x.shape: (432, 432, 8)
type(y): <class 'numpy.ndarray'>, y.shape: (432, 432, 7, 1)
type(sw): <class 'numpy.ndarray'>, sw.shape: (432, 432, 7, 1)

4. Train#

For single runs we programmatically can call the same method used by the CLI. train_model defines the training process from start to finish. The model-ensembler works outside the API, controlling multiple CLI submissions. Customising an ensemble can be achieved through looking at the configuration in the pipeline repository. That said, if workflow system integration (e.g. Airflow) is desired, integrating via this method is the way to go.

from icenet.data.dataset import IceNetDataSet
import tensorflow as tf

dataset_config = "dataset_config.api_dataset.json"
dataset = IceNetDataSet(dataset_config, batch_size=4)
strategy = tf.distribute.get_strategy()

INFO:root:Loading configuration dataset_config.api_dataset.json
INFO:root:Training dataset path: ./network_datasets/api_dataset/south/train
INFO:root:Validation dataset path: ./network_datasets/api_dataset/south/val
INFO:root:Test dataset path: ./network_datasets/api_dataset/south/test

We can view the loaded dataset configuration.

dataset._config

{'identifier': 'api_dataset',
 'implementation': 'DaskMultiWorkerLoader',
 'channels': ['uas_abs_1',
  'vas_abs_1',
  'siconca_abs_1',
  'zg250_anom_1',
  'zg500_anom_1',
  'cos_1',
  'land_1',
  'sin_1'],
 'counts': {'train': 91, 'val': 21, 'test': 2},
 'dtype': 'float32',
 'loader_config': '/data/hpcdata/users/bryald/git/icenet/icenet/icenet-notebooks/loader.tutorial_api_data.json',
 'missing_dates': [],
 'n_forecast_days': 7,
 'north': False,
 'num_channels': 8,
 'shape': [432, 432],
 'south': True,
 'dataset_path': './network_datasets/api_dataset',
 'generate_workers': 8,
 'loss_weight_days': True,
 'output_batch_size': 4,
 'var_lag': 1,
 'var_lag_override': {}}

You can obtain the data loader that was used to create the dataset config via the .get_data_loader() method:

dataset.loader_config

'/data/hpcdata/users/bryald/git/icenet/icenet/icenet-notebooks/loader.tutorial_api_data.json'

You can obtain the data loader that was used to create the dataset config via the .get_data_loader() method:

dataset.get_data_loader()

INFO:root:Loading configuration /data/hpcdata/users/bryald/git/icenet/icenet/icenet-notebooks/loader.tutorial_api_data.json

<icenet.data.loaders.dask.DaskMultiWorkerLoader at 0x7f7eaa607110>

We can use train_model to train. Note that we can add more arguments just those that can be set with the icenet_train CLI, for example:

learning_rate
lr_10e_decay_fac
lr_decay_start
lr_decay_end

and several more.

from icenet.model.train import train_model

trained_path, history = train_model(
    run_name="api_test_run",
    dataset=dataset,
    epochs=10,
    n_filters_factor=0.6,
    seed=42,
    strategy=strategy,
    training_verbosity=2,
)

INFO:root:Creating network folder: ./results/networks/api_test_run
INFO:root:Adding tensorboard callback

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                Output Shape                 Param #   Connected to                  
==================================================================================================
 input_1 (InputLayer)        [(None, 432, 432, 8)]        0         []                            
                                                                                                  
 conv2d (Conv2D)             (None, 432, 432, 38)         2774      ['input_1[0][0]']             
                                                                                                  
 conv2d_1 (Conv2D)           (None, 432, 432, 38)         13034     ['conv2d[0][0]']              
                                                                                                  
 batch_normalization (Batch  (None, 432, 432, 38)         152       ['conv2d_1[0][0]']            
 Normalization)                                                                                   
                                                                                                  
 max_pooling2d (MaxPooling2  (None, 216, 216, 38)         0         ['batch_normalization[0][0]'] 
 D)                                                                                               
                                                                                                  
 conv2d_2 (Conv2D)           (None, 216, 216, 76)         26068     ['max_pooling2d[0][0]']       
                                                                                                  
 conv2d_3 (Conv2D)           (None, 216, 216, 76)         52060     ['conv2d_2[0][0]']            
                                                                                                  
 batch_normalization_1 (Bat  (None, 216, 216, 76)         304       ['conv2d_3[0][0]']            
 chNormalization)                                                                                 
                                                                                                  
 max_pooling2d_1 (MaxPoolin  (None, 108, 108, 76)         0         ['batch_normalization_1[0][0]'
 g2D)                                                               ]                             
                                                                                                  
 conv2d_4 (Conv2D)           (None, 108, 108, 152)        104120    ['max_pooling2d_1[0][0]']     
                                                                                                  
 conv2d_5 (Conv2D)           (None, 108, 108, 152)        208088    ['conv2d_4[0][0]']            
                                                                                                  
 batch_normalization_2 (Bat  (None, 108, 108, 152)        608       ['conv2d_5[0][0]']            
 chNormalization)                                                                                 
                                                                                                  
 max_pooling2d_2 (MaxPoolin  (None, 54, 54, 152)          0         ['batch_normalization_2[0][0]'
 g2D)                                                               ]                             
                                                                                                  
 conv2d_6 (Conv2D)           (None, 54, 54, 152)          208088    ['max_pooling2d_2[0][0]']     
                                                                                                  
 conv2d_7 (Conv2D)           (None, 54, 54, 152)          208088    ['conv2d_6[0][0]']            
                                                                                                  
 batch_normalization_3 (Bat  (None, 54, 54, 152)          608       ['conv2d_7[0][0]']            
 chNormalization)                                                                                 
                                                                                                  
 max_pooling2d_3 (MaxPoolin  (None, 27, 27, 152)          0         ['batch_normalization_3[0][0]'
 g2D)                                                               ]                             
                                                                                                  
 conv2d_8 (Conv2D)           (None, 27, 27, 304)          416176    ['max_pooling2d_3[0][0]']     
                                                                                                  
 conv2d_9 (Conv2D)           (None, 27, 27, 304)          832048    ['conv2d_8[0][0]']            
                                                                                                  
 batch_normalization_4 (Bat  (None, 27, 27, 304)          1216      ['conv2d_9[0][0]']            
 chNormalization)                                                                                 
                                                                                                  
 up_sampling2d (UpSampling2  (None, 54, 54, 304)          0         ['batch_normalization_4[0][0]'
 D)                                                                 ]                             
                                                                                                  
 conv2d_10 (Conv2D)          (None, 54, 54, 152)          184984    ['up_sampling2d[0][0]']       
                                                                                                  
 concatenate (Concatenate)   (None, 54, 54, 304)          0         ['batch_normalization_3[0][0]'
                                                                    , 'conv2d_10[0][0]']          
                                                                                                  
 conv2d_11 (Conv2D)          (None, 54, 54, 152)          416024    ['concatenate[0][0]']         
                                                                                                  
 conv2d_12 (Conv2D)          (None, 54, 54, 152)          208088    ['conv2d_11[0][0]']           
                                                                                                  
 batch_normalization_5 (Bat  (None, 54, 54, 152)          608       ['conv2d_12[0][0]']           
 chNormalization)                                                                                 
                                                                                                  
 up_sampling2d_1 (UpSamplin  (None, 108, 108, 152)        0         ['batch_normalization_5[0][0]'
 g2D)                                                               ]                             
                                                                                                  
 conv2d_13 (Conv2D)          (None, 108, 108, 152)        92568     ['up_sampling2d_1[0][0]']     
                                                                                                  
 concatenate_1 (Concatenate  (None, 108, 108, 304)        0         ['batch_normalization_2[0][0]'
 )                                                                  , 'conv2d_13[0][0]']          
                                                                                                  
 conv2d_14 (Conv2D)          (None, 108, 108, 152)        416024    ['concatenate_1[0][0]']       
                                                                                                  
 conv2d_15 (Conv2D)          (None, 108, 108, 152)        208088    ['conv2d_14[0][0]']           
                                                                                                  
 batch_normalization_6 (Bat  (None, 108, 108, 152)        608       ['conv2d_15[0][0]']           
 chNormalization)                                                                                 
                                                                                                  
 up_sampling2d_2 (UpSamplin  (None, 216, 216, 152)        0         ['batch_normalization_6[0][0]'
 g2D)                                                               ]                             
                                                                                                  
 conv2d_16 (Conv2D)          (None, 216, 216, 76)         46284     ['up_sampling2d_2[0][0]']     
                                                                                                  
 concatenate_2 (Concatenate  (None, 216, 216, 152)        0         ['batch_normalization_1[0][0]'
 )                                                                  , 'conv2d_16[0][0]']          
                                                                                                  
 conv2d_17 (Conv2D)          (None, 216, 216, 76)         104044    ['concatenate_2[0][0]']       
                                                                                                  
 conv2d_18 (Conv2D)          (None, 216, 216, 76)         52060     ['conv2d_17[0][0]']           
                                                                                                  
 batch_normalization_7 (Bat  (None, 216, 216, 76)         304       ['conv2d_18[0][0]']           
 chNormalization)                                                                                 
                                                                                                  
 up_sampling2d_3 (UpSamplin  (None, 432, 432, 76)         0         ['batch_normalization_7[0][0]'
 g2D)                                                               ]                             
                                                                                                  
 conv2d_19 (Conv2D)          (None, 432, 432, 38)         11590     ['up_sampling2d_3[0][0]']     
                                                                                                  
 concatenate_3 (Concatenate  (None, 432, 432, 76)         0         ['conv2d_1[0][0]',            
 )                                                                   'conv2d_19[0][0]']           
                                                                                                  
 conv2d_20 (Conv2D)          (None, 432, 432, 38)         26030     ['concatenate_3[0][0]']       
                                                                                                  
 conv2d_21 (Conv2D)          (None, 432, 432, 38)         13034     ['conv2d_20[0][0]']           
                                                                                                  
 conv2d_22 (Conv2D)          (None, 432, 432, 38)         13034     ['conv2d_21[0][0]']           
                                                                                                  
 conv2d_23 (Conv2D)          (None, 432, 432, 7)          273       ['conv2d_22[0][0]']           
                                                                                                  
==================================================================================================
Total params: 3867077 (14.75 MB)
Trainable params: 3864873 (14.74 MB)
Non-trainable params: 2204 (8.61 KB)
__________________________________________________________________________________________________

INFO:root:Datasets: 23 train, 6 val and 1 test filenames
INFO:root:Reducing datasets to 1.0 of total files
INFO:root:Reduced: 23 train, 6 val and 1 test filenames
INFO:root:
Setting learning rate to: 9.999999747378752e-05

Epoch 1/10

WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1739901260.411547   12285 device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.

Epoch 1: val_rmse improved from inf to 39.29171, saving model to ./results/networks/api_test_run/api_test_run.network_api_dataset.42.h5

/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/keras/src/engine/training.py:3103: UserWarning: You are saving your model as an HDF5 file via `model.save()`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')`.
  saving_api.save_model(

23/23 - 94s - loss: 194.5975 - binacc: 46.6107 - mae: 26.9723 - rmse: 32.7170 - mse: 1648.0391 - val_loss: 280.6678 - val_binacc: 37.4605 - val_mae: 37.3215 - val_rmse: 39.2917 - val_mse: 1664.6821 - lr: 1.0000e-04 - 94s/epoch - 4s/step

INFO:root:
Setting learning rate to: 9.999999747378752e-05

Epoch 2/10

Epoch 2: val_rmse improved from 39.29171 to 36.75743, saving model to ./results/networks/api_test_run/api_test_run.network_api_dataset.42.h5
23/23 - 15s - loss: 34.5736 - binacc: 91.5352 - mae: 8.2429 - rmse: 13.7904 - mse: 886.3834 - val_loss: 245.6298 - val_binacc: 38.8811 - val_mae: 34.6514 - val_rmse: 36.7574 - val_mse: 1539.5328 - lr: 1.0000e-04 - 15s/epoch - 657ms/step

INFO:root:
Setting learning rate to: 9.999999747378752e-05

Epoch 3/10

Epoch 3: val_rmse did not improve from 36.75743
23/23 - 15s - loss: 19.9743 - binacc: 95.2971 - mae: 5.1566 - rmse: 10.4819 - mse: 784.5455 - val_loss: 252.7097 - val_binacc: 37.7841 - val_mae: 35.3270 - val_rmse: 37.2834 - val_mse: 1663.5397 - lr: 1.0000e-04 - 15s/epoch - 635ms/step

INFO:root:
Setting learning rate to: 9.999999747378752e-05

Epoch 4/10

Epoch 4: val_rmse did not improve from 36.75743
23/23 - 15s - loss: 17.0506 - binacc: 95.6285 - mae: 4.7101 - rmse: 9.6844 - mse: 778.6022 - val_loss: 245.6464 - val_binacc: 37.8139 - val_mae: 34.6528 - val_rmse: 36.7587 - val_mse: 1699.6475 - lr: 1.0000e-04 - 15s/epoch - 633ms/step

INFO:root:
Setting learning rate to: 9.999999747378752e-05

Epoch 5/10

Epoch 5: val_rmse improved from 36.75743 to 35.42382, saving model to ./results/networks/api_test_run/api_test_run.network_api_dataset.42.h5
23/23 - 15s - loss: 17.9778 - binacc: 95.6332 - mae: 4.8636 - rmse: 9.9443 - mse: 770.4973 - val_loss: 228.1295 - val_binacc: 37.9954 - val_mae: 33.4161 - val_rmse: 35.4238 - val_mse: 1611.3502 - lr: 1.0000e-04 - 15s/epoch - 647ms/step

INFO:root:
Setting learning rate to: 9.999999747378752e-05

Epoch 6/10

Epoch 6: val_rmse improved from 35.42382 to 32.70583, saving model to ./results/networks/api_test_run/api_test_run.network_api_dataset.42.h5
23/23 - 15s - loss: 16.9991 - binacc: 95.9539 - mae: 4.6282 - rmse: 9.6698 - mse: 855.1011 - val_loss: 194.4648 - val_binacc: 39.7058 - val_mae: 30.6694 - val_rmse: 32.7058 - val_mse: 1299.2209 - lr: 1.0000e-04 - 15s/epoch - 653ms/step

INFO:root:
Setting learning rate to: 9.999999747378752e-05

Epoch 7/10

Epoch 7: val_rmse improved from 32.70583 to 32.09356, saving model to ./results/networks/api_test_run/api_test_run.network_api_dataset.42.h5
23/23 - 15s - loss: 16.3363 - binacc: 96.2113 - mae: 4.5398 - rmse: 9.4794 - mse: 901.5573 - val_loss: 187.2520 - val_binacc: 38.0056 - val_mae: 30.0285 - val_rmse: 32.0936 - val_mse: 1676.7063 - lr: 1.0000e-04 - 15s/epoch - 664ms/step

INFO:root:
Setting learning rate to: 9.999999747378752e-05

Epoch 8/10

Epoch 8: val_rmse improved from 32.09356 to 31.41574, saving model to ./results/networks/api_test_run/api_test_run.network_api_dataset.42.h5
23/23 - 15s - loss: 13.6596 - binacc: 96.1464 - mae: 4.1590 - rmse: 8.6681 - mse: 931.1821 - val_loss: 179.4260 - val_binacc: 38.1194 - val_mae: 28.9165 - val_rmse: 31.4157 - val_mse: 1729.0111 - lr: 1.0000e-04 - 15s/epoch - 652ms/step

INFO:root:
Setting learning rate to: 9.999999747378752e-05

Epoch 9/10

Epoch 9: val_rmse did not improve from 31.41574
23/23 - 15s - loss: 14.5424 - binacc: 95.9816 - mae: 4.1770 - rmse: 8.9438 - mse: 836.4677 - val_loss: 222.1769 - val_binacc: 37.5008 - val_mae: 31.4880 - val_rmse: 34.9586 - val_mse: 2159.8699 - lr: 1.0000e-04 - 15s/epoch - 641ms/step

INFO:root:
Setting learning rate to: 9.999999747378752e-05

Epoch 10/10

Epoch 10: val_rmse did not improve from 31.41574
23/23 - 15s - loss: 11.9712 - binacc: 96.4809 - mae: 3.8356 - rmse: 8.1147 - mse: 842.3392 - val_loss: 193.7969 - val_binacc: 37.6117 - val_mae: 29.4039 - val_rmse: 32.6496 - val_mse: 1835.7318 - lr: 1.0000e-04 - 15s/epoch - 638ms/step

INFO:root:Saving network to: ./results/networks/api_test_run/api_test_run.network_api_dataset.42.h5
INFO:tensorflow:Assets written to: ./results/networks/api_test_run/api_test_run.model_api_dataset.42/assets

Breaking train_model apart, one can look at customising the training process itself programmatically. Here, we’ve reduced train_model to its component parts with some notes about missing items (e.g. callbacks and wandb integration), to give some insight into how the training workflow is architected.

from icenet.data.dataset import IceNetDataSet
from icenet.model.models import unet_batchnorm
import icenet.model.losses as losses
import icenet.model.metrics as metrics

# train_model sets up wandb and attempts seeding here (see icenet#8 for issues around multi-GPU determinism)
seed = 45
os.environ['PYTHONHASHSEED'] = str(seed)
np.random.seed(seed)
random.seed(seed)
tf.random.set_seed(seed)
tf.keras.utils.set_random_seed(seed)

# initialise IceNetDataSet
ds = IceNetDataSet(dataset_config, batch_size=4)

input_shape = (*ds.shape, ds.num_channels)
train_ds, val_ds, test_ds = ds.get_split_datasets()

# train_model handles pickup runs/trained networks
run_name = "custom_run"
network_folder = os.path.join(".", "results", "networks", run_name)

if not os.path.exists(network_folder):
    logging.info("Creating network folder: {}".format(network_folder))
    os.makedirs(network_folder)

network_path = os.path.join(network_folder,
                            "{}.network_{}.{}.h5".format(run_name,
                                                         ds.identifier,
                                                         seed))

callbacks_list = list()
# train_model sets up various callbacks: early stopping, lr scheduler, 
# checkpointing, wandb and tensorboard

with strategy.scope():
    loss = losses.WeightedMSE()
    metrics_list = [
        metrics.WeightedMAE(),
        metrics.WeightedRMSE(),
        losses.WeightedMSE()
    ]

    network = unet_batchnorm(
        input_shape=input_shape,
        loss=loss,
        metrics=metrics_list,
        learning_rate=1e-4,
        filter_size=3,
        n_filters_factor=0.6,
        n_forecast_days=ds.n_forecast_days,
    )

# train_model loads weights
network.summary()

model_history = network.fit(
    train_ds,
    epochs=5,
    verbose=2,
    callbacks=callbacks_list,
    validation_data=val_ds,
    max_queue_size=10,
)

logging.info("Saving network to: {}".format(network_path))
network.save_weights(network_path)

INFO:root:Loading configuration dataset_config.api_dataset.json
INFO:root:Training dataset path: ./network_datasets/api_dataset/south/train
INFO:root:Validation dataset path: ./network_datasets/api_dataset/south/val
INFO:root:Test dataset path: ./network_datasets/api_dataset/south/test
INFO:root:Datasets: 46 train, 12 val and 2 test filenames
INFO:root:Creating network folder: ./results/networks/custom_run

Model: "model_1"
__________________________________________________________________________________________________
 Layer (type)                Output Shape                 Param #   Connected to                  
==================================================================================================
 input_2 (InputLayer)        [(None, 432, 432, 8)]        0         []                            
                                                                                                  
 conv2d_24 (Conv2D)          (None, 432, 432, 38)         2774      ['input_2[0][0]']             
                                                                                                  
 conv2d_25 (Conv2D)          (None, 432, 432, 38)         13034     ['conv2d_24[0][0]']           
                                                                                                  
 batch_normalization_8 (Bat  (None, 432, 432, 38)         152       ['conv2d_25[0][0]']           
 chNormalization)                                                                                 
                                                                                                  
 max_pooling2d_4 (MaxPoolin  (None, 216, 216, 38)         0         ['batch_normalization_8[0][0]'
 g2D)                                                               ]                             
                                                                                                  
 conv2d_26 (Conv2D)          (None, 216, 216, 76)         26068     ['max_pooling2d_4[0][0]']     
                                                                                                  
 conv2d_27 (Conv2D)          (None, 216, 216, 76)         52060     ['conv2d_26[0][0]']           
                                                                                                  
 batch_normalization_9 (Bat  (None, 216, 216, 76)         304       ['conv2d_27[0][0]']           
 chNormalization)                                                                                 
                                                                                                  
 max_pooling2d_5 (MaxPoolin  (None, 108, 108, 76)         0         ['batch_normalization_9[0][0]'
 g2D)                                                               ]                             
                                                                                                  
 conv2d_28 (Conv2D)          (None, 108, 108, 152)        104120    ['max_pooling2d_5[0][0]']     
                                                                                                  
 conv2d_29 (Conv2D)          (None, 108, 108, 152)        208088    ['conv2d_28[0][0]']           
                                                                                                  
 batch_normalization_10 (Ba  (None, 108, 108, 152)        608       ['conv2d_29[0][0]']           
 tchNormalization)                                                                                
                                                                                                  
 max_pooling2d_6 (MaxPoolin  (None, 54, 54, 152)          0         ['batch_normalization_10[0][0]
 g2D)                                                               ']                            
                                                                                                  
 conv2d_30 (Conv2D)          (None, 54, 54, 152)          208088    ['max_pooling2d_6[0][0]']     
                                                                                                  
 conv2d_31 (Conv2D)          (None, 54, 54, 152)          208088    ['conv2d_30[0][0]']           
                                                                                                  
 batch_normalization_11 (Ba  (None, 54, 54, 152)          608       ['conv2d_31[0][0]']           
 tchNormalization)                                                                                
                                                                                                  
 max_pooling2d_7 (MaxPoolin  (None, 27, 27, 152)          0         ['batch_normalization_11[0][0]
 g2D)                                                               ']                            
                                                                                                  
 conv2d_32 (Conv2D)          (None, 27, 27, 304)          416176    ['max_pooling2d_7[0][0]']     
                                                                                                  
 conv2d_33 (Conv2D)          (None, 27, 27, 304)          832048    ['conv2d_32[0][0]']           
                                                                                                  
 batch_normalization_12 (Ba  (None, 27, 27, 304)          1216      ['conv2d_33[0][0]']           
 tchNormalization)                                                                                
                                                                                                  
 up_sampling2d_4 (UpSamplin  (None, 54, 54, 304)          0         ['batch_normalization_12[0][0]
 g2D)                                                               ']                            
                                                                                                  
 conv2d_34 (Conv2D)          (None, 54, 54, 152)          184984    ['up_sampling2d_4[0][0]']     
                                                                                                  
 concatenate_4 (Concatenate  (None, 54, 54, 304)          0         ['batch_normalization_11[0][0]
 )                                                                  ',                            
                                                                     'conv2d_34[0][0]']           
                                                                                                  
 conv2d_35 (Conv2D)          (None, 54, 54, 152)          416024    ['concatenate_4[0][0]']       
                                                                                                  
 conv2d_36 (Conv2D)          (None, 54, 54, 152)          208088    ['conv2d_35[0][0]']           
                                                                                                  
 batch_normalization_13 (Ba  (None, 54, 54, 152)          608       ['conv2d_36[0][0]']           
 tchNormalization)                                                                                
                                                                                                  
 up_sampling2d_5 (UpSamplin  (None, 108, 108, 152)        0         ['batch_normalization_13[0][0]
 g2D)                                                               ']                            
                                                                                                  
 conv2d_37 (Conv2D)          (None, 108, 108, 152)        92568     ['up_sampling2d_5[0][0]']     
                                                                                                  
 concatenate_5 (Concatenate  (None, 108, 108, 304)        0         ['batch_normalization_10[0][0]
 )                                                                  ',                            
                                                                     'conv2d_37[0][0]']           
                                                                                                  
 conv2d_38 (Conv2D)          (None, 108, 108, 152)        416024    ['concatenate_5[0][0]']       
                                                                                                  
 conv2d_39 (Conv2D)          (None, 108, 108, 152)        208088    ['conv2d_38[0][0]']           
                                                                                                  
 batch_normalization_14 (Ba  (None, 108, 108, 152)        608       ['conv2d_39[0][0]']           
 tchNormalization)                                                                                
                                                                                                  
 up_sampling2d_6 (UpSamplin  (None, 216, 216, 152)        0         ['batch_normalization_14[0][0]
 g2D)                                                               ']                            
                                                                                                  
 conv2d_40 (Conv2D)          (None, 216, 216, 76)         46284     ['up_sampling2d_6[0][0]']     
                                                                                                  
 concatenate_6 (Concatenate  (None, 216, 216, 152)        0         ['batch_normalization_9[0][0]'
 )                                                                  , 'conv2d_40[0][0]']          
                                                                                                  
 conv2d_41 (Conv2D)          (None, 216, 216, 76)         104044    ['concatenate_6[0][0]']       
                                                                                                  
 conv2d_42 (Conv2D)          (None, 216, 216, 76)         52060     ['conv2d_41[0][0]']           
                                                                                                  
 batch_normalization_15 (Ba  (None, 216, 216, 76)         304       ['conv2d_42[0][0]']           
 tchNormalization)                                                                                
                                                                                                  
 up_sampling2d_7 (UpSamplin  (None, 432, 432, 76)         0         ['batch_normalization_15[0][0]
 g2D)                                                               ']                            
                                                                                                  
 conv2d_43 (Conv2D)          (None, 432, 432, 38)         11590     ['up_sampling2d_7[0][0]']     
                                                                                                  
 concatenate_7 (Concatenate  (None, 432, 432, 76)         0         ['conv2d_25[0][0]',           
 )                                                                   'conv2d_43[0][0]']           
                                                                                                  
 conv2d_44 (Conv2D)          (None, 432, 432, 38)         26030     ['concatenate_7[0][0]']       
                                                                                                  
 conv2d_45 (Conv2D)          (None, 432, 432, 38)         13034     ['conv2d_44[0][0]']           
                                                                                                  
 conv2d_46 (Conv2D)          (None, 432, 432, 38)         13034     ['conv2d_45[0][0]']           
                                                                                                  
 conv2d_47 (Conv2D)          (None, 432, 432, 7)          273       ['conv2d_46[0][0]']           
                                                                                                  
==================================================================================================
Total params: 3867077 (14.75 MB)
Trainable params: 3864873 (14.74 MB)
Non-trainable params: 2204 (8.61 KB)
__________________________________________________________________________________________________
Epoch 1/5
46/46 - 57s - loss: 113.6602 - mae: 16.7237 - rmse: 25.0040 - mse: 1454.2063 - val_loss: 241.6398 - val_mae: 34.9990 - val_rmse: 36.4577 - val_mse: 1608.4023 - 57s/epoch - 1s/step
Epoch 2/5
46/46 - 27s - loss: 23.5520 - mae: 5.3973 - rmse: 11.3820 - mse: 1203.1927 - val_loss: 280.7432 - val_mae: 37.6099 - val_rmse: 39.2970 - val_mse: 1881.7152 - 27s/epoch - 593ms/step
Epoch 3/5
46/46 - 27s - loss: 17.2413 - mae: 4.5673 - rmse: 9.7385 - mse: 1206.6486 - val_loss: 188.9226 - val_mae: 30.6525 - val_rmse: 32.2364 - val_mse: 1439.7086 - 27s/epoch - 593ms/step
Epoch 4/5
46/46 - 27s - loss: 15.6058 - mae: 4.2761 - rmse: 9.2650 - mse: 1110.8224 - val_loss: 158.1187 - val_mae: 27.6734 - val_rmse: 29.4915 - val_mse: 1419.6378 - 27s/epoch - 592ms/step
Epoch 5/5
46/46 - 27s - loss: 12.4503 - mae: 3.7437 - rmse: 8.2755 - mse: 1169.8990 - val_loss: 83.3131 - val_mae: 19.0371 - val_rmse: 21.4073 - val_mse: 1318.4581 - 27s/epoch - 591ms/step

INFO:root:Saving network to: ./results/networks/custom_run/custom_run.network_api_dataset.45.h5

As can be seen the training workflow is very standard for deep learning networks, with train_model and CLI wrapping up the training process with a lot of customisation of extraneous functionality.

5. Predict#

In much the same manner as with train_model, the predict_forecast method acts as a convenient entry point workflow system integration, CLI entry as well as an overridable method upon which to base custom implementations. Using the method directly relies on loading from a prepared (but perhaps not cached) dataset.

Some parameters are fed to predict_forecast that ideally shouldn’t need to be specified (like seed and n_filters_factor) and might seem contextually odd. They’re used to locate the appropriate saved network. This will be cleaned up in a future version.

from icenet.model.predict import predict_forecast

# Follows the naming convention used by the CLI version
output_dir = os.path.join(".", "results", "predict",
                          "custom_run_forecast",
                          "{}.{}".format(run_name, "42"))

predict_forecast(
    dataset_config=dataset_config,
    network_name=run_name,
    n_filters_factor=0.6,
    output_folder=output_dir,
    seed=seed,
    start_dates=[pd.to_datetime(el).date()
                 for el in pd.date_range("2020-04-01", "2020-04-02")],
    test_set=True,
)

INFO:root:Loading configuration dataset_config.api_dataset.json
INFO:root:Training dataset path: ./network_datasets/api_dataset/south/train
INFO:root:Validation dataset path: ./network_datasets/api_dataset/south/val
INFO:root:Test dataset path: ./network_datasets/api_dataset/south/test
INFO:root:Loading configuration /data/hpcdata/users/bryald/git/icenet/icenet/icenet-notebooks/loader.tutorial_api_data.json
INFO:root:Loading model from ./results/networks/custom_run/custom_run.network_api_dataset.45.h5...
INFO:root:Datasets: 69 train, 18 val and 3 test filenames
INFO:root:Processing test batch 1, item 0 (date 2020-04-01)
INFO:root:Running prediction 2020-04-01
INFO:root:Saving 2020-04-01 - forecast output (1, 432, 432, 7)
INFO:root:Processing test batch 1, item 1 (date 2020-04-02)
INFO:root:Running prediction 2020-04-02
WARNING:root:./results/predict/custom_run_forecast/custom_run.42 output already exists
INFO:root:Saving 2020-04-02 - forecast output (1, 432, 432, 7)

The persistence and respective use of these results is then up to the user, with the threefold outputs correlating to that which is normally saved to disk as individual files containing the numpy arrays by the CLI command.

The internals of the predict_forecast method are still undergoing some development, but it should be noted that this method can be easily overridden or called as part of a larger workflow. In particular, within this method it’s worth noting the importance of the testset parameter.

Should testset be true, then cached data generated in network_datasets is never used, and instead the preprocessed data in processed is used directly. This actually makes the implementation of predict_forecast extremely simple compared with the alternative, due to some outstanding work to derive dates from the cached batched files.

As before this is revised implementation in order to illustrate the “non-testset” use case, so several modifications are in situ for notebook execution:

from icenet.data.dataset import IceNetDataSet
from icenet.model.models import unet_batchnorm
import tensorflow as tf

start_dates = [el.date() for el in pd.date_range("2020-04-01", "2020-04-02")]

# initialise IceNetDataSet and obtain data loader used to generate the dataset config
ds = IceNetDataSet(dataset_config, batch_size=4)
dl = ds.get_data_loader()

logging.info("Generating forecast inputs from processed/ files")

# generate samples for prediction
forecast_inputs, gen_outputs, sample_weights = \
    list(zip(*[dl.generate_sample(date, prediction=True) for date in start_dates]))

network_folder = os.path.join(".", "results", "networks", "custom_run")

dataset_name = ds.identifier
network_path = os.path.join(network_folder,
                            "{}.network_{}.{}.h5".format(run_name,
                                                         "api_dataset",
                                                         seed))

logging.info("Loading model from {}...".format(network_path))

network = unet_batchnorm(
    (*ds.shape, dl.num_channels),
    [],
    [],
    n_filters_factor=0.6,
    n_forecast_days=ds.n_forecast_days
)
network.load_weights(network_path)

predictions = []

for i, net_input in enumerate(forecast_inputs):
    logging.info("Running prediction {} - {}".format(i, start_dates[i]))
    pred = network(tf.convert_to_tensor([net_input]), training=False)
    predictions.append(pred)

INFO:root:Loading configuration dataset_config.api_dataset.json
INFO:root:Training dataset path: ./network_datasets/api_dataset/south/train
INFO:root:Validation dataset path: ./network_datasets/api_dataset/south/val
INFO:root:Test dataset path: ./network_datasets/api_dataset/south/test
INFO:root:Loading configuration /data/hpcdata/users/bryald/git/icenet/icenet/icenet-notebooks/loader.tutorial_api_data.json
INFO:root:Generating forecast inputs from processed/ files

INFO:root:Loading model from ./results/networks/custom_run/custom_run.network_api_dataset.45.h5...
INFO:root:Running prediction 0 - 2020-04-01
INFO:root:Running prediction 1 - 2020-04-02

print(f"Predictions: {len(predictions)}, shape {predictions[0].shape}")
print(f"Generated outputs: {len(gen_outputs)}, shape {gen_outputs[0].shape}")
print(f"Sample weights: {len(sample_weights)}, shape {sample_weights[0].shape}")

Predictions: 2, shape (1, 432, 432, 7)
Generated outputs: 2, shape (432, 432, 7, 1)
Sample weights: 2, shape (432, 432, 7, 1)

To generate a CF-compliant NetCDF containing the forecasts requested we need to run icenet_output, these can then be post-processed. As an input to this command, we need to provide it with a csv containing the test dates. In this example, we use printf to generate the required file.

!printf "2020-04-01\n2020-04-02" | tee predict_dates.csv

2020-04-01
2020-04-02

!icenet_output -m -o ./results/predict custom_run_forecast api_dataset predict_dates.csv

[18-02-25 18:01:45 :INFO    ] - Loading configuration ./dataset_config.api_dataset.json
[18-02-25 18:01:45 :INFO    ] - Training dataset path: ./network_datasets/api_dataset/south/train
[18-02-25 18:01:45 :INFO    ] - Validation dataset path: ./network_datasets/api_dataset/south/val
[18-02-25 18:01:45 :INFO    ] - Test dataset path: ./network_datasets/api_dataset/south/test
[18-02-25 18:01:46 :INFO    ] - Post-processing 2020-04-01
[18-02-25 18:01:46 :INFO    ] - Post-processing 2020-04-02
[18-02-25 18:01:46 :INFO    ] - Dataset arr shape: (2, 432, 432, 7, 2)
[18-02-25 18:01:46 :INFO    ] - Applying active grid cell masks
[18-02-25 18:01:46 :INFO    ] - Land masking the forecast output
[18-02-25 18:01:46 :INFO    ] - Applying zeros to land mask
[18-02-25 18:01:46 :INFO    ] - Saving to ./results/predict/custom_run_forecast.nc

6. Visualisation#

Now that we have a prediction, we can visualise the binary sea ice concentration using some of the built-in tools in IceNet that utilise cartopy and matplotlib.

(Note: There are also some scripts in the icenet-pipeline repository that enable plotting common results such as produce_op_assets.sh)

Here, we are loading the prediction netCDF file we’ve just created in the previous step.

We are also using the Masks class from IceNet to create a land mask region that will mask out the land regions in the forecast plot.

from icenet.plotting.video import xarray_to_video as xvid
from icenet.data.sic.mask import Masks
from IPython.display import HTML, display
import xarray as xr, pandas as pd, datetime as dt

def plot_result(file_path):
    # Load our output prediction file
    ds = xr.open_dataset(file_path)

    # Get land mask to mask these regions in final plot
    land_mask = Masks(south=True, north=False).get_land_mask()

    # We obtain the start date of the forecast we would like to plot
    forecast_date = ds.time.values[0]
    print(forecast_date)

    fc = ds.sic_mean.isel(time=0).drop_vars("time").rename(dict(leadtime="time"))
    fc['time'] = [pd.to_datetime(forecast_date) \
                + dt.timedelta(days=int(e)) for e in fc.time.values]

    anim = xvid(fc, 15, figsize=(4, 4), mask=land_mask, north=False, south=True)
    display(HTML(anim.to_jshtml()))

Now, we can use the built in plotting tool to visualise our forecast.

plot_result("results/predict/custom_run_forecast.nc")

INFO:root:Inspecting data
INFO:root:Initialising plot

2020-04-01T00:00:00.000000000

INFO:root:Animating
INFO:root:Not saving plot, will return animation
INFO:matplotlib.animation:Animation.save using <class 'matplotlib.animation.HTMLWriter'>

Summary#

This notebook has attempted to illustrate the workflow implementations of the CLI as well as highlight the flexibility of direct integration using it. Ultimately, library usage is the only way to achieve truly novel and flexible usage, the CLI is for convenience of running existing pipelines without having to manually implement complex scripts.

The key to leveraging the benefits of both of these interfaces being provided is to consider using the following workflow:

Get your environment(s) set up, be they research, development or production
Use the existing CLI implementations to seed the data stores and get baseline networks operational
Start to customise the existing operations via custom calls to the API, for example by downloading new variables or adding extra analysis to training/prediction runs
If researching, consider extending the functionality of the API to include revised or completely new implementations, such as additional data sources(05.library_extension.ipynb in icenet-ai/icenet-notebooks)

This last point brings us to the topic of the last of the introductory notebooks.

Version#

IceNet Codebase: v0.2.9