1: IceNet Basic Command-Line Usage#
Context#
Purpose#
The IceNet library provides the ability to download, process, train and predict from end to end. Users can interact with IceNet either via the python interface (see notebook 3: library usage) or via a set of command-line interfaces (CLI) which provide a high-level interface.
This notebook illustrates the CLI utilities that are available natively from the library for testing and producing operational forecasts. Via this interface, users can specify data inputs, data processing, training models, using them for predictions and processing outputs.
Modelling approach#
This modelling approach allows users to immediately utilise the library for producing sea ice concentration forecasts.
Highlights#
The key stages of an end to end run are:
1. Setup the environment and project structure.
2. Download sea ice concentration data as training data.
3. Process downloaded data, and generate cached datasets to speed up training.
4. Train the neural network and generate checkpoint and model output.
5. Predict for defined dates.
6. Visualisation of the prediction output.
Contributions#
Notebook#
James Byrne (author)
Bryn Noel Ubald (co-author)
David Wilby (co-author)
Please raise issues in this repository to suggest updates to this notebook!
Contact me at jambyr <at> bas.ac.uk for anything else…
Modelling codebase#
James Byrne (code author), Bryn Noel Ubald (code author), Tom Andersson (science author)
Modelling publications#
Andersson, T.R., Hosking, J.S., Pérez-Ortiz, M. et al. Seasonal Arctic sea ice forecasting with probabilistic deep learning. Nat Commun 12, 5124 (2021). https://doi.org/10.1038/s41467-021-25257-4
Involved organisations#
The Alan Turing Institute and British Antarctic Survey
1. Setup#
Prerequisites#
In order to execute the IceNet CLI tools in this notebook you will need:
An internet connection is needed for downloading the source data at the beginning of the notebook,
A suitable place to run this jupyter notebook such as:
Running
jupyter notebook
orjupyter lab
on your computer (see the jupyter project page for more),A jupyterhub instance,
A development environment such as visual studio code which can run jupyter notebooks (Note: for vscode, you’ll need to install
ipykernel
in our conda environment later on), orA Google colab instance.
A working installation of conda,
GPUs are required for training (due to size of network, unrealistic to try running on CPU) but not required for predictions.
Knowledge of Git, python and shell (links to Carpentries courses on these topics)
There are a few external facilities that we interface with, which you will need to set up if you haven’t already.
Data sources under Climate and Sea Ice Data including an account and API token for the Climate Data Store (detailed later)
Wandb (Weights and Biases) - which can optionally be used during training for monitoring.
We’ll assume that you’re running in a local copy of icenet-notebooks
for this tutorial, and that one directory up we can deposit other repositories and folders. If you already have some previous IceNet data available (as we do in ../data
) then you can symlink to it using ln -s ../data
. The reason for this is described further below, as is the creation of this folder if it doesn’t exist.
eg
my-icenet-project/
├── data/
└── icenet-notebooks/ <--- we're in here!
Installation: Environment Configuration#
Conda install from scratch#
We recommend running IceNet (or any python code) in a virtual environment. Here we will use conda
to create a virtual environment containing python
and icenet
.
First create a conda environment if you don’t have one already: In a shell (not in this notebook), run:
conda create -n icenet python=3.11
which creates an environment named
icenet
and installs python 3.11 within it. Follow the prompts at your terminal to complete creation of the environment.Alternatively, the icenet library has an
environment.yml
file which can be used to create a conda environment with all the necessary packages and their versions installed. To do this, either download the file, or clone the repository and run:conda env create --file environment.yml -n icenet
Activate your environment:
conda activate icenet
Check that your environment has activated correctly:
which python
should return a path to a python installation corresponding to your new environment (e.g. it should sayicenet
in it somewhere),If you do not have
netcdf4
installed in your system, run:conda install -c conda-forge "netcdf4<1.6.1"
Use
pip
to install icenet from the Python Package Index (PyPI):pip install icenet
which should install the IceNet package and its dependencies.
Commands#
Once the icenet library is installed, you’ll be able to access all commands made available by the library. Some are utilities that won’t be covered, but using icenet_<TAB>
-complete you should be able to see a list that includes (but is not limited to):
icenet_data_cmip
icenet_data_era5
icenet_data_hres
icenet_data_masks
icenet_data_sic
icenet_dataset_create
icenet_output
icenet_predict
icenet_process_cmip
icenet_process_era5
icenet_process_hres
icenet_process_metadata
icenet_process_sic
icenet_train
icenet_plot_forecast
All of these commands are either directly or indirectly (through pipeline shell scripts) used in this notebook…
All commands accept options such as -v
for turning on verbose logging and -h
for obtaining help about what options they offer. As with many shell commands, use -h
to obtain information about options.
CLI vs Library vs Pipeline usage#
The IceNet package is designed to support automated runs from end to end by exposing the above CLI operations. These are simple wrappers around the library itself, and any step of this can be undertaken manually or programmatically by inspecting the relevant endpoints.
IceNet can be run in a number of ways: from the command line, the python interface, or as a pipeline.
The rule of thumb to follow:
Use the pipeline repository if you want to run the end to end IceNet processing out of the box.
Adapt or customise this process using
icenet_*
commands described in this notebook and in the scripts contained in the pipeline repo.For ultimate customisation, you can interact with the IceNet repository programmatically (which is how the CLI commands operate.) For more information look at the IceNet CLI implementations and the library notebook, along with the library documentation.
2. Download#
Now we can get started, with the first step of downloading the data.
Mask data#
IceNet relies on some generated masks for training/prediction, which can be automatically produced very easily using icenet_data_masks {north,south}
, which downloads and processes the data required. Once this has been run once, it does not need to be run again since the mask files are stored on disk (the masks vary each month, but are fixed across any given year).
!icenet_data_masks south
[18-02-25 14:22:08 :INFO ] - Skipping ./data/masks/south/masks/active_grid_cell_mask_01.npy, already exists
[18-02-25 14:22:08 :INFO ] - Skipping ./data/masks/south/masks/active_grid_cell_mask_02.npy, already exists
[18-02-25 14:22:08 :INFO ] - Skipping ./data/masks/south/masks/active_grid_cell_mask_03.npy, already exists
[18-02-25 14:22:08 :INFO ] - Skipping ./data/masks/south/masks/active_grid_cell_mask_04.npy, already exists
[18-02-25 14:22:08 :INFO ] - Skipping ./data/masks/south/masks/active_grid_cell_mask_05.npy, already exists
[18-02-25 14:22:08 :INFO ] - Skipping ./data/masks/south/masks/active_grid_cell_mask_06.npy, already exists
[18-02-25 14:22:08 :INFO ] - Skipping ./data/masks/south/masks/active_grid_cell_mask_07.npy, already exists
[18-02-25 14:22:08 :INFO ] - Skipping ./data/masks/south/masks/active_grid_cell_mask_08.npy, already exists
[18-02-25 14:22:08 :INFO ] - Skipping ./data/masks/south/masks/active_grid_cell_mask_09.npy, already exists
[18-02-25 14:22:08 :INFO ] - Skipping ./data/masks/south/masks/active_grid_cell_mask_10.npy, already exists
[18-02-25 14:22:08 :INFO ] - Skipping ./data/masks/south/masks/active_grid_cell_mask_11.npy, already exists
[18-02-25 14:22:08 :INFO ] - Skipping ./data/masks/south/masks/active_grid_cell_mask_12.npy, already exists
This command creates the following directories/files.
Directory structure
icenet-notebooks/
└── data/masks/south/
├── masks/
│ ├── active_grid_cell_mask_01.npy <--- Mask for the active regions to consider for each month (This is for Jan)
│ ├── active_grid_cell_mask_02.npy <--- Mask for Feb
│ ├── ...
│ ├── check.py
│ ├── land_mask.npy <--- This masks the land regions
│ └── masks.params <--- This stores details relating to the "polar hole"
└── siconca/ <--- These are temporarily downloaded data used to generate the above masks
└── 2000/
├── 01/
│ └── ice_conc_sh_ease2-250_cdr-v2p0_200001021200.nc
├── 02/
│ └── ice_conc_sh_ease2-250_cdr-v2p0_200002021200.nc
└── .../
└── .../
Note: The output data structure in its entirety for all parts of the IceNet library is covered in the third notebook (03.data_and_forecasts.ipynb).
Climate and Sea Ice Data#
Obtaining and preparing data is simply achieved using icenet_data_*
commands (you need to configure the CDS API token yourself - see here for some instructions on registering and on how to use the CDS API), which share common arguments hemisphere
, start_date
and end_date
. There are also implementation specific options worth reviewing under --help
. We specify the variables and levels via these commands.
Please ignore “NOT IMPLEMENTED YET”, this is indicative of the commands not checking before overwriting files.
The -d
flag prevents the downloaded data from being downloaded each time.
Even small data ranges like this can take a while to retrieve (each variable in this case, for four months, is 3GB, so may take up to an hour.) Please refer to CDS requests page to monitor ERA5 downloads…
icenet_data_era5
downloads ERA5 (European Centre for Medium Range Weather Forecasting Reanalysis) data. For more information on the ERA5 data, see the Copernicus page.
We can use the --help
flag for the command line tools to print the help text and explanation of the options. Some of these help commands can take up to a minute to run, so don’t worry if you have to wait a moment.
# Please note that on some systems running the help commands can take 30 seconds or more the first time it is run.
!icenet_data_era5 --help
usage: icenet_data_era5 [-h] [-c {cdsapi}] [-w WORKERS] [-po] [-d] [-v]
[--vars VARS] [--levels LEVELS] [-n] [-p]
{north,south} start_date end_date
positional arguments:
{north,south}
start_date
end_date
options:
-h, --help show this help message and exit
-c {cdsapi}, --choice {cdsapi}
-w WORKERS, --workers WORKERS
-po, --parallel-opens
Allow xarray mfdataset to work with parallel opens
-d, --dont-delete
-v, --verbose
--vars VARS Comma separated list of vars
--levels LEVELS Comma separated list of pressures/depths as needed,
use zero length string if None (e.g. ',,500,,,') and
pipes for multiple per var (e.g. ',,250|500,,'
-n, --do-not-download
-p, --do-not-postprocess
The -vars
flag is used for specifying the variables from ERA5 that we want as follows:
uas
:10 metre U wind component
,vas
:10 metre V wind component
,zg
:Geopotential height
.
These are the three we’ll use here, though others are available.
--levels
specifies the levels requested, here we use the string ,,500|250
to request None
for our first three variables and 500 and 250 for our zg
variable using the syntax 500|250
.
Finally, we pass start and end dates for our query. This is in the format of yyyy-mm-yy
, though for single digits, you can omit the leading 0. So, the both of these are equivalent and valid: 2020-1-1
or 2020-01-01
.
!icenet_data_era5 south -d --vars uas,vas,zg --levels ',,500|250' 2020-1-1 2020-4-30
[18-02-25 14:22:50 :INFO ] - ERA5 Data Downloading
[18-02-25 14:22:50 :WARNING ] - !!! Deletions of temp files are switched off: be careful with this, you need to manage your files manually
2025-02-18 14:22:50,878 INFO [2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.
[18-02-25 14:22:50 :INFO ] - [2024-09-26T00:00:00] Watch our [Forum](https://forum.ecmwf.int/) for Announcements, news and other discussed topics.
2025-02-18 14:22:50,879 WARNING [2024-06-16T00:00:00] CDS API syntax is changed and some keys or parameter names may have also changed. To avoid requests failing, please use the "Show API request code" tool on the dataset Download Form to check you are using the correct syntax for your API request.
[18-02-25 14:22:50 :WARNING ] - [2024-06-16T00:00:00] CDS API syntax is changed and some keys or parameter names may have also changed. To avoid requests failing, please use the "Show API request code" tool on the dataset Download Form to check you are using the correct syntax for your API request.
[18-02-25 14:22:50 :INFO ] - Building request(s), downloading and daily averaging from ERA5 API
[18-02-25 14:22:50 :INFO ] - Processing single download for uas @ None with 121 dates
[18-02-25 14:22:50 :INFO ] - Processing single download for vas @ None with 121 dates
[18-02-25 14:22:50 :INFO ] - Processing single download for zg @ 500 with 121 dates
[18-02-25 14:22:50 :INFO ] - Processing single download for zg @ 250 with 121 dates
[18-02-25 14:22:51 :INFO ] - No requested dates remain, likely already present
[18-02-25 14:22:51 :INFO ] - No requested dates remain, likely already present
[18-02-25 14:22:51 :INFO ] - No requested dates remain, likely already present
[18-02-25 14:22:51 :INFO ] - No requested dates remain, likely already present
[18-02-25 14:22:51 :INFO ] - 0 daily files downloaded
[18-02-25 14:22:51 :INFO ] - No regrid batches to processing, moving on...
[18-02-25 14:22:51 :INFO ] - Rotating wind data prior to merging
[18-02-25 14:22:52 :INFO ] - Rotating wind data in ./data/era5/south/uas ./data/era5/south/vas
[18-02-25 14:22:52 :INFO ] - 0 files for uas
[18-02-25 14:22:52 :INFO ] - 0 files for vas
icenet_data_sic
downloads the Sea Ice Concentration (SIC) data from the Ocean and Sea Ice Satellite Application Facility (OSI SAF).
!icenet_data_sic --help
usage: icenet_data_sic [-h] [-w WORKERS] [-po] [-d] [-v] [-u]
[-c SIC_CHUNKING_SIZE] [-dt DASK_TIMEOUTS]
[-dp DASK_PORT]
{north,south} start_date end_date
positional arguments:
{north,south}
start_date
end_date
options:
-h, --help show this help message and exit
-w WORKERS, --workers WORKERS
-po, --parallel-opens
Allow xarray mfdataset to work with parallel opens
-d, --dont-delete
-v, --verbose
-u, --use-dask
-c SIC_CHUNKING_SIZE, --sic-chunking-size SIC_CHUNKING_SIZE
-dt DASK_TIMEOUTS, --dask-timeouts DASK_TIMEOUTS
-dp DASK_PORT, --dask-port DASK_PORT
To run icenet_data_sic
you will need the eccodes
package installed. If you have used pip
to install IceNet, you will need to use conda
to install eccodes
by running conda install -c conda-forge eccodes
at the command line. Alternatively, the ECMWF provide alternative instructions for installing eccodes here.
Here we pass two dates bounding the data range to be downloaded using the -d
option with the dates in YYYY-M-D
format.
!icenet_data_sic south -d 2020-1-1 2020-4-30
[18-02-25 14:54:55 :INFO ] - OSASIF-SIC Data Downloading
[18-02-25 14:54:55 :INFO ] - Downloading SIC datafiles to .temp intermediates...
[18-02-25 14:55:08 :INFO ] - Saving ./data/osisaf/south/siconca/2020.nc
[18-02-25 14:55:13 :INFO ] - Opening for interpolation: ['./data/osisaf/south/siconca/2020.nc']
[18-02-25 14:55:13 :INFO ] - Processing 0 missing dates
By default, the IceNet commands regrid and rotates data as required to align with the OSISAF SIC data, which is used as the output for the dataset. Programmatic usage allows you to avoid this (see 03.library_usage).
The following downloaders are available:
icenet_data_era5
- downloads ERA5 reanalysis data using either the CDS Toolbox or direct APIicenet_data_oras5
- downloads ORAS5 reanalysis data (Ocean Reanalysis).icenet_data_cmip
- downloads the prescribed experiments from CMIP6 for the original IceNet paper runsicenet_data_hres
- downloads up to date forecast generated data from the ECMWF MARS APIicenet_data_sic
- downloads OSISAF sea-ice concentration (SIC) data
3. Process#
Processing takes the data made available through the source data store and undertakes the necessary normalisation for use as input channels to the UNet architecture. This intermediary step means that the original source data can be reused numerous times with varying training, validation and test date setups.
Command example#
!icenet_process_era5 --help
usage: icenet_process_era5 [-h] [-ns TRAIN_START] [-ne TRAIN_END]
[-vs VAL_START] [-ve VAL_END] [-ts TEST_START]
[-te TEST_END] [-l LAG] [-f FORECAST] [-po]
[--abs ABS] [--anom ANOM] [--trends TRENDS]
[--trend-lead TREND_LEAD] [-r REF] [-v]
[-u UPDATE_KEY]
name {north,south}
positional arguments:
name
{north,south}
options:
-h, --help show this help message and exit
-ns TRAIN_START, --train_start TRAIN_START
-ne TRAIN_END, --train_end TRAIN_END
-vs VAL_START, --val_start VAL_START
-ve VAL_END, --val_end VAL_END
-ts TEST_START, --test-start TEST_START
-te TEST_END, --test-end TEST_END
-l LAG, --lag LAG
-f FORECAST, --forecast FORECAST
-po, --parallel-opens
Allow xarray mfdataset to work with parallel opens
--abs ABS Comma separated list of abs vars
--anom ANOM Comma separated list of abs vars
--trends TRENDS Comma separated list of abs vars
--trend-lead TREND_LEAD
Time steps in the future for linear trends
-r REF, --ref REF Reference loader for normalisations etc
-v, --verbose
-u UPDATE_KEY, --update-key UPDATE_KEY
Add update key to processor to avoid overwriting
defaultentries in the loader configuration
These commands take the following as positional arguments:
argument | description | value |
---|---|---|
NAME | Processed data output name | tutorial_data |
HEMISPHERE | Hemisphere(s) the processed data covers | south |
This outputs the processed files into the processed/tutorial_data
directory and creates a loader file called loader.tutorial_data.json
. Each of these process
commands updates the loader file with the corresponding information on how the processed data was generated as a form of data lineage.
!icenet_process_era5 tutorial_data south \
-ns 2020-1-1 -ne 2020-3-31 -vs 2020-4-3 -ve 2020-4-23 -ts 2020-4-1 -te 2020-4-2 \
-l 1 --abs uas,vas --anom zg500,zg250
!icenet_process_sic tutorial_data south \
-ns 2020-1-1 -ne 2020-3-31 -vs 2020-4-1 -ve 2020-4-20 -ts 2020-4-1 -te 2020-4-2 \
-l 1 --abs siconca
!icenet_process_metadata tutorial_data south
[18-02-25 14:57:22 :INFO ] - Got 91 dates for train
[18-02-25 14:57:22 :INFO ] - Got 21 dates for val
[18-02-25 14:57:22 :INFO ] - Got 2 dates for test
[18-02-25 14:57:22 :INFO ] - Creating path: ./processed/tutorial_data/era5
[18-02-25 14:57:22 :INFO ] - Processing 91 dates for train category
[18-02-25 14:57:22 :INFO ] - Including lag of 1 days
[18-02-25 14:57:22 :INFO ] - Including lead of 93 days
[18-02-25 14:57:22 :INFO ] - Processing 21 dates for val category
[18-02-25 14:57:22 :INFO ] - Including lag of 1 days
[18-02-25 14:57:22 :INFO ] - Including lead of 93 days
[18-02-25 14:57:22 :INFO ] - Processing 2 dates for test category
[18-02-25 14:57:22 :INFO ] - Including lag of 1 days
[18-02-25 14:57:22 :INFO ] - Including lead of 93 days
[18-02-25 14:57:22 :INFO ] - Got 2 files for psl
[18-02-25 14:57:22 :INFO ] - Got 2 files for ta500
[18-02-25 14:57:22 :INFO ] - Got 2 files for tas
[18-02-25 14:57:22 :INFO ] - Got 2 files for tos
[18-02-25 14:57:22 :INFO ] - Got 2 files for uas
[18-02-25 14:57:22 :INFO ] - Got 2 files for vas
[18-02-25 14:57:22 :INFO ] - Got 2 files for zg250
[18-02-25 14:57:22 :INFO ] - Got 2 files for zg500
[18-02-25 14:57:22 :INFO ] - Opening files for uas
[18-02-25 14:57:23 :INFO ] - Filtered to 731 units long based on configuration requirements
[18-02-25 14:57:26 :INFO ] - Normalising uas
[18-02-25 14:57:27 :INFO ] - Opening files for vas
[18-02-25 14:57:27 :INFO ] - Filtered to 731 units long based on configuration requirements
[18-02-25 14:57:29 :INFO ] - Normalising vas
[18-02-25 14:57:30 :INFO ] - Opening files for zg500
[18-02-25 14:57:30 :INFO ] - Filtered to 731 units long based on configuration requirements
[18-02-25 14:57:30 :INFO ] - Generating climatology ./processed/tutorial_data/era5/south/params/climatology.zg500
[18-02-25 14:57:31 :WARNING ] - We don't have a full climatology (1,2,3) compared with data (1,2,3,4,5,6,7,8,9,10,11,12)
[18-02-25 14:57:32 :INFO ] - Normalising zg500
[18-02-25 14:57:34 :INFO ] - Opening files for zg250
[18-02-25 14:57:34 :INFO ] - Filtered to 731 units long based on configuration requirements
[18-02-25 14:57:34 :INFO ] - Generating climatology ./processed/tutorial_data/era5/south/params/climatology.zg250
[18-02-25 14:57:34 :WARNING ] - We don't have a full climatology (1,2,3) compared with data (1,2,3,4,5,6,7,8,9,10,11,12)
[18-02-25 14:57:35 :INFO ] - Normalising zg250
[18-02-25 14:57:37 :INFO ] - Writing configuration to ./loader.tutorial_data.json
[18-02-25 14:57:41 :INFO ] - Got 91 dates for train
[18-02-25 14:57:41 :INFO ] - Got 20 dates for val
[18-02-25 14:57:41 :INFO ] - Got 2 dates for test
[18-02-25 14:57:41 :INFO ] - Creating path: ./processed/tutorial_data/osisaf
[18-02-25 14:57:41 :INFO ] - Processing 91 dates for train category
[18-02-25 14:57:41 :INFO ] - Including lag of 1 days
[18-02-25 14:57:41 :INFO ] - Including lead of 93 days
[18-02-25 14:57:41 :INFO ] - No data found for 2019-12-31, outside data boundary perhaps?
[18-02-25 14:57:41 :INFO ] - Processing 20 dates for val category
[18-02-25 14:57:41 :INFO ] - Including lag of 1 days
[18-02-25 14:57:41 :INFO ] - Including lead of 93 days
[18-02-25 14:57:41 :INFO ] - Processing 2 dates for test category
[18-02-25 14:57:41 :INFO ] - Including lag of 1 days
[18-02-25 14:57:41 :INFO ] - Including lead of 93 days
[18-02-25 14:57:41 :INFO ] - Got 1 files for siconca
[18-02-25 14:57:41 :INFO ] - Opening files for siconca
[18-02-25 14:57:42 :INFO ] - Filtered to 121 units long based on configuration requirements
[18-02-25 14:57:42 :INFO ] - No normalisation for siconca
[18-02-25 14:57:43 :INFO ] - Loading configuration ./loader.tutorial_data.json
[18-02-25 14:57:43 :INFO ] - Writing configuration to ./loader.tutorial_data.json
[18-02-25 14:57:46 :INFO ] - Creating path: ./processed/tutorial_data/meta
[18-02-25 14:57:47 :INFO ] - Loading configuration ./loader.tutorial_data.json
[18-02-25 14:57:47 :INFO ] - Writing configuration to ./loader.tutorial_data.json
Consulting the command options will make the above more obvious (as well as further options) but a few things we can note that are helpful:
Options
-ns
,-ne
,-vs
,-ve
,-ts
,-te
, which correspond to training, validation and test sets, allow ranges to be comma-delimited. The above example produces a split training set, for example, that spans the first 4 months of 2020.-ns
specifies the start of the training set,-ne
specifies the end.-vs
specifies the start of the validation set,-ve
specifies the end.-ts
specifies the start of the test set,-te
specifies the end.
These date ranges can be randomised and subsampled using
-d
, though this is still a bit experimentalThe
-l
option (which is for--lag
) specified the number of days back we look at input data variables for the output in question.
There are plenty of other options available for preprocessing the data, but it should be noted that whilst this is not strongly coupled to dataset creation, options like the lag specified here might influence the creation of datasets in the next step.
These commands, especially with decadal ranges, can take a long time (12+ hours) to complete depending on the hosts/storage in use.
Dataset creation#
Once the above preprocessing is taken care of datasets can easily be created thus. This operation creates a cached dataset in the filesystem that can be fed in for training runs.
The common options used here:
-fd
allows us to specify how far forward to forecast to. For this example we’re limiting to 7 days based on the limited amount of SIC groud truth data we downloaded.-l
as in the preprocessing stage. If experimenting and using full date ranges, creating a dataset with a different lag can save having to reprocess everything.-ob
is the output batch size for the tfrecords. It is advisable to keep this smaller except where there are seriously large numbers of sets, preferably near to the expected size being used for training.-w
specifies the number of worker subprocesses to use for producing the output. Probably advisable to keep this below the number of cores on your host! :)
!icenet_dataset_create tutorial_data south -l 1 -fd 7 -ob 2 -w 4
[18-02-25 14:58:41 :INFO ] - Got 0 dates for train
[18-02-25 14:58:41 :INFO ] - Got 0 dates for val
[18-02-25 14:58:41 :INFO ] - Got 0 dates for test
[18-02-25 14:58:41 :INFO ] - Creating path: ./network_datasets/tutorial_data
[18-02-25 14:58:41 :INFO ] - Loading configuration loader.tutorial_data.json
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/node.py:187: UserWarning: Port 8888 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 42138 instead
warnings.warn(
[18-02-25 14:58:42 :INFO ] - Dashboard at localhost:8888
[18-02-25 14:58:42 :INFO ] - Using dask client <Client: 'tcp://127.0.0.1:35074' processes=4 threads=4, memory=503.20 GiB>
[18-02-25 14:58:43 :INFO ] - 91 train dates to process, generating cache data.
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
[18-02-25 14:58:56 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000000.tfrecord
[18-02-25 14:58:56 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000001.tfrecord
[18-02-25 14:58:56 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000002.tfrecord
[18-02-25 14:58:56 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000003.tfrecord
[18-02-25 14:58:56 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000004.tfrecord
[18-02-25 14:58:56 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000005.tfrecord
[18-02-25 14:58:56 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000006.tfrecord
[18-02-25 14:58:56 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000007.tfrecord
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
2025-02-18 14:58:58,294 - distributed.scheduler - WARNING - Detected different `run_spec` for key ('astype-ac07afc5678ef766d38010ceed63954d', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888.
Debugging information
---------------------
old task state: waiting
old run_spec: Alias(('astype-ac07afc5678ef766d38010ceed63954d', 0, 0)->('getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0))
new run_spec: Alias(('astype-ac07afc5678ef766d38010ceed63954d', 0, 0)->('array-getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0))
old dependencies: {('getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0)}
new dependencies: {('array-getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0)}
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
2025-02-18 14:58:58,325 - distributed.scheduler - WARNING - Detected different `run_spec` for key ('astype-ac07afc5678ef766d38010ceed63954d', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888.
Debugging information
---------------------
old task state: waiting
old run_spec: Alias(('astype-ac07afc5678ef766d38010ceed63954d', 0, 0)->('getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0))
new run_spec: Alias(('astype-ac07afc5678ef766d38010ceed63954d', 0, 0)->('array-getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0))
old dependencies: {('getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0)}
new dependencies: {('array-getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0)}
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
2025-02-18 14:58:58,415 - distributed.scheduler - WARNING - Detected different `run_spec` for key ('astype-ac07afc5678ef766d38010ceed63954d', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888.
Debugging information
---------------------
old task state: processing
old run_spec: Alias(('astype-ac07afc5678ef766d38010ceed63954d', 0, 0)->('getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0))
new run_spec: Alias(('astype-ac07afc5678ef766d38010ceed63954d', 0, 0)->('array-getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0))
old dependencies: {('getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0)}
new dependencies: {('array-getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0)}
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
2025-02-18 14:59:00,957 - distributed.scheduler - WARNING - Detected different `run_spec` for key ('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 18, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888.
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 18, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 18, 0, 0)->('original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 18, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31'}
new dependencies: {('original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 18, 0, 0)}
2025-02-18 14:59:00,982 - distributed.scheduler - WARNING - Detected different `run_spec` for key ('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 22, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888.
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 22, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 22, 0, 0)->('original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 22, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31'}
new dependencies: {('original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 22, 0, 0)}
2025-02-18 14:59:01,048 - distributed.scheduler - WARNING - Detected different `run_spec` for key ('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 20, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888.
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 20, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 20, 0, 0)->('original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 20, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31'}
new dependencies: {('original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 20, 0, 0)}
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
2025-02-18 14:59:01,335 - distributed.scheduler - WARNING - Detected different `run_spec` for key ('astype-ac07afc5678ef766d38010ceed63954d', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888.
Debugging information
---------------------
old task state: released
old run_spec: Alias(('astype-ac07afc5678ef766d38010ceed63954d', 0, 0)->('getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0))
new run_spec: Alias(('astype-ac07afc5678ef766d38010ceed63954d', 0, 0)->('array-getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0))
old dependencies: {('getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0)}
new dependencies: {('array-getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0)}
2025-02-18 14:59:01,379 - distributed.scheduler - WARNING - Detected different `run_spec` for key ('astype-ac07afc5678ef766d38010ceed63954d', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888.
Debugging information
---------------------
old task state: processing
old run_spec: Alias(('astype-ac07afc5678ef766d38010ceed63954d', 0, 0)->('getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0))
new run_spec: Alias(('astype-ac07afc5678ef766d38010ceed63954d', 0, 0)->('array-getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0))
old dependencies: {('getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0)}
new dependencies: {('array-getitem-astype-ac07afc5678ef766d38010ceed63954d', 0, 0)}
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
[18-02-25 14:59:09 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000008.tfrecord
[18-02-25 14:59:09 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000009.tfrecord
[18-02-25 14:59:09 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000010.tfrecord
[18-02-25 14:59:09 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000011.tfrecord
[18-02-25 14:59:09 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000012.tfrecord
[18-02-25 14:59:09 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000013.tfrecord
[18-02-25 14:59:09 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000014.tfrecord
[18-02-25 14:59:09 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000015.tfrecord
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
2025-02-18 14:59:16,295 - distributed.scheduler - WARNING - Detected different `run_spec` for key ('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 42, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888.
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 42, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 42, 0, 0)->('original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 42, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31'}
new dependencies: {('original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 42, 0, 0)}
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
2025-02-18 14:59:17,239 - distributed.scheduler - WARNING - Detected different `run_spec` for key ('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 40, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888.
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 40, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 40, 0, 0)->('original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 40, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31'}
new dependencies: {('original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 40, 0, 0)}
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
2025-02-18 14:59:18,262 - distributed.scheduler - WARNING - Detected different `run_spec` for key ('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 38, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888.
Debugging information
---------------------
old task state: processing
old run_spec: <Task ('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 38, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 38, 0, 0)->('original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 38, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31'}
new dependencies: {('original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 38, 0, 0)}
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
[18-02-25 14:59:24 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000016.tfrecord
[18-02-25 14:59:24 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000017.tfrecord
[18-02-25 14:59:24 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000018.tfrecord
[18-02-25 14:59:24 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000019.tfrecord
[18-02-25 14:59:24 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000020.tfrecord
[18-02-25 14:59:24 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000021.tfrecord
[18-02-25 14:59:24 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000022.tfrecord
[18-02-25 14:59:24 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000023.tfrecord
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
2025-02-18 14:59:31,399 - distributed.scheduler - WARNING - Detected different `run_spec` for key ('astype-8edca25275ac5fe21814770f6825e20b', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888.
Debugging information
---------------------
old task state: waiting
old run_spec: Alias(('astype-8edca25275ac5fe21814770f6825e20b', 0, 0)->('getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0))
new run_spec: Alias(('astype-8edca25275ac5fe21814770f6825e20b', 0, 0)->('array-getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0))
old dependencies: {('getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0)}
new dependencies: {('array-getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0)}
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
2025-02-18 14:59:31,428 - distributed.scheduler - WARNING - Detected different `run_spec` for key ('astype-8edca25275ac5fe21814770f6825e20b', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888.
Debugging information
---------------------
old task state: waiting
old run_spec: Alias(('astype-8edca25275ac5fe21814770f6825e20b', 0, 0)->('getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0))
new run_spec: Alias(('astype-8edca25275ac5fe21814770f6825e20b', 0, 0)->('array-getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0))
old dependencies: {('getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0)}
new dependencies: {('array-getitem-astype-8edca25275ac5fe21814770f6825e20b', 0, 0)}
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
[18-02-25 14:59:37 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000024.tfrecord
[18-02-25 14:59:37 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000025.tfrecord
[18-02-25 14:59:37 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000026.tfrecord
[18-02-25 14:59:37 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000027.tfrecord
[18-02-25 14:59:37 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000028.tfrecord
[18-02-25 14:59:37 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000029.tfrecord
[18-02-25 14:59:37 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000030.tfrecord
[18-02-25 14:59:37 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000031.tfrecord
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
[18-02-25 14:59:48 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000032.tfrecord
[18-02-25 14:59:48 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000033.tfrecord
[18-02-25 14:59:48 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000034.tfrecord
[18-02-25 14:59:48 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000035.tfrecord
[18-02-25 14:59:48 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000036.tfrecord
[18-02-25 14:59:48 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000037.tfrecord
[18-02-25 14:59:48 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000038.tfrecord
[18-02-25 14:59:48 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000039.tfrecord
2025-02-18 14:59:49,508 - distributed.scheduler - WARNING - Detected different `run_spec` for key ('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 87, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888.
Debugging information
---------------------
old task state: released
old run_spec: <Task ('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 87, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 87, 0, 0)->('original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 87, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31'}
new dependencies: {('original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 87, 0, 0)}
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
2025-02-18 14:59:50,145 - distributed.scheduler - WARNING - Detected different `run_spec` for key ('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888.
Debugging information
---------------------
old task state: waiting
old run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
new run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
old dependencies: {('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}
new dependencies: {('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
2025-02-18 14:59:50,497 - distributed.scheduler - WARNING - Detected different `run_spec` for key ('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888.
Debugging information
---------------------
old task state: released
old run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
new run_spec: Alias(('astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)->('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0))
old dependencies: {('array-getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}
new dependencies: {('getitem-astype-bbd1124cd4ebdf229756b011d5b455c2', 0, 0)}
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
[18-02-25 14:59:57 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000040.tfrecord
[18-02-25 14:59:57 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000041.tfrecord
[18-02-25 14:59:57 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000042.tfrecord
[18-02-25 14:59:57 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000043.tfrecord
[18-02-25 14:59:57 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000044.tfrecord
[18-02-25 14:59:57 :INFO ] - Finished output ./network_datasets/tutorial_data/south/train/00000045.tfrecord
[18-02-25 14:59:57 :INFO ] - 23 val dates to process, generating cache data.
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
2025-02-18 15:00:05,290 - distributed.scheduler - WARNING - Detected different `run_spec` for key ('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 103, 0, 0) between two consecutive calls to `update_graph`. This can cause failures and deadlocks down the line. Please ensure unique key names. If you are using a standard dask collections, consider releasing all the data before resubmitting another computation. More details and help can be found at https://github.com/dask/dask/issues/9888.
Debugging information
---------------------
old task state: queued
old run_spec: <Task ('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 103, 0, 0) getter(...)>
new run_spec: Alias(('open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 103, 0, 0)->('original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 103, 0, 0))
old dependencies: {'original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31'}
new dependencies: {('original-open_dataset-siconca_abs-98dd794fa44bea59941852b0de766b31', 103, 0, 0)}
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
[18-02-25 15:00:10 :INFO ] - Finished output ./network_datasets/tutorial_data/south/val/00000000.tfrecord
[18-02-25 15:00:10 :INFO ] - Finished output ./network_datasets/tutorial_data/south/val/00000001.tfrecord
[18-02-25 15:00:10 :INFO ] - Finished output ./network_datasets/tutorial_data/south/val/00000002.tfrecord
[18-02-25 15:00:10 :INFO ] - Finished output ./network_datasets/tutorial_data/south/val/00000003.tfrecord
[18-02-25 15:00:10 :INFO ] - Finished output ./network_datasets/tutorial_data/south/val/00000004.tfrecord
[18-02-25 15:00:10 :INFO ] - Finished output ./network_datasets/tutorial_data/south/val/00000005.tfrecord
[18-02-25 15:00:10 :INFO ] - Finished output ./network_datasets/tutorial_data/south/val/00000006.tfrecord
[18-02-25 15:00:10 :INFO ] - Finished output ./network_datasets/tutorial_data/south/val/00000007.tfrecord
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
[18-02-25 15:00:17 :INFO ] - Finished output ./network_datasets/tutorial_data/south/val/00000008.tfrecord
[18-02-25 15:00:17 :INFO ] - Finished output ./network_datasets/tutorial_data/south/val/00000009.tfrecord
[18-02-25 15:00:17 :INFO ] - Finished output ./network_datasets/tutorial_data/south/val/00000010.tfrecord
[18-02-25 15:00:17 :INFO ] - Finished output ./network_datasets/tutorial_data/south/val/00000011.tfrecord
[18-02-25 15:00:17 :INFO ] - 2 test dates to process, generating cache data.
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/distributed/client.py:3370: UserWarning: Sending large graph of size 12.13 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
warnings.warn(
[18-02-25 15:00:22 :INFO ] - Finished output ./network_datasets/tutorial_data/south/test/00000000.tfrecord
[18-02-25 15:00:22 :INFO ] - Average sample generation time: 4.7196644030768296
[18-02-25 15:00:22 :INFO ] - Writing configuration to ./dataset_config.tutorial_data.json
Config-only operation / Prediction datasets#
Datasets used to predict don’t benefit from caching, so adding the -c
option and dropping -w
and -ob
will create a configuration for the dataset without writing sets to disk. You can also use this option to create a dataset that is fed directly from the preprocessed data, though bear in mind, depending on your infrastructure, that this requires the batches to be created on the fly and can have a significant impact on performance. By specifying -fn
we ensure the dataset is given a different name to the previously cached one above (though this is more commonly used for prediction datasets where caching isn’t necessary…)
!icenet_dataset_create -fd 7 -l 1 -c -fn tutorial_raw_dataset tutorial_data south
[18-02-25 15:01:37 :INFO ] - Got 0 dates for train
[18-02-25 15:01:37 :INFO ] - Got 0 dates for val
[18-02-25 15:01:37 :INFO ] - Got 0 dates for test
[18-02-25 15:01:37 :INFO ] - Creating path: ./network_datasets/tutorial_raw_dataset
[18-02-25 15:01:37 :INFO ] - Loading configuration loader.tutorial_data.json
[18-02-25 15:01:37 :INFO ] - Writing dataset configuration without data generation
[18-02-25 15:01:37 :INFO ] - 91 train dates in total, NOT generating cache data.
[18-02-25 15:01:37 :INFO ] - 23 val dates in total, NOT generating cache data.
[18-02-25 15:01:37 :INFO ] - 2 test dates in total, NOT generating cache data.
[18-02-25 15:01:37 :INFO ] - Writing configuration to ./dataset_config.tutorial_raw_dataset.json
4. Train#
Once the dataset is prepared, running a network is then as simple as using icenet_train
with the appropriate parameters. Some key parameters are illustrated in the following commands:
!icenet_train --help
[18-02-25 15:02:21 :WARNING ] - PyTorch not found - not required if not using PyTorch
usage: icenet_train [-h] [-b BATCH_SIZE] [-ca CHECKPOINT_MODE]
[-cm CHECKPOINT_MONITOR] [-ds [ADDITIONAL ...]]
[-e EPOCHS] [-f FILTER_SIZE]
[--early-stopping EARLY_STOPPING] [-m]
[-n N_FILTERS_FACTOR] [-p PRELOAD] [-pw]
[-qs MAX_QUEUE_SIZE] [-r RATIO]
[-s {default,mirrored,central}] [--shuffle-train]
[--gpus GPUS] [-v] [-w WORKERS] [-nw] [-wo]
[-wp WANDB_PROJECT] [-wu WANDB_USER] [--lr LR]
[--lr_10e_decay_fac LR_10E_DECAY_FAC]
[--lr_decay_start LR_DECAY_START]
[--lr_decay_end LR_DECAY_END]
dataset run_name seed
positional arguments:
dataset
run_name
seed
options:
-h, --help show this help message and exit
-b BATCH_SIZE, --batch-size BATCH_SIZE
-ca CHECKPOINT_MODE, --checkpoint-mode CHECKPOINT_MODE
-cm CHECKPOINT_MONITOR, --checkpoint-monitor CHECKPOINT_MONITOR
-ds [ADDITIONAL ...], --additional-dataset [ADDITIONAL ...]
-e EPOCHS, --epochs EPOCHS
-f FILTER_SIZE, --filter-size FILTER_SIZE
--early-stopping EARLY_STOPPING
-m, --multiprocessing
-n N_FILTERS_FACTOR, --n-filters-factor N_FILTERS_FACTOR
-p PRELOAD, --preload PRELOAD
-pw, --pickup-weights
-qs MAX_QUEUE_SIZE, --max-queue-size MAX_QUEUE_SIZE
-r RATIO, --ratio RATIO
-s {default,mirrored,central}, --strategy {default,mirrored,central}
--shuffle-train Shuffle the training set
--gpus GPUS
-v, --verbose
-w WORKERS, --workers WORKERS
-nw, --no-wandb
-wo, --wandb-offline
-wp WANDB_PROJECT, --wandb-project WANDB_PROJECT
-wu WANDB_USER, --wandb-user WANDB_USER
--lr LR
--lr_10e_decay_fac LR_10E_DECAY_FAC
Factor by which LR is multiplied by every 10 epochs
using exponential decay. E.g. 1 -> no decay (default),
0.5 -> halve every 10 epochs.
--lr_decay_start LR_DECAY_START
--lr_decay_end LR_DECAY_END
The following runs demonstrate using the aforementioned dataset with the following options:
in
-b
batches of 2 (-b 2
)for a run of
-e
10 epochs (-e 10
)using
-m
for multiprocessing we enable up to-w
4 process workers to load data at a time (-w 4
)into a data queue
-qs
of length 4 (-qs 4
)We could specify a
-r
ratio we use only 0.2x of the files from the dataset (useful when testing on a low power machine with a large dataset, but unnecessary with our example here)supplying a UNet built with 0.6x the
-n
numbers of filters as normal. (-n 0.6
)
!icenet_train tutorial_data tutorial_testrun 42 -b 2 -e 10 -m -qs 4 -w 4 -n 0.6 -nw
[18-02-25 15:02:47 :WARNING ] - PyTorch not found - not required if not using PyTorch
[18-02-25 15:02:48 :WARNING ] - Setting seed for best attempt at determinism, value 42
[18-02-25 15:02:48 :INFO ] - Loading configuration dataset_config.tutorial_data.json
[18-02-25 15:02:48 :INFO ] - Training dataset path: ./network_datasets/tutorial_data/south/train
[18-02-25 15:02:48 :INFO ] - Validation dataset path: ./network_datasets/tutorial_data/south/val
[18-02-25 15:02:48 :INFO ] - Test dataset path: ./network_datasets/tutorial_data/south/test
[18-02-25 15:02:48 :WARNING ] - WandB is not available, we will never use it
[18-02-25 15:02:48 :INFO ] - Creating network folder: ./results/networks/tutorial_testrun
[18-02-25 15:02:48 :INFO ] - Adding tensorboard callback
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 432, 432, 8)] 0 []
conv2d (Conv2D) (None, 432, 432, 38) 2774 ['input_1[0][0]']
conv2d_1 (Conv2D) (None, 432, 432, 38) 13034 ['conv2d[0][0]']
batch_normalization (Batch (None, 432, 432, 38) 152 ['conv2d_1[0][0]']
Normalization)
max_pooling2d (MaxPooling2 (None, 216, 216, 38) 0 ['batch_normalization[0][0]']
D)
conv2d_2 (Conv2D) (None, 216, 216, 76) 26068 ['max_pooling2d[0][0]']
conv2d_3 (Conv2D) (None, 216, 216, 76) 52060 ['conv2d_2[0][0]']
batch_normalization_1 (Bat (None, 216, 216, 76) 304 ['conv2d_3[0][0]']
chNormalization)
max_pooling2d_1 (MaxPoolin (None, 108, 108, 76) 0 ['batch_normalization_1[0][0]'
g2D) ]
conv2d_4 (Conv2D) (None, 108, 108, 152) 104120 ['max_pooling2d_1[0][0]']
conv2d_5 (Conv2D) (None, 108, 108, 152) 208088 ['conv2d_4[0][0]']
batch_normalization_2 (Bat (None, 108, 108, 152) 608 ['conv2d_5[0][0]']
chNormalization)
max_pooling2d_2 (MaxPoolin (None, 54, 54, 152) 0 ['batch_normalization_2[0][0]'
g2D) ]
conv2d_6 (Conv2D) (None, 54, 54, 152) 208088 ['max_pooling2d_2[0][0]']
conv2d_7 (Conv2D) (None, 54, 54, 152) 208088 ['conv2d_6[0][0]']
batch_normalization_3 (Bat (None, 54, 54, 152) 608 ['conv2d_7[0][0]']
chNormalization)
max_pooling2d_3 (MaxPoolin (None, 27, 27, 152) 0 ['batch_normalization_3[0][0]'
g2D) ]
conv2d_8 (Conv2D) (None, 27, 27, 304) 416176 ['max_pooling2d_3[0][0]']
conv2d_9 (Conv2D) (None, 27, 27, 304) 832048 ['conv2d_8[0][0]']
batch_normalization_4 (Bat (None, 27, 27, 304) 1216 ['conv2d_9[0][0]']
chNormalization)
up_sampling2d (UpSampling2 (None, 54, 54, 304) 0 ['batch_normalization_4[0][0]'
D) ]
conv2d_10 (Conv2D) (None, 54, 54, 152) 184984 ['up_sampling2d[0][0]']
concatenate (Concatenate) (None, 54, 54, 304) 0 ['batch_normalization_3[0][0]'
, 'conv2d_10[0][0]']
conv2d_11 (Conv2D) (None, 54, 54, 152) 416024 ['concatenate[0][0]']
conv2d_12 (Conv2D) (None, 54, 54, 152) 208088 ['conv2d_11[0][0]']
batch_normalization_5 (Bat (None, 54, 54, 152) 608 ['conv2d_12[0][0]']
chNormalization)
up_sampling2d_1 (UpSamplin (None, 108, 108, 152) 0 ['batch_normalization_5[0][0]'
g2D) ]
conv2d_13 (Conv2D) (None, 108, 108, 152) 92568 ['up_sampling2d_1[0][0]']
concatenate_1 (Concatenate (None, 108, 108, 304) 0 ['batch_normalization_2[0][0]'
) , 'conv2d_13[0][0]']
conv2d_14 (Conv2D) (None, 108, 108, 152) 416024 ['concatenate_1[0][0]']
conv2d_15 (Conv2D) (None, 108, 108, 152) 208088 ['conv2d_14[0][0]']
batch_normalization_6 (Bat (None, 108, 108, 152) 608 ['conv2d_15[0][0]']
chNormalization)
up_sampling2d_2 (UpSamplin (None, 216, 216, 152) 0 ['batch_normalization_6[0][0]'
g2D) ]
conv2d_16 (Conv2D) (None, 216, 216, 76) 46284 ['up_sampling2d_2[0][0]']
concatenate_2 (Concatenate (None, 216, 216, 152) 0 ['batch_normalization_1[0][0]'
) , 'conv2d_16[0][0]']
conv2d_17 (Conv2D) (None, 216, 216, 76) 104044 ['concatenate_2[0][0]']
conv2d_18 (Conv2D) (None, 216, 216, 76) 52060 ['conv2d_17[0][0]']
batch_normalization_7 (Bat (None, 216, 216, 76) 304 ['conv2d_18[0][0]']
chNormalization)
up_sampling2d_3 (UpSamplin (None, 432, 432, 76) 0 ['batch_normalization_7[0][0]'
g2D) ]
conv2d_19 (Conv2D) (None, 432, 432, 38) 11590 ['up_sampling2d_3[0][0]']
concatenate_3 (Concatenate (None, 432, 432, 76) 0 ['conv2d_1[0][0]',
) 'conv2d_19[0][0]']
conv2d_20 (Conv2D) (None, 432, 432, 38) 26030 ['concatenate_3[0][0]']
conv2d_21 (Conv2D) (None, 432, 432, 38) 13034 ['conv2d_20[0][0]']
conv2d_22 (Conv2D) (None, 432, 432, 38) 13034 ['conv2d_21[0][0]']
conv2d_23 (Conv2D) (None, 432, 432, 7) 273 ['conv2d_22[0][0]']
==================================================================================================
Total params: 3867077 (14.75 MB)
Trainable params: 3864873 (14.74 MB)
Non-trainable params: 2204 (8.61 KB)
__________________________________________________________________________________________________
[18-02-25 15:02:49 :INFO ] - Datasets: 46 train, 12 val and 1 test filenames
[18-02-25 15:02:49 :INFO ] - Reducing datasets to 1.0 of total files
[18-02-25 15:02:49 :INFO ] - Reduced: 46 train, 12 val and 1 test filenames
[18-02-25 15:02:50 :INFO ] -
Setting learning rate to: 9.999999747378752e-05
Epoch 1/10
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1739890982.600440 1264 device_compiler.h:186] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
Epoch 1: val_rmse improved from inf to 37.51984, saving model to ./results/networks/tutorial_testrun/tutorial_testrun.network_tutorial_data.42.h5
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/keras/src/engine/training.py:3103: UserWarning: You are saving your model as an HDF5 file via `model.save()`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')`.
saving_api.save_model(
46/46 - 65s - loss: 175.5103 - binacc: 52.0889 - mae: 24.3542 - rmse: 31.0710 - mse: 1965.7726 - val_loss: 255.9250 - val_binacc: 37.1588 - val_mae: 35.0078 - val_rmse: 37.5198 - val_mse: 2018.9741 - lr: 1.0000e-04 - 65s/epoch - 1s/step
[18-02-25 15:03:56 :INFO ] -
Setting learning rate to: 9.999999747378752e-05
Epoch 2/10
Epoch 2: val_rmse improved from 37.51984 to 30.97603, saving model to ./results/networks/tutorial_testrun/tutorial_testrun.network_tutorial_data.42.h5
46/46 - 15s - loss: 21.5067 - binacc: 94.2400 - mae: 5.6784 - rmse: 10.8765 - mse: 1230.0433 - val_loss: 174.4384 - val_binacc: 49.8753 - val_mae: 27.6988 - val_rmse: 30.9760 - val_mse: 1732.6095 - lr: 1.0000e-04 - 15s/epoch - 337ms/step
[18-02-25 15:04:11 :INFO ] -
Setting learning rate to: 9.999999747378752e-05
Epoch 3/10
Epoch 3: val_rmse did not improve from 30.97603
46/46 - 15s - loss: 15.4110 - binacc: 95.7369 - mae: 4.3114 - rmse: 9.2070 - mse: 1185.7988 - val_loss: 188.8904 - val_binacc: 46.7715 - val_mae: 29.3755 - val_rmse: 32.2337 - val_mse: 1704.2366 - lr: 1.0000e-04 - 15s/epoch - 328ms/step
[18-02-25 15:04:26 :INFO ] -
Setting learning rate to: 9.999999747378752e-05
Epoch 4/10
Epoch 4: val_rmse improved from 30.97603 to 29.73280, saving model to ./results/networks/tutorial_testrun/tutorial_testrun.network_tutorial_data.42.h5
46/46 - 15s - loss: 14.4196 - binacc: 95.9676 - mae: 4.0430 - rmse: 8.9060 - mse: 1185.5770 - val_loss: 160.7172 - val_binacc: 46.5404 - val_mae: 26.8661 - val_rmse: 29.7328 - val_mse: 1716.8209 - lr: 1.0000e-04 - 15s/epoch - 335ms/step
[18-02-25 15:04:42 :INFO ] -
Setting learning rate to: 9.999999747378752e-05
Epoch 5/10
Epoch 5: val_rmse improved from 29.73280 to 28.68647, saving model to ./results/networks/tutorial_testrun/tutorial_testrun.network_tutorial_data.42.h5
46/46 - 15s - loss: 12.7467 - binacc: 96.3205 - mae: 3.7732 - rmse: 8.3734 - mse: 1163.1246 - val_loss: 149.6046 - val_binacc: 43.8543 - val_mae: 25.9567 - val_rmse: 28.6865 - val_mse: 1735.8658 - lr: 1.0000e-04 - 15s/epoch - 336ms/step
[18-02-25 15:04:57 :INFO ] -
Setting learning rate to: 9.999999747378752e-05
Epoch 6/10
Epoch 6: val_rmse did not improve from 28.68647
46/46 - 15s - loss: 11.6972 - binacc: 96.5091 - mae: 3.6257 - rmse: 8.0213 - mse: 1170.1283 - val_loss: 150.6583 - val_binacc: 39.9089 - val_mae: 26.0869 - val_rmse: 28.7873 - val_mse: 1753.8884 - lr: 1.0000e-04 - 15s/epoch - 328ms/step
[18-02-25 15:05:12 :INFO ] -
Setting learning rate to: 9.999999747378752e-05
Epoch 7/10
Epoch 7: val_rmse did not improve from 28.68647
46/46 - 15s - loss: 10.8200 - binacc: 96.6243 - mae: 3.5153 - rmse: 7.7147 - mse: 1099.8267 - val_loss: 185.8114 - val_binacc: 38.5876 - val_mae: 28.2696 - val_rmse: 31.9699 - val_mse: 1843.5582 - lr: 1.0000e-04 - 15s/epoch - 323ms/step
[18-02-25 15:05:27 :INFO ] -
Setting learning rate to: 9.999999747378752e-05
Epoch 8/10
Epoch 8: val_rmse did not improve from 28.68647
46/46 - 15s - loss: 10.6752 - binacc: 96.7119 - mae: 3.4372 - rmse: 7.6629 - mse: 1104.3926 - val_loss: 216.6756 - val_binacc: 37.9495 - val_mae: 30.0410 - val_rmse: 34.5231 - val_mse: 1986.5660 - lr: 1.0000e-04 - 15s/epoch - 323ms/step
[18-02-25 15:05:42 :INFO ] -
Setting learning rate to: 9.999999747378752e-05
Epoch 9/10
Epoch 9: val_rmse did not improve from 28.68647
46/46 - 15s - loss: 10.3326 - binacc: 96.7235 - mae: 3.4312 - rmse: 7.5389 - mse: 1108.7585 - val_loss: 244.2825 - val_binacc: 37.6807 - val_mae: 31.3547 - val_rmse: 36.6565 - val_mse: 2268.3660 - lr: 1.0000e-04 - 15s/epoch - 327ms/step
[18-02-25 15:05:57 :INFO ] -
Setting learning rate to: 9.999999747378752e-05
Epoch 10/10
Epoch 10: val_rmse did not improve from 28.68647
46/46 - 15s - loss: 9.3397 - binacc: 96.9574 - mae: 3.1848 - rmse: 7.1676 - mse: 1130.2780 - val_loss: 251.8369 - val_binacc: 37.5395 - val_mae: 31.7049 - val_rmse: 37.2190 - val_mse: 2064.9258 - lr: 1.0000e-04 - 15s/epoch - 326ms/step
[18-02-25 15:06:12 :INFO ] - Saving network to: ./results/networks/tutorial_testrun/tutorial_testrun.network_tutorial_data.42.h5
[18-02-25 15:06:16 :INFO ] - Running evaluation against test set
WARNING:tensorflow:Unable to restore custom metric. Please ensure that the layer implements `get_config` and `from_config` when saving. In addition, please use the `custom_objects` arg when calling `load_model()`.
[18-02-25 15:06:16 :WARNING ] - Unable to restore custom metric. Please ensure that the layer implements `get_config` and `from_config` when saving. In addition, please use the `custom_objects` arg when calling `load_model()`.
WARNING:tensorflow:Unable to restore custom metric. Please ensure that the layer implements `get_config` and `from_config` when saving. In addition, please use the `custom_objects` arg when calling `load_model()`.
[18-02-25 15:06:16 :WARNING ] - Unable to restore custom metric. Please ensure that the layer implements `get_config` and `from_config` when saving. In addition, please use the `custom_objects` arg when calling `load_model()`.
WARNING:tensorflow:Unable to restore custom metric. Please ensure that the layer implements `get_config` and `from_config` when saving. In addition, please use the `custom_objects` arg when calling `load_model()`.
[18-02-25 15:06:16 :WARNING ] - Unable to restore custom metric. Please ensure that the layer implements `get_config` and `from_config` when saving. In addition, please use the `custom_objects` arg when calling `load_model()`.
WARNING:tensorflow:Unable to restore custom metric. Please ensure that the layer implements `get_config` and `from_config` when saving. In addition, please use the `custom_objects` arg when calling `load_model()`.
[18-02-25 15:06:16 :WARNING ] - Unable to restore custom metric. Please ensure that the layer implements `get_config` and `from_config` when saving. In addition, please use the `custom_objects` arg when calling `load_model()`.
[18-02-25 15:06:18 :INFO ] - Datasets: 46 train, 12 val and 1 test filenames
[18-02-25 15:06:18 :INFO ] - Reducing datasets to 1.0 of total files
[18-02-25 15:06:18 :INFO ] - Reduced: 46 train, 12 val and 1 test filenames
[18-02-25 15:06:18 :INFO ] - Using test set for validation
[18-02-25 15:06:18 :INFO ] - Metric creation for lead time of 7 days
[18-02-25 15:06:18 :INFO ] - Evaluating...
[18-02-25 15:06:20 :INFO ] - Done in 2.3s
In this second command, we can pick up the training from the previous run above and continue training for another 2 epochs, as set by the -e 2
option.
This can be useful when wanting to continue training runs after having run them previously (if say for example you realised later down the line that you did not run for long enough, or due to resource constraints or any other reason, you had to stop).
!icenet_train tutorial_data tutorial_testrun 42 -b 2 -e 2 -m -qs 4 -w 4 -n 0.6 -nw \
-p ./results/networks/tutorial_testrun/tutorial_testrun.network_tutorial_data.42.h5
[18-02-25 15:06:33 :WARNING ] - PyTorch not found - not required if not using PyTorch
[18-02-25 15:06:34 :WARNING ] - Setting seed for best attempt at determinism, value 42
[18-02-25 15:06:34 :INFO ] - Loading configuration dataset_config.tutorial_data.json
[18-02-25 15:06:34 :INFO ] - Training dataset path: ./network_datasets/tutorial_data/south/train
[18-02-25 15:06:34 :INFO ] - Validation dataset path: ./network_datasets/tutorial_data/south/val
[18-02-25 15:06:34 :INFO ] - Test dataset path: ./network_datasets/tutorial_data/south/test
[18-02-25 15:06:34 :WARNING ] - WandB is not available, we will never use it
[18-02-25 15:06:34 :INFO ] - Adding tensorboard callback
[18-02-25 15:06:35 :INFO ] - Loading network weights from ./results/networks/tutorial_testrun/tutorial_testrun.network_tutorial_data.42.h5
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 432, 432, 8)] 0 []
conv2d (Conv2D) (None, 432, 432, 38) 2774 ['input_1[0][0]']
conv2d_1 (Conv2D) (None, 432, 432, 38) 13034 ['conv2d[0][0]']
batch_normalization (Batch (None, 432, 432, 38) 152 ['conv2d_1[0][0]']
Normalization)
max_pooling2d (MaxPooling2 (None, 216, 216, 38) 0 ['batch_normalization[0][0]']
D)
conv2d_2 (Conv2D) (None, 216, 216, 76) 26068 ['max_pooling2d[0][0]']
conv2d_3 (Conv2D) (None, 216, 216, 76) 52060 ['conv2d_2[0][0]']
batch_normalization_1 (Bat (None, 216, 216, 76) 304 ['conv2d_3[0][0]']
chNormalization)
max_pooling2d_1 (MaxPoolin (None, 108, 108, 76) 0 ['batch_normalization_1[0][0]'
g2D) ]
conv2d_4 (Conv2D) (None, 108, 108, 152) 104120 ['max_pooling2d_1[0][0]']
conv2d_5 (Conv2D) (None, 108, 108, 152) 208088 ['conv2d_4[0][0]']
batch_normalization_2 (Bat (None, 108, 108, 152) 608 ['conv2d_5[0][0]']
chNormalization)
max_pooling2d_2 (MaxPoolin (None, 54, 54, 152) 0 ['batch_normalization_2[0][0]'
g2D) ]
conv2d_6 (Conv2D) (None, 54, 54, 152) 208088 ['max_pooling2d_2[0][0]']
conv2d_7 (Conv2D) (None, 54, 54, 152) 208088 ['conv2d_6[0][0]']
batch_normalization_3 (Bat (None, 54, 54, 152) 608 ['conv2d_7[0][0]']
chNormalization)
max_pooling2d_3 (MaxPoolin (None, 27, 27, 152) 0 ['batch_normalization_3[0][0]'
g2D) ]
conv2d_8 (Conv2D) (None, 27, 27, 304) 416176 ['max_pooling2d_3[0][0]']
conv2d_9 (Conv2D) (None, 27, 27, 304) 832048 ['conv2d_8[0][0]']
batch_normalization_4 (Bat (None, 27, 27, 304) 1216 ['conv2d_9[0][0]']
chNormalization)
up_sampling2d (UpSampling2 (None, 54, 54, 304) 0 ['batch_normalization_4[0][0]'
D) ]
conv2d_10 (Conv2D) (None, 54, 54, 152) 184984 ['up_sampling2d[0][0]']
concatenate (Concatenate) (None, 54, 54, 304) 0 ['batch_normalization_3[0][0]'
, 'conv2d_10[0][0]']
conv2d_11 (Conv2D) (None, 54, 54, 152) 416024 ['concatenate[0][0]']
conv2d_12 (Conv2D) (None, 54, 54, 152) 208088 ['conv2d_11[0][0]']
batch_normalization_5 (Bat (None, 54, 54, 152) 608 ['conv2d_12[0][0]']
chNormalization)
up_sampling2d_1 (UpSamplin (None, 108, 108, 152) 0 ['batch_normalization_5[0][0]'
g2D) ]
conv2d_13 (Conv2D) (None, 108, 108, 152) 92568 ['up_sampling2d_1[0][0]']
concatenate_1 (Concatenate (None, 108, 108, 304) 0 ['batch_normalization_2[0][0]'
) , 'conv2d_13[0][0]']
conv2d_14 (Conv2D) (None, 108, 108, 152) 416024 ['concatenate_1[0][0]']
conv2d_15 (Conv2D) (None, 108, 108, 152) 208088 ['conv2d_14[0][0]']
batch_normalization_6 (Bat (None, 108, 108, 152) 608 ['conv2d_15[0][0]']
chNormalization)
up_sampling2d_2 (UpSamplin (None, 216, 216, 152) 0 ['batch_normalization_6[0][0]'
g2D) ]
conv2d_16 (Conv2D) (None, 216, 216, 76) 46284 ['up_sampling2d_2[0][0]']
concatenate_2 (Concatenate (None, 216, 216, 152) 0 ['batch_normalization_1[0][0]'
) , 'conv2d_16[0][0]']
conv2d_17 (Conv2D) (None, 216, 216, 76) 104044 ['concatenate_2[0][0]']
conv2d_18 (Conv2D) (None, 216, 216, 76) 52060 ['conv2d_17[0][0]']
batch_normalization_7 (Bat (None, 216, 216, 76) 304 ['conv2d_18[0][0]']
chNormalization)
up_sampling2d_3 (UpSamplin (None, 432, 432, 76) 0 ['batch_normalization_7[0][0]'
g2D) ]
conv2d_19 (Conv2D) (None, 432, 432, 38) 11590 ['up_sampling2d_3[0][0]']
concatenate_3 (Concatenate (None, 432, 432, 76) 0 ['conv2d_1[0][0]',
) 'conv2d_19[0][0]']
conv2d_20 (Conv2D) (None, 432, 432, 38) 26030 ['concatenate_3[0][0]']
conv2d_21 (Conv2D) (None, 432, 432, 38) 13034 ['conv2d_20[0][0]']
conv2d_22 (Conv2D) (None, 432, 432, 38) 13034 ['conv2d_21[0][0]']
conv2d_23 (Conv2D) (None, 432, 432, 7) 273 ['conv2d_22[0][0]']
==================================================================================================
Total params: 3867077 (14.75 MB)
Trainable params: 3864873 (14.74 MB)
Non-trainable params: 2204 (8.61 KB)
__________________________________________________________________________________________________
[18-02-25 15:06:35 :INFO ] - Datasets: 46 train, 12 val and 1 test filenames
[18-02-25 15:06:35 :INFO ] - Reducing datasets to 1.0 of total files
[18-02-25 15:06:35 :INFO ] - Reduced: 46 train, 12 val and 1 test filenames
[18-02-25 15:06:36 :INFO ] -
Setting learning rate to: 9.999999747378752e-05
Epoch 1/2
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1739891207.284372 9221 device_compiler.h:186] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
Epoch 1: val_rmse improved from inf to 41.99921, saving model to ./results/networks/tutorial_testrun/tutorial_testrun.network_tutorial_data.42.h5
/data/hpcdata/users/bryald/miniconda3/envs/icenet0.2.9_dev/lib/python3.11/site-packages/keras/src/engine/training.py:3103: UserWarning: You are saving your model as an HDF5 file via `model.save()`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')`.
saving_api.save_model(
46/46 - 64s - loss: 7.8740 - binacc: 97.2214 - mae: 2.8776 - rmse: 6.5812 - mse: 1022.4691 - val_loss: 320.6808 - val_binacc: 37.1541 - val_mae: 35.7755 - val_rmse: 41.9992 - val_mse: 2391.8354 - lr: 1.0000e-04 - 64s/epoch - 1s/step
[18-02-25 15:07:40 :INFO ] -
Setting learning rate to: 9.999999747378752e-05
Epoch 2/2
Epoch 2: val_rmse improved from 41.99921 to 36.33954, saving model to ./results/networks/tutorial_testrun/tutorial_testrun.network_tutorial_data.42.h5
46/46 - 15s - loss: 7.6655 - binacc: 97.3340 - mae: 2.7241 - rmse: 6.4934 - mse: 1129.7290 - val_loss: 240.0765 - val_binacc: 40.3385 - val_mae: 30.1900 - val_rmse: 36.3395 - val_mse: 2075.1689 - lr: 1.0000e-04 - 15s/epoch - 333ms/step
[18-02-25 15:07:55 :INFO ] - Saving network to: ./results/networks/tutorial_testrun/tutorial_testrun.network_tutorial_data.42.h5
[18-02-25 15:08:00 :INFO ] - Running evaluation against test set
WARNING:tensorflow:Unable to restore custom metric. Please ensure that the layer implements `get_config` and `from_config` when saving. In addition, please use the `custom_objects` arg when calling `load_model()`.
[18-02-25 15:08:00 :WARNING ] - Unable to restore custom metric. Please ensure that the layer implements `get_config` and `from_config` when saving. In addition, please use the `custom_objects` arg when calling `load_model()`.
WARNING:tensorflow:Unable to restore custom metric. Please ensure that the layer implements `get_config` and `from_config` when saving. In addition, please use the `custom_objects` arg when calling `load_model()`.
[18-02-25 15:08:00 :WARNING ] - Unable to restore custom metric. Please ensure that the layer implements `get_config` and `from_config` when saving. In addition, please use the `custom_objects` arg when calling `load_model()`.
WARNING:tensorflow:Unable to restore custom metric. Please ensure that the layer implements `get_config` and `from_config` when saving. In addition, please use the `custom_objects` arg when calling `load_model()`.
[18-02-25 15:08:00 :WARNING ] - Unable to restore custom metric. Please ensure that the layer implements `get_config` and `from_config` when saving. In addition, please use the `custom_objects` arg when calling `load_model()`.
WARNING:tensorflow:Unable to restore custom metric. Please ensure that the layer implements `get_config` and `from_config` when saving. In addition, please use the `custom_objects` arg when calling `load_model()`.
[18-02-25 15:08:00 :WARNING ] - Unable to restore custom metric. Please ensure that the layer implements `get_config` and `from_config` when saving. In addition, please use the `custom_objects` arg when calling `load_model()`.
[18-02-25 15:08:02 :INFO ] - Datasets: 46 train, 12 val and 1 test filenames
[18-02-25 15:08:02 :INFO ] - Reducing datasets to 1.0 of total files
[18-02-25 15:08:02 :INFO ] - Reduced: 46 train, 12 val and 1 test filenames
[18-02-25 15:08:02 :INFO ] - Using test set for validation
[18-02-25 15:08:02 :INFO ] - Metric creation for lead time of 7 days
[18-02-25 15:08:02 :INFO ] - Evaluating...
[18-02-25 15:08:04 :INFO ] - Done in 2.5s
Notes on training and prediction#
There are a few things to note about the icenet_train
and icenet_predict
(see the prediction section below) commands and the switches they provide:
Common switches such as
-n
should be applied consistently between training and prediction.These commands work with individual network runs (see the next section).
5. Predict#
To run individual sets through the test network from the test dataset we produced earlier can be easily achieved. The steps are to create a date file, which can be produced from the configuration created by icenet_process
in the processing section. This date file then can be supplied to the icenet_predict
command to produce files using either cached data (useful for test data prepared at the same time as the training and validation sets) or directly from the normalised data (as is the case for nearly all data that isn’t part of the training run.)
icenet_predict
takes a file containing dates to make predictions for. First we can make a file, here called testdates.csv
to pass to icenet_predict
in the next step. (Note that the more advanced IceNet Pipeline method uses a more elegant system for providing dates; or if using the python interface, can be provided to the predict_forecast
function as a list of dates - see 03.library_usage)
!printf "2020-04-01\n2020-04-02" | tee testdates.csv
2020-04-01
2020-04-02
!icenet_predict --help
[18-02-25 15:08:11 :WARNING ] - PyTorch not found - not required if not using PyTorch
usage: icenet_predict [-h] [-i IDENT] [-n N_FILTERS_FACTOR] [-l] [-t] [-v]
[-s]
dataset network_name output_name seed datefile
positional arguments:
dataset
network_name
output_name
seed
datefile
options:
-h, --help show this help message and exit
-i IDENT, --train-identifier IDENT
Train dataset identifier
-n N_FILTERS_FACTOR, --n-filters-factor N_FILTERS_FACTOR
-l, --legacy-rounding
Ensure filter number rounding occurs last in channel
number calculations
-t, --testset
-v, --verbose
-s, --save_args
!icenet_predict -n 0.6 -t \
tutorial_data tutorial_testrun example_south_forecast 42 testdates.csv
[18-02-25 15:08:49 :WARNING ] - PyTorch not found - not required if not using PyTorch
[18-02-25 15:08:49 :INFO ] - Loading configuration ./dataset_config.tutorial_data.json
[18-02-25 15:08:49 :INFO ] - Training dataset path: ./network_datasets/tutorial_data/south/train
[18-02-25 15:08:49 :INFO ] - Validation dataset path: ./network_datasets/tutorial_data/south/val
[18-02-25 15:08:49 :INFO ] - Test dataset path: ./network_datasets/tutorial_data/south/test
[18-02-25 15:08:49 :INFO ] - Loading configuration /data/hpcdata/users/bryald/git/icenet/icenet/notebooks/loader.tutorial_data.json
[18-02-25 15:08:49 :INFO ] - Loading model from ./results/networks/tutorial_testrun/tutorial_testrun.network_tutorial_data.42.h5...
[18-02-25 15:08:50 :INFO ] - Datasets: 46 train, 12 val and 1 test filenames
[18-02-25 15:08:50 :INFO ] - Processing test batch 1, item 0 (date 2020-04-01)
[18-02-25 15:08:50 :INFO ] - Running prediction 2020-04-01
[18-02-25 15:08:58 :INFO ] - Saving 2020-04-01 - forecast output (1, 432, 432, 7)
[18-02-25 15:08:58 :INFO ] - Processing test batch 1, item 1 (date 2020-04-02)
[18-02-25 15:08:58 :INFO ] - Running prediction 2020-04-02
[18-02-25 15:08:58 :WARNING ] - ./results/predict/example_south_forecast/tutorial_testrun.42 output already exists
[18-02-25 15:08:58 :INFO ] - Saving 2020-04-02 - forecast output (1, 432, 432, 7)
The example uses the cached test data from the training run, but the process is the same for any other processed data with only the need to omit the -t
option, which specifies to source from cached test data.
Outputs#
In the above example, there are three outputs:
forecast: the predicted forecast data from the model output layer
outputs: the outputs from the data loader which would be used for training
weights: the generated sample weights from the data loader for the training sample
The outputs initially are stored as Numpy arrays under the results
directory thusly:
results/predict/example_south_forecast/notebook_testrun.42/2020_04_01.npy
results/predict/example_south_forecast/notebook_testrun.42/2020_04_02.npy
With associated inputs, output and weights stored within subfolders.
The individual numpy outputs, samples and sample weights are deposited into /results/predict
. To generate a CF-compliant NetCDF containing the forecasts requested we need to run icenet_output
, these can then be post-processed.
!icenet_output example_south_forecast tutorial_data testdates.csv -o results/predict
[18-02-25 15:10:05 :INFO ] - Loading configuration ./dataset_config.tutorial_data.json
[18-02-25 15:10:05 :INFO ] - Training dataset path: ./network_datasets/tutorial_data/south/train
[18-02-25 15:10:05 :INFO ] - Validation dataset path: ./network_datasets/tutorial_data/south/val
[18-02-25 15:10:05 :INFO ] - Test dataset path: ./network_datasets/tutorial_data/south/test
[18-02-25 15:10:05 :INFO ] - Downloading single daily SIC netCDF file for regridding ERA5 data to EASE grid...
--2025-02-18 15:10:05-- ftp://osisaf.met.no/reprocessed/ice/conc/v2p0/1979/01/ice_conc_sh_ease2-250_cdr-v2p0_197901021200.nc
=> ‘./_sicfile/.listing’
Resolving osisaf.met.no (osisaf.met.no)... 157.249.75.10
Connecting to osisaf.met.no (osisaf.met.no)|157.249.75.10|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done. ==> PWD ... done.
==> TYPE I ... done. ==> CWD (1) /reprocessed/ice/conc/v2p0/1979/01 ... done.
==> PASV ... done. ==> LIST ... done.
[ <=> ] 3,239 --.-K/s in 0.03s
2025-02-18 15:10:06 (107 KB/s) - ‘./_sicfile/.listing’ saved [3239]
--2025-02-18 15:10:06-- ftp://osisaf.met.no/reprocessed/ice/conc/v2p0/1979/01/ice_conc_sh_ease2-250_cdr-v2p0_197901021200.nc
=> ‘./_sicfile/ice_conc_sh_ease2-250_cdr-v2p0_197901021200.nc’
==> CWD not required.
==> PASV ... done. ==> RETR ice_conc_sh_ease2-250_cdr-v2p0_197901021200.nc ... done.
Length: 9856141 (9.4M)
100%[======================================>] 9,856,141 1.90MB/s in 5.7s
2025-02-18 15:10:12 (1.64 MB/s) - ‘./_sicfile/ice_conc_sh_ease2-250_cdr-v2p0_197901021200.nc’ saved [9856141]
FINISHED --2025-02-18 15:10:12--
Total wall clock time: 6.4s
Downloaded: 2 files, 9.4M in 5.8s (1.63 MB/s)
[18-02-25 15:10:12 :INFO ] - Child returned: 0
[18-02-25 15:10:13 :INFO ] - Post-processing 2020-04-01
[18-02-25 15:10:13 :INFO ] - Post-processing 2020-04-02
[18-02-25 15:10:13 :INFO ] - Dataset arr shape: (2, 432, 432, 7, 2)
[18-02-25 15:10:14 :INFO ] - Saving to results/predict/example_south_forecast.nc
6. Visualisation#
Once we have created the netCDF file containing the forecast, we can generate plots using icenet_plot_forecast
!icenet_plot_forecast --help
usage: icenet_plot_forecast [-h] [-o OUTPUT_PATH] [-v] [-r REGION]
[-z REGION_GEOGRAPHIC] [-l LEADTIMES] [-c]
[-f {mp4,png,svg,tiff}] [-g] [-n CMAP_NAME] [-s]
[--crs CRS] [--clip-region]
{north,south} forecast_file forecast_date
positional arguments:
{north,south}
forecast_file
forecast_date
options:
-h, --help show this help message and exit
-o OUTPUT_PATH, --output-path OUTPUT_PATH
-v, --verbose
-r REGION, --region REGION
Region specified x1, y1, x2, y2
-z REGION_GEOGRAPHIC, --region-geographic REGION_GEOGRAPHIC
Geographic region specified as lon and lat min/max:
lon_min, lat_min, lon_max, lat_max
-l LEADTIMES, --leadtimes LEADTIMES
Leadtimes to output, multiple as CSV, range as n..n
-c, --no-coastlines Turn off cartopy integration
-f {mp4,png,svg,tiff}, --format {mp4,png,svg,tiff}
Format to output in
-g, --gridlines Turn on gridlines for plots
-n CMAP_NAME, --cmap-name CMAP_NAME
Color map name if not wanting to use default
-s, --stddev Plot the standard deviation from the ensemble
--crs CRS Coordinate Reference System to use for plotting
--clip-region Whether to clip the data to the region specified by
lon/lat, When enabled, this crops forecast plot to the
bounds, can cause empty pixels across image edges due
to lon/lat curvature. Default is False.
In the following cell, we generate video outputs for the two dates we’ve forecasted for over a 7 day period.
-l
defines the start and end lead times to capture in the video.
-o
defines the output directory for the image/video.
-f
defines the output file type.
-g
defines whether to show lat/lon gridlines overlay (default is to not show it).
-r
defines the pixel region to show in the plot. (Between 0 and 432 pixels horizontall and vertically). E.g. -r 70,155,145,240
-z
defines the lon/lat region to show in the plot. E.g. To plot the Belllingshausen Sea region: -z=-105,-74,-64,-65
. The =
is required since if the first value provided is a negative number, without it, argparse
will consider it as the start of the next argument.
--crs
defines the Coordinate Reference System (CRS) to use to reproject the plot. E.g. --crs mercator
. If in doubt, putting a random value will generate an error and show you the available options (not all have been tested).
Note: For video outputs, this requires ffmpeg
to be installed locally (Or, can remove the -f mp4
flag to output a series of png
forecast image files without ffmpeg
). Since v0.2.9
, this should be managed internally by icenet, but if you run into issues, you can install it manually using:
conda install -c conda-forge ffmpeg
!icenet_plot_forecast south results/predict/example_south_forecast.nc 2020-04-01 -l 1..7 -o outputs -f mp4
!icenet_plot_forecast south results/predict/example_south_forecast.nc 2020-04-02 -l 1..7 -o outputs -f mp4
WARNING:root:No directory at: outputs
INFO:root:Using cmap Blues_r
INFO:root:Inspecting data
INFO:root:Initialising plot
INFO:root:Animating
INFO:root:Saving plot to outputs/example_south_forecast.2020-04-01.20200401.mp4
INFO:root:Using cmap Blues_r
INFO:root:Inspecting data
INFO:root:Initialising plot
INFO:root:Animating
INFO:root:Saving plot to outputs/example_south_forecast.2020-04-02.20200402.mp4
A more automated way of visualising the forecasts from the netCDF output is shown in the next notebook.
Now, the video can be visualised for the two test dates.
from IPython.display import Video
Video("outputs/example_south_forecast.2020-04-01.20200401.mp4", embed=True, width=800)
Video("outputs/example_south_forecast.2020-04-02.20200402.mp4", embed=True, width=800)
Note
Since we have not done an actual training run with enough data and for long enough, the forecasts are not very good. However, the video is still useful for demonstration purposes and can be used as a reference for what the output should look like.
Summary#
Within this notebook we’ve attempted to give a full crash course to running the CLI tools manually. This is the first of a series of notebooks, covering further information:
Data structure and analysis
03.data_and_forecasts.ipynb
: understand the structure of the data stores and products created by these workflows and what tools currently exist in IceNet to looks over them.
Library usage
04.library_usage.ipynb
: understand how to programmatically perform an end to end run.
Version#
IceNet Codebase: v0.2.9