hdmf_zarr.utils module

Collection of utility I/O classes for the ZarrIO backend store.

class hdmf_zarr.utils.ZarrIODataChunkIteratorQueue(number_of_jobs: int = 1, max_threads_per_process: None | int = None, multiprocessing_context: None | Literal['fork', 'spawn'] = None)

Bases: deque

Helper class used by ZarrIO to manage the write for DataChunkIterators Each queue element must be a tupple of two elements: 1) the dataset to write to and 2) the AbstractDataChunkIterator with the data :param number_of_jobs: The number of jobs used to write the datasets. The default is 1. :type number_of_jobs: integer :param max_threads_per_process: Limits the number of threads used by each process. The default is None (no limits). :type max_threads_per_process: integer or None :param multiprocessing_context: Context for multiprocessing. It can be None (default), “fork” or “spawn”. Note that “fork” is only available on UNIX systems (not Windows). :type multiprocessing_context: string or None

exhaust_queue()

Read and write from any queued DataChunkIterators.

Operates in a round-robin fashion for a single job. Operates on a single dataset at a time with multiple jobs.

append(dataset, data)

Append a value to the queue :param dataset: The dataset where the DataChunkIterator is written to :type dataset: Zarr array :param data: DataChunkIterator with the data to be written :type data: AbstractDataChunkIterator

static initializer_wrapper(operation_to_run: callable, process_initialization: callable, initialization_arguments: Iterable, max_threads_per_process: None | int = None)

Needed as a part of a bug fix with cloud memory leaks discovered by SpikeInterface team.

Recommended fix is to have global wrappers for the working initializer that limits the threads used per process.

static function_wrapper(args: Tuple[str, str, AbstractDataChunkIterator, Tuple[slice, ...]])

Needed as a part of a bug fix with cloud memory leaks discovered by SpikeInterface team.

Recommended fix is to have a global wrapper for the executor.map level.

class hdmf_zarr.utils.ZarrSpecWriter(group)

Bases: SpecWriter

Class used to write format specs to Zarr

Parameters:

group (Group) – the Zarr file to write specs to

static stringify(spec)

Converts a spec into a JSON string to write to a dataset

write_spec(spec, path)

Write a spec to the given path

write_namespace(namespace, path)

Write a namespace to the given path

class hdmf_zarr.utils.ZarrSpecReader(group, source='.')

Bases: SpecReader

Class to read format specs from Zarr

Parameters:
  • group (Group) – the Zarr file to read specs from

  • source (str) – the path spec files are relative to

read_spec(spec_path)

Read a spec from the given path

read_namespace(ns_path)

Read a namespace from the given path

class hdmf_zarr.utils.ZarrDataIO(data, chunks=None, fillvalue=None, compressor=None, filters=None, link_data=False)

Bases: DataIO

Wrap data arrays for write via ZarrIO to customize I/O behavior, such as compression and chunking for data arrays.

Parameters:
  • data (ndarray or list or tuple or Array or Iterable) – the data to be written. NOTE: If an zarr.Array is used, all other settings but link_data will be ignored as the dataset will either be linked to or copied as is in ZarrIO.

  • chunks (list or tuple) – Chunk shape

  • fillvalue (None) – Value to be returned when reading uninitialized parts of the dataset

  • compressor (Codec or bool) – Zarr compressor filter to be used. Set to True to use Zarr default.Set to False to disable compression)

  • filters (list or tuple) – One or more Zarr-supported codecs used to transform data prior to compression.

  • link_data (bool) – If data is an zarr.Array should it be linked to or copied. NOTE: This parameter is only allowed if data is an zarr.Array

property io_settings
class hdmf_zarr.utils.ZarrReference(source=None, path=None, object_id=None, source_object_id=None)

Bases: dict

Data structure to describe a reference to another container used with the ZarrIO backend

Parameters:
  • source (str) – Source of referenced object. Usually the relative path to the Zarr file containing the referenced object

  • path (str) – Path of referenced object within the source

  • object_id (str) – Object_id of the referenced object (if available)

  • source_object_id (str) – Object_id of the source (should always be available)

property source: str
property path: str
property object_id: str
property source_object_id: str