hdmf_zarr.utils module
Collection of utility I/O classes for the ZarrIO backend store.
- class hdmf_zarr.utils.ZarrIODataChunkIteratorQueue(number_of_jobs: int = 1, max_threads_per_process: None | int = None, multiprocessing_context: None | Literal['fork', 'spawn'] = None)
Bases:
dequeHelper class used by ZarrIO to manage the write for DataChunkIterators Each queue element must be a tuple of two elements: 1) the dataset to write to and 2) the AbstractDataChunkIterator with the data :param number_of_jobs: The number of jobs used to write the datasets. The default is 1. :type number_of_jobs: integer :param max_threads_per_process: Limits the number of threads used by each process. The default is None (no limits). :type max_threads_per_process: integer or None :param multiprocessing_context: Context for multiprocessing. It can be None (default), “fork” or “spawn”. Note that “fork” is only available on UNIX systems (not Windows). :type multiprocessing_context: string or None
- exhaust_queue()
Read and write from any queued DataChunkIterators.
Operates in a round-robin fashion for a single job. Operates on a single dataset at a time with multiple jobs.
- append(dataset, data)
Append a value to the queue :param dataset: The dataset where the DataChunkIterator is written to :type dataset: Zarr array :param data: DataChunkIterator with the data to be written :type data: AbstractDataChunkIterator
- static initializer_wrapper(operation_to_run: callable, process_initialization: callable, initialization_arguments: Iterable, max_threads_per_process: int | None = None)
Needed as a part of a bug fix with cloud memory leaks discovered by SpikeInterface team.
Recommended fix is to have global wrappers for the working initializer that limits the threads used per process.
- class hdmf_zarr.utils.ZarrSpecWriter(group)
Bases:
SpecWriterClass used to write format specs to Zarr
- Parameters:
group (
Group) – the Zarr file to write specs to
- static stringify(spec)
Converts a spec into a JSON string to write to a dataset
- write_spec(spec, path)
Write a spec to the given path
- write_namespace(namespace, path)
Write a namespace to the given path
- class hdmf_zarr.utils.ZarrSpecReader(group)
Bases:
SpecReaderClass to read format specs from Zarr
- Parameters:
group (
Group) – the Zarr file to read specs from
- read_spec(spec_path)
Read a spec from the given path
- read_namespace(ns_path)
Read a namespace from the given path
- class hdmf_zarr.utils.ZarrDataIO(data, chunks=None, fillvalue=None, compressor=None, filters=None, link_data=False)
Bases:
DataIOWrap data arrays for write via ZarrIO to customize I/O behavior, such as compression and chunking for data arrays.
- Parameters:
data (
ndarrayorlistortupleorArrayorIterable) – the data to be written. NOTE: If an zarr.Array is used, all other settings but link_data will be ignored as the dataset will either be linked to or copied as is in ZarrIO.fillvalue (None) – Value to be returned when reading uninitialized parts of the dataset
compressor (
Codecorbool) – Zarr compressor filter to be used. Set to True to use Zarr default. Set to False to disable compression)filters (
listortuple) – One or more Zarr-supported codecs used to transform data prior to compression.link_data (
bool) – If data is an zarr.Array should it be linked to or copied. NOTE: This parameter is only allowed if data is an zarr.Array
- property link_data: bool
Only applies to zarr.Array type data
- Type:
Bool indicating should it be linked to or copied. NOTE
- static from_h5py_dataset(h5dataset, **kwargs)
Factory method to create a ZarrDataIO instance from a h5py.Dataset. The ZarrDataIO object wraps the h5py.Dataset and the io filter settings are inferred from filters used in h5py such that the options in Zarr match (if possible) the options used in HDF5.
- Parameters:
dataset (h5py.Dataset) – h5py.Dataset object that should be wrapped
kwargs – Other keyword arguments to pass to ZarrDataIO.__init__
- Returns:
ZarrDataIO object wrapping the dataset
- static hdf5_to_zarr_filters(h5dataset) list
From the given h5py.Dataset infer the corresponding filters to use in Zarr
- static is_h5py_dataset(obj)
Check if the object is an instance of h5py.Dataset without requiring import of h5py
- class hdmf_zarr.utils.ZarrReference(source=None, path=None, object_id=None, source_object_id=None)
Bases:
dictData structure to describe a reference to another container used with the ZarrIO backend
- Parameters:
source (
str) – Source of referenced object. Usually the relative path to the Zarr file containing the referenced objectpath (
str) – Path of referenced object within the sourceobject_id (
str) – Object_id of the referenced object (if available)source_object_id (
str) – Object_id of the source (should always be available)