TODO¶

v0.2 - update to linkml-arrays and formal release¶

NWB schema translation

handle links field in groups
handle compound dtype like in ophys.PlaneSegmentation.pixel_mask
handle compound dtype like in TimeSeriesReferenceVectorData
Create a validator that checks if all the lists in a compound dtype dataset are same length

Cleanup

[ ] Update pydantic generator
[ ] Restore regressions from stripping the generator
[x] Make any_of with array ranges work
[ ] PR upstream equals_string and ifabsent (if existing PR doesn’t fix)
[ ] Use the class rather than a string in _get_class_slot_range_origin: or inlined_as_list or ( # sv.get_identifier_slot(range_cls.name, use_key=True) is None and
[ ] Make a minimal pydanticgen-only package to slim linkml deps?
[ ] Disambiguate “maps” terminology - split out simple maps from the eg. dataset mapping classes
[ ] Remove unnecessary imports
- dask
- nptyping
[ ] Adapt the split generation to the new split generator style

Important things that are not implemented yet!

[x] nwb_linkml.adapters.classes.ClassAdapter.handle_dtype() does not yet handle compound dtypes, leaving them as AnyType instead. This is fine for a first draft since they are used rarely within NWB, but we will need to handle them by making slots for each of the dtypes since they typically represent table-like data.
[ ] Need to handle DynamicTables!
- Adding columns?
- Validating eg. all are same length?
- Or do we want to just say “no dynamictables, just subclass and add more slots since it’s super easy to do that.”
- method to return a dataframe
- append rows/this should just be a df basically.
- existing handler is fucked, for example, in maps/hdmf
[ ] Handle indirect indexing eg. https://pynwb.readthedocs.io/en/stable/tutorials/general/plot_timeintervals.html#accessing-referenced-timeseries

Remove monkeypatches/overrides once PRs are closed

[ ] https://github.com/linkml/linkml-runtime/pull/330

Tests

[ ] Ensure schemas and pydantic modules in repos are up to date

Docs TODOs¶

Todo

Implement reading, skipping arrays - they are fast to read with the ArrayProxy class and dask, but there are times when we might want to leave them out of the read entirely. This might be better implemented as a filter on model_dump , but to investigate further how best to support reading just metadata, or even some specific field value, or if we should leave that to other implementations like eg. after we do SQL export then not rig up a whole query system ourselves.

original entry

Todo

Implement HDF5 writing.

Need to create inverse mappings that can take pydantic models to hdf5 groups and datasets. If more metadata about the generation process needs to be preserved (eg. explicitly notating that something is an attribute, dataset, group, then we can make use of the LinkML_Meta model. If the model to edit has been loaded from an HDF5 file (rather than freshly created), then the hdf5_path should be populated making mapping straightforward, but we probably want to generalize that to deterministically get hdf5_path from position in the NWBFile object – I think that might require us to explicitly annotate when something is supposed to be a reference vs. the original in the model representation, or else it’s ambiguous.

Otherwise, it should be a matter of detecting changes from file if it exists already, and then write them.

original entry

Todo

Test find_references() !

original entry

Todo

This is likely deprecated, check usage.

original entry

Todo

This is likely deprecated, check usage.

original entry

Todo

This is likely deprecated, check usage.

original entry

Todo

This is likely deprecated, check usage.

original entry

Todo

Document Pydantic model generation

original entry

Todo

Document provider usage

original entry

Todo

Link to relevant adapter classes

original entry