Dataset¶

Adapter for NWB datasets to linkml Classes

class DatasetMap¶

Abstract builder class for dataset elements

abstract classmethod check(cls: Dataset) → bool¶: Check if this map applies

abstract classmethod apply(cls: Dataset, res: BuildResult | None = None, name: str | None = None) → BuildResult¶: Apply this mapping

class MapScalar¶

Datasets that are just a single value should just be a scalar value, not an array with size 1

Replaces the built class with a slot.

Examples

NWB Schema

datasets:
- name: MyScalar
  doc: A scalar
  dtype: int32
  quantity: '?'

LinkML

slots:
- name: MyScalar
  description: A scalar
  multivalued: false
  range: int32
  required: false

classmethod check(cls: Dataset) → bool¶

Attr	Value
`neurodata_type_inc`	`None`
`attributes`	`None`
`dims`	`None`
`shape`	`None`
`name`	`str`

classmethod apply(cls: Dataset, res: BuildResult | None = None, name: str | None = None) → BuildResult¶: Map to a scalar value

class MapScalarAttributes¶

A scalar with attributes gets an additional slot “value” that contains the actual scalar value of this field

Examples

NWB Schema

datasets:
- name: starting_time
  dtype: float64
  doc: Timestamp of the first sample in seconds. When timestamps are uniformly
    spaced, the timestamp of the first sample can be specified and all subsequent
    ones calculated from the sampling rate attribute.
  quantity: '?'
  attributes:
  - name: rate
    dtype: float32
    doc: Sampling rate, in Hz.
  - name: unit
    dtype: text
    value: seconds
    doc: Unit of measurement for time, which is fixed to 'seconds'.

LinkML

classes:
- name: starting_time
  description: Timestamp of the first sample in seconds. When timestamps are
    uniformly spaced, the timestamp of the first sample can be specified and all
    subsequent ones calculated from the sampling rate attribute.
  attributes:
    name:
      name: name
      ifabsent: string(starting_time)
      range: string
      required: true
      equals_string: starting_time
    rate:
      name: rate
      description: Sampling rate, in Hz.
      range: float32
    unit:
      name: unit
      description: Unit of measurement for time, which is fixed to 'seconds'.
      range: text
    value:
      name: value
      range: float64
      required: true
  tree_root: true

classmethod check(cls: Dataset) → bool¶

Attr	Value
`neurodata_type_inc`	`None`
`attributes`	`True`
`dims`	`None`
`shape`	`None`
`name`	`str`

classmethod apply(cls: Dataset, res: BuildResult | None = None, name: str | None = None) → BuildResult¶: Map to a scalar attribute with an adjoining “value” slot

class MapListlike¶

Datasets that refer to a list of other datasets.

Used exactly once in the core schema, in ImageReferences - an array of references to other Image datasets. We ignore the usual array structure and unnest the implicit array into a slot names from the target type rather than the oddly-named num_images dimension so that ultimately in the pydantic model we get a nicely behaved single-level list.

Examples

NWB Schema

datasets:
- neurodata_type_def: ImageReferences
  neurodata_type_inc: NWBData
  dtype:
    target_type: Image
    reftype: object
  dims:
  - num_images
  shape:
  - null
  doc: Ordered dataset of references to Image objects.

LinkML

classes:
- name: ImageReferences
  description: Ordered dataset of references to Image objects.
  is_a: NWBData
  attributes:
    name:
      name: name
      range: string
      required: true
    image:
      name: image
      description: Ordered dataset of references to Image objects.
      multivalued: true
      range: Image
      required: true
  tree_root: true

classmethod check(cls: Dataset) → bool¶

Check if we are a 1D dataset that isn’t a normal datatype

Attr	Value
`is_1d()`	`True`
`dtype`	`Class`

classmethod apply(cls: Dataset, res: BuildResult | None = None, name: str | None = None) → BuildResult¶: Map to a list of the given class

class MapArraylike¶

Datasets without any additional attributes don’t create their own subclass, they’re just an array :).

Replace the base class with a slot that defines the array.

Examples

eg. from image.ImageSeries :

NWB Schema

datasets:
- name: data
  dtype: numeric
  dims:
  - - frame
    - x
    - y
  - - frame
    - x
    - y
    - z
  shape:
  - - null
    - null
    - null
  - - null
    - null
    - null
    - null
  doc: Binary data representing images across frames. If data are stored in an
    external file, this should be an empty 3D array.

LinkML

slots:
- name: data
  description: Binary data representing images across frames. If data are stored in
    an external file, this should be an empty 3D array.
  multivalued: false
  range: numeric
  required: true
  any_of:
  - array:
      dimensions:
      - alias: frame
      - alias: x
      - alias: y
  - array:
      dimensions:
      - alias: frame
      - alias: x
      - alias: y
      - alias: z

classmethod check(cls: Dataset) → bool¶

Check if we’re a plain array

Attr	Value
`name`	`True`
`dims`	`True`
`shape`	`True`
`has_attrs()`	`False`
`is_compound()`	`False`

classmethod apply(cls: Dataset, res: BuildResult | None = None, name: str | None = None) → BuildResult¶: Map to an array class and the adjoining slot

class MapArrayLikeAttributes¶

The most general case - treat everything that isn’t handled by one of the special cases as an array!

Examples

NWB Schema

datasets:
- neurodata_type_def: Image
  neurodata_type_inc: NWBData
  dtype: numeric
  dims:
  - - x
    - y
  - - x
    - y
    - r, g, b
  - - x
    - y
    - r, g, b, a
  shape:
  - - null
    - null
  - - null
    - null
    - 3
  - - null
    - null
    - 4
  doc: An abstract data type for an image. Shape can be 2-D (x, y), or 3-D where the
    third dimension can have three or four elements, e.g. (x, y, (r, g, b)) or
    (x, y, (r, g, b, a)).
  attributes:
  - name: resolution
    dtype: float32
    doc: Pixel resolution of the image, in pixels per centimeter.
    required: false
  - name: description
    dtype: text
    doc: Description of the image.
    required: false

LinkML

classes:
- name: Image
  description: An abstract data type for an image. Shape can be 2-D (x, y), or 3-D
    where the third dimension can have three or four elements, e.g. (x, y, (r, g,
    b)) or (x, y, (r, g, b, a)).
  is_a: NWBData
  attributes:
    name:
      name: name
      range: string
      required: true
    resolution:
      name: resolution
      description: Pixel resolution of the image, in pixels per centimeter.
      range: float32
    description:
      name: description
      description: Description of the image.
      range: text
    array:
      name: array
      range: numeric
      any_of:
      - array:
          dimensions:
          - alias: x
          - alias: y
      - array:
          dimensions:
          - alias: x
          - alias: y
          - alias: r_g_b
            exact_cardinality: 3
      - array:
          dimensions:
          - alias: x
          - alias: y
          - alias: r_g_b_a
            exact_cardinality: 4
  tree_root: true

NEEDS_NAME = True¶

classmethod check(cls: Dataset) → bool¶: Check that we’re an array with some additional metadata

classmethod apply(cls: Dataset, res: BuildResult | None = None, name: str | None = None) → BuildResult¶: Map to an arraylike class

class MapClassRange¶

Datasets that are a simple named reference to another type without any additional modification to that type.

classmethod check(cls: Dataset) → bool¶: Check that we are a dataset with a neurodata_type_inc and a name but nothing else

classmethod apply(cls: Dataset, res: BuildResult | None = None, name: str | None = None) → BuildResult¶: Replace the base class with a slot with an annotation that indicates it should use the Named generic when generated to pydantic

class MapVectorClassRange¶

Map a VectorData class that is a reference to another class as simply a multivalued slot range, rather than an independent class

classmethod check(cls: Dataset) → bool¶: Check that we are a VectorData object without any additional attributes with a dtype that refers to another class

classmethod apply(cls: Dataset, res: BuildResult | None = None, name: str | None = None) → BuildResult¶: Create a slot that replaces the base class just as a list[ClassRef]

class MapNVectors¶

An unnamed container that indicates an arbitrary quantity of some other neurodata type.

Most commonly: VectorData is subclassed without a name and with a ‘*’ quantity to indicate arbitrary columns.

classmethod check(cls: Dataset) → bool¶: Check for being an unnamed multivalued vector class

classmethod apply(cls: Dataset, res: BuildResult | None = None, name: str | None = None) → BuildResult¶: Return a slot mapping to multiple values of the type

class MapCompoundDtype¶

A dtype declared as an array of types that function effectively as a row in a table.

We render them just as a class with each of the dtypes as slots - they are typically used by other datasets to create a table.

Eg. base.TimeSeriesReferenceVectorData

datasets:
- neurodata_type_def: TimeSeriesReferenceVectorData
  neurodata_type_inc: VectorData
  default_name: timeseries
  dtype:
  - name: idx_start
    dtype: int32
    doc: Start index into the TimeSeries 'data' and 'timestamp' datasets of the referenced
      TimeSeries. The first dimension of those arrays is always time.
  - name: count
    dtype: int32
    doc: Number of data samples available in this time series, during this epoch
  - name: timeseries
    dtype:
      target_type: TimeSeries
      reftype: object
    doc: The TimeSeries that this index applies to
  doc: Column storing references to a TimeSeries (rows). For each TimeSeries this
    VectorData column stores the start_index and count to indicate the range in time
    to be selected as well as an object reference to the TimeSeries.

classmethod check(cls: Dataset) → bool¶: Check that we’re a dataset with a compound dtype

classmethod apply(cls: Dataset, res: BuildResult | None = None, name: str | None = None) → BuildResult¶: Make a new class for this dtype, using its sub-dtypes as fields, and use it as the range for the parent class

class DatasetAdapter(*, cls: Dataset, parent: ClassAdapter | None = None)¶

Orchestrator class for datasets - calls the set of applicable mapping classes

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

TYPE¶: alias of Dataset

cls: Dataset¶

build() → BuildResult¶: Build the base result, and then apply the applicable mappings.

match() → Type[DatasetMap] | None¶

Find the map class that applies to this class

Returns:: DatasetMap
Raises:: RuntimeError - if more than one map matches –

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}¶: A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'cls': FieldInfo(annotation=Dataset, required=True), 'parent': FieldInfo(annotation=Union[ClassAdapter, NoneType], required=False, default=None)}¶

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

is_1d(cls: Dataset) → bool¶

Check if the values of a dataset are 1-dimensional.

Specifically: * a single-layer dim/shape list of length 1, or * a nested dim/shape list where every nested spec is of length 1

is_compound(cls: Dataset) → bool¶: Check if dataset has a compound dtype

has_attrs(cls: Dataset) → bool¶: Check if a dataset has any attributes at all without defaults