core/ndarray-1.0.0
An n-dimensional array.
Description
There are two ways to store the data in an ndarray.
Inline in the tree: This is recommended only for small arrays. In this case, the entire
ndarray
tag may be a nested list, in which case the type of the array is inferred from the content. (See the rules for type inference in theinline-data
definition below.) The inline data may also be given in thedata
property, in which case it is possible to explicitly specify thedatatype
and other properties.External to the tree: The data comes from a block within the same ASDF file or an external ASDF file referenced by a URI.
Outline
Schema Definitions ¶
This node must validate against any of the following:
This type is an object with the following properties:
source
object The source of the data.
If an integer: If positive, the zero-based index of the block within the same file. If negative, the index from the last block within the same file. For example, a source of
-1
corresponds to the last block in the same file.If a string, a URI to an external ASDF file containing the block data. Relative URIs and
file:
andhttp:
protocols must be supported. Other protocols may be supported by specific library implementations.
The ability to reference block data in an external ASDF file is intentionally limited to the first block in the external ASDF file, and is intended only to support the needs of exploded. For the more general case of referencing data in an external ASDF file, use tree references.
This node must validate against any of the following:
integer
- No length restriction
string
data
inline-data The data for the array inline.
If
datatype
and/orshape
are also provided, they must match the data here and can be used as a consistency check.strides
,offset
andbyteorder
are meaningless whendata
is provided.shape
array No length restrictionThe shape of the array.
The first entry may be the string
*
, indicating that the length of the first index of the array will be automatically determined from the size of the block. This is used for streaming support.Items in the array must be any of the following types:
integer
Minimum value: 0
- This node has no type definition (unrestricted)
datatype
datatype The data format of the array elements.
byteorder
string No length restrictionThe byte order (big- or little-endian) of the array data.
Only the following values are valid for this node:
big
little
offset
integer The offset, in bytes, within the data for this start of this view.
Minimum value: 0
Default value: 0
strides
array No length restrictionThe number of bytes to skip in each dimension. If not provided, the array is assumed by be contiguous and in C order. If provided, must be the same length as the shape property.
Items in the array must be any of the following types:
integer
Minimum value: 1
integer
Maximum value: -1
mask
object Describes how missing values in the array are stored. If a scalar number, that number is used to represent missing values. If an ndarray, the given array provides a mask, where non-zero values represent missing values in this array. The mask array must be broadcastable to the dimensions of this array.
This node must validate against any of the following:
number
This node must validate against all of the following:
- This node has no type definition (unrestricted)
Examples ¶
An inline array, with implicit data type:
!core/ndarray-1.0.0
[[1, 0, 0],
[0, 1, 0],
[0, 0, 1]]
An inline array, with an explicit data type:
!core/ndarray-1.0.0
datatype: float64
data:
[[1, 0, 0],
[0, 1, 0],
[0, 0, 1]]
An inline structured array, where the types of each column are automatically detected:
!core/ndarray-1.0.0
[[M110, 110, 205, And],
[ M31, 31, 224, And],
[ M32, 32, 221, And],
[M103, 103, 581, Cas]]
An inline structured array, where the types of each column are explicitly specified:
!core/ndarray-1.0.0
datatype: [['ascii', 4], uint16, uint16, ['ascii', 4]]
data:
[[M110, 110, 205, And],
[ M31, 31, 224, And],
[ M32, 32, 221, And],
[M103, 103, 581, Cas]]
A double-precision array, in contiguous memory in a block within the same file:
!core/ndarray-1.0.0
source: 0
shape: [1024, 1024]
datatype: float64
byteorder: little
A view of a tile in that image:
!core/ndarray-1.0.0
source: 0
shape: [256, 256]
datatype: float64
byteorder: little
strides: [8192, 8]
offset: 2099200
A structured datatype, with nested columns for a coordinate in (ra, dec), and a 3x3 convolution kernel:
!core/ndarray-1.0.0
source: 0
shape: [64]
datatype:
- name: coordinate
datatype:
- name: ra
datatype: float64
- name: dec
datatype: float64
- name: kernel
datatype: float32
shape: [3, 3]
byteorder: little
An array in Fortran order:
!core/ndarray-1.0.0
source: 0
shape: [1024, 1024]
datatype: float64
byteorder: little
strides: [8192, 8]
An array where values of -999 are treated as missing:
!core/ndarray-1.0.0
source: 0
shape: [256, 256]
datatype: float64
byteorder: little
mask: -999
An array where another array is used as a mask:
!core/ndarray-1.0.0
source: 0
shape: [256, 256]
datatype: float64
byteorder: little
mask: !core/ndarray-1.0.0
source: 1
shape: [256, 256]
datatype: bool8
byteorder: little
An array where the data is stored in the first block in another ASDF file.:
!core/ndarray-1.0.0
source: external.asdf
shape: [256, 256]
datatype: float64
byteorder: little
Internal Definitions ¶
scalar-datatype
object |
Describes the type of a single element.
There is a set of numeric types, each with a single identifier:
int8
,int16
,int32
,int64
: Signed integer types, with the given bit size.uint8
,uint16
,uint32
,uint64
: Unsigned integer types, with the given bit size.float32
: Single-precision floating-point type or “binary32”, as defined in IEEE 754.float64
: Double-precision floating-point type or “binary64”, as defined in IEEE 754.complex64
: Complex number where the real and imaginary parts are each single-precision floating-point (“binary32”) numbers, as defined in IEEE 754.complex128
: Complex number where the real and imaginary parts are each double-precision floating-point (“binary64”) numbers, as defined in IEEE 754.
There are two distinct fixed-length string types, which must be indicated with a 2-element array where the first element is an identifier for the string type, and the second is a length:
ascii
: A string containing ASCII text (all codepoints < 128), where each character is 1 byte.ucs4
: A string containing unicode text in the UCS-4 encoding, where each character is always 4 bytes long. Here the number of bytes used is 4 times the given length.
This node must validate against any of the following:
- No length restriction
string
Only the following values are valid for this node:
int8
uint8
int16
uint16
int32
uint32
int64
uint64
float32
float64
complex64
complex128
bool8
- No length restriction
array
The first 2 items in the list must be the following types:
ascii
ucs4
No length restrictionstring
Only the following values are valid for this node:
integer
Minimum value: 0
datatype
object |
The data format of the array elements. May be a single scalar datatype, or may be a nested list of datatypes. When a list, each field may have a name.
This node must validate against any of the following:
- No length restriction
array
Items in the array must be any of the following types:
This type is an object with the following properties:
name
string No length restrictionThe name of the field
Must match the following pattern:
[A-Za-z_][A-Za-z0-9_]*
datatype
datatype Required byteorder
string No length restrictionThe byteorder for the field. If not provided, the byteorder of the datatype as a whole will be used.
Only the following values are valid for this node:
big
little
shape
No length restrictionarray Items in the array are restricted to the following types:
integer
Minimum value: 0
inline-data
array |
Inline data is stored in YAML format directly in the tree, rather than referencing a binary block. It is made out of nested lists.
If the datatype of the array is not specified, it is inferred from the array contents. Type inference is supported only for homogeneous arrays, not tables.
If any of the elements in the array are YAML strings, the
datatype
of the entire array isucs4
, with the width of the largest string in the column, otherwise…If any of the elements in the array are complex numbers, the
datatype
of the entire column iscomplex128
, otherwise…If any of the types in the column are numbers with a decimal point, the
datatype
of the entire column isfloat64
, otherwise..If any of the types in the column are integers, the
datatype
of the entire column isint64
, otherwise…The
datatype
of the entire column isbool8
.
Masked values may be included in the array using null
. If an explicit mask array is also provided, it takes precedence.
Items in the array must be any of the following types:
number
- No length restriction
string
null
boolean
Original Schema ¶
%YAML 1.1
---
$schema: "http://stsci.edu/schemas/yaml-schema/draft-01"
id: "http://stsci.edu/schemas/asdf/core/ndarray-1.0.0"
title: >
An *n*-dimensional array.
description: |
There are two ways to store the data in an ndarray.
- Inline in the tree: This is recommended only for small arrays. In
this case, the entire ``ndarray`` tag may be a nested list, in
which case the type of the array is inferred from the content.
(See the rules for type inference in the ``inline-data``
definition below.) The inline data may also be given in the
``data`` property, in which case it is possible to explicitly
specify the ``datatype`` and other properties.
- External to the tree: The data comes from a [block](ref:block)
within the same ASDF file or an external ASDF file referenced by a
URI.
examples:
-
- An inline array, with implicit data type
- |
!core/ndarray-1.0.0
[[1, 0, 0],
[0, 1, 0],
[0, 0, 1]]
-
- An inline array, with an explicit data type
- |
!core/ndarray-1.0.0
datatype: float64
data:
[[1, 0, 0],
[0, 1, 0],
[0, 0, 1]]
-
- An inline structured array, where the types of each column are
automatically detected
- |
!core/ndarray-1.0.0
[[M110, 110, 205, And],
[ M31, 31, 224, And],
[ M32, 32, 221, And],
[M103, 103, 581, Cas]]
-
- An inline structured array, where the types of each column are
explicitly specified
- |
!core/ndarray-1.0.0
datatype: [['ascii', 4], uint16, uint16, ['ascii', 4]]
data:
[[M110, 110, 205, And],
[ M31, 31, 224, And],
[ M32, 32, 221, And],
[M103, 103, 581, Cas]]
-
- A double-precision array, in contiguous memory in a block within
the same file
- |
!core/ndarray-1.0.0
source: 0
shape: [1024, 1024]
datatype: float64
byteorder: little
-
- A view of a tile in that image
- |
!core/ndarray-1.0.0
source: 0
shape: [256, 256]
datatype: float64
byteorder: little
strides: [8192, 8]
offset: 2099200
-
- A structured datatype, with nested columns for a coordinate in
(*ra*, *dec*), and a 3x3 convolution kernel
- |
!core/ndarray-1.0.0
source: 0
shape: [64]
datatype:
- name: coordinate
datatype:
- name: ra
datatype: float64
- name: dec
datatype: float64
- name: kernel
datatype: float32
shape: [3, 3]
byteorder: little
-
- An array in Fortran order
- |
!core/ndarray-1.0.0
source: 0
shape: [1024, 1024]
datatype: float64
byteorder: little
strides: [8192, 8]
-
- An array where values of -999 are treated as missing
- |
!core/ndarray-1.0.0
source: 0
shape: [256, 256]
datatype: float64
byteorder: little
mask: -999
-
- An array where another array is used as a mask
- |
!core/ndarray-1.0.0
source: 0
shape: [256, 256]
datatype: float64
byteorder: little
mask: !core/ndarray-1.0.0
source: 1
shape: [256, 256]
datatype: bool8
byteorder: little
-
- An array where the data is stored in the first block in
another ASDF file.
- |
!core/ndarray-1.0.0
source: external.asdf
shape: [256, 256]
datatype: float64
byteorder: little
definitions:
scalar-datatype:
description: |
Describes the type of a single element.
There is a set of numeric types, each with a single identifier:
- `int8`, `int16`, `int32`, `int64`: Signed integer types, with
the given bit size.
- `uint8`, `uint16`, `uint32`, `uint64`: Unsigned integer types,
with the given bit size.
- `float32`: Single-precision floating-point type or "binary32",
as defined in IEEE 754.
- `float64`: Double-precision floating-point type or "binary64",
as defined in IEEE 754.
- `complex64`: Complex number where the real and imaginary parts
are each single-precision floating-point ("binary32") numbers,
as defined in IEEE 754.
- `complex128`: Complex number where the real and imaginary
parts are each double-precision floating-point ("binary64")
numbers, as defined in IEEE 754.
There are two distinct fixed-length string types, which must
be indicated with a 2-element array where the first element is an
identifier for the string type, and the second is a length:
- `ascii`: A string containing ASCII text (all codepoints <
128), where each character is 1 byte.
- `ucs4`: A string containing unicode text in the UCS-4
encoding, where each character is always 4 bytes long. Here
the number of bytes used is 4 times the given length.
anyOf:
- type: string
enum: [int8, uint8, int16, uint16, int32, uint32, int64, uint64,
float32, float64, complex64, complex128, bool8]
- type: array
items:
- type: string
enum: [ascii, ucs4]
- type: integer
minimum: 0
minLength: 2
maxLength: 2
datatype:
description: |
The data format of the array elements. May be a single scalar
datatype, or may be a nested list of datatypes. When a list, each field
may have a name.
anyOf:
- $ref: "#/definitions/scalar-datatype"
- type: array
items:
anyOf:
- $ref: "#/definitions/scalar-datatype"
- type: object
properties:
name:
type: string
pattern: "[A-Za-z_][A-Za-z0-9_]*"
description: The name of the field
datatype:
$ref: "#/definitions/datatype"
byteorder:
type: string
enum: [big, little]
description: |
The byteorder for the field. If not provided, the
byteorder of the datatype as a whole will be used.
shape:
type: array
items:
type: integer
minimum: 0
required: [datatype]
inline-data:
description: |
Inline data is stored in YAML format directly in the tree, rather than
referencing a binary block. It is made out of nested lists.
If the datatype of the array is not specified, it is inferred from
the array contents. Type inference is supported only for
homogeneous arrays, not tables.
- If any of the elements in the array are YAML strings, the
`datatype` of the entire array is `ucs4`, with the width of
the largest string in the column, otherwise...
- If any of the elements in the array are complex numbers, the
`datatype` of the entire column is `complex128`, otherwise...
- If any of the types in the column are numbers with a decimal
point, the `datatype` of the entire column is `float64`,
otherwise..
- If any of the types in the column are integers, the `datatype`
of the entire column is `int64`, otherwise...
- The `datatype` of the entire column is `bool8`.
Masked values may be included in the array using `null`. If an
explicit mask array is also provided, it takes precedence.
type: array
items:
anyOf:
- type: number
- type: string
- type: "null"
- $ref: "complex-1.0.0"
- $ref: "#/definitions/inline-data"
- type: boolean
anyOf:
- $ref: "#/definitions/inline-data"
- type: object
properties:
source:
description: |
The source of the data.
- If an integer: If positive, the zero-based index of the
block within the same file. If negative, the index from
the last block within the same file. For example, a
source of `-1` corresponds to the last block in the same
file.
- If a string, a URI to an external ASDF file containing the
block data. Relative URIs and ``file:`` and ``http:``
protocols must be supported. Other protocols may be supported
by specific library implementations.
The ability to reference block data in an external ASDF file
is intentionally limited to the first block in the external
ASDF file, and is intended only to support the needs of
[exploded](ref:exploded). For the more general case of
referencing data in an external ASDF file, use tree
[references](ref:references).
anyOf:
- type: integer
- type: string
format: uri
data:
description: |
The data for the array inline.
If `datatype` and/or `shape` are also provided, they must
match the data here and can be used as a consistency check.
`strides`, `offset` and `byteorder` are meaningless when
`data` is provided.
$ref: "#/definitions/inline-data"
shape:
description: |
The shape of the array.
The first entry may be the string `*`, indicating that the
length of the first index of the array will be automatically
determined from the size of the block. This is used for
streaming support.
type: array
items:
anyOf:
- type: integer
minimum: 0
- enum: ['*']
datatype:
description: |
The data format of the array elements.
$ref: "#/definitions/datatype"
byteorder:
description: >
The byte order (big- or little-endian) of the array data.
type: string
enum: [big, little]
offset:
description: >
The offset, in bytes, within the data for this start of this
view.
type: integer
minimum: 0
default: 0
strides:
description: >
The number of bytes to skip in each dimension. If not provided,
the array is assumed by be contiguous and in C order. If
provided, must be the same length as the shape property.
type: array
items:
anyOf:
- type: integer
minimum: 1
- type: integer
maximum: -1
mask:
description: >
Describes how missing values in the array are stored. If a
scalar number, that number is used to represent missing values.
If an ndarray, the given array provides a mask, where non-zero
values represent missing values in this array. The mask array
must be broadcastable to the dimensions of this array.
anyOf:
- type: number
- $ref: "complex-1.0.0"
- allOf:
- $ref: "ndarray-1.0.0"
- datatype: bool8
dependencies:
source: [shape, datatype, byteorder]
propertyOrder: [source, data, mask, datatype, byteorder, shape, offset, strides]
...