Key features of **yadg**
------------------------
Units and uncertainties
```````````````````````
One of the key features of yadg is the enforced association of units and uncertainties
with measured properties. This means that all floating-point values are stored in the
format ``{"n": float, "s": float, "u": str}``, where ``"n"`` is the nominal value,
``"s"`` is the uncertainty / error estimate, and ``"u"`` is the unit.
Units
+++++
yadg uses the `pint `_ package to
validate units in the created `datagrams`. For this, an extended
:class:`pint.UnitRegistry` is exposed in yadg, containing definitions of some
quantities present in the raw data files in addition to pint's standard unit
registry. This :class:`pint.UnitRegistry` should be used in downstream packages
which depend on yadg. An arbitrary unit is denoted as ``" "``. See
:mod:`yadg.dgutils.pintutils` for more info.
Uncertainties
+++++++++++++
In many cases it is possible to define more than one uncertainty: for example,
accuracy, precision, instrument resolution etc. may be available. The convention
in yadg is that when both a measure of within-measurement uncertainty (resolution)
and a cross-measurement error (accuracy) are available, ``"s"`` corresponds to
the instrumental resolution associated with each datapoint, and the accuracy of
the measurement (which is normally a higher value than that of the resoution)
should be noted in the step metadata.
Unless more information is available, when converting :class:`str` data to
:class:`float`, the uncertainty is determined from the last significant digit
specified in the :class:`str`. For this, the functionality from within the
`uncertainties `_ package is used.
When derived data is generated by yadg, error propagation is handled using the
linear error propagation functionality as implemented in the
`uncertainties `_ package.
Timestamping
````````````
Another key feature in yadg is the timestamping of all datapoints. The Unix
timestamp is used, as it's the natural timestamp for Python, and with its second
resolution it can be easily converted to minutes or hours.
Most of the supported file formats contain a timestamp of some kind. However, several
file formats may not define both date and time of each datapoint, or may define
neither. That is why yadg includes a powerful "external date" interface, see
:func:`yadg.dgutils.dateutils.complete_timestamps`.
Object validation
`````````````````
Additionally, yadg provides `dataschema` and `datagram` validation functionality.
The validation of `dataschema` is handled using a
`Pydantic `_ model implemented in the
:mod:`dgbowl_schemas.yadg_dataschema` package, developed in lockstep with yadg.
This Pydantic-based validator class should be used to ensure that the incoming
`dataschema` is valid.
The validation of the created `datagram` is handled by :mod:`yadg.core.validators`.
By default, yadg checks that the `datagram` conforms to the specification. Among
others, the validator ensures that provenance data is included for every operation,
and that uncertainties and units are specified for each measurement.