yadg features
Units and uncertainties
One of the key features of yadg is the enforced association of units and
uncertainties with measured properties. This means that all floating-point values
are stored in the format {"n": float, "s": float, "u": str}
, where "n"
is
the nominal value, "s"
is the uncertainty, and "u"
is the unit.
yadg uses the pint package to
validate units in the created datagrams, if requested. For this, an extended
pint.UnitRegistry
is defined in yadg, containing definitions of some
quantities present in the raw data files - this pint.UnitRegistry
should
be used in downstream packages depending on yadg. An arbitrary unit is denoted
as " "
. See yadg.dgutils.pintutils
for more info.
In many cases it is possible to define more than one uncertainty: for example,
accuracy, precision, resolution etc. may be available. The convention in yadg
is that if both a measure of within-measurement uncertainty (resolution) and a
cross-measurement error (accuracy) are available, "s"
corresponds to resolution
associated with each datapoint, and the accuracy of the measurement (which is
normally a higher value than that of the resoution) should be noted in the step
metadata.
Unless more information is available, when converting str
data to
float
, the uncertainty is determined from the last significant digit
specified in the str
. For this, the functionality from within the
uncertainties package is used.
When derived data is generated by yadg, error propagation is handled using the linear error propagation functionality as implemented in the uncertainties package.
Timestamping
Another key feature in yadg is the timestamping of all datapoints. The Unix timestamp is used, as it’s the natural timestamp for Python, and with its second resolution it can be easily converted to minutes or hours.
Most of the supported file formats contain a timestamp of some kind. However, several
file formats may not define both date and time of each datapoint, or may define
neither. That is why yadg includes a powerful “external date” interface, see
yadg.dgutils.dateutils.complete_timestamps()
.
Object validation
Additionally, yadg provides schema and datagram validation functionality,
should be used to ensure that the incoming schema is valid and that the processed
datagram conforms to the specification. The documentation of this functionality
is discussed in yadg.core.validators
. Among others, the validators ensure
that provenance data is included for every operation, and that uncertainties and
units are specified for each measurement.