**dgpost** features
-------------------
.. note::
For an overview of the data-processing features within dgpost, see the
documentation of the :mod:`dgpost.transform` module.
`Pandas `_ compatibility
````````````````````````````````````````````````````
One of the design goals of dgpost was to develop a library that can be used with
`datagrams`, the :class:`pd.DataFrames` created by dgpost, as well as with any
other :class:`pd.DataFrames`, created e.g. by parsing an ``xlsx`` or ``csv``
file.
This is achieved by placing some necessary requirements on the functions in the
:mod:`dgpost.transform` module. The key requirements are:
- the function must process :class:`pint.Quantity` objects,
- the function must return data in a :class:`dict[str, pint.Quantity]` format.
If these requirements are met, the decorator function
:func:`~dgpost.transform.helpers.load_data` can be used to either extract data
from the supplied :class:`pd.DataFrame`, or wrap directly supplied data into
:class:`pint.Quantity` objects, and supply those into the called ``transform``
function transparently to the user.
Units and uncertainties
```````````````````````
Another key objective of dgpost is to allow and encourage annotating data by units
as well as error estimates / uncertainties. The design philosophy here is that by
building unit- and uncertainty- awareness into the toolchain, users will be encouraged
to use it, and in case of uncertainties, be more thoughtful about the limitations
of their data.
As discussed in the
`documentation of yadg `_,
when experimental data is loaded from `datagrams`, it is annotated with units by
default. In dgpost, the units for the data in each column in each table are stored
as a :class:`dict[str, str]` in the ``"units"`` key of the ``df.attrs`` attribute,
and they are extracted and exported appropriately when the table is saved.
If the ``df.attrs`` attribute does not contain the ``"units"`` entry, dgpost assumes
the underlying data is unitless, and the default units selected for each function in
the :mod:`dgpost.transform` library by its developers are applied to the data.
Internally, all units are handled using yadg's custom :class:`pint.UnitRegistry`,
via the `pint `_ library.
Uncertainties are handled using the linear uncertainty propagation library,
`uncertainties `_. As the input data for
the functions in the :mod:`dgpost.transform` module is passed using
:class:`pint.Quantity` objects, which supports the :class:`uncetainties.unumpy`
arrays, uncertainty handling is generally transparent to both user and developer.
The notable exceptions here are transformations using fitting functions from the
`scipy `_ library, where arrays containing
:class:`floats` are expected - this has to be handled explicitly by the developer.
When saving tables created in dgpost, the units are appended to the column
names (``csv/xlsx``) or stored in the table (``pkl/json``), while the uncertainties
may be optionally dropped from the exported table; see :mod:`dgpost.utils.save`.
Provenance
``````````
Provenance tracking is implemented in dgpost using the ``"meta"`` entry of the
``df.attrs`` attribute of the created :class:`pd.DataFrame`. This entry is exported
when the :class:`pd.DataFrame` is saved as ``pkl/json``, and contains dgpost version
information as well as a copy of the `recipe` used to create the saved object.