table: utilities for operations with tables

Code author: Peter Kraus

Provides convenience functions for operating with tables, including combine_namespaces() for combining ->-separated namespaces of values or chemicals into a single namespace; combine_columns() for combining individual columns into a single column; set_uncertainty() for stripping or replacing uncertainties from data, and apply_linear() and apply_inverse() for applying linear corrections to columns or namespaces.

Functions

`sum_namespace`(namespace[, output, fillnan, _inp])	Sums all entries within the provided namespace into one column, defined by `output`.
`combine_namespaces`(a, b[, conflicts, ...])	Combines two namespaces into one, either summing or merging entries.
`combine_columns`(a, b[, conflicts, fillnan, ...])	Combines two columns into one, by summing or replacing missing elements.
`set_uncertainty`([namespace, column, abs, ...])	Allows for stripping or replacing uncertainties using absolute and relative values.
`apply_linear`([namespace, column, slope, ...])	Allows for applying linear functions / corrections to columns and namespaces.
`apply_inverse`([namespace, column, slope, ...])	Allows for applying inverse linear functions / corrections to columns and namespaces.

dgpost.transform.table.sum_namespace(namespace: dict[str, Quantity], output: str = 'output', fillnan: bool = True, _inp: dict = {}) → dict[str, Quantity]

Sums all entries within the provided namespace into one column, defined by output.

Parameters:

namespace – Namespace to be summed.
fillnan – Toggle whether NaN values within the columns ought to be treated as zeroes (when True) or as NaN. Default is True.
output – Namespace of the returned dictionary. Defaults to output.

dgpost.transform.table.combine_namespaces(a: dict[str, Quantity], b: dict[str, Quantity], conflicts: str = 'sum', output: str = None, fillnan: bool = True, chemicals: bool = False, _inp: dict = {}) → dict[str, Quantity]

Combines two namespaces into one, either summing or merging entries.

Unit checks are performed, with the resulting units corresponding to the units in namespace a. By default, the output namespace is set to a. Optionally, the keys in each namespace can be treated as chemicals instead of strings (i.e. “C2H6” and “ethane” would be summed / merged).

Parameters:

a – Namespace a.
b – Namespace b.
conflicts – Name resolution scheme. Can be either "sum" where conflicts are summed, or "replace", where conflicting values in a are overwritten by b.
fillnan – Toggle whether NaN values within the columns ought to be treated as zeroes or as NaN. Default is True.
chemicals – Treat keys within a and b as chemicals, and combine them accordingly. Default is False.
output – Namespace of the returned dictionary. Defaults to the namespace of a.

dgpost.transform.table.combine_columns(a: Quantity, b: Quantity, conflicts: str = 'sum', fillnan: bool = True, output: str = None, _inp: dict = {}) → dict[str, Quantity]

Combines two columns into one, by summing or replacing missing elements.

Unit checks are performed, with the resulting units corresponding to the units in column a. By default, the output column is set to a.

Parameters:

a – Column a.
b – Column b.
conflicts – Conflict resolution scheme. Can be either "sum" where the two columns are summed, or "replace", where values in column a are overwritten by non-np.nan values in column b.
fillnan – Toggle whether NaN values within the columns ought to be treated as zeroes or as NaN. Default is True. Note that conflicts="replace" and fillnan="true" will lead to all NaN``s in column ``a to be set to zero.
output – Namespace of the returned dictionary. By defaults to the name of column a.

dgpost.transform.table.set_uncertainty(namespace: dict[str, Quantity] = None, column: Quantity = None, abs: Quantity | float = None, rel: Quantity | float = None, _inp: dict = {}) → dict[str, Quantity]

Allows for stripping or replacing uncertainties using absolute and relative values. Can target either namespaces, or individual columns. If both abs and rel uncertainty is provided, the higher of the two values is set. If neither abs nor rel are provided, the uncertainties are stripped.

Parameters:

namespace – The prefix of the namespace for which uncertainties are to be replaced or stripped. Cannot be supplied along with column.
column – The name of the column for which uncertainties are to be replaced or stripped. Cannot be supplied along with namespace
abs – The absolute value of the uncertainty. If units are not supplied, the units of the column/namespace will be used. If both abs and rel are None, the existing uncertainties will be stripped.
rel – The relative value of the uncertainty, should be in dimensionless units. If both abs and rel are None, the existing uncertainties will be stripped.

dgpost.transform.table.apply_linear(namespace: dict[str, Quantity] = None, column: Quantity = None, slope: Quantity | float = None, intercept: Quantity | float = None, nonzero_only: bool = True, minimum: Quantity | float = None, maximum: Quantity | float = None, output: str = 'output', _inp: dict = {}) → dict[str, Quantity]

Allows for applying linear functions / corrections to columns and namespaces.

Given the linear formula, \(y = m \times x + c\), this function returns \(y\) calculated from the input, where \(m\) is the provided slope and \(c\) is the provided intercept.

The arguments slope and intercept can be provided without units, in which case the units of column or namespace are preserved. If units of slope and intercept are provided, they have to be dimensionally consistent.

The arguments slope and intercept can be provided with uncertainties, in which case the uncertainty of the output value is calculated from linear uncertainty propagation.

Parameters:

namespace – The prefix of the namespace for which uncertainties are to be replaced or stripped. Cannot be supplied along with column.
column – The name of the column for which uncertainties are to be replaced or stripped. Cannot be supplied along with namespace
slope – The slope \(m\).
intercept – The intercept \(c\).
nonzero_only – Whether the linear function should be applied to nonzero values of \(x\) only, defaults to True.
minimum – The minimum of the returned value. If the calculate value is below the minimum, the minimum is returned.
maximum – The maximum of the returned value. If the calculate value is above the maximum, the maximum is returned.
output – Name of the output column or the namespace of the output columns.

dgpost.transform.table.apply_inverse(namespace: dict[str, Quantity] = None, column: Quantity = None, slope: Quantity | float = None, intercept: Quantity | float = None, nonzero_only: bool = True, minimum: Quantity | float = None, maximum: Quantity | float = None, output: str = 'output', _inp: dict = {}) → dict[str, Quantity]

Allows for applying inverse linear functions / corrections to columns and namespaces.

Given the linear formula, \(y = m \times x + c\), this function returns \(x\), calculated from the input, i.e. \(x = (y - c) / m\), where \(m\) is the provided slope and \(c\) is the provided intercept.

The arguments slope and intercept can be provided without units, in which case the units of column or namespace are preserved. If units of slope and intercept are provided, they have to be dimensionally consistent.

The arguments slope and intercept can be provided with uncertainties, in which case the uncertainty of the output value is calculated from linear uncertainty propagation.

Parameters:

namespace – The prefix of the namespace for which uncertainties are to be replaced or stripped. Cannot be supplied along with column.
column – The name of the column for which uncertainties are to be replaced or stripped. Cannot be supplied along with namespace
slope – The slope \(m\).
intercept – The intercept \(c\).
nonzero_only – Whether the linear function should be applied to nonzero values of \(x\) only, defaults to True.
minimum – The minimum of the returned value. If the calculate value is below the minimum, the minimum is returned.
maximum – The maximum of the returned value. If the calculate value is above the maximum, the maximum is returned.
output – Name of the output column or the namespace of the output columns.