dgpost.transform package

catalysis: common calculations in catalytic testing

Code author: Peter Kraus

Includes functions to calculate the reactant- and product-based conversion(), atom-based selectivity(), catalytic_yield(), and atom_balance().

Names of compounds within the specified mixtures are parsed to SMILES, and the cross-matching of the feestock, internal standard, and the components of xin and xout (or rin and rout) is performed using these SMILES.

Note

This module assumes that the provided inlet and outlet compositions, whether mole fractions or molar rates, contain all species. This implies that the atomic balance of the inlet and outlet is near unity. If multiple inlet/outlet streams are present in the experiment, they need to be appropriately combined into a single namespace, e.g. using functions in the dgpost.transform.rates before using the functions in this module.

dgpost.transform.catalysis.conversion(feedstock, xin=None, xout=None, rin=None, rout=None, element=None, product=True, standard='N2', output=None)

Calculates the conversion of feedstock \(f\) into other products \(p\). The feedstock must be a species present in the inlet. By default, a product-based carbon conversion (\(\text{el} = \text{C}\)), using data from the outlet, is computed.

The product-based conversion \(X_p\) is calculated using the outlet mole fractions \(x_\text{out}(s)\) of species \(s\), or the outlet molar rates \(\dot{n}_\text{out}(s)\). It is calculated on an elemental basis by considering the number of atoms of a certain element, \(n_\text{el}\), as:

\[X_{p, \text{el}} = \frac{\sum_{s} n_\text{el}(s) x_\text{out}(s) - n_\text{el}(f) x_\text{out}(f)} {\sum_{s} n_\text{el}(s) x_\text{out}(s)} = \frac{\sum_{s} n_\text{el}(s) \dot{n}_\text{out}(s) - n_\text{el}(f) \dot{n}_\text{out}(f)} {\sum_{s} n_\text{el}(s) \dot{n}_\text{out}(s)}\]

Which requires the feedstock \(f\) to be quantified in the outlet composition. If the outlet composition of \(f\) is found to be zero at all timesteps, its value in the the inlet composition is used instead, and a “mixed” conversion \(X_m\) is calculated instead:

\[X_{m, \text{el}} = \frac{\sum_{p}^{p\ne f} n_\text{el}(p) x_\text{out}(p) f_e} {n_\text{el}(f) x_\text{in}(f)} = \frac{\sum_{p}^{p\ne f} n_\text{el}(p) \dot{n}_\text{out}(s)} {n_\text{el}(f) \dot{n}_\text{in}(f)}\]

Here, the expansion factor \(f_e\) is introduced, which is defined as the expansion ratio of the internal standard in the mixture: \(f_e = x_\text{out}(\text{i.s.}) / x_\text{in}(\text{i.s.})\). The internal standard is set to N2 by default. When molar rates \(\dot{n}\) are used, accounting for expansion in this way is not necessary.

Note

Calculating product-based conversion \(X_m\) using inlet mole fraction of feedstock is not ideal and should be avoided, as the value is strongly convoluted with the atom_balance() of the mixtures. A warning will be raised by the program.

Finally, the calculation of reactant-based (or feedstock-based) conversion, \(X_r\), proceeds as follows:

\[X_{r} = \frac{x_\text{in}(f) - x_\text{out}(f) f_e}{x_\text{in}(f)} = \frac{\dot{n}_\text{in}(f) - \dot{n}_\text{out}(f)}{\dot{n}_\text{in}(f)}\]

The selection of the element \(\text{el}\) is meaningless for the reactant -based conversion.

Warning

Note that the calculated values of \(X_p\), \(X_m\), and \(X_r\) will differ from one another in cases where the atomic balances of the inlet and outlet mixtures are different from unity!

Parameters
  • feedstock (str) – Name of the feedstock. Parsed into SMILES and matched against xin and xout.

  • xin (Optional[dict[str, Quantity]]) – A dictionary containing the composition of the inlet mixture with the names of the chemicals as str keys. Has to be supplied with xout. Cannot be supplied with rin or rout.

  • xout (Optional[dict[str, Quantity]]) – A dictionary containing the composition of the outlet mixture with the names of the chemicals as str keys. Has to be supplied with xin. Cannot be supplied with rin or rout.

  • rin (Optional[dict[str, Quantity]]) – A dictionary containing the molar rates of species in the inlet mixture with the names of the chemicals as str keys. Has to be supplied with rout. Cannot be supplied with xin or xout.

  • rout (Optional[dict[str, Quantity]]) – A dictionary containing the molar rates of species in the outlet mixture with the names of the chemicals as str keys. Has to be supplied with rin. Cannot be supplied with xin or xout.

  • element (Optional[str]) – Name of the element for determining conversion. If not specified, set to the highest-priority (C > H > O) element in feedstock.

  • product (bool) – Product-based conversion toggle. Default is True.

  • standard (str) – Internal standard for normalizing the compositions. By default set to “N2”. Only used when xin and xout are supplied.

  • output (Optional[str]) – A str name of the output variable.

Returns

ret – A dict containing the calculated conversion with output as its key.

Return type

dict[str, pint.Quantity]

dgpost.transform.catalysis.selectivity(feedstock, xout=None, rout=None, element=None, output=None)

Calculates product-based atomic selectivities of all species \(s\), excluding the feedstock \(f\). The sum of selectivities is normalised to unity. Works with both mole fractions \(x\) as well as molar rates \(\dot{n}\):

\[S_\text{el}(p) = \frac{\sum_{p}^{p \ne f} n_\text{el}(p) x_\text{out}(p)} {\sum_{s} n_\text{el}(s) x_\text{out}(s)} = \frac{\sum_{p}^{p \ne f} n_\text{el}(p) \dot{n}_\text{out}(p)} {\sum_{s} n_\text{el}(s) \dot{n}_\text{out}(s)}\]

Only the outlet fractions/rates are considered. The sum on in the numerator runs over all products \(p\) (i.e. \(p \ne f\)), while the sum in the denominator runs over all species \(s\). The value \(n_\text{el}(s)\) is the number of atoms of element \(\text{el}\) in species \(s\).

Note

The selectivity calculation assumes that all products have been determined; it provides no information about the mass or atomic balance.

Parameters
  • feedstock (str) – Name of the feedstock. Parsed into SMILES and matched against the specified inlet and outlet species

  • xout (Optional[dict[str, Quantity]]) – Prefix of the columns determining the outlet composition.

  • rout (Optional[dict[str, Quantity]]) – Prefix of the columns determining the outlet molar rates.

  • element (Optional[str]) – The element for determining conversion. If not specified, set from feedstock using dgpost.transform.chemhelpers.default_element().

  • output (Optional[str]) – A str prefix for the output variables.

Returns

ret – A dict containing the calculated selectivities using output as the prefix for each key.

Return type

dict[str, pint.Quantity]

dgpost.transform.catalysis.catalytic_yield(feedstock, xin=None, xout=None, rin=None, rout=None, element=None, standard='N2', output=None)

Calculates the catalytic yield \(Y_p\), defined as the product of conversion and selectivity. Uses product-based conversion of feedstock for an internal consistency with selectivity. The sum of all yields is equal to the conversion. Implicitly runs conversion() and selectivity() on the pd.DataFrame:

\[Y_{p, \text{el}}(s) = X_{p, \text{el}}(s) \times S_{p, \text{el}}(s)\]

Where \(s\) is the product species, and the subscript \(p\) denotes it is a product-based quantity, with respect to element \(\text{el}\).

Parameters
  • xin (Optional[dict[str, Quantity]]) – Prefix of the columns determining the inlet composition.

  • xout (Optional[dict[str, Quantity]]) – Prefix of the columns determining the outlet composition.

  • feedstock (str) – Name of the feedstock. Parsed into SMILES and matched against xin and xout.

  • element (Optional[str]) – The element for determining conversion. If not specified, set from feedstock using dgpost.transform.chemhelpers.default_element().

  • standard (str) – Internal standard for normalizing the compositions. By default set to “N2”.

  • output (Optional[str]) – A str prefix for the output variables.

Returns

ret – A dict containing the calculated yields using output as the prefix for each key.

Return type

dict[str, pint.Quantity]

dgpost.transform.catalysis.atom_balance(xin=None, xout=None, rin=None, rout=None, element='C', standard='N2', output=None)

Calculates the atom balance of an element \(\text{el}\) between the inlet and outlet mixtures. The total number of atoms of the specified element is compared between between the inlet and outlet, normalized by an internal standard if necessary:

\[\text{atbal}_\text{el} = \frac{\sum_s n_\text{el}(s) x_\text{out}(s) f_e} {\sum_s n_\text{el}(s) x_\text{in}(s)} = \frac{\sum_s n_\text{el}(s) \dot{n}_\text{out}(s)} {\sum_s n_\text{el}(s) \dot{n}_\text{in}(s)}\]

Here the sum runs over all species \(s\), \(n_\text{el}(s)\) is the number of atoms of element \(\text{el}\) in \(s\), and \(f_e\) is the expansion factor calculated using the internal standard as \(f_e = x_\text{out}(\text{i.s.}) / x_\text{in}(\text{i.s.})\).

Parameters
  • xin (Optional[dict[str, Quantity]]) – Prefix of the columns determining the inlet composition as a mole fraction.

  • xout (Optional[dict[str, Quantity]]) – Prefix of the columns determining the outlet composition as a mole fraction.

  • rin (Optional[dict[str, Quantity]]) – Prefix of the columns determining the inlet composition as a molar rate.

  • rout (Optional[dict[str, Quantity]]) – Prefix of the columns determining the outlet composition as a molar rate.

  • element (str) – The element for determining conversion. If not specified, set to "C".

  • standard (str) – Internal standard for normalizing flows. By default set to "N2".

  • output (Optional[str]) – A str prefix for the output variables.

Returns

ret – A dict containing the calculated atomic balance using output as the key.

Return type

dict[str, pint.Quantity]

electrochemistry: calculations relevant in electrochemistry

Code author: Peter Kraus

Includes functions for calculating applied voltage correction via the nernst() equation, calculation of the Faradaic efficiency (fe()), as well as the calculation of total charge(), and the average_current() from the total charge and timestamps.

dgpost.transform.electrochemistry.nernst(Ewe, R=<Quantity(0.0, 'ohm')>, I=<Quantity(0.0, 'ampere')>, Eref=<Quantity(0.0, 'volt')>, T=<Quantity(298.15, 'kelvin')>, n=None, Q=None, pH=None, output='Eapp')

Correct measured voltage to calculate the applied voltage, using corrections for Ohmic drop, reference potential vs RHE, and ionic concentration using the Nernst equation, either by providing \(Q\) and \(n\), or by providing the \(pH\).

This function corrects the measured voltage \(E_\text{we}\) to applied voltage \(E\) by implementing the following equation:

\[\begin{split}E &= E_\text{we} + E_\text{ref} + RI + E_N \\ E_N &= - \frac{\overline{R} T}{nF} ln(Q) = \frac{\overline{R} T}{F} ln(10) \text{pH}\end{split}\]

where \(E_\text{ref}\) is the potential of the reference electrode vs RHE, \(RI\) is the product of cell resistance and applied current, \(\overline{R}\) is the molar gas constant, \(T\) is the temperature, \(n\) is the number of electrones transferred, \(F\) is the Faraday constant, \(Q\) is the reaction quotient, and \(\text{pH}\) is the pH of the electrolyte.

Parameters
  • Ewe (Quantity) – The measured working potential with respect to the reference electrode. By default in V.

  • R (Quantity) – The resistance of the cell. By default in Ω.

  • I (Quantity) – The applied current. By default in A.

  • Eref (Quantity) – The potential of the reference electrode with respect to RHE. By default in V.

  • T (Quantity) – Temperature of the working electrode, used in the Nernst equation. By default 298.15 K.

  • n (Optional[int]) – Number of electrons transferred in the process described by the Nernst equation. Must be specified along Q, cannot be specified with pH.

  • Q (Optional[float]) – The reaction quotient of the components of the process described by the Nernst equation. Must be specified along n, cannot be specified with pH.

  • pH (Optional[float]) – The pH of the solution. This assumes the modelled process is the reduction of \(\text{H}^+ \rightarrow \text{H}\), i.e. \(n = 1\) and \(\text{ln}(Q) = -\text{ln}(10)\text{pH}\)

  • output (str) – Name of the output variable. Defaults to Eapp.

Returns

dict(output, E) – Returns the calculated applied current in V.

Return type

dict[str, pint.Quantity]

dgpost.transform.electrochemistry.fe(rate, I, charges, output=None)

Calculate the Faradaic efficiency \(\eta_F\) from a set of molar rates \(\dot{n}\) corresponding to a single mixture, and the applied current \(I\) required to produce that mixture from a source mixture. A set of formal atomic charges corresponding to those in the species comprising the source mixture has to be supplied.

This function implements the following equation to calculate \(\eta_F\):

\[\eta_F(x) = \frac{n_\text{el}(x)\dot{n}(x)}{I}\]

where \(x\) is a species in the mixture, \(n_\text{el}(x)\) is the number of electrons required to produce \(x\) from the ions specified in the source mixture, and \(\dot{n}(x)\) is the molar rate (production or flow) of species \(x\).

Parameters
  • rate (dict[str, Quantity]) – A dict of molar flow or production rates of species in a mixture. By default in mol/l.

  • I (Quantity) – The applied current. By default in A.

  • charges (dict[str, int]) – A dict of formal atomic/ionic charges of atoms in the source mixture.

  • output (Optional[str]) – Name of the prefix for the species in the output variable. Defaults to fe.

Returns

dict(output, E) – Returns the calculated Faradaic efficiencies.

Return type

dict[str, pint.Quantity]

dgpost.transform.electrochemistry.charge(time, I, t0=None, output='Q')

Calculate the total charge \(Q\) as a time-integral of the electric current:

\[Q_n = \int_{t_0}^{t_n} I(t) dt = \sum_0^n I_n (t_n - t_{n-1})\]

where \(I_n\) is the instantaneous current and \(t_n\) is the time at the \(n\)-th datapoint.

Parameters
  • time (Quantity) – An array of timestamps at which the instantaneous current was measured.

  • I (Quantity) – Values of the instantaneous current.

  • t0 (Optional[Quantity]) – An optional timestamp representing the time at which charge was zero. If not supplied, the first value in the time array will be used, with the charge at that timestamp set to zero.

  • output (str) – Prefix of the columns where the calculated rate will be stored.

Returns

dict(output, Q) – Returns the integrated electrical charge.

Return type

dict[str, pint.Quantity]

dgpost.transform.electrochemistry.average_current(time, Q, t0=None, output='<I>')

Calculate the average current \(<I>\) from a set of timestamped values of electical charge,

\[<I> = \frac{dQ}{dt} = \frac{Q_n - Q_{n-1}}{t_n - t_{n-1}}\]

where \(Q_n\) is the charge and \(t_n\) is the time at the \(n\)-th datapoint.

Parameters
  • time (Quantity) – An array of timestamps at which the instantaneous current was measured.

  • I – Values of the instantaneous current.

  • t0 (Optional[Quantity]) – An optional timestamp representing the time at which charge was zero. If not supplied, the first value in the time array will be used, with the charge at that timestamp set to zero.

  • output (str) – Prefix of the columns where the calculated rate will be stored.

Returns

dict(output, I) – Returns the average electrical current.

Return type

dict[str, pint.Quantity]

impedance: fitting electrochemical impedance spectra

Code author: Ueli Sauter, Peter Kraus

Including functions relevant for Electrochemical Impedance Spectroscopy (EIS): evaluation of an equivalent circuit as well as the fitting of the circuit to data (calc_circuit() and fit_circuit()), and an interpolation function to find the lowest_real_impedance() in the data.

Note

The functions in this module expect the whole EIS trace as input - the \(\text{Re}(Z)\), \(-\text{Im}(Z)\) and \(f\) are expected to be pint.Quantity containing an np.ndarray (or similar list-like object), which is then processed to a (set of) scalar values. This means that for processing time resolved data, the functions in this module have to be called on each timestep individually.

dgpost.transform.impedance.fit_circuit(real, imag, freq, circuit, initial_values, fit_bounds=None, fit_constants=None, ignore_neg_res=True, upper_freq=inf, lower_freq=0, repeat=1, output='fit_circuit')

Fits an equivalent circuit to the frequency-resolved EIS data.

For the fitting an equivalent circuit is needed, defined as a str. The circuit may be composed of multiple circuit elements. To combine elements in a series a dash (-) is used. Elements in parallel are wrapped by p( , ). An element is defined by an identifier (usually letters) followed by a digit. Already implemented elements are located in the circuit_utils.circuit_components module:

Name

Symbol

Parameters

Bounds

Units

Resistor

R

R

(1e-6, 1e6)

Ω

Capacitance

C

C

(1e-20, 1)

F

Constant Phase Element

CPE

CPE_Q

(1e-20, 1)

Ω⁻¹sᵃ

CPE_a

(0, 1)

Warburg element

W

W

(0, 1e10)

Ω⁻¹s¹ᐟ²

Warburg short element

Ws

Ws_R

(0, 1e10)

Ω

Ws_T

(1e-10, 1e10)

s

Warburg open element

Wo

Wo_R

(0, 1e10)

Ω

Wo_T

(1e-10, 1e10)

s

Additionally an initial guess for the fitting parameters is needed. The initial guess is given as a dict where each key is the parameter name and the corresponding value is the initial value for the circuit.

The bounds of each parameter can be customized by the fit_bounds parameter. This parameter is a dict, where each key is the parameter name and the value consists of a tuple for the lower and upper bound (lb, ub).

To hold a parameter constant, add the name of the parameter to a list and pass it as fit_constants

Parameters
  • real (Quantity) – A pint.Quantity object containing the real part of the impedance data, \(\text{Re}(Z)\). The unit of the provided gets converted to ‘Ω’.

  • imag (Quantity) – A pint.Quantity object containing the imaginary part of the impedance data, \(-\text{Im}(Z)\). The unit of the provided gets converted to ‘Ω’.

  • freq (Quantity) – A pint.Quantity object containing the frequency \(f\) of the impedance data. The unit of the provided data should gets converted to ‘Hz’.

  • circuit (str) – A str description of the equivalent circuit.

  • initial_values (dict[str, float]) – A dict with the initial (guess) values. Structure: {“param name”: value, … }

  • output (str) – A str prefix

  • fit_bounds (Optional[dict[str, tuple[float, float]]]) – Custom bounds for a parameter if default bounds are not wanted Structure: {“param name”: (lower bound, upper bound), …} Default is ‘’None’’

  • fit_constants (Optional[list[str]]) – list of parameters which should stay constant during fitting Structure: [“param name”, …] Default is ‘’None’’

  • ignore_neg_res (bool) – ignores impedance values with a negative real part

  • upper_freq (float) – upper frequency bound to be considered for fitting

  • lower_freq (float) – lower frequency bound to be considered for fitting

  • repeat (int) – how many times fit_routine gets called

Returns

A tuple containing two dicts.
  • parameters, contains all the values of the parameters accessible by their names

  • units, contains all the units of the parameters accessible by their names

Return type

(parameters, units)

dgpost.transform.impedance.calc_circuit(freq, circuit, parameters, output='calc_circuit')

Calculates the complex impedance \(\text{Re}(Z)\) and \(-\text{Im}(Z)\) of the prescribed equivalent circuit as a function of frequency \(f\).

Parameters
  • freq (Quantity) – A pint.Quantity containing the frequencies \(f\) at which the equivalent circuit is to be evaluated. The provided data should be in “Hz”.

  • circuit (str) – A str description of the equivalent circuit. For more details see fit_circuit().

  • parameters (dict[str, float]) – A dict containing the values defining the equivalent circuit. Structure: {“param name”: value, … }

  • output (str) – A str prefix for the Re(Z) and -Im(Z) values calculated in this function. Defaults to "calc_circuit".

Returns

retvals – A dictionary containing the pint.Quantity arrays with the output-prefixed \(\text{Re}(Z)\) and \(-\text{Im}(Z)\) as keys.

Return type

dict[str, pint.Quantity]

dgpost.transform.impedance.lowest_real_impedance(real, imag, threshold=0.0, output='min Re(Z)')

A function that finds and interpolates the lowest \(\text{Re}(Z)\) value at which the complex impedance \(Z = \text{Re}(Z) - j \text{Im}(Z)\) is a real number (i.e. \(\text{Im}(Z) \approx 0\)). If the impedance does not cross the real-zero axis, the \(\text{Re}(Z)\) at which the \(\text{abs}(\text{Im}(Z))\) is the smallest is returned.

Parameters
  • real (Quantity) – A pint.Quantity object containing the real part of the impedance data, \(\text{Re}(Z)\). The units of real and imag are assumed to match.

  • imag (Quantity) – A pint.Quantity object containing the imaginary part of the impedance data, \(-\text{Im}(Z)\). The units of real and imag are assumed to match.

  • output (str) – A str name for the output column, defaults to "min Re(Z)".

Returns

retvals – A dictionary containing the pint.Quantity values of the real impedances with the output as keys.

Return type

dict[str, pint.Quantity]

rates: determining molar rates of chemical species

Code author: Peter Kraus

Includes functions to convert mixture compositions (concentration, mol fraction) from instantaneous flow data or continuous batch data to rates (dimension of [quantity]/[time]). The flow_to_molar() function is useful for converting gas-phase or liquid flows, while batch_to_molar() can be used to determine formation rates from the concentration profile of a batch mixture.

dgpost.transform.rates.flow_to_molar(flow, c=None, x=None, Tref=<Quantity(273.15, 'kelvin')>, pref=<Quantity(1, 'standard_atmosphere')>, output='rate')

Calculates a molar rate of species from specified flow and composition. The units of the rate have to be either dimensionless (for unit-naive dataframes) or in dimensions of [substance]/[time].

Currently, three combinations of units are supported:

  • Dimensionless flow and dimensionless comp, as is the case for unit-naive dataframes. In this case, the molar flow rates of species \(r_s\) are calculated by a simple multiplication:

    \[r_s = \text{flow} \times \text{comp}_s\]
  • Volumetric flow \(\dot{V}\) and composition comp as a concentration, as is often the case in liquid flows. In this case, the molar rates of species \(r_s\) are also a simple multiplication (accounting for unit conversion):

    \[r_s = \dot{V} \times c_s\]
  • Volumetric flow \(\dot{V}\) and composition comp as a dimensionless molar fraction \(x_s\), in which case the flow is assumed to be gas-phase. In this case, the flow has to be converted to molar units using the ideal gas law:

    \[r_s = \dot{V} \times \frac{p_\text{ref}}{RT_\text{ref}} \times x_s\]

    The pressure \(p_\text{ref}\) and temperature \(T_\text{ref}\) are specifying the state at which the flow \(\dot{V}\) has been measured.

Parameters
  • flow (Quantity) – The total flow of the mixture.

  • c (Optional[dict[str, Quantity]]) – A dictionary containing the composition of the mixture as concentration. Cannot be supplied at the same time as x.

  • x (Optional[dict[str, Quantity]]) – A dictionary containing the composition of the mixture as mole fraction. Assuming ideal gas-phase flow. Cannot be supplied at the same time as c.

  • Tref (Quantity) – Reference temperature of the flow measurement, used when composition is specified using a mol fraction. By default set to 273.15 K.

  • pref (Quantity) – Reference pressure of the flow measurement, used when composition is specified using a mol fraction. By default set to 1 atm.

  • output (str) – Prefix of the keys of the returned rate dictionary.

Return type

dict[str, Quantity]

dgpost.transform.rates.batch_to_molar(time, c, V, t0=None, output='rate')

Calculates a molar rate of species from specified volume and composition at the specified timesteps. The units of the rate have to be either dimensionless (for unit-naive dataframes) or in dimensions of [substance]/[time].

First, the \(\delta t\) and \(\delta c(x)\) at each timestep \(n\) is calculated:

\[\delta t_n = t_n - t_{n-1} \delta c(x)_n = c(x)_n - c(x)_{n-1}\]

Then, the formation rate is calculated using the volume:

Parameters
  • time (Quantity) – An array of timestamps at which the concentrations and volumes are measured.

  • c (dict[str, Quantity]) – A dictionary containing concentrations of species at the specified timestamps.

  • V (Quantity) – Volume of the batch at the timestamps.

  • t0 (Optional[Quantity]) – An optional timestamp representing the initial time where all concentrations are zero. If not supplied, the calculation will use the first datapoint as reference with its rates set to zero.

  • output (str) – Prefix of the columns where the calculated rate will be stored.

Return type

dict[str, Quantity]

helpers: helper functions for the transform package

Code author: Peter Kraus, Ueli Sauter

dgpost.transform.helpers.element_from_formula(f, el)

Given a chemical formula f, returns the number of atoms of element el in that formula.

Return type

int

dgpost.transform.helpers.default_element(f)

Given a formula f, return the default element for calculating conversion. The priority list is ["C", "O", "H"].

Return type

str

dgpost.transform.helpers.name_to_chem(name)
Return type

str

dgpost.transform.helpers.columns_to_smiles(**kwargs)

Creates a dictionary with a SMILES representation of all chemicals present among the keys in the kwargs, storing the returned chemicals.ChemicalMetadata as well as the full name within args.

Parameters

kwargs (dict[str, dict[str, Any]]) – A dict containing dict[str, Any] values. The str keys of the inner dicts are parsed to SMILES.

Returns

smiles – A new dict[str, dict] containing the SMILES of all prefixed chemicals as str keys, and the metadata and column specification as the dict values.

Return type

dict

dgpost.transform.helpers.electrons_from_smiles(smiles, ions=None)
Return type

float

dgpost.transform.helpers.pQ(df, col)

Unit-aware dataframe accessor function.

Given a dataframe in df and a column name in col, the function looks through the units stored in df.attrs["units"] and returns a unit-annotated pint.Quantity containing the column data.

Note

If df.attrs has no units, or col is not in df.attrs["units"], the returned pint.Quantity is dimensionless.

Parameters
  • df (DataFrame) – A pd.DataFrame, optionally annotated with units in df.attrs.

  • col (str) – The str name of the column to be loaded from the df.

Returns

Quantity – Unit-aware ping.Quantity object containing the data from df[col].

Return type

pint.Quantity

dgpost.transform.helpers.separate_data(data, unit=None)

Separates the data into values, errors and units

Parameters
  • data (Quantity) – A pint.Quantity object containing the data points. Can be either float or uc.ufloat.

  • unit (Optional[str]) – When specified, converts the data to this unit.

Returns

Converted nominal values and errors, and the original unit of the data.

Return type

(values, errors, old_unit)

dgpost.transform.helpers.load_data(*cols)

Decorator factory for data loading.

Creates a decorator that will load the columns specified in cols and calls the wrapped function func as appropriate. The func has to accept pint.Quantity objects, return a dict[str, pint.Quantity], and handle an optional parameter "output" which prefixes (or assigns) the output data in the returned dict appropriately.

The argument of the decorator is a list[tuple], with each element being a are tuple[str, str, type]. The first field in this tuple is the str name of the argument of the decorated func, the second str field denotes the default units for that argument (or None for a unitless quantity), and the type field allows the use of the decorator with functions that expect list of points in the argument (such as trace-processing functions) or dict of pint.Quantity objects (such as functions operating on chemical compositions).

The decorator handles the following cases:

  • the decorated func is launched directly, either with kwargs or with a mixture of args and kwargs:

    • the args are assigned into kwargs using their position in the args and cols array as provided to the decorator

    • all elements in kwargs that match the argument names in the cols list provided to the decorator are converted to pint.Quantity objects, assigning the default units using the data from the cols list, unless they are a pint.Quantity already.

  • decorated func is launched with a pd.DataFrame as the args and other parameters in kwargs:

    • the data for the arguments listed in cols is sourced from the columns of the pd.DataFrame, using the provided str arguments to find the appropriate columns

    • if pd.Index is provided as the data type, and no column name is provided by the user, the index of the pd.DataFrame is passed into the called function

    • data from unit-aware pd.DataFrame objects is loaded using the pQ() accessor accordingly

    • data from unit-naive pd.DataFrame objects are coerced into pint.Quantity objects using the default units as specified in the cols list

Parameters

cols (tuple[str, str, type]) – A list[tuple[str, str, type]] containing the column names used to call the func.

Returns

loading – A wrapped version of the decorated func.

Return type

Callable

Subpackages