helpers: helper functions for the transform package
Code author: Peter Kraus, Ueli Sauter
- dgpost.utils.helpers.element_from_formula(f: str, el: str) int
Given a chemical formula
f, returns the number of atoms of elementelin that formula.
- dgpost.utils.helpers.default_element(f: str) str
Given a formula
f, return the default element for calculating conversion. The priority list is["C", "O", "H"].
- dgpost.utils.helpers.name_to_chem(name: str) str
- dgpost.utils.helpers.columns_to_smiles(**kwargs: dict[str, dict[str, Any]]) dict
Creates a dictionary with a SMILES representation of all chemicals present among the keys in the kwargs, storing the returned
chemicals.ChemicalMetadataas well as the full name within args.- Parameters:
kwargs – A
dictcontainingdict[str, Any]values. Thestrkeys of the innerdictsare parsed to SMILES.- Returns:
smiles – A new
dict[str, dict]containing the SMILES of all prefixed chemicals asstrkeys, and the metadata and column specification as thedictvalues.- Return type:
dict
- dgpost.utils.helpers.electrons_from_smiles(smiles: str, ions: dict | None = None) float
- dgpost.utils.helpers.pQ(df: DataFrame, col: str | tuple[str], unit: str | None = None) Quantity
Unit-aware dataframe accessor function.
Given a dataframe in
dfand a column name incol, the function looks through the units stored indf.attrs["units"]and returns a unit-annotatedureg.Quantitycontaining the column data. Alternatively, the data indf[col]can be annotated by the providedunit.Note
If
df.attrshas no units, orcolis not indf.attrs["units"], the returnedureg.Quantityis dimensionless.- Parameters:
df – A
pd.DataFrame, optionally annotated with units indf.attrs.col – The
strname of the column to be loaded from thedf.unit – Optional override for units.
- Returns:
Quantity – Unit-aware
ping.Quantityobject containing the data fromdf[col].- Return type:
ureg.Quantity
- dgpost.utils.helpers.separate_data(data: Quantity, unit: str | None = None) tuple[ndarray, ndarray, str]
Separates the data into values, errors and units
- Parameters:
data – A
ureg.Quantityobject containing the data points. Can be eitherfloatoruc.ufloat.unit – When specified, converts the data to this unit.
- Returns:
Converted nominal values and errors, and the original unit of the data.
- Return type:
(values, errors, old_unit)
- dgpost.utils.helpers.load_data(*cols: tuple[str, str, type])
Decorator factory for data loading.
Creates a decorator that will load the columns specified in
colsand calls the wrapped functionfuncas appropriate. Thefunchas to acceptureg.Quantityobjects, return adict[str, ureg.Quantity], and handle an optional parameter"output"which prefixes (or assigns) the output data in the returneddictappropriately.The argument of the decorator is a
list[tuple], with each element being a aretuple[str, str, type]. The first field in thistupleis thestrname of the argument of the decoratedfunc, the secondstrfield denotes the default units for that argument (orNonefor a unitless quantity), and thetypefield allows the use of the decorator with functions that expectlistof points in the argument (such as trace-processing functions) ordictofureg.Quantityobjects (such as functions operating on chemical compositions).The decorator handles the following cases:
the decorated
funcis launched directly, either withkwargsor with a mixture ofargsandkwargs:the
argsare assigned intokwargsusing their position in theargsandcolsarray as provided to the decoratorall elements in
kwargsthat match the argument names in thecolslistprovided to the decorator are converted toureg.Quantityobjects, assigning the default units using the data from thecolslist, unless they are aureg.Quantityalready.
decorated
funcis launched with apd.DataFrameas theargsand other parameters inkwargs:the data for the arguments listed in
colsis sourced from the columns of thepd.DataFrame, using the providedstrarguments to find the appropriate columnsif
pd.Indexis provided as the data type, and no column name is provided by the user, the index of thepd.DataFrameis passed into the called functiondata from unit-aware
pd.DataFrameobjects is loaded using thepQ()accessor accordinglydata from unit-naive
pd.DataFrameobjects are coerced intoureg.Quantityobjects using the default units as specified in thecolslist
- Parameters:
cols – A
list[tuple[str, str, type]]containing the column names used to call thefunc.- Returns:
loading – A wrapped version of the decorated
func.- Return type:
Callable
- dgpost.utils.helpers.kwarg_to_quantity(*kws: str)
A decorator to convert kwargs passed as
strinto transform functions intopint.Quantities.
- dgpost.utils.helpers.merge_units(a: dict, b: dict)
- dgpost.utils.helpers.combine_tables(a: DataFrame, b: DataFrame) DataFrame
Combine two
pd.DataFramesinto a newpd.DataFrame.Assumes the
pd.DataFramescontain apd.MultiIndex. Automatically pads thepd.MultiIndexto match the higher number of levels, if necessary. Merges units.
- dgpost.utils.helpers.arrow_to_multiindex(df: DataFrame, warn: bool = True) DataFrame
Convert the provided
pd.DataFrameto adgpost-compatible format.converts tables with
pd.Indexintopd.MultiIndex,converts
->-separated namespaces intopd.MultiIndex,processes units into nested
dicts.
- dgpost.utils.helpers.keys_in_df(key: str | tuple, df: DataFrame) set[tuple]
Find all columns in the provided
pd.DataFramethat matchkey.Returns a
setof all columns in thedfwhich are matched bykey. Assumes the providedpd.DataFramecontains apd.MultiIndex.
- dgpost.utils.helpers.key_to_tuple(key: str | tuple) tuple
Convert a provided
keyto atuplefor use withpd.DataFramescontaining apd.MultiIndex.
- dgpost.utils.helpers.get_units(key: str | Sequence, df: DataFrame) str | None
Given a
keycorresponding to a column in thedf, return the units. The providedkeycan be both astrfordfwithpd.Index, or any otherSequencefor adfwithpd.MultiIndex.
- dgpost.utils.helpers.set_units(key: str | Sequence, unit: str | None, target: dict | DataFrame) None
Set the units of
keytounitin thetargetobject, which can be either adictor apd.DataFrame. See alsoget_units().
- dgpost.utils.helpers.fill_nans(data: Quantity, fillmag: float = 0.0) Quantity