helpers: helper functions for the transform package
Code author: Peter Kraus, Ueli Sauter
- dgpost.utils.helpers.element_from_formula(f, el)
Given a chemical formula
f, returns the number of atoms of elementelin that formula.- Return type
int
- dgpost.utils.helpers.default_element(f)
Given a formula
f, return the default element for calculating conversion. The priority list is["C", "O", "H"].- Return type
str
- dgpost.utils.helpers.name_to_chem(name)
- Return type
str
- dgpost.utils.helpers.columns_to_smiles(**kwargs)
Creates a dictionary with a SMILES representation of all chemicals present among the keys in the kwargs, storing the returned
chemicals.ChemicalMetadataas well as the full name within args.- Parameters
kwargs (
dict[str,dict[str,Any]]) – Adictcontainingdict[str, Any]values. Thestrkeys of the innerdictsare parsed to SMILES.- Returns
smiles – A new
dict[str, dict]containing the SMILES of all prefixed chemicals asstrkeys, and the metadata and column specification as thedictvalues.- Return type
dict
- dgpost.utils.helpers.electrons_from_smiles(smiles, ions=None)
- Return type
float
- dgpost.utils.helpers.pQ(df, col, unit=None)
Unit-aware dataframe accessor function.
Given a dataframe in
dfand a column name incol, the function looks through the units stored indf.attrs["units"]and returns a unit-annotatedpint.Quantitycontaining the column data. Alternatively, the data indf[col]can be annotated by the providedunit.Note
If
df.attrshas no units, orcolis not indf.attrs["units"], the returnedpint.Quantityis dimensionless.- Parameters
df (
DataFrame) – Apd.DataFrame, optionally annotated with units indf.attrs.col (
Union[str,tuple[str]]) – Thestrname of the column to be loaded from thedf.unit (
Optional[str]) – Optional override for units.
- Returns
Quantity – Unit-aware
ping.Quantityobject containing the data fromdf[col].- Return type
pint.Quantity
- dgpost.utils.helpers.separate_data(data, unit=None)
Separates the data into values, errors and units
- Parameters
data (
Quantity) – Apint.Quantityobject containing the data points. Can be eitherfloatoruc.ufloat.unit (
Optional[str]) – When specified, converts the data to this unit.
- Returns
Converted nominal values and errors, and the original unit of the data.
- Return type
(values, errors, old_unit)
- dgpost.utils.helpers.load_data(*cols)
Decorator factory for data loading.
Creates a decorator that will load the columns specified in
colsand calls the wrapped functionfuncas appropriate. Thefunchas to acceptpint.Quantityobjects, return adict[str, pint.Quantity], and handle an optional parameter"output"which prefixes (or assigns) the output data in the returneddictappropriately.The argument of the decorator is a
list[tuple], with each element being a aretuple[str, str, type]. The first field in thistupleis thestrname of the argument of the decoratedfunc, the secondstrfield denotes the default units for that argument (orNonefor a unitless quantity), and thetypefield allows the use of the decorator with functions that expectlistof points in the argument (such as trace-processing functions) ordictofpint.Quantityobjects (such as functions operating on chemical compositions).The decorator handles the following cases:
the decorated
funcis launched directly, either withkwargsor with a mixture ofargsandkwargs:the
argsare assigned intokwargsusing their position in theargsandcolsarray as provided to the decoratorall elements in
kwargsthat match the argument names in thecolslistprovided to the decorator are converted topint.Quantityobjects, assigning the default units using the data from thecolslist, unless they are apint.Quantityalready.
decorated
funcis launched with apd.DataFrameas theargsand other parameters inkwargs:the data for the arguments listed in
colsis sourced from the columns of thepd.DataFrame, using the providedstrarguments to find the appropriate columnsif
pd.Indexis provided as the data type, and no column name is provided by the user, the index of thepd.DataFrameis passed into the called functiondata from unit-aware
pd.DataFrameobjects is loaded using thepQ()accessor accordinglydata from unit-naive
pd.DataFrameobjects are coerced intopint.Quantityobjects using the default units as specified in thecolslist
- Parameters
cols (
tuple[str,str,type]) – Alist[tuple[str, str, type]]containing the column names used to call thefunc.- Returns
loading – A wrapped version of the decorated
func.- Return type
Callable
- dgpost.utils.helpers.combine_tables(a, b)
Combine two
pd.DataFramesinto a newpd.DataFrame.Assumes the
pd.DataFramescontain apd.MultiIndex. Automatically pads thepd.MultiIndexto match the higher number of levels, if necessary. Merges units.- Return type
DataFrame
- dgpost.utils.helpers.arrow_to_multiindex(df, warn=True)
Convert the provided
pd.DataFrameto adgpost-compatible format.converts tables with
pd.Indexintopd.MultiIndex,converts
->-separated namespaces intopd.MultiIndex,processes units into nested
dicts.
- Return type
DataFrame
- dgpost.utils.helpers.keys_in_df(key, df)
Find all columns in the provided
pd.DataFramethat matchkey.Returns a
setof all columns in thedfwhich are matched bykey. Assumes the providedpd.DataFramecontains apd.MultiIndex.- Return type
set[tuple]
- dgpost.utils.helpers.key_to_tuple(key)
Convert a provided
keyto atuplefor use withpd.DataFramescontaining apd.MultiIndex.- Return type
tuple
- dgpost.utils.helpers.get_units(key, df)
Given a
keycorresponding to a column in thedf, return the units. The providedkeycan be both astrfordfwithpd.Index, or any otherSequencefor adfwithpd.MultiIndex.- Return type
Optional[str]
- dgpost.utils.helpers.set_units(key, unit, target)
Set the units of
keytounitin thetargetobject, which can be either adictor apd.DataFrame. See alsoget_units().- Return type
None