chromatography: chromatographic trace postprocessing library
Code author: Peter Kraus
Includes functions to integrate a chromatographic trace, and post-process an integrated trace using calibration.
Functions
|
Chromatographic trace integration. |
|
Apply calibration to an integrated chromatographic trace. |
- dgpost.transform.chromatography.integrate_trace(time: Quantity, signal: Quantity, species: dict[str, dict], polyorder: int = 3, window: int = 7, prominence: float = 0.0001, threshold: float = 1.0, output: str = 'trace') dict[str, float]
Chromatographic trace integration.
Function which integrates peaks found in the chromatographic trace, which is itself defined as a set of
time, signalarrays. The procedure is as follows:The
signalis smoothed the Savigny-Golay filter, via thescipy.signal.savgol_filter(). For this, the argumentspolyorderandwindoware used.Peak maxima are found using
scipy.signal.find_peaks(). For this, the argumentprominenceis used, scaled bymax(abs(signal)).Peak edges of every found peak are determined. The peak ends are either determined from the nearest minima, or from the nearest inflection point at which the gradient is below the
threshold.Peak maxima are matched against known peaks, provided in the
speciesargument. The peak is considered matching a species when its maximum is between the left and right limits defined inspecies.A baseline is constructed by interpolating by copying the
signaldata and interpolating between the ends of all matched peaks. If consecutive peaks are found, the interpolation spans the whole domain.The baseline is subtracted from the signal and the peak areas are integrated using the
numpy.trapezoid()function.The peak height is taken from the original
signaldata.
The format of the
speciesspecification, used for peak matching, is as follows:"{{ species_name }}" : l: pint.Quantity r: pint.Quantity
with the keys
"l"and"r"corresponding to the left and right limit for the maximum of the peak. The limits can be eitherpint.Quantity, orstrwith the same dimensionality astime, or afloatin which case the units oftimeare assumed.- Parameters:
time – A
pint.Quantityarray object determining the X-axis of the trace. By default in seconds.signal – A
pint.Quantityarray object containing the Y-axis of the trace. By default dimensionless.species – A
dict[str, dict], where the keys are species names and the values define the left and right limits for matching the peak maximum.polyorder – An
intdefining the order of the polynomial for the Savigny-Golay filter. Defaults to 3. Thepolyordermust be less thanwindow.window – An
intdefining the smoothing window for the Savigny-Golay filter. Defaults to 7. Must be odd. Thepolyordermust be less thanwindow.prominence – A
floatused to calculate the prominence of the peaks insignalby scaling themax(abs(signal)). Used in the peak picking process. Defaults to 0.0001.threshold – A
floatused to find ends of peaks by comparing to the gradient ofsignalat the nearest inflection points.output – A
strprefix for the output namespace. The results are collated in thef"{output}->areanamespace for peak areas andf"{output}->heightnamespace for peak height.
- Returns:
retvals – A dictionary containing the peak areas and peak heights of matched peaks stored in namespaced
pint.Quantities.- Return type:
dict[str, dict[str, pint.Quantity]
- dgpost.transform.chromatography.apply_calibration(areas: Quantity, calibration: dict[str, dict], output: str = 'x') dict[str, float]
Apply calibration to an integrated chromatographic trace.
Function which applies calibration information, provided in a
dict, to an integrated chromatographic trace. Elements in the calibrationdictare treated as chemicals, matched against the chromatographic data using SMILES.The format of the
calibrationis as follows:"{{ species_name }}" : function: Literal["inverse", "linear"] m: float c: Optional[float]
Two calibration functions are provided. Either the output value
xis calculated as $x = (A - c) / m$, i.e. an “inverse” relationship, or using $x = m times A + c$, a “linear” relationship. The offset $c$ is optional. Both $m$ and $c$ are internally converted topint.Quantity, therefore they can be specified with uncertainty, but have to be annotated by appropriate units to convert the units of the peak areas to the desired output.- Parameters:
areas – A
dictcontaining a namespace ofpint.Quantitycontaining the integrated peak areas $A$, with their keys corresponding to chemicals.calibration – A
dictcontaining the calibration information for processing the above peakareasinto the resultingpint.Quantity.output – The
strprefix for the output namespace.