chromdata: Post-processed chromatography data parser
This module handles the reading of post-processed chromatography data, i.e. files containing peak areas, concentrations, or mole fractions.
chromdata
loads the processed chromatographic data from the
specified file, including the peak heights, areas, retention times, as well as the
concentrations and mole fractions (normalised, unitless concentrations).
Note
To parse trace data as present in raw chromatograms, use the
chromtrace
parser.
Usage
Available since yadg-4.2
. The parser supports the following parameters:
- pydantic model dgbowl_schemas.yadg.dataschema_4_2.step.ChromData.Params
Show JSON schema
{ "title": "Params", "type": "object", "properties": { "filetype": { "title": "Filetype", "default": "fusion.json", "enum": [ "fusion.json", "fusion.zip", "fusion.csv", "empalc.csv", "empalc.xlsx" ], "type": "string" } }, "additionalProperties": false }
- field filetype: Literal['fusion.json', 'fusion.zip', 'fusion.csv', 'empalc.csv', 'empalc.xlsx'] = 'fusion.json'
Formats
The filetypes
currently supported by the parser are:
Inficon Fusion JSON format (
fusion.json
): seefusionjson
Inficon Fusion zip archive (
fusion.zip
): seefusionzip
Inficon Fusion csv export (
fusion.csv
): seefusioncsv
Empa’s Agilent LC csv export (
empalc.csv
): seeempalccsv
Empa’s Agilent LC excel export (
empalc.xlsx
): seeempalcxlsx
Provides
This raw data is stored, for each timestep, using the following format:
- uts: !!float
fn: !!str
raw:
sampleid: !!str # sample name or valve ID
height: # heights of the peak maxima
"{{ species_name }}":
{n: !!float, s: !!float, u: !!str}
area: # integrated areas of the peaks
"{{ species_name }}":
{n: !!float, s: !!float, u: !!str}
concentration:
"{{ species_name }}":
{n: !!float, s: !!float, u: !!str}
xout: # mole fractions (normalised concentrations)
"{{ species_name }}":
{n: !!float, s: !!float, u: " "}
retention time:
"{{ species_name }}":
{n: !!float, s: !!float, u: " "}
Note
The mole fractions in xout
always sum up to unity. If there is more than
one outlet stream, or if some analytes remain unidentified, the values in
xout
will not be accurate.
Submodules
empalccsv: Processing Empa’s online LC exported data (csv)
This is a structured format produced by the export from Agilent’s Online LC device at Empa. It contains three sections:
metadata section,
table containing sampling information,
table containing analysed chromatography data.
Exposed metadata:
params:
method: !!str
username: !!str
version: !!int
datafile: !!str
Code author: Peter Kraus
- yadg.parsers.chromdata.empalccsv.process(fn, encoding, timezone)
Fusion csv export format.
Multiple chromatograms per file, with multiple detectors.
- Parameters
fn (
str
) – Filename to process.encoding (
str
) – Encoding used to open the file.timezone (
str
) – Timezone information. This should be"localtime"
.
- Returns
([chrom], metadata, fulldate) – Standard timesteps, metadata, and date tuple.
- Return type
tuple[list, dict, bool]
empalcxlsx: Processing Empa’s online LC exported data (xlsx)
This is a structured format produced by the export from Agilent’s Online LC device at Empa. It contains three sections:
metadata section,
table containing sampling information,
table containing analysed chromatography data.
Exposed metadata:
params:
method: !!str
username: !!str
version: !!int
datafile: !!str
Code author: Peter Kraus
- yadg.parsers.chromdata.empalcxlsx.process(fn, encoding, timezone)
Fusion xlsx export format.
Multiple chromatograms per file, with multiple detectors.
- Parameters
fn (
str
) – Filename to process.encoding (
str
) – Encoding used to open the file.timezone (
str
) – Timezone information. This should be"localtime"
.
- Returns
([chrom], metadata, fulldate) – Standard timesteps, metadata, and date tuple.
- Return type
tuple[list, dict, bool]
fusioncsv: Processing Inficon Fusion csv export format (csv).
This is a tabulated format, including the concentrations, mole fractions, peak areas, and retention times. The latter is ignored by this parser.
Warning
As also mentioned in the csv
files themselves, the use of this filetype
is discouraged, and the json
files (or a zipped archive of them) should
be parsed instead.
Exposed metadata:
params:
method: !!str
username: None
version: None
datafile: None
Code author: Peter Kraus
- yadg.parsers.chromdata.fusioncsv.process(fn, encoding, timezone)
Fusion csv export format.
Multiple chromatograms per file, with multiple detectors.
- Parameters
fn (
str
) – Filename to process.encoding (
str
) – Encoding used to open the file.timezone (
str
) – Timezone information. This should be"localtime"
.
- Returns
([chrom], metadata, fulldate) – Standard timesteps, metadata, and date tuple.
- Return type
tuple[list, dict, bool]
fusionjson: Processing Inficon Fusion json data format (json).
This is a fairly detailed data format, including the traces, the calibration applied, and also the integrated peak areas and other processed information, which are parsed by this module.
Note
To parse the raw trace data, use the chromtrace
module.
Warning
The detectors in the json files are not necessarily in a consistent order. To avoid inconsistent parsing of species which appear in both detectors, the detector keys are sorted. Species present in both detectors will be overwritten by the last detector in alphabetical order.
Exposed metadata:
params:
method: !!str
username: None
version: !!str
datafile: !!str
Code author: Peter Kraus
- yadg.parsers.chromdata.fusionjson.process(fn, encoding, timezone)
Fusion json format.
One chromatogram per file with multiple traces, and pre-analysed results. Only a subset of the metadata is retained, including the method name, detector names, and information about assigned peaks.
- Parameters
fn (
str
) – Filename to process.encoding (
str
) – Encoding used to open the file.timezone (
str
) – Timezone information. This should be"localtime"
.
- Returns
([chrom], metadata, fulldate) – Standard timesteps, metadata, and date tuple.
- Return type
tuple[list, dict, bool]
fusionzip: Processing Inficon Fusion zipped data format (zip).
This is a wrapper parser which unzips the provided zip file, and then uses
the yadg.parsers.chromdata.fusionjson
parser to parse every data
file present in the archive.
Exposed metadata:
params:
method: !!str
username: None
version: !!str
datafile: !!str
Code author: Peter Kraus
- yadg.parsers.chromdata.fusionzip.process(fn, encoding, timezone)
Fusion zip file format.
The Fusion GC’s can export their json formats as a zip archive of a folder of jsons. This parser allows for parsing of this zip archive directly, without the user having to unzip & move the data.
- Parameters
fn (
str
) – Filename to process.encoding (
str
) – Not used as the file is binary.timezone (
str
) – Timezone information. This should be"localtime"
.
- Returns
(chroms, metadata) – Standard timesteps & metadata tuple.
- Return type
tuple[list, dict]
- yadg.parsers.chromdata.main.process(fn, encoding='utf-8', timezone='localtime', parameters=None)
Unified chromatographic data parser.
- Parameters
fn (
str
) – The file containing the trace(s) to parse.encoding (
str
) – Encoding offn
, by default “utf-8”.timezone (
str
) – A string description of the timezone. Default is “localtime”.parameters (
Optional
[BaseModel
]) – Parameters forChromData
.
- Returns
(data, metadata, fulldate) – Tuple containing the timesteps, metadata, and full date tag. All currently supported file formats return full date.
- Return type
tuple[list, dict, bool]