flowdata: Flow data parser

This parser handles the reading and processing of flow meter data. Files parsed through this parser are guaranteed to contain the "flow" entry in the derived keys.

Usage

The use of flowdata can be requested by supplying flowdata as the parser keyword in the dataschema. The following list of parameters is supported by the parser:

pydantic model dgbowl_schemas.yadg.dataschema_4_1.step.FlowData.Params

Show JSON schema

{
   "title": "Params",
   "type": "object",
   "properties": {
      "filetype": {
         "title": "Filetype",
         "default": "drycal.csv",
         "enum": [
            "drycal.csv",
            "drycal.rtf",
            "drycal.txt"
         ],
         "type": "string"
      },
      "convert": {
         "title": "Convert"
      },
      "calfile": {
         "title": "Calfile",
         "type": "string"
      }
   },
   "additionalProperties": false
}

field filetype: Literal['drycal.csv', 'drycal.rtf', 'drycal.txt'] = 'drycal.csv'

field convert: Optional[Any] = PydanticUndefined

field calfile: Optional[str] = PydanticUndefined

Formats

The formats currently supported by the parser are:

DryCal log file text output (txt): drycal

DryCal log file tabulated output (csv): drycal

DryCal log file document file (rtf): drycal

Provides

The parser is used to extract all tabular data in the input file. Additionally, the parser automatically assigns a best-guess value as flow in the derived entry. This behaviour can be modified by supplying either the calfile and/or convert parameters.

This parser processes additional calibration information analogously to basiccsv.

Metadata

The metadata section currently stores all metadata available from the raw flow data files, including information about the measuring device.

Submodules

drycal: File parser for DryCal log files.

This module includes functions for parsing converted documents (rtf) and tabulated exports (txt, csv).

The DryCal files only contain the timestamps of the datapoints, not the date. Therefore, the date has to be supplied either using the date argument in parameters, or is parsed from the prefix of the filename.

pydantic model yadg.parsers.flowdata.drycal.TimeDate

Bases: pydantic.main.BaseModel

Show JSON schema

{
   "title": "TimeDate",
   "type": "object",
   "properties": {
      "date": {
         "$ref": "#/definitions/TimestampSpec"
      },
      "time": {
         "$ref": "#/definitions/TimestampSpec"
      }
   },
   "definitions": {
      "TimestampSpec": {
         "title": "TimestampSpec",
         "type": "object",
         "properties": {
            "index": {
               "title": "Index",
               "type": "integer"
            },
            "format": {
               "title": "Format",
               "type": "string"
            }
         },
         "additionalProperties": false
      }
   }
}

pydantic model TimestampSpec

Bases: pydantic.main.BaseModel

Show JSON schema

{
   "title": "TimestampSpec",
   "type": "object",
   "properties": {
      "index": {
         "title": "Index",
         "type": "integer"
      },
      "format": {
         "title": "Format",
         "type": "string"
      }
   },
   "additionalProperties": false
}

field index: Optional[int] = PydanticUndefined

field format: Optional[str] = PydanticUndefined

field date: Optional[yadg.parsers.flowdata.drycal.TimeDate.TimestampSpec] = PydanticUndefined

field time: Optional[yadg.parsers.flowdata.drycal.TimeDate.TimestampSpec] = PydanticUndefined

yadg.parsers.flowdata.drycal.drycal_table(lines, sep=',')

DryCal table-processing function.

Given a table with headers and units in the first line, and data in the following lines, this function returns the headers, units, and data extracted from the table. The returned values are always of (str) type, any post-processing is done in the calling routine.

Parameters

lines (list) – A list containing the lines to be parsed
sep (str) – The separator string used to split each line into individual items

Returns

(headers, units, data) – A tuple of a list of the stripped headers, dictionary of header-unit key-value pairs, and a list of lists containing the rows of the table.

Return type

tuple[list, dict, list]

yadg.parsers.flowdata.drycal.rtf(fn, encoding='utf-8', timezone='UTC', calib={})

RTF version of the drycal parser.

This is intended to parse legacy drycal DOC files, which have been converted to RTF using other means.

Parameters

fn (str) – Filename to parse.
encoding (str) – Encoding to use for parsing fn.
calib (dict) – A calibration spec.

Returns

(timesteps, metadata, None) – A standard data - metadata - common data output tuple.

Return type

tuple[list, dict, None]

yadg.parsers.flowdata.drycal.sep(fn, sep, encoding='utf-8', timezone='UTC', calib={})

Generic drycal parser, using sep as separator string.

This is intended to parse other export formats from DryCal, such as txt and csv files.

Parameters

fn (str) – Filename to parse.
date – A unix timestamp float corresponding to the day (or other offset) to be added to each line in the measurement table.
sep (str) – The separator character used to split lines in fn.
encoding (str) – Encoding to use for parsing fn.
calib (dict) – A calibration spec.

Returns

(timesteps, metadata, None) – A standard data - metadata - common data output tuple.

Return type

tuple[list, dict, None]

yadg.parsers.flowdata.main.process(fn, encoding='utf-8', timezone='localtime', parameters=None)

Flow meter data processor

This parser processes flow meter data.

Parameters

fn (str) – File to process
encoding (str) – Encoding of fn, by default “utf-8”.
timezone (str) – A string description of the timezone. Default is “localtime”.
filetype – Whether a rtf, csv, or txt file is to be expected. When None, the suffix of the file is used to determine the file type.
convert – Specification for column conversion. The key of each entry will form a new datapoint in the "derived" (dict) of a timestep. The elements within each entry must either be one of the "header" fields, or "unit" (str) specification. See processing convert for more info.
calfile – convert-like functionality specified in a json file.
date – An optional date argument, required for parsing DryCal files. Otherwise the date is parsed from fn.

Returns

(data, metadata, fulldate) – Tuple containing the timesteps, metadata, and full date tag. Whether full date is returned depends on the file parser.

Return type

tuple[list, dict, bool]