yadg.parsers package

Subpackages

Submodules

yadg.parsers.basiccsv module

yadg.parsers.basiccsv.process(fn, encoding='utf-8', timezone='localtime', sep=',', units=None, timestamp=None, convert=None, calfile=None)

A basic csv parser.

This parser processes a csv file. The header of the csv file consists of one or two lines, with the column headers in the first line and the units in the second. The parser also attempts to parse column names to produce a timestamp, and save all other columns as floats or strings.

Parameters
  • fn (str) – File to process

  • encoding (str) – Encoding of fn, by default “utf-8”.

  • timezone (str) – A string description of the timezone. Default is “localtime”.

  • sep (str) – Separator to use. Default is “,” for csv.

  • units (Optional[dict]) – Column-specific unit specification. If present, even if empty, 2nd line is treated as data. If omitted, 2nd line is treated as units.

  • timestamp (Optional[dict]) – Specification for timestamping. Allowed keys are "date", "time", "timestamp", "uts". The entries can be "index" (list[int]), containing the column indices, and "format" (str) with the format string to be used to parse the date. See yadg.dgutils.dateutils.infer_timestamp_from() for more info.

  • convert (Optional[dict]) – Specification for column conversion. The key of each entry will form a new datapoint in the "derived" (dict) of a timestep, including the option to specify linear combinations. See here for more info.

  • calfile (Optional[str]) – convert-like functionality specified in a json file.

Returns

(data, metadata, fulldate) – Tuple containing the timesteps, metadata, and full date tag. No metadata is returned by the basiccsv parser. The full date might not be returned, eg. when only time is specified in columns.

Return type

tuple[list, dict, bool]

yadg.parsers.basiccsv.process_row(headers, items, units, datefunc, datecolumns, calib={})

A function that processes a row of a table.

This is the main worker function of basiccsv, but can be re-used by any other parser that needs to process tabular data.

This function processes the "calib" parameter, which should be a (dict) in the following format:

- new_name:     !!str    # derived entry name
  - old_name:   !!str    # raw header name
    - calib: {}          # calibration specification
    fraction:   !!float  # coefficient for linear combinations of old_name
  unit:         !!str    # unit of new_name

The syntax of the calibration specification is detailed in yadg.dgutils.calib.calib_handler().

Parameters
  • headers (list) – A list of headers of the table.

  • items (list) – A list of values corresponding to the headers. Must be the same length as headers.

  • units (dict) – A dict for looking up the units corresponding to a certain header.

  • datefunc (Callable) – A function that will generate uts given a list of values.

  • datecolumns (list) – Column indices that need to be passed to datefunc to generate uts.

  • calib (dict) – Specification for converting raw data in headers and items to other quantities. Arbitrary linear combinations of headers are possible. See the above section for the specification.

Returns

element – A result dictionary, containing the keys "uts" with a timestamp, "raw" for all raw data present in the headers, and "derived" for any data processes via calib.

Return type

dict

yadg.parsers.dummy module

yadg.parsers.dummy.process(fn, encoding='utf-8', timezone='localtime', **kwargs)

A dummy parser.

This parser simply returns the current time, the filename provided, and any kwargs passed.

Parameters
  • fn (str) – Filename to process

  • encoding (str) – Not used.

  • timezone (str) – Not used

Returns

(data, metadata, fulldate) – Tuple containing the timesteps, metadata, and full date tag. No metadata is returned by the dummy parser. The full date is always returned.

Return type

tuple[list, dict, bool]

yadg.parsers.meascsv module

yadg.parsers.meascsv.process(fn, encoding='utf-8', timezone='localtime', convert=None, calfile=None)

Legacy MCPT measurement log parser.

This parser is included to maintain parity with older schemas and datagrams. It is essentially a wrapper around yadg.parsers.basiccsv.process_row(). For new applications, please use the basiccsv parser.

Parameters
  • fn (str) – File to process

  • encoding (str) – Encoding of fn, by default “utf-8”.

  • timezone (str) – A string description of the timezone. Default is “localtime”.

  • convert (Optional[dict]) – Specification for column conversion. See this section for syntax and further details.

  • calfile (Optional[str]) – convert-like functionality specified in a json file.

Returns

(data, metadata, fulldate) – Tuple containing the timesteps, metadata, and full date tag. No metadata is returned. The full date is always provided in meascsv-compatible files.

Return type

tuple[list, dict, bool]