flowdata: Flow data parser

Handles the reading and processing of flow controller or flow meter data.

Usage

Available since yadg-4.0. The parser supports the following parameters:

pydantic model dgbowl_schemas.yadg.dataschema_5_0.step.FlowData

Parser for flow controller/meter data.

Show JSON schema
{
   "title": "FlowData",
   "description": "Parser for flow controller/meter data.",
   "type": "object",
   "properties": {
      "tag": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Tag"
      },
      "parser": {
         "const": "flowdata",
         "title": "Parser"
      },
      "input": {
         "$ref": "#/$defs/Input"
      },
      "extractor": {
         "discriminator": {
            "mapping": {
               "drycal.csv": "#/$defs/Drycal_csv",
               "drycal.rtf": "#/$defs/Drycal_rtf",
               "drycal.txt": "#/$defs/Drycal_txt"
            },
            "propertyName": "filetype"
         },
         "oneOf": [
            {
               "$ref": "#/$defs/Drycal_csv"
            },
            {
               "$ref": "#/$defs/Drycal_rtf"
            },
            {
               "$ref": "#/$defs/Drycal_txt"
            }
         ],
         "title": "Extractor"
      },
      "parameters": {
         "anyOf": [
            {
               "$ref": "#/$defs/Parameters"
            },
            {
               "type": "null"
            }
         ],
         "default": null
      },
      "externaldate": {
         "anyOf": [
            {
               "$ref": "#/$defs/ExternalDate"
            },
            {
               "type": "null"
            }
         ],
         "default": null
      }
   },
   "$defs": {
      "Drycal_csv": {
         "additionalProperties": false,
         "properties": {
            "filetype": {
               "const": "drycal.csv",
               "title": "Filetype"
            },
            "timezone": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Timezone"
            },
            "locale": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Locale"
            },
            "encoding": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Encoding"
            }
         },
         "required": [
            "filetype"
         ],
         "title": "Drycal_csv",
         "type": "object"
      },
      "Drycal_rtf": {
         "additionalProperties": false,
         "properties": {
            "filetype": {
               "const": "drycal.rtf",
               "title": "Filetype"
            },
            "timezone": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Timezone"
            },
            "locale": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Locale"
            },
            "encoding": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Encoding"
            }
         },
         "required": [
            "filetype"
         ],
         "title": "Drycal_rtf",
         "type": "object"
      },
      "Drycal_txt": {
         "additionalProperties": false,
         "properties": {
            "filetype": {
               "const": "drycal.txt",
               "title": "Filetype"
            },
            "timezone": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Timezone"
            },
            "locale": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Locale"
            },
            "encoding": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Encoding"
            }
         },
         "required": [
            "filetype"
         ],
         "title": "Drycal_txt",
         "type": "object"
      },
      "ExternalDate": {
         "additionalProperties": false,
         "description": "Supply timestamping information that are external to the processed file.",
         "properties": {
            "using": {
               "anyOf": [
                  {
                     "$ref": "#/$defs/ExternalDateFile"
                  },
                  {
                     "$ref": "#/$defs/ExternalDateFilename"
                  },
                  {
                     "$ref": "#/$defs/ExternalDateISOString"
                  },
                  {
                     "$ref": "#/$defs/ExternalDateUTSOffset"
                  }
               ],
               "title": "Using"
            },
            "mode": {
               "default": "add",
               "enum": [
                  "add",
                  "replace"
               ],
               "title": "Mode",
               "type": "string"
            }
         },
         "required": [
            "using"
         ],
         "title": "ExternalDate",
         "type": "object"
      },
      "ExternalDateFile": {
         "additionalProperties": false,
         "description": "Read external date information from file.",
         "properties": {
            "file": {
               "$ref": "#/$defs/dgbowl_schemas__yadg__dataschema_5_0__externaldate__ExternalDateFile__Content"
            }
         },
         "required": [
            "file"
         ],
         "title": "ExternalDateFile",
         "type": "object"
      },
      "ExternalDateFilename": {
         "additionalProperties": false,
         "description": "Read external date information from the file name.",
         "properties": {
            "filename": {
               "$ref": "#/$defs/dgbowl_schemas__yadg__dataschema_5_0__externaldate__ExternalDateFilename__Content"
            }
         },
         "required": [
            "filename"
         ],
         "title": "ExternalDateFilename",
         "type": "object"
      },
      "ExternalDateISOString": {
         "additionalProperties": false,
         "description": "Read a constant external date using an ISO-formatted string.",
         "properties": {
            "isostring": {
               "title": "Isostring",
               "type": "string"
            }
         },
         "required": [
            "isostring"
         ],
         "title": "ExternalDateISOString",
         "type": "object"
      },
      "ExternalDateUTSOffset": {
         "additionalProperties": false,
         "description": "Read a constant external date using a Unix timestamp offset.",
         "properties": {
            "utsoffset": {
               "title": "Utsoffset",
               "type": "number"
            }
         },
         "required": [
            "utsoffset"
         ],
         "title": "ExternalDateUTSOffset",
         "type": "object"
      },
      "Input": {
         "additionalProperties": false,
         "description": "Specification of input files/folders to be processed by the :class:`Step`.",
         "properties": {
            "folders": {
               "items": {
                  "type": "string"
               },
               "title": "Folders",
               "type": "array"
            },
            "prefix": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Prefix"
            },
            "suffix": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Suffix"
            },
            "contains": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Contains"
            },
            "exclude": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Exclude"
            }
         },
         "required": [
            "folders"
         ],
         "title": "Input",
         "type": "object"
      },
      "Parameters": {
         "additionalProperties": false,
         "description": "Empty parameters specification with no extras allowed.",
         "properties": {},
         "title": "Parameters",
         "type": "object"
      },
      "dgbowl_schemas__yadg__dataschema_5_0__externaldate__ExternalDateFile__Content": {
         "additionalProperties": false,
         "properties": {
            "path": {
               "title": "Path",
               "type": "string"
            },
            "type": {
               "title": "Type",
               "type": "string"
            },
            "match": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Match"
            }
         },
         "required": [
            "path",
            "type"
         ],
         "title": "Content",
         "type": "object"
      },
      "dgbowl_schemas__yadg__dataschema_5_0__externaldate__ExternalDateFilename__Content": {
         "additionalProperties": false,
         "properties": {
            "format": {
               "title": "Format",
               "type": "string"
            },
            "len": {
               "title": "Len",
               "type": "integer"
            }
         },
         "required": [
            "format",
            "len"
         ],
         "title": "Content",
         "type": "object"
      }
   },
   "additionalProperties": false,
   "required": [
      "parser",
      "input",
      "extractor"
   ]
}

Config:
  • extra: str = forbid

field parser: Literal['flowdata'] [Required]
field extractor: Drycal_csv | Drycal_rtf | Drycal_txt [Required]

Formats

The filetypes currently supported by the parser are:

  • DryCal log file text output (drycal.txt), see drycal

  • DryCal log file tabulated output (drycal.csv), see drycal

  • DryCal log file document file (drycal.rtf), see drycal

Schema

The parser is used to extract all tabular data in the input file. This parser processes additional calibration information analogously to basiccsv.

Module Functions

yadg.parsers.flowdata.process(*, fn, filetype, encoding, timezone, **kwargs)

Flow meter data processor

This parser processes flow meter data.

Parameters:
  • fn (str) – File to process

  • encoding (str) – Encoding of fn, by default “utf-8”.

  • timezone (ZoneInfo) – A string description of the timezone. Default is “localtime”.

  • parameters – Parameters for FlowData.

Return type:

xarray.Dataset

Submodules

drycal: File parser for DryCal log files.

This module includes functions for parsing converted documents (rtf) and tabulated exports (txt, csv).

The DryCal files only contain the timestamps of the datapoints, not the date. Therefore, the date has to be supplied either using the date argument in parameters, or is parsed from the prefix of the filename.

Code author: Peter Kraus

pydantic model yadg.parsers.flowdata.drycal.TimeDate

Bases: BaseModel

Show JSON schema
{
   "title": "TimeDate",
   "type": "object",
   "properties": {
      "date": {
         "anyOf": [
            {
               "$ref": "#/$defs/TimestampSpec"
            },
            {
               "type": "null"
            }
         ],
         "default": null
      },
      "time": {
         "anyOf": [
            {
               "$ref": "#/$defs/TimestampSpec"
            },
            {
               "type": "null"
            }
         ],
         "default": null
      }
   },
   "$defs": {
      "TimestampSpec": {
         "additionalProperties": false,
         "properties": {
            "index": {
               "anyOf": [
                  {
                     "type": "integer"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Index"
            },
            "format": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Format"
            }
         },
         "title": "TimestampSpec",
         "type": "object"
      }
   }
}

pydantic model TimestampSpec

Bases: BaseModel

Show JSON schema
{
   "title": "TimestampSpec",
   "type": "object",
   "properties": {
      "index": {
         "anyOf": [
            {
               "type": "integer"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Index"
      },
      "format": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Format"
      }
   },
   "additionalProperties": false
}

Config:
  • extra: str = forbid

field index: int | None = None
field format: str | None = None
field date: TimestampSpec | None = None
field time: TimestampSpec | None = None
yadg.parsers.flowdata.drycal.rtf(fn, encoding, timezone)

RTF version of the drycal parser.

This is intended to parse legacy drycal DOC files, which have been converted to RTF using other means.

Parameters:
  • fn (str) – Filename to parse.

  • encoding (str) – Encoding to use for parsing fn.

  • calib – A calibration spec.

Returns:

(timesteps, metadata, None) – A standard data - metadata - common data output tuple.

Return type:

tuple[list, dict, None]

yadg.parsers.flowdata.drycal.sep(fn, sep, encoding, timezone)

Generic drycal parser, using sep as separator string.

This is intended to parse other export formats from DryCal, such as txt and csv files.

Parameters:
  • fn (str) – Filename to parse.

  • date – A unix timestamp float corresponding to the day (or other offset) to be added to each line in the measurement table.

  • sep (str) – The separator character used to split lines in fn.

  • encoding (str) – Encoding to use for parsing fn.

  • calib – A calibration spec.

Returns:

(timesteps, metadata, None) – A standard data - metadata - common data output tuple.

Return type:

tuple[list, dict, None]

yadg.parsers.flowdata.drycal.drycal_table(lines, sep=',')

DryCal table-processing function.

Given a table with headers and units in the first line, and data in the following lines, this function returns the headers, units, and data extracted from the table. The returned values are always of (str) type, any post-processing is done in the calling routine.

Parameters:
  • lines (list) – A list containing the lines to be parsed

  • sep (str) – The separator string used to split each line into individual items

Returns:

(headers, units, data) – A tuple of a list of the stripped headers, dictionary of header-unit key-value pairs, and a list of lists containing the rows of the table.

Return type:

tuple[list, dict, list]

yadg.parsers.flowdata.main.process(*, fn, filetype, encoding, timezone, **kwargs)

Flow meter data processor

This parser processes flow meter data.

Parameters:
  • fn (str) – File to process

  • encoding (str) – Encoding of fn, by default “utf-8”.

  • timezone (ZoneInfo) – A string description of the timezone. Default is “localtime”.

  • parameters – Parameters for FlowData.

Return type:

xarray.Dataset