electrochem: Electrochemistry data parser

This module handles the reading and processing of files containing electrochemical data, including BioLogic’s EC-Lab file formats. The basic function of the parser is to:

  1. Read in the technique data and create timesteps.

  2. Collect metadata, such as the measurement settings and the loops contained in a given file.

  3. Collect data describing the technique parameter sequences.

Usage

Available since yadg-4.0. The parser supports the following parameters:

pydantic model dgbowl_schemas.yadg.dataschema_5_0.step.ElectroChem

Parser for electrochemistry files.

Show JSON schema
{
   "title": "ElectroChem",
   "description": "Parser for electrochemistry files.",
   "type": "object",
   "properties": {
      "tag": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Tag"
      },
      "parser": {
         "const": "electrochem",
         "title": "Parser"
      },
      "input": {
         "$ref": "#/$defs/Input"
      },
      "extractor": {
         "discriminator": {
            "mapping": {
               "eclab.mpr": "#/$defs/EClab_mpr",
               "eclab.mpt": "#/$defs/EClab_mpt",
               "marda:biologic-mpr": "#/$defs/EClab_mpr",
               "marda:biologic-mpt": "#/$defs/EClab_mpt",
               "tomato.json": "#/$defs/Tomato_json"
            },
            "propertyName": "filetype"
         },
         "oneOf": [
            {
               "$ref": "#/$defs/EClab_mpr"
            },
            {
               "$ref": "#/$defs/EClab_mpt"
            },
            {
               "$ref": "#/$defs/Tomato_json"
            }
         ],
         "title": "Extractor"
      },
      "parameters": {
         "anyOf": [
            {
               "$ref": "#/$defs/Parameters"
            },
            {
               "type": "null"
            }
         ],
         "default": null
      },
      "externaldate": {
         "anyOf": [
            {
               "$ref": "#/$defs/ExternalDate"
            },
            {
               "type": "null"
            }
         ],
         "default": null
      }
   },
   "$defs": {
      "EClab_mpr": {
         "additionalProperties": false,
         "properties": {
            "filetype": {
               "enum": [
                  "eclab.mpr",
                  "marda:biologic-mpr"
               ],
               "title": "Filetype",
               "type": "string"
            },
            "timezone": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Timezone"
            },
            "locale": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Locale"
            },
            "encoding": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Encoding"
            }
         },
         "required": [
            "filetype"
         ],
         "title": "EClab_mpr",
         "type": "object"
      },
      "EClab_mpt": {
         "additionalProperties": false,
         "properties": {
            "filetype": {
               "enum": [
                  "eclab.mpt",
                  "marda:biologic-mpt"
               ],
               "title": "Filetype",
               "type": "string"
            },
            "timezone": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Timezone"
            },
            "locale": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Locale"
            },
            "encoding": {
               "default": "windows-1252",
               "title": "Encoding",
               "type": "string"
            }
         },
         "required": [
            "filetype"
         ],
         "title": "EClab_mpt",
         "type": "object"
      },
      "ExternalDate": {
         "additionalProperties": false,
         "description": "Supply timestamping information that are external to the processed file.",
         "properties": {
            "using": {
               "anyOf": [
                  {
                     "$ref": "#/$defs/ExternalDateFile"
                  },
                  {
                     "$ref": "#/$defs/ExternalDateFilename"
                  },
                  {
                     "$ref": "#/$defs/ExternalDateISOString"
                  },
                  {
                     "$ref": "#/$defs/ExternalDateUTSOffset"
                  }
               ],
               "title": "Using"
            },
            "mode": {
               "default": "add",
               "enum": [
                  "add",
                  "replace"
               ],
               "title": "Mode",
               "type": "string"
            }
         },
         "required": [
            "using"
         ],
         "title": "ExternalDate",
         "type": "object"
      },
      "ExternalDateFile": {
         "additionalProperties": false,
         "description": "Read external date information from file.",
         "properties": {
            "file": {
               "$ref": "#/$defs/dgbowl_schemas__yadg__dataschema_5_0__externaldate__ExternalDateFile__Content"
            }
         },
         "required": [
            "file"
         ],
         "title": "ExternalDateFile",
         "type": "object"
      },
      "ExternalDateFilename": {
         "additionalProperties": false,
         "description": "Read external date information from the file name.",
         "properties": {
            "filename": {
               "$ref": "#/$defs/dgbowl_schemas__yadg__dataschema_5_0__externaldate__ExternalDateFilename__Content"
            }
         },
         "required": [
            "filename"
         ],
         "title": "ExternalDateFilename",
         "type": "object"
      },
      "ExternalDateISOString": {
         "additionalProperties": false,
         "description": "Read a constant external date using an ISO-formatted string.",
         "properties": {
            "isostring": {
               "title": "Isostring",
               "type": "string"
            }
         },
         "required": [
            "isostring"
         ],
         "title": "ExternalDateISOString",
         "type": "object"
      },
      "ExternalDateUTSOffset": {
         "additionalProperties": false,
         "description": "Read a constant external date using a Unix timestamp offset.",
         "properties": {
            "utsoffset": {
               "title": "Utsoffset",
               "type": "number"
            }
         },
         "required": [
            "utsoffset"
         ],
         "title": "ExternalDateUTSOffset",
         "type": "object"
      },
      "Input": {
         "additionalProperties": false,
         "description": "Specification of input files/folders to be processed by the :class:`Step`.",
         "properties": {
            "folders": {
               "items": {
                  "type": "string"
               },
               "title": "Folders",
               "type": "array"
            },
            "prefix": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Prefix"
            },
            "suffix": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Suffix"
            },
            "contains": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Contains"
            },
            "exclude": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Exclude"
            }
         },
         "required": [
            "folders"
         ],
         "title": "Input",
         "type": "object"
      },
      "Parameters": {
         "additionalProperties": false,
         "description": "Empty parameters specification with no extras allowed.",
         "properties": {},
         "title": "Parameters",
         "type": "object"
      },
      "Tomato_json": {
         "additionalProperties": false,
         "properties": {
            "filetype": {
               "const": "tomato.json",
               "title": "Filetype"
            },
            "timezone": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Timezone"
            },
            "locale": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Locale"
            },
            "encoding": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Encoding"
            }
         },
         "required": [
            "filetype"
         ],
         "title": "Tomato_json",
         "type": "object"
      },
      "dgbowl_schemas__yadg__dataschema_5_0__externaldate__ExternalDateFile__Content": {
         "additionalProperties": false,
         "properties": {
            "path": {
               "title": "Path",
               "type": "string"
            },
            "type": {
               "title": "Type",
               "type": "string"
            },
            "match": {
               "anyOf": [
                  {
                     "type": "string"
                  },
                  {
                     "type": "null"
                  }
               ],
               "default": null,
               "title": "Match"
            }
         },
         "required": [
            "path",
            "type"
         ],
         "title": "Content",
         "type": "object"
      },
      "dgbowl_schemas__yadg__dataschema_5_0__externaldate__ExternalDateFilename__Content": {
         "additionalProperties": false,
         "properties": {
            "format": {
               "title": "Format",
               "type": "string"
            },
            "len": {
               "title": "Len",
               "type": "integer"
            }
         },
         "required": [
            "format",
            "len"
         ],
         "title": "Content",
         "type": "object"
      }
   },
   "additionalProperties": false,
   "required": [
      "parser",
      "input",
      "extractor"
   ]
}

Config:
  • extra: str = forbid

field parser: Literal['electrochem'] [Required]
field extractor: EClab_mpr | EClab_mpt | Tomato_json [Required]

Formats

The filetypes currently supported by the parser are:

  • EC-Lab raw data binary file and parameter settings (eclab.mpr), see eclabmpr

  • EC-Lab human-readable text export of data (eclab.mpt), see eclabmpt

  • tomato’s structured json output (tomato.json), see tomatojson

Schema

Depending on the filetype, the output xarray.Dataset may contain multiple derived values. However, all filetypes will report at least the following:

xr.Dataset:
  coords:
    uts:      !!float
  data_vars:
    Ewe:      (uts)    # Potential of the working electrode (V)
    Ece:      (uts)    # Potential of the counter electrode (V)
    I:        (uts)    # Applied current (A)

In some cases, average values (i.e. <Ewe> or <I>) may be reported instead of the instantaneous data.

Warning

In previous versions of yadg, the electrochem parser optionally transposed data from impedance spectroscopy, grouping the datapoints in each scan into a single “trace”. This behaviour has been removed in yadg-5.0.

Module Functions

yadg.parsers.electrochem.process(*, filetype, **kwargs)

Unified parser for electrochemistry data. Forwards kwargs to the worker functions based on the supplied filetype.

Parameters:

filetype (str) – Discriminator used to select the appropriate worker function.

Return type:

xarray.Dataset

Subpackages

Submodules

eclabmpr: Processing of BioLogic’s EC-Lab binary modular files.

.mpr files are structured in a set of “modules”, one concerning settings, one for actual data, one for logs, and an optional loops module. The parameter sequences can be found in the settings module.

This code is partly an adaptation of the galvani module by Chris Kerr, and builds on the work done by the previous civilian service member working on the project, Jonas Krieger.

These are the implemented techniques for which the technique parameter sequences can be parsed:

CA

Chronoamperometry / Chronocoulometry

CP

Chronopotentiometry

CV

Cyclic Voltammetry

GCPL

Galvanostatic Cycling with Potential Limitation

GEIS

Galvano Electrochemical Impedance Spectroscopy

LOOP

Loop

LSV

Linear Sweep Voltammetry

MB

Modulo Bat

OCV

Open Circuit Voltage

PEIS

Potentio Electrochemical Impedance Spectroscopy

WAIT

Wait

ZIR

IR compensation (PEIS)

Note

.mpt files can contain more data than the corresponding binary .mpr file.

File Structure of .mpr Files

At a top level, .mpr files are made up of a number of modules, separated by the MODULE keyword. In all the files I have seen, the first module is the settings module, followed by the data module, the log module and then an optional loop module.

0x0000 BIO-LOGIC MODULAR FILE  # File magic.
0x0034 MODULE                  # Module magic.
...                            # Module 1.
0x???? MODULE                  # Module magic.
...                            # Module 2.
0x???? MODULE                  # Module magic.
...                            # Module 3.
0x???? MODULE                  # Module magic.
...                            # Module 4.

After splitting the entire file on MODULE, each module starts with a header that is structured like this (offsets from start of module):

0x0000 short_name  # Short name, e.g. VMP Set.
0x000A long_name   # Longer name, e.g. VMP settings.
0x0023 length      # Number of bytes in module data.
0x0027 version     # Module version.
0x002B date        # Acquisition date in ASCII, e.g. 08/10/21.
...                # Module data.

The contents of each module’s data vary wildly depending on the used technique, the module and perhaps the software version, the settings in EC-Lab, etc. Here a quick overview (offsets from start of module data).

Settings Module

0x0000 technique_id           # Unique technique ID.
...                           # ???
0x0007 comments               # Pascal string.
...                           # Zero padding.
# Cell Characteristics.
0x0107 active_material_mass   # Mass of active material
0x010B at_x                   # at x =
0x010F molecular_weight       # Molecular weight of active material
0x0113 atomic_weight          # Atomic weight of intercalated ion
0x0117 acquisition_start      # Acquisition started a: xo =
0x011B e_transferred          # Number of e- transferred
0x011E electrode_material     # Pascal string.
...                           # Zero Padding
0x01C0 electrolyte            # Pascal string.
...                           # Zero Padding, ???.
0x0211 electrode_area         # Electrode surface area
0x0215 reference_electrode    # Pascal string
...                           # Zero padding
0x024C characteristic_mass    # Characteristic mass
...                           # ???
0x025C battery_capacity       # Battery capacity C =
0x0260 battery_capacity_unit  # Unit of the battery capacity.
...                           # ???
# Technique parameters can randomly be found at 0x0572, 0x1845 or
# 0x1846. All you can do is guess and try until it fits.
0x1845 ns                     # Number of sequences.
0x1847 n_params               # Number of technique parameters.
0x1849 params                 # ns sets of n_params parameters.
...                           # ???

Data Module

0x0000 n_datapoints  # Number of datapoints.
0x0004 n_columns     # Number of values per datapoint.
0x0005 column_ids    # n_columns unique column IDs.
...
# Depending on module version, datapoints start 0x0195 or 0x0196.
# Length of each datapoint depends on number and IDs of columns.
0x0195 datapoints    # n_datapoints points of data.

Log Module

...                         # ???
0x0009 channel_number       # Zero-based channel number.
...                         # ???
0x00AB channel_sn           # Channel serial number.
...                         # ???
0x01F8 Ewe_ctrl_min         # Ewe ctrl range min.
0x01FC Ewe_ctrl_max         # Ewe ctrl range max.
...                         # ???
0x0249 ole_timestamp        # Timestamp in OLE format.
0x0251 filename             # Pascal String.
...                         # Zero padding, ???.
0x0351 host                 # IP address of host, Pascal string.
...                         # Zero padding.
0x0384 address              # IP address / COM port of potentiostat.
...                         # Zero padding.
0x03B7 ec_lab_version       # EC-Lab version (software)
...                         # Zero padding.
0x03BE server_version       # Internet server version (firmware)
...                         # Zero padding.
0x03C5 interpreter_version  # Command interpretor version (firmware)
...                         # Zero padding.
0x03CF device_sn            # Device serial number.
...                         # Zero padding.
0x0922 averaging_points     # Smooth data on ... points.
...                         # ???

Loop Module

0x0000 n_indexes  # Number of loop indexes.
0x0004 indexes    # n_indexes indexes at which loops start in data.
...               # ???

Metadata

The metadata will contain the information from the Settings module. This should include information about the technique, as well as any explicitly parsed cell characteristics data specified in EC-Lab.

TODO

https://github.com/dgbowl/yadg/issues/12

The mapping between metadata parameters between .mpr and .mpt files is not yet complete. In .mpr files, some technique parameters in the settings module correspond to entries in drop-down lists in EC-Lab. These values are stored as single-byte values in .mpr files.

The metadata also contains the infromation from the Log module, which contains more general parameters, like software, firmware and server versions, channel number, host address and an acquisition start timestamp in Microsoft OLE format.

Note

If the .mpr file contains an ExtDev module (containing parameters of any external sensors plugged into the device), the log is usually not present and therefore the full timestamp cannot be calculated.

Code author: Nicolas Vetsch

yadg.parsers.electrochem.eclabmpr.process_settings(data)

Processes the contents of settings modules.

Parameters:

data (bytes) – The data to parse through.

Returns:

The parsed settings.

Return type:

dict

yadg.parsers.electrochem.eclabmpr.parse_columns(column_ids)

Puts together column info from a list of data column IDs.

Note

The binary layout of the data in the .mpr file is described by a sequence of column IDs. Some column IDs relate to (flags) which are all packed into a single byte.

Parameters:

column_ids (list[int]) – A list of column IDs.

Returns:

The column names, dtypes, units and a dictionary of flag names and bitmasks.

Return type:

tuple[list, list, list, dict]

yadg.parsers.electrochem.eclabmpr.process_data(data, version, Eranges, Iranges, controls)

Processes the contents of data modules.

Parameters:
  • data (bytes) – The data to parse through.

  • version (int) – Module version from the data module header.

Returns:

Processed data ([{column -> value}, …, {column -> value}]). If the column unit is set to None, the value is an int. Otherwise, the value is a dict with value (“n”), sigma (“s”), and unit (“u”).

Return type:

list[dict]

yadg.parsers.electrochem.eclabmpr.process_log(data)

Processes the contents of log modules.

Parameters:

data (bytes) – The data to parse through.

Returns:

The parsed log.

Return type:

dict

yadg.parsers.electrochem.eclabmpr.process_loop(data)

Processes the contents of loop modules.

Parameters:

data (bytes) – The data to parse through.

Returns:

The parsed loops.

Return type:

dict

yadg.parsers.electrochem.eclabmpr.process_ext(data)

Processes the contents of external device modules.

Parameters:

data (bytes) – The data to parse through.

Returns:

The parsed log.

Return type:

dict

yadg.parsers.electrochem.eclabmpr.process_modules(contents)

Handles the processing of all modules.

Parameters:

contents (bytes) – The contents of an .mpr file, minus the file magic.

Returns:

The processed settings, data, log, and loop modules. If they are not present in the provided modules, returns None instead.

Return type:

tuple[dict, list, dict, dict]

yadg.parsers.electrochem.eclabmpr.process(*, fn, timezone, **kwargs)

Processes EC-Lab raw data binary files.

Parameters:
  • fn (str) – The file containing the data to parse.

  • encoding – Encoding of fn, by default “windows-1252”.

  • timezone (ZoneInfo) – A string description of the timezone. Default is “localtime”.

Returns:

The full date is specified only if the “LOG” module is present.

Return type:

xarray.Dataset

eclabmpt: Processing of BioLogic’s EC-Lab ASCII export files.

.mpt files are made up of a header portion (with the technique parameter sequences and an optional loops section) and a tab-separated data table.

A list of techniques supported by this parser is shown in the techniques table.

File Structure of .mpt Files

These human-readable files are sectioned into headerlines and datalines. The header part of the .mpt files is made up of information that can be found in the settings, log and loop modules of the binary .mpr file.

If no header is present, the timestamps will instead be calculated from the file’s mtime().

Metadata

The metadata will contain the information from the header of the file.

Note

The mapping between metadata parameters between .mpr and .mpt files is not yet complete.

Code author: Nicolas Vetsch

yadg.parsers.electrochem.eclabmpt.process_header(lines, timezone)

Processes the header lines.

Parameters:

lines (list[str]) – The header lines, starting at line 3 (which is an empty line), right after the “Nb header lines : “ line.

Returns:

A dictionary containing the settings (and the technique parameters) and a dictionary containing the loop indexes.

Return type:

tuple[dict, dict]

yadg.parsers.electrochem.eclabmpt.process_data(lines, Eranges, Iranges, controls)

Processes the data lines.

Parameters:

lines (list[str]) – The data lines, starting right after the last header section. The first line is an empty line, the column names can be found on the second line.

Returns:

A dictionary containing the datapoints in the format ([{column -> value}, …, {column -> value}]). If the column unit is set to None, the value is an int. Otherwise, the value is a dict with value (“n”), sigma (“s”), and unit (“u”).

Return type:

dict

yadg.parsers.electrochem.eclabmpt.process(*, fn, encoding, locale, timezone, **kwargs)

Processes EC-Lab human-readable text export files.

Parameters:
  • fn (str) – The file containing the data to parse.

  • encoding (str) – Encoding of fn, by default “windows-1252”.

  • timezone (ZoneInfo) – A string description of the timezone. Default is “UTC”.

Returns:

The full date may not be specified if header is not present.

Return type:

xarray.Dataset

tomatojson: Processing of tomato electrochemistry outputs.

This module parses the electrochemistry json files generated by tomato.

Warning

This parser is brand-new in yadg-4.1 and the interface is unstable.

Four sections are expected in each tomato data file:

  • technique section, describing the current technique,

  • previous section, containing status information of the previous file,

  • current section, containing status information of the current file,

  • data section, containing the timesteps.

The reason why both previous and current are requires is that the device status is recorded at the time of data polling, which means the values in current might be invalid (after the run has finished) or not in sync with the data (if a technique change happened). However, previous may not be present in the first data file of an experiment.

To determine the measurement errors, the values from BioLogic manual are used: for measured voltages (\(E_{\text{we}}\) and \(E_{\text{ce}}\)) this corresponds to a constant uncertainty of 0.004% of the applied E-range with a maximum of 75 uV, while for currents (\(I\)) this is a constant uncertainty of 0.0015% of the applied I-range with a maximum of 0.76 uA.

Code author: Peter Kraus

yadg.parsers.electrochem.tomatojson.process(*, fn, **kwargs)
Return type:

Dataset