yadg.core package

Submodules

yadg.core.process module

yadg.core.process.process_schema(schema)

Main worker function of yadg.

Takes in a validated schema as an argument and returns a single annotated datagram created from the schema. It is the job of the user to supply a validated schema.

Parameters

schema (Union[list, tuple]) – A fully validated schema. Use the function yadg.core.validators.validate_schema() to validate your schema.

Returns

datagram – An unvalidated datagram. The parsers included in yadg should return a valid datagram; any custom parsers might not do so. Use the function yadg.core.validators.validate_datagram() to validate the resulting datagram.

Return type

dict

yadg.core.spec_datagram module

yadg.core.spec_schema module

yadg.core.validators module

yadg.core.validators.validate_datagram(datagram)

Datagram validator.

Checks the overall datagram format against the datagram spec, and ensures that each floating-point value is accompanied by standard deviation and unit.

The current datagram specification is:

  • The datagram must be a (dict) with two entries:

    • "metadata" (dict): A top-level entry containing metadata.

    • "steps" (list[dict]): List corresponding to a sequence of steps.

  • The "metadata" entry has to contain information about the "provenance" of the datagram, the creation date using ISO8601 format in "date" (str) entry, a full copy of the input schema in the "input_schema" entry, and version information in "datagram_version" (str).

  • Each element in the "steps" corresponds to a single step from the schema.

  • Each step in "steps" has to contain a "metadata" (dict) entry, and a "data" (list); an optional "common" (dict) data block can be provided.

  • Each timestep in the "data" list has to specify a timestamp using the Unix Timestamp format in "uts" (float) entry; the original filename in "fn" (str) entry. The raw data present in this original filename is stored as sub-entries within the "raw" (dict) entry. Any derived data, such as that obtained via calibration, integration, or fitting, has to be stored in the "derived" (dict).

Note

A floating-point entry should always have its standard deviation specified. Internal processing of this data is always carried out using the (ufloat) type, which ought to be exported as a {"n": value, "s": std_dev, "u": "-"} keypair.

Note

Most numerical data should have associated units. The validator expects all floating-point entries to be in a {"n": value, "s": std_dev} format for properties with an arbitrary unit, and {"n": value, "s": std_dev, "u": unit} for properties with a defined unit.

Parameters

datagram (dict) – The datagram to be validated.

Returns

True – If the datagram passes all assertions, returns True. Else, an AssertionError is raised.

Return type

bool

yadg.core.validators.validate_schema(schema, strictfiles=True, strictfolders=True)

Schema validator.

Checks the overall schema format, checks every step of the schema for required entries, and checks whether required parameters for each parser are provided. The validator additionally fills in optional parameters, where necessary for a valid schema.

The specification is:

  • The schema has to be a (dict) with a top-level "metadata" (dict) and "steps" (list) entries

  • The "metadata" entry has to specify the "provenance" of the schema, as well as the "schema_version" (str). Other entries may be specified.

  • Each element within the "steps" list is a step, of type (dict)

  • Each step has to have the "parser" and "import" entries:

    • The "parser" is a (str) entry that has to contain the name of the requested parser module. This entry is processed in the yadg.core.main._infer_datagram_handler() function in the core module.

    • The "import" is a (dict) entry has to contain:

      • Exactly one entry out of "files" or "folders". This entry must be a (list) even if only one element is provided.

      • Any combination of "prefix", "suffix", "contains" entries. They must be of type (str). These entries specify the matching of files within folders accordingly.

  • The only other allowed entries are:

    • "tag" (str): for defining a tag for each step; by default assigned the numerical index of the step within the schema.

    • "export" (str): for exporting a single step; whether the processed step should be exported as a json file. This file is kept available for other steps, but will be removed at the end of schema processing.

    • "parameters" (dict): for specifying other parameters for each of the parsers.

  • no other entries are permitted

Parameters
  • schema (Union[list, tuple]) – The schema to be validated.

  • strictfiles (bool) – When False, any files specified using the "files" option will not be checked for existence. Note that folders (specified via "folders") are always checked.

Returns

True – When the schema is valid and passes all assertions, True is returned. Otherwise, an AssertionError is raised.

Return type

bool

yadg.core.validators.validator(item, spec)

Worker validator function.

This function checks that item matches the specification supplied in spec. The spec (dict) can have the following entries:

  • "type" (type), a required entry, defining the type of item,

  • "all" (dict) defining a set of required keywords and their respective spec,

  • "any" (dict) defining a set of optional keywords and their respective spec,

  • "one" (dict) defining a set of mutually exclusive keywords and their respective spec,

  • "each" (dict) providing the spec for any keywords not listed in "all", "any", or "one",

  • "allow" (bool) a switch whether to allow unspecified keys.

To extend the existing datagram and schema specs, look into yadg.core.spec_datagram and yadg.core.spec_schema, respectively.

Parameters
  • item (Union[list, dict, str]) – The item to be validated.

  • spec (dict) – The spec with which to validate the item

Returns

True – If the item matches the spec, returns True. Otherwise, an AssertionError is raised.

Return type

bool