phi: For Phi XPS data

An extractor for data files from PHI X-ray photoelectron spectrometers.

yadg.extractors.phi.spe module

Processing of ULVAC PHI Multipak XPS traces.

The IGOR .spe import script by jjweimer was pretty helpful for writing this extractor.

Usage

Available since yadg-4.0.

pydantic model dgbowl_schemas.yadg.dataschema_6_0.filetype.Phi_spe

Config:

extra: str = forbid

Validators:

field filetype: Literal['phi.spe'] [Required]

Schema

xarray.DataTree:
  {{ trace_name }}:
    coords:
      E:            !!float               # Binding energies
    data_vars:
      y:            (E)                   # Signal data

Metadata

The following metadata is extracted:

software_id: ID of the software used to generate the file.

version: Version of the software used to generate the file.

username: User name used to generate the file.

Additionally, the processed header data is stored in the metadata under file_header.

Notes on file structure

These binary files actually contain an ASCII file header, delimited by “SOFH “ and “EOFH “.

The binding energies corresponding to the datapoints in the later part of the file can be found from the “SpectralRegDef” entries in this header. Each of these entries look something like:

2 2 F1s 9 161 -0.1250 695.0 675.0 695.0 680.0    0.160000 29.35 AREA

This maps as follows:

         trace_number
         trace_number (again?)
F1s         name
         atomic_number
       num_datapoints
-0.1250     step
0       start
0       stop
0       ?
0       ?
160000    dwell_time
35       e_pass
AREA        description (?)

After the file header, the binary part starts with a short data header (offsets given from start of data header):

0x0000 group                # Data group number.
0x0004 num_traces           # Number of traces in file
0x0008 trace_header_size    # Combined lengths of all trace headers.
0x000c data_header_size     # Length of this data header.

After this follow num_traces trace headers that are each structured something like this:

0x0000 trace_number          # Number of the trace.
0x0004 bool_01               # ???
0x0008 bool_02               # ???
0x000c trace_number_again    # Number of the trace. Again?
0x0010 bool_03               # ???
0x0014 num_datapoints        # Number of datapoints in trace.
0x0018 bool_04               # ???
0x001c bool_05               # ???
0x0020 string_01             # ???
0x0024 string_02             # ???
0x0028 string_03             # ???
0x002c int_02                # ???
0x0030 string_04             # ???
0x0034 string_05             # ???
0x0038 y_unit                # The unit of the datapoints.
0x003c int_05                # ???
0x0040 int_06                # ???
0x0044 int_07                # ???
0x0048 data_dtype            # Data type for datapoints (f4 / f8).
0x004c num_data_bytes        # Unsure about this one.
0x0050 num_datapoints_tot    # This one as well.
0x0054 int_10                # ???
0x0058 int_11                # ???
0x005c end_of_data           # Byte offset of the end-of-data.

After the trace headers follow the datapoints. After the number of datapoints there is a single 32bit float with the trace’s dwelling time again.

Uncertainties

The uncertainties of "E" are taken as the step-width of the linearly spaced energy values.

The uncertainties "s" of "y" are currently set to a constant value of 12.5 counts per second as all the signals in the files seen so far only seem to take on values in those steps.

TODO

https://github.com/dgbowl/yadg/issues/13

Determining the uncertainty of the counts per second signal in XPS traces from the phispe parser should be done in a better way.

Code author: Nicolas Vetsch

yadg.extractors.phi.spe.camel_to_snake(s: str) → str

Converts CamelCase strings to snake_case.

From https://stackoverflow.com/a/1176023

Parameters:: s – The CamelCase input string.
Returns:: The snake_case equivalent of s.
Return type:: str

yadg.extractors.phi.spe.extract_from_path(source: Path, **kwargs: dict) → DataTree