Usage

The tomato package consists of two user-facing utilities:

  • the state daemon management utility, tomato.tomato, executed by tomato,

  • the job and queue management app, tomato.ketchup, executed by ketchup.

Starting tomato.daemon

Note

For instructions on how to set tomato up for a first run, see the Quick start guide.

Provided a settings file exists, the daemon process tomato-daemon can be started on the default port using:

tomato start

The daemon keeps track of pipelines configured in the device file, and schedules jobs from the queue onto them. See the concepts flowchart for a more detailed overview.

Note

Multiple instances of the tomato.daemon can be running at a single PC, provided a different port is specified using --port argument to tomato and ketchup.

Using tomato

The tomato.tomato executable is used to configure, start, and manage the tomato daemon, as well as load / eject samples to / from pipelines and mark them ready.

  1. To configure the tomato daemon by creating a default settings file, run:

    >>> tomato init
    
  2. To start the tomato daemon on the default port, run:

    >>> tomato start
    

    This will read the settings file, and parse the device file listed within. To start the daemon on an alternative port, run:

    >>> tomato start --port <int>
    

    Warning

    All tomato and ketchup commands intended to interact with the tomato daemon running on an alternative port will have to be executed with the same --port <int> argument.

  3. To stop the tomato daemon, run:

    >>> tomato stop
    

    Note

    The daemon will only stop if there are no running jobs. However, a snapshot of the daemon state will be generated in any case. There is currently no clean way to stop the tomato daemon cleanly while jobs are running.

  4. To reload settings of a running tomato daemon, run:

    >>> tomato reload
    

    Currently, reloading driver settings from the settings file and managing pipelines and/or devices from the device file is supported. Any component not present in a pipeline is automatically removed.

    Note

    The daemon will only remove pipelines, devices and components if they are not used by any running job.

  5. To manage individual pipelines of a running tomato daemon, the following commands are available:

    • For loading a sample into a pipeline:

      >>> tomato pipeline load <pipeline> <sampleid>
      

      This will only succeed on pipelines that are empty and have no jobs running.

    • To eject any sample from a pipeline:

      >>> tomato pipeline eject <pipeline>
      

      This will also succeed if the pipeline was already empty. It will fail if the pipeline has a job running.

      Note

      As a precaution, ejecting a sample from any pipeline will always mark the pipeline as not ready.

    • To mark a pipeline as ready:

      >>> tomato pipeline ready <pipeline>
      

      This will also succeed if the pipeline was already ready.

Using ketchup

The tomato.ketchup executable is used to submit payloads to the daemon, and to check the status of and to cancel jobs in the queue.

  1. To submit a job using a payload contained in a Payload file, run:

    >>> ketchup submit <payload>
    

    The job will enter the queue and wait for a suitable pipeline to begin execution.

    Note

    For more information about how jobs are matched against pipelines, see the documentation of the daemon module.

  2. To check the status of one or several jobs with known jobids, run:

    >>> ketchup status <jobids>
    

    When executed without argument, the status of the whole queue will be returned. The list of possible job statuses is:

    Status

    Meaning

    q

    Job has entered the queue.

    qw

    Job is in the queue, waiting for a pipeline to be ready.

    r

    Job is running.

    rd

    Job has been marked for cancellation.

    c

    Job has completed successfully.

    ce

    Job has completed with an error.

    cd

    Job has been cancelled.

  3. To cancel one or more submitted jobs with known jobids, run:

    >>> ketchup cancel <jobids>
    

    This will mark the jobs for cancellation by setting their status to rd. The tomato.daemon will then proceed with cancelling each job.

Jobs submitted to the queue will remain in the queue until a pipeline meets all of the following criteria:

  • A pipeline where all of the technique_names specified in all Tasks within the payload are matched by its capabilities must exist. Once the tomato.daemon finds such a pipeline, the status of the job will change to qw to indicate a suitable pipeline exists.

  • The matching pipeline must contain a sample with a samplename that matches the name specified in the payload.

  • The matching pipeline must be marked as ready.

Note

Further information about ketchup is available in the documentation of the ketchup module.

Machine-readable output

As of tomato-1.0, the outputs of tomato and ketchup utilities can be requested as a yaml-formatted text, by passing the --yaml (or -y) command-line parameter to the executables.

Accessing output data

Each job stores its data and logs in its own job folder, which is a subfolder of the jobs.storage folder specified in the settings file.

Warning

While “live” job data is available in the job folder in pickled form, accessing those files directly is not supported and may lead to race conditions and crashes. If you need an up-to-date data archive, request a snapshot. If you need the current status of a device, probe the responsible driver process.

Note that a pipeline dashboard functionality is planned for a future version of tomato.

Final job data and metadata

By default, all data in the job folder is processed to create a NetCDF file. The NetCDF files can be read using xaray.open_datatree(), returning a xarray.DataTree.

In the root node of the DataTree, the attrs dictionary contains all tomato-relevant metadata. This currently includes:

  • tomato_version which is the version of tomato used to create the NetCDF file,

  • tomato_Job which is the job object serialised as a json str, containing the full payload, sample information, as well as job submission/execution/completion time.

The child nodes of the DataTree contain:

  • the actual data from each pipeline component, unit-annotated using the CF Metadata Conventions. The node names correspond to the role that component fullfils in a pipeline.

  • a tomato_Component entry in the attrs object, which is the component object serialised as a json str, containing information about the device address and channel that define the component, the driver and device names, as well as the component capabilities.

Note

The tomato_Job and tomato_Component entries can be converted back to the source objects using tomato.models.Job.model_validate_json() and tomato.models.Component.model_validate_json(), respectively.

Note

As opposed to tomato-0.2, in tomato-1.0 we currently do not output measurement uncertainties.

Unless specified within the payload, the default location where these output files will be placed is the cwd() where the ketchup submit command was executed; the default filenames of the returned files are results.<jobid>.[zip,json].

Data snapshotting

While the job is running, access to an up-to-date snapshot of the data is provided by ketchup:

>>> ketchup snapshot <jobid>

This will create an up-to-date snapshot.<jobid>.nc file in the current working dir. The files are overwritten on subsequent invocations of ketchup snapshot. An automated, periodic snapshotting stored in a custom location can be further configured within the payload of the job.