dgpost usage
Note
A detailed, interactive usage manual has been prepared as part of the Lock, stock, and barrel manuscript. The Binder-ready data archive, as well as the direct Binder link are below:
dgpost is intended for use in two modes:
with a recipe as an executable:
dgpost <recipe.yml>as a Python library:
import dgpost.utilsandimport dgpost.transform
These two usage modes are described below:
dgpost as an executable
This is the main user-focused way of using dgpost. The user should craft a
recipe, written in yaml, which includes a prescription of the steps to be
performed by dgpost. Currently supported steps are:
load: load aNetCDForJSONdatagram, or apd.DataFrameextract: extract and interpolate data from the loaded files into a tablepivot: reorganise the data in the table using columns as indices for pivoting, grouping several rows of the original table into a single rowtransform: use a function from thedgpost.transformlibrary to process the data in the table, creating new columnsplot: plot parts of the table using our custom wrapper aroundmatplotlibsave: export the created table for further use
Each of the above keywords can be only specified once in the recipe, however more
than one command can be specified for each of the keywords (i.e. it’s possible to
load multiple files, apply several transform functions, etc.).
Detailed syntax of each of the above steps is further described in the documentation of
each of the keywords within the dgpost.utils module.
dgpost can be used to process multiple datasets, in a batch-like mode, using the
--patch argument. For this, the recipe should contain either $patch or
$PATCH in the str defining the path in the load
step. The same string can be included in the as argument of the save
and/or the save->as argument of the plot steps. Then, dgpost
can be executed using
dgpost --patch <patchname> <yamlfile>
and the paths in the provided yamlfile containing the recipe will be patched
using patchname.
dgpost as a Python library
For advanced users and more complex workflows, dgpost can be also imported as a
standard python library with import dgpost. The functions within dgpost are
split into two key modules:
dgpost.utilsmodule, containing the ancilliary data management functions of dgpost, anddgpost.transformmodule, containing all “scientific” data transformation functions.
See the documentation of the two respective modules for details. The functionality of dgpost can be used from within e.g. Jupyter notebooks in this way.