dgpost usage
Note
A detailed, interactive usage manual has been prepared as part of the Lock, stock, and barrel manuscript. The Binder-ready data archive, as well as the direct Binder link are below:
dgpost is intended for use in two modes:
with a recipe as an executable:
dgpost <recipe.yml>
as a Python library:
import dgpost.utils
andimport dgpost.transform
These two usage modes are described below:
dgpost as an executable
This is the main user-focused way of using dgpost. The user should craft a
recipe, written in yaml
, which includes a prescription of the steps to be
performed by dgpost. Currently supported steps are:
load
: load aNetCDF
orJSON
datagram, or apd.DataFrame
extract
: extract and interpolate data from the loaded files into a tablepivot
: reorganise the data in the table using columns as indices for pivoting, grouping several rows of the original table into a single rowtransform
: use a function from thedgpost.transform
library to process the data in the table, creating new columnsplot
: plot parts of the table using our custom wrapper aroundmatplotlib
save
: export the created table for further use
Each of the above keywords can be only specified once in the recipe, however more
than one command can be specified for each of the keywords (i.e. it’s possible to
load
multiple files, apply several transform
functions, etc.).
Detailed syntax of each of the above steps is further described in the documentation of
each of the keywords within the dgpost.utils
module.
dgpost can be used to process multiple datasets, in a batch-like mode, using the
--patch
argument. For this, the recipe should contain either $patch
or
$PATCH
in the str
defining the path
in the load
step. The same string can be included in the as
argument of the save
and/or the save->as
argument of the plot
steps. Then, dgpost
can be executed using
dgpost --patch <patchname> <yamlfile>
and the paths in the provided yamlfile
containing the recipe will be patched
using patchname
.
dgpost as a Python library
For advanced users and more complex workflows, dgpost can be also imported as a
standard python library with import dgpost
. The functions within dgpost are
split into two key modules:
dgpost.utils
module, containing the ancilliary data management functions of dgpost, anddgpost.transform
module, containing all “scientific” data transformation functions.
See the documentation of the two respective modules for details. The functionality of dgpost can be used from within e.g. Jupyter notebooks in this way.