Package glue :: Module dagfile
[hide private]
[frames] | no frames]

Module dagfile

source code

Machinery for reading, editing, and writing Condor DAG files.

When running DAGs on Condor compute clusters, very often one will wish to re-run a portion of a DAG. This can be done by marking all jobs except the ones to be re-run as "DONE". Unfortunately the Condor software suite lacks an I/O library for reading and writing Condor DAG files, so there is no easy way to edit DAG files except by playing games sed, awk, or once-off Python or Perl scripts. That's where this module comes in. This module will read a DAG file into an in-ram representation that is easily edited, and allow the file to be written to disk again.

Example:

>>> from glue import dagfile
>>> dag = dagfile.DAG.parse(open("pipeline.dag"))
>>> dag.write(open("pipeline.dag", "w"))

Although it is possible to machine-generate an original DAG file using this module and write it to disk, this module does not provide the tools required to do any of the other tasks associated with pipeline construction. For example there is no facility here to generate or manage submit files, data files, or any other files that are associated with a full pipeline. Only the DAG file itself is considered here. For general pipeline construction see the pipeline module. The focus of this module is on editing existing DAG files.

Developers should also consider doing any new pipeline development using DAX files as the fundamental workflow description, instead of DAGs. See http://pegasus.isi.edu for more information.

A DAG file is loaded using the .parse() class method of the DAG class. This parses the file-like object passed to it and returns an instance of the DAG class representing the file's contents. Once loaded, the nodes in the DAG can all be found in the .nodes dictionary, whose keys are the node names and whose values are the corresponding node objects. Among each node object's attributes are sets .children and .parents containing references to the child and parent nodes (not their names) for each node. Note that every node must appear listed as a parent of each of its children, and vice versa. The other attributes of a DAG instance contain information about the DAG, for example the CONFIG file or the DOT file, and so on. All of the data for each node in the DAG, for example the node's VARS value, its initial working directory, and so on, can be found in the attributes of the nodes themselves. A DAG is written to a file using the .write() method of the DAG object.

Classes [hide private]
  DAG
Representation of the contents of a Condor DAG file.
  DATA
Representation of a Stork DATA node in a Condor DAG.
  JOB
Representation of a JOB node in a Condor DAG.
  SPLICE
Representation of a SPLICE node in a Condor DAG.
  SUBDAG_EXTERNAL
Representation of a SUBDAG EXTERNAL node in a Condor DAG.
  nofile
Object providing a no-op .write() method to fake a file.
  progress_wrapper
Progress report wrapper.
Variables [hide private]
  __package__ = 'glue'