Biskit :: Executor :: Executor :: Class Executor
[hide private]
[frames] | no frames]

Class Executor

source code

All calls of external programs should be done via this class or subclasses.

Executor gets the necessary information about a program (binary, environment variables, etc) from ExeConfigCache, creates an input file or pipe from a template (if available) or an existing file, wrapps the program call into ssh and nice (if necessary), spawns an external process via subprocess.Popen, communicates the input file or string, waits for completion and collects the output file or string, and cleans up temporary files.

There are two ways of using Executor

  • (recommended) Create a subclass of Executor for a certain program call. Methods to override would be: Additionally, you should provide a simple program configuration file in biskit/external/defaults/. See Biskit.ExeConfig for details and examples!
  • Use Executor directly. An example is given in the __main__ section of this module. You first have to create an Executor instance with all the parameters, then call its run() method and collect the result.

    In the most simple cases this can be combined into one line:
    >>> out, error, returncode = Executor('ls', strict=0).run()
    
    strict=0 means, ExeConfig does not insist on an existing exe_ls.dat file and instead looks for a program called 'ls' in the search path.
  • Templates

    Templates are files or strings that contain place holders like, for example:
       file_in=%(f_in)s
       file_out=%(f_out)s      
    
    At run time, Executor will create an input file or pipe from the template by replacing all place holders with values from its own fields. Let's assume, the above example is put into a file 'in.template'.
    >>> x = Executor( 'ls', template='in.template', f_in='in.dat')
    
    ... will then pass the following input to the ls program:
       file_in=in.dat
       file_out=/tmp/tmp1HYOvO
    
    However, the following input template will raise an error:
       file_in=%(f_in)s
       seed=%(seed)i
    
    ...because Executor doesn't have a 'seed' field. You could provide one by overwriting Executor.__init__. Alternatively, you can provide seed as a keyword to the original Executor.__init__:
    >>> x = Executor('ls', template='in.template',f_in='in.dat', seed=1.5)
    
    This works because Executor.__init__ puts all unknown key=value pairs into the object's name space and passes them on to the template.

    Communicating Input

    Programs often expect scripts, commands or additional parameters from StdIn or from input files. Executor tries to support many scenarios -- which one is chosen mainly depends on the ExeConfig `pipes` setting in exe_<program>.dat and on the `template` parameter given to Executor.__init__. (Note: Executor loads the ExeConfig instance for the given program `name` into its `self.exe` field.)

    Here is an overview over the different scenarios and how to activate them:
  • no input (default behaviour)

    The program only needs command line parameters

    Condition:
    • template == None
  • input pipe from STDIN (== ``myprogram | 'some input string'``)

    Condition:
    • exe.pipes == 1 / True
    • template != None ((or f_in points to existing file))
    Setup:
  • `template` points to an existing file:

    Executor reads the template file, completes it in memory, and pushes it directly to the program.
  • `template` points to string that doesn't look like a file name:

    Executor completes the string in memory (using `self.template % self.__dict__`) and pushes it directly to the program. This is the fastest option as it avoids file access alltogether.
  • `template` == None but f_in points to an *existing* file:

    Executor will read this file and push it unmodified to the program via StdIn. (kind of an exception, if used at all, f_in usual points to a *non-existing* file that will receive the completed input.)
  • input from file (== ``myprogram < input_file``)

    Condition:
    • exe.pipes == 0 / False
    • template != None
    • push_inp == 1 / True (default)
    Setup:
  • `template` points to an existing file:

    Executor reads the template file, completes it in memory, saves the completed file to disc (creating or overriding self.f_in), opens the file and passes the file handle to the program (instead of STDIN).
  • `template` points to string that doesn't look like a file name:

    Same as 3.1, except that the template is not read from disc but directly taken from memory (see 2.2).
  • input from file passed as argument to the program (== ``myprogram input_file``)

    Condition:
    • exe.pipes == 0 / False

    For this it is up to you to provide the correct program argument.

    Setup:
  • Use template completion:

    The best option would be to set an explicit file name for `f_in` and include this file name into `args`, Example:
     exe = ExeConfigCache.get('myprogram')
     assert not exe.pipes 
    
     x = Executor( 'myprogram', args='input.in', f_in='input.in',
               template='/somewhere/input.template', cwd='/tmp' )
    
    Executor create your input file on the fly which is then passed as first argument.
  • Without template completion:

    Similar, just that you don't give a template:
     x = Executor( 'myprogram', args='input.in', f_in='input.in',
               cwd='/tmp' )
    
    It would then be up to you to provide the correct input file in `/tmp/input.in`. You could override the prepare() hook method for creating it.
  • There are other ways of doing the same thing. Look at generateInp() to see what is actually going on.

    References



    Instance Methods [hide private]
      __init__(self, name, args='', template=None, f_in=None, f_out=None, f_err=None, strict=1, catch_out=1, push_inp=1, catch_err=0, node=None, nice=0, cwd=None, log=None, debug=0, verbose=None, **kw)
    Create Executor.
    str version(self)
    Version of class (at creation).
    str, str communicate(self, cmd, inp, bufsize=-1, executable=None, stdin=None, stdout=None, stderr=None, shell=0, env=None, cwd=None)
    Start and communicate with the new process.
    int execute(self, inp=None)
    Run external command and block until it is finished.
    any run(self, inp_mirror=None)
    Run the callculation.
    str command(self)
    Compose command string from binary, arguments, nice, and node.
    dict OR None environment(self)
    Setup the environment for the process.
      prepare(self)
    called before running external program, override!
      postProcess(self)
    called directly after running the external program, override!
      cleanup(self)
    Clean up after external program has finished (failed or not).
      fail(self)
    Called if external program failed, override!
      finish(self)
    Called if external program finished successfully, override!
      isFailed(self)
    Detect whether external program failed, override!
    str fillTemplate(self)
    Create complete input string from template with place holders.
    str convertInput(self, inp)
    Convert the input to a format used by the selected execution method.
    str generateInp(self)
    Prepare the program input (file or string) from a template (if present, file or string).

    Instance Variables [hide private]
      f_in
    will be overridden by self.convertInput()
      log
    Log object for own program messages
      runTime
    time needed for last run
      output
    STDOUT returned by process
      error
    STDERR returned by process
      returncode
    int status returned by process
      pid
    process ID
      result
    set by self.finish()

    Method Details [hide private]

    __init__(self, name, args='', template=None, f_in=None, f_out=None, f_err=None, strict=1, catch_out=1, push_inp=1, catch_err=0, node=None, nice=0, cwd=None, log=None, debug=0, verbose=None, **kw)
    (Constructor)

    source code 

    Create Executor. *name* must point to an existing program configuration unless *strict*=0. Executor will create a program input from the template and its own fields and put it into f_in. If f_in but no template is given, the unchanged f_in is used as input. If neither is given, the program is called without input. If a node is given, the process is wrapped in a ssh call. If *nice* != 0, the process is preceeded by nice. *cwd* specifies the working directory. By default, this setting is taken from the configuration file which defaults to the current working directory.
    Parameters:
    • name (str) - program name (configured in .biskit/exe_name.dat)
    • args (str) - command line arguments
    • template (str) - template for input file -- this can be the template itself or the path to a file containing it (default: None)
    • f_in (str) - target for completed input file (default: None, discard)
    • f_out (str) - target file for program output (default: None, discard)
    • f_err (str) - target file for error messages (default: None, discard)
    • strict (1|0) - strict check of environment and configuration file (default: 1)
    • catch_out (1|0) - catch output in file (f_out or temporary) (default: 1)
    • catch_err (1|0) - catch errors in file (f_out or temporary) (default: 1)
    • push_inp (1|0) - push input file to process via stdin ('< f_in') [1]
    • node (str) - host for calculation (None->no ssh) (default: None)
    • nice (int) - nice level (default: 0)
    • cwd (str) - working directory, overwrites ExeConfig.cwd (default: None)
    • log (Biskit.LogFile) - execution log (None->STOUT) (default: None)
    • debug (0|1) - keep all temporary files (default: 0)
    • verbose (0|1) - print progress messages to log (default: log != STDOUT)
    • kw (key=value) - key=value pairs with values for template file
    Raises:

    version(self)

    source code 

    Version of class (at creation).
    Returns: str
    version

    communicate(self, cmd, inp, bufsize=-1, executable=None, stdin=None, stdout=None, stderr=None, shell=0, env=None, cwd=None)

    source code 

    Start and communicate with the new process. Called by execute(). See subprocess.Popen() for a detailed description of the parameters! This method should work for pretty much any purpose but may fail for very long pipes (more than 100000 lines).
    Parameters:
    • inp (str) - (for pipes) input sequence
    • cmd (str) - command
    • bufsize (int) - see subprocess.Popen() (default: -1)
    • executable (str) - see subprocess.Popen() (default: None)
    • stdin (int|file|None) - subprocess.PIPE or file handle or None (default: None)
    • stdout (int|file|None) - subprocess.PIPE or file handle or None (default: None)
    • stderr (int|file|None) - subprocess.PIPE or file handle or None (default: None)
    • shell (1|0) - wrap process in shell; see subprocess.Popen() (default: 0, use exe_*.dat configuration)
    • env ({str:str}) - environment variables (default: None, use exe_*.dat config)
    • cwd (str) - working directory (default: None, means self.cwd)
    Returns: str, str
    output and error output
    Raises:
    • RunError - if OSError occurs during Popen or Popen.communicate

    execute(self, inp=None)

    source code 

    Run external command and block until it is finished. Called by run() .
    Parameters:
    • inp (str) - input to be communicated via STDIN pipe (default: None)
    Returns: int
    execution time in seconds
    Raises:

    run(self, inp_mirror=None)

    source code 

    Run the callculation. This calls (in that order):
    Parameters:
    • inp_mirror (str) - file name for formatted copy of inp file (default: None) [not implemented]
    Returns: any
    calculation result

    command(self)

    source code 

    Compose command string from binary, arguments, nice, and node. Override (perhaps).
    Returns: str
    the command to execute

    environment(self)

    source code 

    Setup the environment for the process. Override if needed.
    Returns: dict OR None
    environment dictionary

    prepare(self)

    source code 

    called before running external program, override!

    postProcess(self)

    source code 

    called directly after running the external program, override!

    cleanup(self)

    source code 

    Clean up after external program has finished (failed or not). Override, but call in child method!

    fail(self)

    source code 

    Called if external program failed, override!

    finish(self)

    source code 

    Called if external program finished successfully, override!

    isFailed(self)

    source code 

    Detect whether external program failed, override!

    fillTemplate(self)

    source code 

    Create complete input string from template with place holders.
    Returns: str
    input
    Raises:
    • TemplateError - if unknown option/place holder in template file

    convertInput(self, inp)

    source code 

    Convert the input to a format used by the selected execution method.
    Parameters:
    • inp (str) - path to existing input file or string with input
    Returns: str
    input string if self.exe.pipes; file name otherwise

    generateInp(self)

    source code 

    Prepare the program input (file or string) from a template (if present, file or string).
    Returns: str
    input file name OR (if pipes=1) content of input file
    Raises:

    Instance Variable Details [hide private]

    f_in


    will be overridden by self.convertInput()

    log


    Log object for own program messages

    runTime


    time needed for last run

    output


    STDOUT returned by process

    error


    STDERR returned by process

    returncode


    int status returned by process

    pid


    process ID

    result


    set by self.finish()