job_manager

Calculation manager.

This module provides classes for managing collections of calculations. For persistant information it is best to interact via a JobCache instance.

Parameters to class constructors are also names of attributes of that class unless otherwise noted.

exception job_manager.UserError

Bases: exceptions.Exception

Raised due to an error from user-input.

exception job_manager.LockException

Bases: exceptions.Exception

Raised if a lock cannot be acquired.

class job_manager.JobStatus

enum-esque class for specifying the status of a job.

Defined statuses:

unknown = 'unknown'
held = 'held'
queueing = 'queueing'
running = 'running'
finished = 'finished'
analysed = 'analysed'
class job_manager.Job(job_id, program, path, input_fname=None, output_fname=None, status=None, submit=None, comment=None)

Store of information regarding a calculation job.

Parameters:
  • job_id (string or integer) – job id (e.g. pid or from queueing system)
  • program (string) – program being executed
  • path (string) – path to job directory
  • input_fname (string) – input file name
  • output_fname (string) – output file name
  • status (string) – current status of job. See JobStatus for defined statuses. This must be an attribute of JobStatus.
  • submit (string) – submit script file name.
  • comment (string) – further information regarding the job.

Only job_id, program and path are required. All other attributes are optional. Not all attributes are always applicable.

mtime()

Inspect the timestamp of the job.

Return type:integer
Returns:the last time (in seconds since the epoch) that the job was modified.
auto_update()

Update job status attribute automatically.

This inspects the output from ps and any queueing system to discover if the status of the job has changed (e.g. if the job has started or has finished). Note that this assumes the job is running on the local computer. Warning: if this condition is not met, then the job status will be incorrectly updated to finished.

Currently only aware of the PBS and LoadLeveler queueing systems.

Only jobs which are currently held, queueing or running are updated.

modify(job_spec)

Modify the job description.

Parameters:job_spec (dictionary) – dictionary with Job attributes as keys associated with new values. All attributes set at initialisation can be changed. Keys with null values are ignored. See output from job_spec() for an example format.
match(pattern)

Test to see if the job description matches the supplied pattern.

Parameters:pattern (string) – regular expression. All public attributes of the Job instance are tested (using re.search) against the pattern.
Return type:boolean
Returns:True if any attribute matches the pattern.
job_spec()

Inspect the job.

Return type:dictionary
Returns:dictionary (a job spec) of the job attributes.
class job_manager.JobServer(hostname='localhost')

Store set of Job instances running on a server/computer.

Parameters:hostname (string) – name of computer running the jobs. Default: localhost. localhost is treated as a special case and is assumed to be the computer on which this instance of job_manager is running.
jobs

list of Job instances which are running on the server.

add(job_spec)

Add a Job to the list of jobs running on the server.

Parameters:job_spec (dictionary) – job to be added. See Job and Job.job_spec() for possible fields and format.
auto_update()

Automatically update the job status of all jobs.

Only performed on the localhost JobServer. See also Job.auto_update().

select(pattern)

Select a subset of jobs from the server which match the supplied pattern.

Parameters:pattern (string) – regular expression. All jobs are tested (using Job.match()) against the pattern and the list of matching jobs is returned. If pattern is None then all jobs are returned.
delete(indices=None, pattern=None)

Delete a selected subset of jobs.

Parameters:
  • indices (iterable of integers) – indices of Job instances in the jobs list to delete. Not used if None or empty.
  • pattern (string) – regular expression. The jobs are which match the pattern (found using select()) are deleted. Not used if None.
modify(job_spec, indices=None, pattern=None)

Modify a selected subset of jobs using Job.modify().

Parameters:
  • job_spec (dictionary) – fields of job to be modified. See Job and Job.job_spec() for possible fields and format.
  • indices (iterable of integers) – indices of Job instances in the jobs list to modify. Not used if None or empty.
  • pattern (string) – regular expression. The jobs are which match the pattern (found using select()) are modified. Not used if None.
merge(other)

Merge jobs from another JobServer.

If a Job in the other JobServer has the same job_id as a Job in jobs and has a later modification time, then it is copied across. If a job in the other JobServer does not have the same job_id as any Job in jobs, then it is copied across.

Note that the hostname of the other JobServer is not checked.

The job_id of the Job instance is treated as a unique identifier. This is usually true on a given queueing system but is not guaranteed with running local jobs (where the job_id is taken from ps). Care should thus be taken when merging jobs from a localhost JobServer and from merging jobs from two different :class`JobServer` instances (which should instead be grouped together using a JobCache instance).

Parameters:other (JobServer) – another instance of JobServer.
class job_manager.JobCache(cache, load=False)

Store, manipulate, load and save multiple JobServer instances.

By default a new (empty) JobServer instance is created on localhost.

Parameters:
  • cache (string) – path to a file in which the job data can be stored and retrieved. Only one instance can manipulate job data stored in a cache at a time, so a lock is acquired when a cache is read and released only when the cache dumped out to the cache. The directory for the cache file is created if it doesn’t already exist.
  • load (boolean) – load data from an existing cache file if true. Not an attribute.
job_servers

List of JobServer instances.

dump()

Dump job_servers data to the cache file.

Also releases the lock and resets the job_servers to be an empty JobServer instance on the localhost.

load()

Read in the job_servers data from the cache file.

Also acquires the lock.

add_server(hostname)

Add a new JobServer instance.

Parameters:hostname (string) – name of server.
auto_update()

Auto-update the status of the jobs on the localhost JobServer.

See also JobServer.auto_update().

merge(other, other_hostname)

Merge data from another JobCache.

Each JobServer in other JobCache is merged with the corresponding JobServer in the current instance. JobServer instances are matched by the hostname. If a JobServer exists in the other JobCache and in the current instance, then they are merged using JobServer.merge(), otherwise it is simply copied to the current instance. Note that the localhost hostname of the other JobCache is replaced with other_hostname to avoid unintended nameclashes.

Parameters:
  • other (JobCache) – another instance of JobCache.
  • other_hostname (string) – hostname of the other JobCache. Used instead of localhost when transferring the localhost JobServer from the other JobCache to the current instance.
pretty_print(hosts=None, pattern=None, short=False)

Print out job_servers.

Parameters:
  • hosts (list of strings) – list of hostnames. If specified, print out only jobs on the specified servers.
  • pattern (string) – regular expression. Only jobs which match the supplied pattern are printed. If pattern is None then all jobs are printed.
  • short (boolean) – print just the hostname, index, job_id and status.