tracked

class anadama2.tracked.Base[source]

The Dependency object is the tool for specifying task father-child relationships. Dependency objects can be used in either targets or depends arguments of anadama2.workflow.Workflow.add_task(). Often, these targets or dependencies are specified by strings or anadama2.Task objects and instantiated into the appropriate Base subclass with anadama2.tracked.auto(); this behavior depends on the arguments to anadama2.workflow.Workflow.add_task().

A dependency of the same name will be defined multiple times in the normal use of AnADAMA. To make it such that many calls of a dependency constructor or use of the same argument to anadama2.tracked.auto() result in the same dependency instance being returned, a little bit of black magic involving anadama2.tracked.Base.__new__() is required. The price for such magics is that subclasses of Base must define the init method to initialize the dependency, instead of the more commonly used __init__ method. The init method is only called once per dependency, while __init__ is called every time any Base sublcasses are instantiated. Base subclasses must also define a key() staticmethod. The key() staticmethod is used to lookup already existing instances of that dependency; return values from the key() method must be unique to that dependency.

Unrelated to the quasi-singleton magics above, sublcasses of Base must define a compare() method. See anadama2.tracked.Base.compare() for more documentation.

compare()[source]

Produce the iterator that is used to determine if this dependency has changed. This method is called twice: once before the task’s actions have been executed for all of a task’s dependencies and once after all of a task’s actions are executed for each of the task’s targets.

Returns:iterator
exists()[source]

Return whether the thing that this object represents exists. Examples include whether a file exists, or whether a row in a database exists. This method is used when the workflow is not in strict mode and the user tries to depend on a Tracked object that isn’t the target of another task.

Returns:bool
init(key)[source]

Initialize the dependency. Only run once for each new dependency object with this key().

static key(the_key)[source]

Returns the unique key for retrieving this dependency from the storage backend and for comparing against other dependencies of the same type.

Return type:str
must_preexist = True
class anadama2.tracked.Container(namespace=None, **kwds)[source]

Track a collection of small strings. This is useful for rerunning tasks based on whether a flag has changed when running a command. Using anadama2.tracked.TrackedString for small strings can run into collisions between workflows that use the same backend. Consider using logging = TrackedString("debug") in script_a.py and logging = TrackedString("warning") in script_b.py. If you change script_a.py to logging = TrackedString("warning") after running script_b.py, script_a.py won’t rerun tasks that depend on the TrackedString assigned to logging.

This class solves TrackedString collisions by prepending a user-provided (or auto-generated) namespace to each key-value pair in the collection. The auto-generated namespace is unique to the script or module creating the Container, so expect to have a bunch of tasks rerun if you rename a module or script between runs.

>>> import anadama2.tracked
>>> conf = anadama2.tracked.Container(alpha="5", beta=2)
>>> conf.alpha
<anadama2.tracked.TrackedVariable object at 0x7f66445fa490> 
>>> str(conf.alpha)
'5'
>>> str(conf['beta'])
'2'
>>> conf.beta = 7
>>> conf.beta
<anadama2.tracked.TrackedVariable object at 0x7f66445fa4d0>
>>> str(conf.beta)
'7'
compare()[source]
items()[source]
static key(namespace=None)[source]
class anadama2.tracked.DependencyIndex[source]

Keeps track of what dependencies belong to what class and provides efficient lookups of what task produces what dependency. Use DependencyIndex[dependency_obj] to get the task that makes that dependency.

Link a dependency to a task. Used later for lookups with __getitem__.

Parameters:
  • dep (subclass of anadama2.tracked.Base) – The dependency to track
  • task_or_none (anadama2.Task or None) – The task that’s supposed to create this dependency. Use None if the dependency isn’t created by a task but exists prior to any tasks running.
class anadama2.tracked.HugeTrackedFile[source]

Track a large file. Large being large enough that you don’t want to read through the entire file to create a checksum of its contents.

This dependency class is the fastest option for tracking file changes for large files. Speed comes at the cost of safety; only the size and the modification time are used to determine freshness.

compare()[source]

Produce the iterator that is used to determine if this dependency has changed. This method is called twice: once before the task’s actions have been executed for all of a task’s dependencies and once after all of a task’s actions are executed for each of the task’s targets.

Returns:iterator
class anadama2.tracked.TrackedDirectory[source]

Track a directory. A directory is considered changed if it’s removed, the modify time has changed, the list of files within the directory has changed, or if any of the files within the directory have changed size or modify times.

compare()[source]

Produce the iterator that is used to determine if this dependency has changed. This method is called twice: once before the task’s actions have been executed for all of a task’s dependencies and once after all of a task’s actions are executed for each of the task’s targets.

Returns:iterator
exists()[source]

Return whether the thing that this object represents exists. Examples include whether a file exists, or whether a row in a database exists. This method is used when the workflow is not in strict mode and the user tries to depend on a Tracked object that isn’t the target of another task.

Returns:bool
class anadama2.tracked.TrackedExecutable[source]

Track a script or binary executable.

compare()[source]

Produce the iterator that is used to determine if this dependency has changed. This method is called twice: once before the task’s actions have been executed for all of a task’s dependencies and once after all of a task’s actions are executed for each of the task’s targets.

Returns:iterator
exists()[source]

Return whether the thing that this object represents exists. Examples include whether a file exists, or whether a row in a database exists. This method is used when the workflow is not in strict mode and the user tries to depend on a Tracked object that isn’t the target of another task.

Returns:bool
init(name, version_command='{} --version')[source]

Initialize the dependency.

Parameters:
  • name (str) – Name of a script on the shell $PATH or name of the file to track
  • version (str) – Command to get the executables version
static key(name)[source]
version()[source]
class anadama2.tracked.TrackedFile[source]

Track a small file. Small being small enough that you don’t mind that the entire file is read to create a checksum of its contents.

This dependency class is the safest option for tracking file changes. Safety comes at the cost of disk IO; all file contents are read to create a checksum.

compare()[source]

Produce the iterator that is used to determine if this dependency has changed. This method is called twice: once before the task’s actions have been executed for all of a task’s dependencies and once after all of a task’s actions are executed for each of the task’s targets.

Returns:iterator
exists()[source]

Return whether the thing that this object represents exists. Examples include whether a file exists, or whether a row in a database exists. This method is used when the workflow is not in strict mode and the user tries to depend on a Tracked object that isn’t the target of another task.

Returns:bool
init(name)[source]

Initialize the dependency. :param name: The filename to keep track of :type name: str

static key(name)[source]
class anadama2.tracked.TrackedFilePattern[source]

Track several files according to a bash-style globbing pattern. Uses glob.glob() under the hood. A Glob is considered changed if the names of the matched files changes, or any of the matched files change in size or modify time.

compare()[source]

Produce the iterator that is used to determine if this dependency has changed. This method is called twice: once before the task’s actions have been executed for all of a task’s dependencies and once after all of a task’s actions are executed for each of the task’s targets.

Returns:iterator
exists()[source]

Return whether the thing that this object represents exists. Examples include whether a file exists, or whether a row in a database exists. This method is used when the workflow is not in strict mode and the user tries to depend on a Tracked object that isn’t the target of another task.

Returns:bool
class anadama2.tracked.TrackedFunction[source]

Useful for things like database lookups or API calls. The function must return a hashable type. For a tiered comparison method like that seen in anadama2.tracked.TrackedFile, it’s best to create your own subclass of Base and override the compare() method.

compare()[source]

Produce the iterator that is used to determine if this dependency has changed. This method is called twice: once before the task’s actions have been executed for all of a task’s dependencies and once after all of a task’s actions are executed for each of the task’s targets.

Returns:iterator
init(key, fn)[source]

Initialize the dependency. Only run once for each new dependency object with this key().

static key(key)[source]
must_preexist = False
class anadama2.tracked.TrackedString[source]
compare()[source]

Produce the iterator that is used to determine if this dependency has changed. This method is called twice: once before the task’s actions have been executed for all of a task’s dependencies and once after all of a task’s actions are executed for each of the task’s targets.

Returns:iterator
init(s)[source]

Initialize the dependency.

Parameters:s (str or unicode) – The string to keep track of
static key(s)[source]
must_preexist = False
class anadama2.tracked.TrackedVariable[source]
compare()[source]

Produce the iterator that is used to determine if this dependency has changed. This method is called twice: once before the task’s actions have been executed for all of a task’s dependencies and once after all of a task’s actions are executed for each of the task’s targets.

Returns:iterator
init(namespace, k, v)[source]

Initialize the dependency. Only run once for each new dependency object with this key().

static key(ns, k, v)[source]
must_preexist = False
anadama2.tracked.any_different(ds, backend)[source]

Determine whether any dependencies have changed since last save.

Parameters:
anadama2.tracked.auto(x)[source]

Translate a string, function or task into the appropriate subclass of anadama2.tracked.Base. Tildes and shell variables are expanded using os.path.expanduser() and os.path.expandvars(). If that’s not your game, use anadama2.tracked.TrackedDirectory or anadama2.tracked.HugeTrackedFile as appropriate. The current mapping is as follows:

Parameters:x – The object to be translated into a dependency object