A framework for out-of-core and parallel computing.
Here is an example DAG:
using Dagger p = delayed(f; options...)(42) q = delayed(g)(p) r = delayed(h)(53) s = delayed(combine)(p, q, r)
The connections between nodes
s is represented by this dependency graph:
Returns a function which when called creates a
Thunk object representing a call to function
f with the given arguments. If it is called with other thunks as input, then they form a graph with input nodes directed at the output. The function
f get the result of evaluating the input thunks.
To compute and fetch the result of a thunk (say
s), you can call
collect will fetch the result of the computation to the master process. Alternatively, if you want to compute but not fetch the result you can call
compute on the thunk. This will return a
Chunk object which references the result. If you pass in a
Chunk objects as an input to a delayed function, then the function will get executed with the value of the
Chunk -- this evaluation will likely happen where the input chunks are, to reduce communication.
get_result::Bool-- return the actual result to the scheduler instead of
Chunkobjects. Used when
fexplicitly constructs a Chunk or when return value is small (e.g. in case of reduce)
meta::Bool-- pass the input “Chunk” objects themselves to
fand not the value contained in them - this is always run on the master process
persist::Bool-- the result of this Thunk should not be released after it becomes unused in the DAG
cache::Bool-- cache the result of this Thunk such that if the thunk is evaluated again, one can just reuse the cached value. If it’s been removed from cache, recompute the value.
Chunkobject to the scheduler.
pit will be given
qsince it will already have the result of
pwhich is input to
We thank DARPA, Intel, and the NIH for supporting this work at MIT.
3 days ago