The GraphLab framework provides a collection of built-in basic schedules as well as some more exotic specialized dynamic schedules. Here, we provide a brief description of each scheduler as well as the options available for each scheduler.
There are many ways to provide options to the scheduler:
core.sched_options().add_option("optionname", VALUE)where VALUE can be any type. (sched_options will cast to the right type as needed automatically).
The static schedulers ignore add_task calls, and typically repeat a predefined sequence until termination.
The chromatic scheduler supports only a single update function and applies it in sweeps over the graph. The chromatic reads the color of each vertex from the graphlab::graph::color() function. The scheduler then executes all vertices of the same color before proceeding to the next color. Within a color, processors may execute update functions on multiple vertices simultaneously. However, the chromatic scheduler ensures that at no point are vertices of different colors executed simultaneously.
The color scheduler can be used to obtain a Gauss-Seidel parallel execution of a sequential algorithm by first computing a coloring of the underlying model using graphlab::graph::compute_coloring(). As long as the update function does not modify the data on neighboring vertices, the parallel execution will be identical to all sequential executions in which the vertices are updated in order of color.
This scheduler executes repeated cycles through all vertices. The add_task() functions can be used to associate different update functions with each vertex.
This is experimental and should not be used. It is not part of the official release.
The dynamic schedulers rely on the graphlab::icallback in the update functions to receive new tasks (and potentially task priorities). These tasks are then incorporated into the execution schedule. These schedulers are called dynamic schedulers because they rely on the computation in the update functions to determine what to execute and when.
All of the dynamic schedulers have an internal task de-duplication mechanism that prevents the same task from occurring more than once in the queue. Schedulers that use priorities typically take the maximum priority of all duplicate entries and assign that to a single instance in the queue. There are quite a few dynamic schedulers in GraphLab and below is a list of several of the most popular:
The fifo scheduler executes tasks in the classical first in first out order. When update functions generate tasks they are added to the back of the fifo queue. Because there is a single central queue the fifo scheduler can become a synchronizing bottleneck for algorithms with relatively light update functions.
The Multiqueue fifo scheduler is like the fifo scheduler but instead of using a single queue multiple queues (2 x ncpus) are used and tasks are added to queues using a randomized balancing strategy. Each processor only draws from its own pair of queue.
The priority scheduler maintains a single priority scheduling queue. The task with highest priority is always executed next. If add_task() is invoked on an already present task the existing task's priority is set to the max of the two tasks.
Same as the priority scheduler except multiple queues are maintained.
The clustered priority schedule maintains a priority queue over blocks of vertices. This schedule begins by partitioning the graph using the partitioning method into blocks which each contain vert_per_part vertices. The priorities are maintained over blocks but within each block a sweep schedule is run.
This scheduler loops over vertices executing a task if one is associated with the vertex. Each vertex maintains its own local queue. This scheduler has the least possible overhead.
We currently only provide the Splash scheduler which grows small spanning trees for each cpu and then sequential executes the update function in a forward backward fashion. This scheduler is loosely based on the Splash BP algorithm by Gonzalez et al. AISTATS 2009.