Public Types | Public Member Functions

graphlab::distributed_core< VertexType, EdgeType > Class Template Reference

A GraphLab core is the base (or core) data structure in GraphLab. More...

#include <distributed_core.hpp>

List of all members.

Public Types

typedef
graphlab::distributed_types
< graphlab::distributed_graph
< VertexType, EdgeType > > 
distributed_types

Public Member Functions

 distributed_core (distributed_control &dc, std::string atomindex)
 ~distributed_core ()
distributed_types::distributed_graphgraph ()
const
distributed_types::distributed_graph
graph () const
void set_scheduler_type (const std::string &scheduler_type)
 Set the type of scheduler. The engine must not be constructed yet. All machines must call simultaneously.
void set_scope_type (const std::string &scope_type)
 Set the scope consistency model used in this engine.
void set_engine_type (const std::string &engine_type)
 Set the engine type.
void set_metrics_type (const std::string &metrics_type)
 Sets the output format of any recorded metrics This function is parallel.
void set_ncpus (size_t ncpus)
 Set the number of cpus that the engine will use.
distributed_types::iengineengine ()
bool build_engine ()
 Constructs the engine using the current defined options Once an engine is constructed, options cannot be modified All machines must call simultaneously.
void set_engine_options (const engine_options &opts)
 Set the engine options by passing in an engine options object. The engine must not be constructed yet. All machines must call simultaneously.
imetrics_reporter & get_reporter ()
 Gets the reporter This function is parallel.
const engine_optionsget_engine_options () const
 Returns the engine options This function is parallel.
scheduler_optionssched_options ()
 Returns a modifiable reference to the scheduler options.
const scheduler_optionssched_options () const
 Returns a constant reference to the scheduler options This function is parallel.
bool parse_engine_options (int argc, char **argv)
 Set the engine options by simply parsing the command line arguments. The engine must not be constructed yet. All machines must call simultaneously.
double start ()
 Run the engine until a termination condition is reached or there are no more tasks remaining to execute. This function will call build_engine() internally if the engine has not yet been constructed. All machines must call simultaneously.
void add_task (vertex_id_t vertex, typename distributed_types::update_function func, double priority)
 Add a single update function to a single vertex. This function is parallel. Engine must have been constructed using build_engine() prior to calling this function.
void add_task (typename distributed_types::update_task task, double priority)
 Add a single task with a fixed priority. This function is parallel. Engine must have been constructed using build_engine() prior to calling this function.
void add_tasks (const std::vector< vertex_id_t > &vertices, typename distributed_types::update_function func, double priority)
 Add the update function to all the veritces in the provided vector with the given priority. This function is parallel. Engine must have been constructed using build_engine() prior to calling this function.
void add_task_to_all (typename distributed_types::update_function func, double priority)
 Add the given function to all vertices using the given priority This function is parallel. Engine must have been constructed using build_engine() prior to calling this function.
size_t last_update_count ()
 Get the number of updates executed by the engine This function is parallel. Engine must have been constructed using build_engine() prior to calling this function.
void fill_metrics ()
 Fills the metrics with the engine options. This function is parallel.
void reset_metrics ()
 Clears all recorded metrics. This function is parallel.
void report_metrics ()
 Outputs the recorded metrics. This function is parallel.
void set_sync (distributed_glshared_base &shared, typename distributed_types::iengine::sync_function_type sync, glshared_base::apply_function_type apply, const any &zero, size_t sync_interval, typename distributed_types::iengine::merge_function_type merge, vertex_id_t rangelow=0, vertex_id_t rangehigh=-1)
 Registers a sync with the engine.
void sync_now (glshared_base &shared)

Detailed Description

template<typename VertexType, typename EdgeType>
class graphlab::distributed_core< VertexType, EdgeType >

A GraphLab core is the base (or core) data structure in GraphLab.

This is like graphlab::core but for the distributed setting.

The core is templatized over the VertexType and EdgeType however by using the ref types typedef, one can simply create a core by doing the following:

     gl::distributed_core glcore;

The core contains the

The core also manages the engine and scheduler construction parameters.

The distributed core is more limited as compared to the shared memory graphlab::core version. In particular, engine construction must be executed manually through build_engine() and the engine options / scheduler options cannot be modified after engine construction.

Also, some functions must be called by all machines simultaneously, while others are "parallel" allowing any machine to call the function seperately. This behavior is documented in each function. The user must take care to obey this requirement or it may result in unexpected behavior.

Definition at line 86 of file distributed_core.hpp.


Constructor & Destructor Documentation

template<typename VertexType , typename EdgeType >
graphlab::distributed_core< VertexType, EdgeType >::distributed_core ( distributed_control dc,
std::string  atomindex 
) [inline]

default constructor. Graph is constructed using the atom index. All machines must construct simultaneously.

Definition at line 94 of file distributed_core.hpp.

template<typename VertexType , typename EdgeType >
graphlab::distributed_core< VertexType, EdgeType >::~distributed_core (  )  [inline]

Destructor. All machines must call simultaneously.

Definition at line 111 of file distributed_core.hpp.


Member Function Documentation

template<typename VertexType , typename EdgeType >
distributed_types::iengine& graphlab::distributed_core< VertexType, EdgeType >::engine (  )  [inline]

Get a reference to the active engine. build_engine() must be called prior to this. This function is parallel.

Definition at line 225 of file distributed_core.hpp.

template<typename VertexType , typename EdgeType >
distributed_types::distributed_graph& graphlab::distributed_core< VertexType, EdgeType >::graph (  )  [inline]

Get a modifiable reference to the graph associated with this core This function is parallel.

Definition at line 124 of file distributed_core.hpp.

template<typename VertexType , typename EdgeType >
const distributed_types::distributed_graph& graphlab::distributed_core< VertexType, EdgeType >::graph (  )  const [inline]

Get a constant reference to the graph associated with this core This function is parallel.

Definition at line 129 of file distributed_core.hpp.

template<typename VertexType , typename EdgeType >
scheduler_options& graphlab::distributed_core< VertexType, EdgeType >::sched_options (  )  [inline]

Returns a modifiable reference to the scheduler options.

This function is parallel but> any modifications to the options must be made the same way across all machines.

Definition at line 290 of file distributed_core.hpp.

template<typename VertexType , typename EdgeType >
void graphlab::distributed_core< VertexType, EdgeType >::set_engine_type ( const std::string &  engine_type  )  [inline]

Set the engine type.

The engine must not be constructed yet. All machines must call simultaneously.

  • "dist_locking" Distributed engine with consistency ensured through locking
  • "dist_chromatic" Distributed engien with consistency ensured through coloring

Definition at line 175 of file distributed_core.hpp.

template<typename VertexType , typename EdgeType >
void graphlab::distributed_core< VertexType, EdgeType >::set_metrics_type ( const std::string &  metrics_type  )  [inline]

Sets the output format of any recorded metrics This function is parallel.

  • "none" No reporting
  • "basic" Outputs to screen
  • "file" Outputs to a text file graphlab_metrics.txt
  • "html" Outputs to a html file graphlab_metrics.html

Definition at line 190 of file distributed_core.hpp.

template<typename VertexType , typename EdgeType >
void graphlab::distributed_core< VertexType, EdgeType >::set_ncpus ( size_t  ncpus  )  [inline]

Set the number of cpus that the engine will use.

The engine must not be constructed yet. All machines must call simultaneously.

Definition at line 214 of file distributed_core.hpp.

template<typename VertexType , typename EdgeType >
void graphlab::distributed_core< VertexType, EdgeType >::set_scope_type ( const std::string &  scope_type  )  [inline]

Set the scope consistency model used in this engine.

The engine must not be constructed yet. All machines must call simultaneously. The available scopes are:

  • "full" This ensures full data consistency within the scope
  • "edge" This ensures data consistency with just the vertex and edges
  • "vertex" This ensures that a vertex cannot be updated by two processors simultaneously

See Scopes for details

Definition at line 157 of file distributed_core.hpp.

template<typename VertexType , typename EdgeType >
void graphlab::distributed_core< VertexType, EdgeType >::set_sync ( distributed_glshared_base shared,
typename distributed_types::iengine::sync_function_type  sync,
glshared_base::apply_function_type  apply,
const any zero,
size_t  sync_interval,
typename distributed_types::iengine::merge_function_type  merge,
vertex_id_t  rangelow = 0,
vertex_id_t  rangehigh = -1 
) [inline]

Registers a sync with the engine.

Registers a sync with the engine. All machines must call simultaneously.

The sync will be performed approximately every "interval" updates, and will perform a reduction over all vertices from rangelow to rangehigh inclusive. The merge function may be NULL, in which it will not be used. However, it is highly recommended to provide a merge function since this allow the sync operation to be parallelized.

The sync operation is guaranteed to be strictly sequentially consistent with all other execution.

Parameters:
shared The shared variable to synchronize
sync The reduction function
apply The final apply function which writes to the shared value
zero The initial zero value passed to the reduction
sync_interval Frequency at which the sync is initiated. Corresponds approximately to the number of update function calls before the sync is reevaluated. If 0, the sync will only be evaluated once at engine start, and will never be evaluated again.
merge Combined intermediate reduction value. Required.
rangelow he lower range of vertex id to start syncing. The range is inclusive. i.e. vertex with id 'rangelow' and vertex with id 'rangehigh' will be included. Defaults to 0.
rangehigh The upper range of vertex id to stop syncing. The range is inclusive. i.e. vertex with id 'rangelow' and vertex with id 'rangehigh' will be included. Defaults to infinity.

Definition at line 458 of file distributed_core.hpp.

template<typename VertexType , typename EdgeType >
void graphlab::distributed_core< VertexType, EdgeType >::sync_now ( glshared_base shared  ) 

Performs a sync immediately. This function requires that the shared variable already be registered with the engine. Not implemented.


The documentation for this class was generated from the following file: