overview.rst 6.1 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136
  1. .. Copyright (C) 2004-2008 The Trustees of Indiana University.
  2. Use, modification and distribution is subject to the Boost Software
  3. License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
  4. http://www.boost.org/LICENSE_1_0.txt)
  5. ===============================================
  6. An Overview of the Parallel Boost Graph Library
  7. ===============================================
  8. .. image:: ../graph.png
  9. :width: 206
  10. :height: 184
  11. :alt: An example graph
  12. :align: right
  13. The Parallel Boost Graph Library (Parallel BGL) is a C++ library for
  14. parallel, distributed computation on graphs. The Parallel BGL contains
  15. distributed graph data structures, distributed graph algorithms,
  16. abstractions over the communication medium (such as MPI), and
  17. supporting data structures. A graph (also called a *network*) consists
  18. of a set of *vertices* and a set of relationships between vertices,
  19. called *edges*. The edges may be *undirected*, meaning that the
  20. relationship between vertices is mutual, e.g., "X is related to Y", or
  21. they can be *directed*, meaning that the relationship goes only one
  22. way, e.g., "X is the child of Y". The following figure illustrates a
  23. typical directed graph, where *a-i* are the vertices and the arrows
  24. represent edges.
  25. .. image:: ../distributed-graph.png
  26. :width: 229
  27. :height: 199
  28. :alt: A distributed graph
  29. :align: right
  30. The Parallel BGL is primarily concerned with *distributed*
  31. graphs. Distributed graphs are conceptually graphs, but their storage
  32. is spread across multiple processors. The following figure
  33. demonstrates a distributed version of the graph above, where the graph
  34. has been divided among three processors (represented by the grey
  35. rectangles). Edges in the graph may be either local (with both
  36. endpoints stored on the same processor) or remote (the target of the
  37. edge is stored on a different processor).
  38. The Parallel BGL is a generic library. At its core are *generic*
  39. distributed graph algorithms, which can operate on any distributed
  40. graph data structure provided that data structure meets certain
  41. requirements. For instance, the algorithm may need to enumerate the
  42. set of vertices stored on the current processor, enumerate the set of
  43. outgoing edges from a particular vertex, and determine on which
  44. processor the target of each edge resides. The graph algorithms in the
  45. Parallel BGL are also generic with respect to the *properties*
  46. attached to edges and vertices in a graph; for instance, the weight of
  47. each edge can be stored as part of the graph or allocated in a
  48. completely separate data structure.
  49. The genericity available in the algorithms of the Parallel BGL allows
  50. them to be applied to existing graph data structures. However, most
  51. users will instead be writing new code that takes advantage of the
  52. Parallel BGL. The Parallel BGL provides distributed graph data
  53. structures that meet the requirements of the Parallel BGL
  54. algorithms. The primary data structure is the `distributed adjacency
  55. list`_, which allows storage and manipulation of a (distributed)
  56. graph. The vertices in the graph are divided among the various
  57. processors, and each of the edges outgoing from a vertex are stored on
  58. the processor that "owns" (stores) that vertex. The following figure
  59. illustrates the distributed adjacency list representation.
  60. .. image:: ../dist-adjlist.png
  61. :width: 446
  62. :height: 154
  63. :alt: A distributed adjacency list
  64. :align: center
  65. .. image:: ../dist-pmap.png
  66. :width: 271
  67. :height: 175
  68. :alt: A distributed property map
  69. :align: right
  70. The `distributed adjacency list`_ distributes the structure of a graph
  71. over multiple processors. While graph structure is in important part
  72. of many graph problems, there are typically other properties attached
  73. to the vertices and edges, such as edge weights or the position of
  74. vertices within a grid. These properties are stored in *property
  75. maps*, which associate a single piece of data with each edge or vertex
  76. in a graph. Distributed property maps extend this notion to
  77. distributed computing, where properties are stored on the same
  78. processor as the vertex or edge. The following figure illustrates the
  79. distribution of a property map storing colors (white, gray, black) for
  80. each vertex. In addition to the storage for each vertex, the
  81. processors store some "ghost cells" that cache values actually stored
  82. on other processors, represented by the dashed boxes.
  83. Tying together all of the distributed data structures of the Parallel
  84. BGL are its process groups and distributed graph algorithms. Process
  85. groups coordinate the interactions between multiple processes and
  86. distributed data structures by abstracting the communication
  87. mechanism. The algorithms are typically written using the SPMD model
  88. (Single Program, Multiple Data) and interact with both the distributed
  89. data structures and the process group itself. At various points in the
  90. algorithm's execution, all processes execute a synchronization point,
  91. which allows all of the distributed data structures to ensure an
  92. appropriate degree of consistency across processes. The following
  93. diagram illustrates the communication patterns within the the
  94. execution of a distributed algorithm in the Parallel BGL. In
  95. particular, the diagram illustrates the distributed data structures
  96. used in a distributed breadth-first search, from the top-left and
  97. proceeding clockwise:
  98. - a user-defined property map that tracks the distance from the
  99. source vertex to all other vertices,
  100. - an automatically-generated property map that tracks the "color"
  101. of vertices in the (distributed) graph, to determine which
  102. vertices have been seen before,
  103. - a distributed queue, which coordinates the breadth-first search
  104. and distributes new vertices to search, and
  105. - a distributed graph, on which the breadth-first search is
  106. operating.
  107. .. image:: ../architecture.png
  108. :width: 485
  109. :height: 410
  110. :alt: Parallel Boost Graph Library architecture
  111. :align: center
  112. ----------------------------------------------------------------------------
  113. Copyright (C) 2005 The Trustees of Indiana University.
  114. Authors: Douglas Gregor and Andrew Lumsdaine
  115. .. _Distributed adjacency list: distributed_adjacency_list.html
  116. .. _Process groups: