operations_overview.html 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277
  1. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  2. "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  3. <html xmlns="http://www.w3.org/1999/xhtml">
  4. <head>
  5. <meta name="generator" content=
  6. "HTML Tidy for Linux/x86 (vers 1st March 2004), see www.w3.org" />
  7. <meta name="GENERATOR" content="Quanta Plus" />
  8. <meta http-equiv="Content-Type" content=
  9. "text/html; charset=us-ascii" />
  10. <link rel="stylesheet" href="../../../../boost.css" type="text/css"/>
  11. <link rel="stylesheet" href="ublas.css" type="text/css" />
  12. <script type="text/javascript" src="js/jquery-1.3.2.min.js" async="async" ></script>
  13. <script type="text/javascript" src="js/jquery.toc-gw.js" async="async" ></script>
  14. <title>uBLAS operations overview</title>
  15. </head>
  16. <body>
  17. <h1><img src="../../../../boost.png" align="middle" />Overview of Tensor, Matrix and Vector Operations</h1>
  18. <div class="toc" id="toc"></div>
  19. <dl>
  20. <dt>Contents:</dt>
  21. <dd><a href="#blas">Basic Linear Algebra</a></dd>
  22. <dd><a href="#advanced">Advanced Functions</a></dd>
  23. <dd><a href="#sub">Submatrices, Subvectors</a></dd>
  24. <dd><a href="#speed">Speed Improvements</a></dd>
  25. </dl>
  26. <h2>Definitions</h2>
  27. <table style="" summary="notation">
  28. <tr><td><code>X, Y, Z</code></td>
  29. <td> are tensors</td></tr>
  30. <tr><td><code>A, B, C</code></td>
  31. <td> are matrices</td></tr>
  32. <tr><td><code>u, v, w</code></td>
  33. <td>are vectors</td></tr>
  34. <tr><td><code>i, j, k</code></td>
  35. <td>are integer values</td></tr>
  36. <tr><td><code>t, t1, t2</code></td>
  37. <td>are scalar values</td></tr>
  38. <tr><td><code>r, r1, r2</code></td>
  39. <td>are <a href="range.html">ranges</a>, e.g. <code>range(0, 3)</code></td></tr>
  40. <tr><td><code>s, s1, s2</code></td>
  41. <td>are <a href="range.html#slice">slices</a>, e.g. <code>slice(0, 1, 3)</code></td></tr>
  42. </table>
  43. <h2><a name="blas">Basic Linear Algebra</a></h2>
  44. <h3>standard operations: addition, subtraction, multiplication by a
  45. scalar</h3>
  46. <pre><code>
  47. X = Y + Z; X = Y - Z; X = -Y;
  48. C = A + B; C = A - B; C = -A;
  49. w = u + v; w = u - v; w = -u;
  50. X = t * Y; Y = X * t; X = Y / t;
  51. C = t * A; C = A * t; C = A / t;
  52. w = t * u; w = u * t; w = u / t;
  53. </code></pre>
  54. <h3>computed assignments</h3>
  55. <pre><code>
  56. X += Y; X -= Y;
  57. C += A; C -= A;
  58. w += u; w -= u;
  59. X *= t; X /= t;
  60. C *= t; C /= t;
  61. w *= t; w /= t;
  62. </code></pre>
  63. <h3>inner, outer and other products</h3>
  64. <pre><code>
  65. t = inner_prod(u, v);
  66. C = outer_prod(u, v);
  67. w = prod(A, u); w = prod(u, A); w = prec_prod(A, u); w = prec_prod(u, A);
  68. C = prod(A, B); C = prec_prod(A, B);
  69. w = element_prod(u, v); w = element_div(u, v);
  70. C = element_prod(A, B); C = element_div(A, B);
  71. </code></pre>
  72. <h3>tensor products</h3>
  73. <pre><code>
  74. Z = prod(X, v, t);
  75. Z = prod(X, A, t);
  76. Z = prod(X, Y, p);
  77. Z = prod(X, Y, pa, pb);
  78. t = inner_prod(X, Y);
  79. Z = outer_prod(X, Y);
  80. </code></pre>
  81. <h3>transformations</h3>
  82. <pre><code>
  83. w = conj(u); w = real(u); w = imag(u);
  84. C = trans(A); C = conj(A); C = herm(A); C = real(A); C = imag(A);
  85. Z = trans(X); Z = conj(X); Z = real(X); Z = imag(X);
  86. </code></pre>
  87. <h2><a name="advanced">Advanced functions</a></h2>
  88. <h3>norms</h3>
  89. <pre><code>
  90. t = norm_inf(v); i = index_norm_inf(v);
  91. t = norm_1(v); t = norm_2(v);
  92. t = norm_2_square(v);
  93. t = norm_inf(A); i = index_norm_inf(A);
  94. t = norm_1(A); t = norm_frobenius(A);
  95. t = norm(X);
  96. </code></pre>
  97. <h3>products</h3>
  98. <pre><code>
  99. axpy_prod(A, u, w, true); // w = A * u
  100. axpy_prod(A, u, w, false); // w += A * u
  101. axpy_prod(u, A, w, true); // w = trans(A) * u
  102. axpy_prod(u, A, w, false); // w += trans(A) * u
  103. axpy_prod(A, B, C, true); // C = A * B
  104. axpy_prod(A, B, C, false); // C += A * B
  105. </code></pre>
  106. <p><em>Note:</em> The last argument (<code>bool init</code>) of
  107. <code>axpy_prod</code> is optional. Currently it defaults to
  108. <code>true</code>, but this may change in the future. Setting the
  109. <code>init</code> to <code>true</code> is equivalent to calling
  110. <code>w.clear()</code> before <code>axpy_prod</code>.
  111. There are some specialisation for products of compressed matrices that give a
  112. large speed up compared to <code>prod</code>.</p>
  113. <pre><code>
  114. w = block_prod&lt;matrix_type, 64&gt; (A, u); // w = A * u
  115. w = block_prod&lt;matrix_type, 64&gt; (u, A); // w = trans(A) * u
  116. C = block_prod&lt;matrix_type, 64&gt; (A, B); // C = A * B
  117. </code></pre>
  118. <p><em>Note:</em> The blocksize can be any integer. However, the
  119. actual speed depends very significantly on the combination of blocksize,
  120. CPU and compiler. The function <code>block_prod</code> is designed
  121. for large dense matrices.</p>
  122. <h3>rank-k updates</h3>
  123. <pre><code>
  124. opb_prod(A, B, C, true); // C = A * B
  125. opb_prod(A, B, C, false); // C += A * B
  126. </code></pre>
  127. <p><em>Note:</em> The last argument (<code>bool init</code>) of
  128. <code>opb_prod</code> is optional. Currently it defaults to
  129. <code>true</code>, but this may change in the future. This function
  130. may give a speedup if <code>A</code> has less columns than rows,
  131. because the product is computed as a sum of outer products.</p>
  132. <h2><a name="sub">Submatrices, Subvectors</a></h2>
  133. <p>Accessing submatrices and subvectors via <b>proxies</b> using <code>project</code> functions:</p>
  134. <pre><code>
  135. w = project(u, r); // the subvector of u specifed by the index range r
  136. w = project(u, s); // the subvector of u specifed by the index slice s
  137. C = project(A, r1, r2); // the submatrix of A specified by the two index ranges r1 and r2
  138. C = project(A, s1, s2); // the submatrix of A specified by the two index slices s1 and s2
  139. w = row(A, i); w = column(A, j); // a row or column of matrix as a vector
  140. </code></pre>
  141. <p>Assigning to submatrices and subvectors via <b>proxies</b> using <code>project</code> functions:</p>
  142. <pre><code>
  143. project(u, r) = w; // assign the subvector of u specifed by the index range r
  144. project(u, s) = w; // assign the subvector of u specifed by the index slice s
  145. project(A, r1, r2) = C; // assign the submatrix of A specified by the two index ranges r1 and r2
  146. project(A, s1, s2) = C; // assign the submatrix of A specified by the two index slices s1 and s2
  147. row(A, i) = w; column(A, j) = w; // a row or column of matrix as a vector
  148. </code></pre>
  149. <p><em>Note:</em> A range <code>r = range(start, stop)</code>
  150. contains all indices <code>i</code> with <code>start &lt;= i &lt;
  151. stop</code>. A slice is something more general. The slice
  152. <code>s = slice(start, stride, size)</code> contains the indices
  153. <code>start, start+stride, ..., start+(size-1)*stride</code>. The
  154. stride can be 0 or negative! If <code>start >= stop</code> for a range
  155. or <code>size == 0</code> for a slice then it contains no elements.</p>
  156. <p>Sub-ranges and sub-slices of vectors and matrices can be created directly with the <code>subrange</code> and <code>sublice</code> functions:</p>
  157. <pre><code>
  158. w = subrange(u, 0, 2); // the 2 element subvector of u
  159. w = subslice(u, 0, 1, 2); // the 2 element subvector of u
  160. C = subrange(A, 0,2, 0,3); // the 2x3 element submatrix of A
  161. C = subslice(A, 0,1,2, 0,1,3); // the 2x3 element submatrix of A
  162. subrange(u, 0, 2) = w; // assign the 2 element subvector of u
  163. subslice(u, 0, 1, 2) = w; // assign the 2 element subvector of u
  164. subrange(A, 0,2, 0,3) = C; // assign the 2x3 element submatrix of A
  165. subrange(A, 0,1,2, 0,1,3) = C; // assigne the 2x3 element submatrix of A
  166. </code></pre>
  167. <p>There are to more ways to access some matrix elements as a
  168. vector:</p>
  169. <pre><code>matrix_vector_range&lt;matrix_type&gt; (A, r1, r2);
  170. matrix_vector_slice&lt;matrix_type&gt; (A, s1, s2);
  171. </code></pre>
  172. <p><em>Note:</em> These matrix proxies take a sequence of elements
  173. of a matrix and allow you to access these as a vector. In
  174. particular <code>matrix_vector_slice</code> can do this in a very
  175. general way. <code>matrix_vector_range</code> is less useful as the
  176. elements must lie along a diagonal.</p>
  177. <p><em>Example:</em> To access the first two elements of a sub
  178. column of a matrix we access the row with a slice with stride 1 and
  179. the column with a slice with stride 0 thus:<br />
  180. <code>matrix_vector_slice&lt;matrix_type&gt; (A, slice(0,1,2),
  181. slice(0,0,2));
  182. </code></p>
  183. <h2><a name="speed">Speed improvements</a></h2>
  184. <h3><a name='noalias'>Matrix / Vector assignment</a></h3>
  185. <p>If you know for sure that the left hand expression and the right
  186. hand expression have no common storage, then assignment has
  187. no <em>aliasing</em>. A more efficient assignment can be specified
  188. in this case:</p>
  189. <pre><code>noalias(C) = prod(A, B);
  190. </code></pre>
  191. <p>This avoids the creation of a temporary matrix that is required in a normal assignment.
  192. 'noalias' assignment requires that the left and right hand side be size conformant.</p>
  193. <h3>Sparse element access</h3>
  194. <p>The matrix element access function <code>A(i1,i2)</code> or the equivalent vector
  195. element access functions (<code>v(i) or v[i]</code>) usually create 'sparse element proxies'
  196. when applied to a sparse matrix or vector. These <em>proxies</em> allow access to elements
  197. without having to worry about nasty C++ issues where references are invalidated.</p>
  198. <p>These 'sparse element proxies' can be implemented more efficiently when applied to <code>const</code>
  199. objects.
  200. Sadly in C++ there is no way to distinguish between an element access on the left and right hand side of
  201. an assignment. Most often elements on the right hand side will not be changed and therefore it would
  202. be better to use the <code>const</code> proxies. We can do this by making the matrix or vector
  203. <code>const</code> before accessing it's elements. For example:</p>
  204. <pre><code>value = const_cast&lt;const VEC&gt;(v)[i]; // VEC is the type of V
  205. </code></pre>
  206. <p>If more then one element needs to be accessed <code>const_iterator</code>'s should be used
  207. in preference to <code>iterator</code>'s for the same reason. For the more daring 'sparse element proxies'
  208. can be completely turned off in uBLAS by defining the configuration macro <code>BOOST_UBLAS_NO_ELEMENT_PROXIES</code>.
  209. </p>
  210. <h3>Controlling the complexity of nested products</h3>
  211. <p>What is the complexity (the number of add and multiply operations) required to compute the following?
  212. </p>
  213. <pre>
  214. R = prod(A, prod(B,C));
  215. </pre>
  216. <p>Firstly the complexity depends on matrix size. Also since prod is transitive (not commutative)
  217. the bracket order affects the complexity.
  218. </p>
  219. <p>uBLAS evaluates expressions without matrix or vector temporaries and honours
  220. the bracketing structure. However avoiding temporaries for nested product unnecessarly increases the complexity.
  221. Conversly by explictly using temporary matrices the complexity of a nested product can be reduced.
  222. </p>
  223. <p>uBLAS provides 3 alternative syntaxes for this purpose:
  224. </p>
  225. <pre>
  226. temp_type T = prod(B,C); R = prod(A,T); // Preferable if T is preallocated
  227. </pre>
  228. <pre>
  229. prod(A, temp_type(prod(B,C));
  230. </pre>
  231. <pre>
  232. prod(A, prod&lt;temp_type&gt;(B,C));
  233. </pre>
  234. <p>The 'temp_type' is important. Given A,B,C are all of the same type. Say
  235. matrix&lt;float&gt;, the choice is easy. However if the value_type is mixed (int with float or double)
  236. or the matrix type is mixed (sparse with symmetric) the best solution is not so obvious. It is up to you! It
  237. depends on numerical properties of A and the result of the prod(B,C).
  238. </p>
  239. <hr />
  240. <p>Copyright (&copy;) 2000-2007 Joerg Walter, Mathias Koch, Gunter
  241. Winkler, Michael Stevens<br />
  242. Use, modification and distribution are subject to the
  243. Boost Software License, Version 1.0.
  244. (See accompanying file LICENSE_1_0.txt
  245. or copy at <a href="http://www.boost.org/LICENSE_1_0.txt">
  246. http://www.boost.org/LICENSE_1_0.txt
  247. </a>).
  248. </p>
  249. <script type="text/javascript">
  250. (function($) {
  251. $('#toc').toc();
  252. })(jQuery);
  253. </script>
  254. </body>
  255. </html>