introduction.qbk 7.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126
  1. [/============================================================================
  2. Boost.Geometry Index
  3. Copyright (c) 2011-2013 Adam Wulkiewicz.
  4. Use, modification and distribution is subject to the Boost Software License,
  5. Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
  6. http://www.boost.org/LICENSE_1_0.txt)
  7. =============================================================================/]
  8. [section Introduction]
  9. The __boost_geometry_index__ is intended to gather data structures called spatial
  10. indexes which may be used to accelerate searching for objects in space. In general,
  11. spatial indexes stores geometric objects' representations and allows searching for
  12. objects occupying some space or close to some point in space.
  13. Currently, only one spatial index is implemented - __rtree__.
  14. [heading __rtree__]
  15. __rtree__ is a tree data structure used for spatial searching. It was proposed by
  16. Antonin Guttman in 1984 [footnote Guttman, A. (1984). /R-Trees: A Dynamic Index Structure for Spatial Searching/]
  17. as an expansion of B-tree for multi-dimensional data. It may be used to store points or volumetric data in order to
  18. perform a spatial query. This query may for example return objects that are inside some area or are close to some point in space
  19. [footnote Cheung, K.; Fu, A. (1998). /Enhanced Nearest Neighbour Search on the R-tree/].
  20. It's possible to insert new objects or to remove the ones already stored.
  21. The __rtree__ structure is presented on the image below. Each __rtree__'s node store a box describing the space occupied by
  22. its children nodes. At the bottom of the structure, there are leaf-nodes which contains values
  23. (geometric objects representations).
  24. [$img/index/rtree/rstar.png]
  25. The __rtree__ is a self-balanced data structure. The key part of balancing algorithm is node splitting algorithm
  26. [footnote Greene, D. (1989). /An implementation and performance analysis of spatial data access methods/]
  27. [footnote Beckmann, N.; Kriegel, H. P.; Schneider, R.; Seeger, B. (1990). /The R*-tree: an efficient and robust access method for points and rectangles/].
  28. Each algorithm produces different splits so the internal structure of a tree may be different for each one of them.
  29. In general, more complex algorithms analyses elements better and produces less overlapping nodes. In the searching process less nodes must be traversed
  30. in order to find desired objects. On the other hand more complex analysis takes more time. In general faster inserting will result in slower searching
  31. and vice versa. The performance of the R-tree depends on balancing algorithm, parameters and data inserted into the container.
  32. Additionally there are also algorithms creating R-tree containing some, number of objects. This technique is called bulk loading and is
  33. done by use of packing algorithm
  34. [footnote Leutenegger, Scott T.; Edgington, Jeffrey M.; Lopez, Mario A. (1997). /STR: A Simple and Efficient Algorithm for R-Tree Packing/]
  35. [footnote Garcia, Yvan J.; Lopez, Mario A.; Leutenegger, Scott T. (1997). /A Greedy Algorithm for Bulk Loading R-trees/].
  36. This method is faster and results in R-trees with better internal structure. This means that the query performance is increased.
  37. The examples of structures of trees created by use of different algorithms and exemplary operations times are presented below.
  38. [table
  39. [[] [Linear algorithm] [Quadratic algorithm] [R*-tree] [Packing algorithm]]
  40. [[*Example structure*] [[$img/index/rtree/linear.png]] [[$img/index/rtree/quadratic.png]] [[$img/index/rtree/rstar.png]] [[$img/index/rtree/bulk.png]]]
  41. [[*1M Values inserts*] [1.76s] [2.47s] [6.19s] [0.64s]]
  42. [[*100k spatial queries*] [2.21s] [0.51s] [0.12s] [0.07s]]
  43. [[*100k knn queries*] [6.37s] [2.09s] [0.64s] [0.52s]]
  44. ]
  45. The configuration of the machine used for testing was: /Intel(R) Core(TM) i7 870 @ 2.93GHz, 8GB RAM, MS Windows 7 x64/.
  46. The code was compiled with optimization for speed (`O2`).
  47. The performance of the R-tree for different values of Max parameter and Min=0.5*Max is presented in the table below.
  48. In the two upper figures you can see the performance of the __rtree__ storing random, relatively small, non-overlapping, 2d boxes.
  49. In the lower ones, the performance of the __rtree__ also storing random, 2d boxes, but this time quite big and possibly overlapping.
  50. As you can see, the __rtree__ performance is different in both cases.
  51. [table
  52. [[] [building] [querying]]
  53. [[*non overlapping*] [[$img/index/rtree/build_non_ovl.png]] [[$img/index/rtree/query_non_ovl.png]]]
  54. [[*overlapping*] [[$img/index/rtree/build_ovl.png]] [[$img/index/rtree/query_ovl.png]]]
  55. ]
  56. [heading Implementation details]
  57. Key features of this implementation of the __rtree__ are:
  58. * capable to store arbitrary __value__ type,
  59. * three different balancing algorithms - linear, quadratic or rstar,
  60. * creation using packing algorithm,
  61. * parameters (including maximal and minimal number of elements) may be passed as compile- or run-time parameters, in compile-time
  62. version nodes elements are stored in static-size containers,
  63. * advanced queries, e.g. search for 5 nearest Values to some point and intersecting some Geometry but not within the other one,
  64. * iterative queries by use of iterators,
  65. * C++11 conformant - move semantics, stateful allocators,
  66. * capable to store __value__ type with no default constructor,
  67. * in-memory storage by use of the default std::allocator<>,
  68. * other storage options - shared memory and mapped file by use of Boost.Interprocess allocators.
  69. [/
  70. [heading Planned features]
  71. Below you can find features that will (or probably will) be added in the future releases:
  72. /]
  73. [/ Done
  74. * rstar optimization (planned for release in Boost 1.55),
  75. * bulk loading (planned for release in Boost 1.55),
  76. * 'reversed' spatial predicates or additional spatial predicates like contains(),
  77. * iterative queries - query iterators / type-erased query iterators,
  78. /]
  79. [/
  80. * path/ray query predicate - search for Values along Segment or LineString, closest to the starting point,
  81. * user-defined distance calculation in nearest() predicate,
  82. * serialization,
  83. * persistent storage.
  84. /]
  85. [/ Maybe
  86. * other geometries as Indexables, e.g. NSpheres. Rings would probably require using move semantics instead of copying
  87. * bounding tree - rtree variation capable to use other Geometries as bounds, e.g. NSpheres, Rings/convex polygons/ (moving required), Capsules, Elipses, Variants etc.
  88. * moving instead of copying + optimizations for movable/nonthrowing/trivialy copied elements
  89. * passing more than one nearest/path predicate - "returned value is one of k1 nearest values to p1 and ... and one of kN nearest values to pN"
  90. /]
  91. [heading Dependencies]
  92. R-tree depends on Boost.Container, Boost.Core, Boost.Move, Boost.MPL, Boost.Range, Boost.Tuple.
  93. [heading Contributors]
  94. The spatial index was originally started by Federico J. Fernandez during the Google Summer of Code 2008 program, mentored by Hartmut Kaiser.
  95. [heading Spatial thanks]
  96. I'd like to thank Barend Gehrels, Bruno Lalande, Mateusz Łoskot, Lucanus J. Simonson for their support and ideas.
  97. [endsect]