policy_tutorial.qbk 20 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535
  1. [section:pol_tutorial Policy Tutorial]
  2. [section:what_is_a_policy So Just What is a Policy Anyway?]
  3. A policy is a compile-time mechanism for customising the behaviour of a
  4. special function, or a statistical distribution. With Policies you can
  5. control:
  6. * What action to take when an error occurs.
  7. * What happens when you call a function that is mathematically undefined
  8. (for example, if you ask for the mean of a Cauchy distribution).
  9. * What happens when you ask for a quantile of a discrete distribution.
  10. * Whether the library is allowed to internally promote `float` to `double`
  11. and `double` to `long double` in order to improve precision.
  12. * What precision to use when calculating the result.
  13. Some of these policies could arguably be run-time variables, but then we couldn't
  14. use compile-time dispatch internally to select the best evaluation method
  15. for the given policies.
  16. For this reason a Policy is a /type/: in fact it's an instance of the
  17. class template `boost::math::policies::policy<>`. This class is just a
  18. compile-time-container of user-selected policies (sometimes called a type-list).
  19. Over a dozen __policy_defaults are provided, so most of the time you can ignore the policy framework,
  20. but you can overwrite the defaults with your own policies to give detailed control, for example:
  21. using namespace boost::math::policies;
  22. // Define a policy that sets ::errno on overflow,
  23. // and does not promote double to long double internally,
  24. // and only aims for precision of only 3 decimal digits,
  25. // to an error-handling policy, usually to trade precision for speed:
  26. typedef policy
  27. <
  28. domain_error<errno_on_error>,
  29. promote_double<false>,
  30. digits10<3>
  31. > my_policy;
  32. [endsect] [/section:what_is_a_policy So Just What is a Policy Anyway?]
  33. [section:policy_tut_defaults Policies Have Sensible Defaults]
  34. Most of the time you can just ignore the policy framework.
  35. ['*The defaults for the various policies are as follows,
  36. if these work OK for you then you can stop reading now!]
  37. [variablelist
  38. [[Domain Error][Throws a `std::domain_error` exception.]]
  39. [[Pole Error][Occurs when a function is evaluated at a pole: throws a `std::domain_error` exception.]]
  40. [[Overflow Error][Throws a `std::overflow_error` exception.]]
  41. [[Underflow][Ignores the underflow, and returns zero.]]
  42. [[Denormalised Result][Ignores the fact that the result is denormalised, and returns it.]]
  43. [[Rounding Error][Throws a `boost::math::rounding_error` exception.]]
  44. [[Internal Evaluation Error][Throws a `boost::math::evaluation_error` exception.]]
  45. [[Indeterminate Result Error][Returns a result that depends on the function where the error occurred.]]
  46. [[Promotion of float to double][Does occur by default - gives full float precision results.]]
  47. [[Promotion of double to long double][Does occur by default if long double offers
  48. more precision than double.]]
  49. [[Precision of Approximation Used][By default uses an approximation that
  50. will result in the lowest level of error for the type of the result.]]
  51. [[Behaviour of Discrete Quantiles]
  52. [
  53. The quantile function will by default return an integer result that has been
  54. /rounded outwards/. That is to say lower quantiles (where the probability is
  55. less than 0.5) are rounded downward, and upper quantiles (where the probability
  56. is greater than 0.5) are rounded upwards. This behaviour
  57. ensures that if an X% quantile is requested, then /at least/ the requested
  58. coverage will be present in the central region, and /no more than/
  59. the requested coverage will be present in the tails.
  60. This behaviour can be changed so that the quantile functions are rounded
  61. differently, or even return a real-valued result using
  62. [link math_toolkit.pol_overview Policies]. It is strongly
  63. recommended that you read the tutorial
  64. [link math_toolkit.pol_tutorial.understand_dis_quant
  65. Understanding Quantiles of Discrete Distributions] before
  66. using the quantile function on a discrete distribution. The
  67. [link math_toolkit.pol_ref.discrete_quant_ref reference docs]
  68. describe how to change the rounding policy
  69. for these distributions.
  70. ]]
  71. ]
  72. What's more, if you define your own policy type, then it automatically
  73. inherits the defaults for any policies not explicitly set, so given:
  74. using namespace boost::math::policies;
  75. //
  76. // Define a policy that sets ::errno on overflow, and does
  77. // not promote double to long double internally:
  78. typedef policy
  79. <
  80. domain_error<errno_on_error>,
  81. promote_double<false>
  82. > my_policy;
  83. then `my_policy` defines a policy where only the overflow error handling and
  84. `double`-promotion policies differ from the defaults.
  85. We can also add a desired precision, for example, 9 bits or 3 decimal digits,
  86. to an error-handling policy, usually to trade precision for speed:
  87. typedef policy<domain_error<errno_on_error>, digit2<9> > my_policy;
  88. Or if you want to further modify an existing user policy, use `normalise`:
  89. using boost::math::policies::normalise;
  90. typedef normalise<my_policy, digits2<9>>::type my_policy_9; // errno on error, and limited precision.
  91. [endsect] [/section:policy_tut_defaults Policies Have Sensible Defaults]
  92. [section:policy_usage So How are Policies Used Anyway?]
  93. The details follow later, but basically policies can be set by either:
  94. * Defining some macros that change the default behaviour: [*this is the
  95. recommended method for setting installation-wide policies].
  96. * By instantiating a statistical distribution object with an explicit policy:
  97. this is mainly reserved for ad hoc policy changes.
  98. * By passing a policy to a special function as an optional final argument:
  99. this is mainly reserved for ad hoc policy changes.
  100. * By using some helper macros to define a set of functions or distributions
  101. in the current namespace that use a specific policy: [*this is the
  102. recommended method for setting policies on a project- or translation-unit-wide
  103. basis].
  104. The following sections introduce these methods in more detail.
  105. [endsect] [/section:policy_usage So How are Policies Used Anyway?]
  106. [section:changing_policy_defaults Changing the Policy Defaults]
  107. The default policies used by the library are changed by the usual
  108. configuration macro method.
  109. For example, passing `-DBOOST_MATH_DOMAIN_ERROR_POLICY=errno_on_error` to
  110. your compiler will cause domain errors to set `::errno` and return a __NaN
  111. rather than the usual default behaviour of throwing a `std::domain_error`
  112. exception.
  113. [tip For Microsoft Visual Studio,you can add to the Project Property Page,
  114. C/C++, Preprocessor, Preprocessor definitions like:
  115. ``BOOST_MATH_ASSERT_UNDEFINED_POLICY=0
  116. BOOST_MATH_OVERFLOW_ERROR_POLICY=errno_on_error``
  117. This may be helpful to avoid complications with pre-compiled headers
  118. that may mean that the equivalent definitions in source code:
  119. ``#define BOOST_MATH_ASSERT_UNDEFINED_POLICY false
  120. #define BOOST_MATH_OVERFLOW_ERROR_POLICY errno_on_error``
  121. *may be ignored*.
  122. The compiler command line shows:
  123. ``/D "BOOST_MATH_ASSERT_UNDEFINED_POLICY=0"
  124. /D "BOOST_MATH_OVERFLOW_ERROR_POLICY=errno_on_error"``
  125. ] [/MSVC tip]
  126. There is however a very important caveat to this:
  127. [important
  128. [*['Default policies changed by setting configuration macros must be changed
  129. uniformly in every translation unit in the program.]]
  130. Failure to follow this rule may result in violations of the "One
  131. Definition Rule (ODR)" and result in unpredictable program behaviour.]
  132. That means there are only two safe ways to use these macros:
  133. * Edit them in [@../../../../boost/math/tools/user.hpp boost/math/tools/user.hpp],
  134. so that the defaults are set on an installation-wide basis.
  135. Unfortunately this may not be convenient if
  136. you are using a pre-installed Boost distribution (on Linux for example).
  137. * Set the defines in your project's Makefile or build environment, so that they
  138. are set uniformly across all translation units.
  139. What you should *not* do is:
  140. * Set the defines in the source file using `#define` as doing so
  141. almost certainly will break your program, unless you're absolutely
  142. certain that the program is restricted to a single translation unit.
  143. And, yes, you will find examples in our test programs where we break this
  144. rule: but only because we know there will always be a single
  145. translation unit only: ['don't say that you weren't warned!]
  146. [import ../../example/error_handling_example.cpp]
  147. [error_handling_example]
  148. [endsect] [/section:changing_policy_defaults Changing the Policy Defaults]
  149. [section:ad_hoc_dist_policies Setting Policies for Distributions on an Ad Hoc Basis]
  150. All of the statistical distributions in this library are class templates
  151. that accept two template parameters:
  152. real type (float, double ...) and policy (how to handle exceptional events),
  153. both with sensible defaults, for example:
  154. namespace boost{ namespace math{
  155. template <class RealType = double, class Policy = policies::policy<> >
  156. class fisher_f_distribution;
  157. typedef fisher_f_distribution<> fisher_f;
  158. }}
  159. This policy gets used by all the accessor functions that accept
  160. a distribution as an argument, and forwarded to all the functions called
  161. by these. So if you use the shorthand-typedef for the distribution, then you get
  162. `double` precision arithmetic and all the default policies.
  163. However, say for example we wanted to evaluate the quantile
  164. of the binomial distribution at float precision, without internal
  165. promotion to double, and with the result rounded to the /nearest/
  166. integer, then here's how it can be done:
  167. [import ../../example/policy_eg_3.cpp]
  168. [policy_eg_3]
  169. Which outputs:
  170. [pre quantile is: 40]
  171. [endsect] [/section:ad_hoc_dist_policies Setting Policies for Distributions on an Ad Hoc Basis]
  172. [section:ad_hoc_sf_policies Changing the Policy on an Ad Hoc Basis for the Special Functions]
  173. All of the special functions in this library come in two overloaded forms,
  174. one with a final "policy" parameter, and one without. For example:
  175. namespace boost{ namespace math{
  176. template <class RealType, class Policy>
  177. RealType tgamma(RealType, const Policy&);
  178. template <class RealType>
  179. RealType tgamma(RealType);
  180. }} // namespaces
  181. Normally, the second version is just a forwarding wrapper to the first
  182. like this:
  183. template <class RealType>
  184. inline RealType tgamma(RealType x)
  185. {
  186. return tgamma(x, policies::policy<>());
  187. }
  188. So calling a special function with a specific policy
  189. is just a matter of defining the policy type to use
  190. and passing it as the final parameter. For example,
  191. suppose we want `tgamma` to behave in a C-compatible
  192. fashion and set `::errno` when an error occurs, and never
  193. throw an exception:
  194. [import ../../example/policy_eg_1.cpp]
  195. [policy_eg_1]
  196. which outputs:
  197. [pre
  198. Result of tgamma(30000) is: 1.#INF
  199. errno = 34
  200. Result of tgamma(-10) is: 1.#QNAN
  201. errno = 33
  202. ]
  203. Alternatively, for ad hoc use, we can use the `make_policy`
  204. helper function to create a policy for us: this usage is more
  205. verbose, so is probably only preferred when a policy is going
  206. to be used once only:
  207. [import ../../example/policy_eg_2.cpp]
  208. [policy_eg_2]
  209. [endsect] [/section:ad_hoc_sf_policies Changing the Policy on an Ad Hoc Basis for the Special Functions]
  210. [section:namespace_policies Setting Policies at Namespace or Translation Unit Scope]
  211. Sometimes what you want to do is just change a set of policies within
  212. the current scope: *the one thing you should not do in this situation
  213. is use the configuration macros*, as this can lead to "One Definition
  214. Rule" violations. Instead this library provides a pair of macros
  215. especially for this purpose.
  216. Let's consider the special functions first: we can declare a set of
  217. forwarding functions that all use a specific policy using the
  218. macro BOOST_MATH_DECLARE_SPECIAL_FUNCTIONS(['Policy]). This
  219. macro should be used either inside a unique namespace set aside for the
  220. purpose (for example, a C namespace for a C-style policy),
  221. or an unnamed namespace if you just want the functions
  222. visible in global scope for the current file only.
  223. [import ../../example/policy_eg_4.cpp]
  224. [policy_eg_4]
  225. The same mechanism works well at file scope as well, by using an unnamed
  226. namespace, we can ensure that these declarations don't conflict with any
  227. alternate policies present in other translation units:
  228. [import ../../example/policy_eg_5.cpp]
  229. [policy_eg_5]
  230. Handling policies for the statistical distributions is very similar except that now
  231. the macro BOOST_MATH_DECLARE_DISTRIBUTIONS accepts two parameters: the
  232. floating point type to use, and the policy type to apply. For example:
  233. BOOST_MATH_DECLARE_DISTRIBUTIONS(double, my_policy)
  234. Results a set of typedefs being defined like this:
  235. typedef boost::math::normal_distribution<double, my_policy> normal;
  236. The name of each typedef is the same as the name of the distribution
  237. class template, but without the "_distribution" suffix.
  238. [import ../../example/policy_eg_6.cpp]
  239. [policy_eg_6]
  240. [note
  241. There is an important limitation to note: you can *not use the macros
  242. BOOST_MATH_DECLARE_DISTRIBUTIONS and BOOST_MATH_DECLARE_SPECIAL_FUNCTIONS
  243. ['in the same namespace]*, as doing so creates ambiguities between functions
  244. and distributions of the same name.
  245. ]
  246. As before, the same mechanism works well at file scope as well: by using an unnamed
  247. namespace, we can ensure that these declarations don't conflict with any
  248. alternate policies present in other translation units:
  249. [import ../../example/policy_eg_7.cpp]
  250. [policy_eg_7]
  251. [endsect][/section:namespace_policies Setting Policies at Namespace or Translation Unit Scope]
  252. [section:user_def_err_pol Calling User Defined Error Handlers]
  253. [import ../../example/policy_eg_8.cpp]
  254. [policy_eg_8]
  255. [import ../../example/policy_eg_9.cpp]
  256. [policy_eg_9]
  257. [endsect] [/section:user_def_err_pol Calling User Defined Error Handlers]
  258. [section:understand_dis_quant Understanding Quantiles of Discrete Distributions]
  259. Discrete distributions present us with a problem when calculating the
  260. quantile: we are starting from a continuous real-valued variable - the
  261. probability - but the result (the value of the random variable)
  262. should really be discrete.
  263. Consider for example a Binomial distribution, with a sample size of
  264. 50, and a success fraction of 0.5. There are a variety of ways
  265. we can plot a discrete distribution, but if we plot the PDF
  266. as a step-function then it looks something like this:
  267. [$../graphs/binomial_pdf.png]
  268. Now lets suppose that the user asks for a the quantile that corresponds
  269. to a probability of 0.05, if we zoom in on the CDF for that region here's
  270. what we see:
  271. [$../graphs/binomial_quantile_1.png]
  272. As can be seen there is no random variable that corresponds to
  273. a probability of exactly 0.05, so we're left with two choices as
  274. shown in the figure:
  275. * We could round the result down to 18.
  276. * We could round the result up to 19.
  277. In fact there's actually a third choice as well: we could "pretend" that the
  278. distribution was continuous and return a real valued result: in this case we
  279. would calculate a result of approximately 18.701 (this accurately
  280. reflects the fact that the result is nearer to 19 than 18).
  281. By using policies we can offer any of the above as options, but that
  282. still leaves the question: ['What is actually the right thing to do?]
  283. And in particular: ['What policy should we use by default?]
  284. In coming to an answer we should realise that:
  285. * Calculating an integer result is often much faster than
  286. calculating a real-valued result: in fact in our tests it
  287. was up to 20 times faster.
  288. * Normally people calculate quantiles so that they can perform
  289. a test of some kind: ['"If the random variable is less than N
  290. then we can reject our null-hypothesis with 90% confidence."]
  291. So there is a genuine benefit to calculating an integer result
  292. as well as it being "the right thing to do" from a philosophical
  293. point of view. What's more if someone asks for a quantile at 0.05,
  294. then we can normally assume that they are asking for
  295. ['[*at least] 95% of the probability to the right of the value chosen,
  296. and [*no more than] 5% of the probability to the left of the value chosen.]
  297. In the above binomial example we would therefore round the result down to 18.
  298. The converse applies to upper-quantiles: If the probability is greater than
  299. 0.5 we would want to round the quantile up, ['so that [*at least] the requested
  300. probability is to the left of the value returned, and [*no more than] 1 - the
  301. requested probability is to the right of the value returned.]
  302. Likewise for two-sided intervals, we would round lower quantiles down,
  303. and upper quantiles up. This ensures that we have ['at least the requested
  304. probability in the central region] and ['no more than 1 minus the requested
  305. probability in the tail areas.]
  306. For example, taking our 50 sample binomial distribution with a success fraction
  307. of 0.5, if we wanted a two sided 90% confidence interval, then we would ask
  308. for the 0.05 and 0.95 quantiles with the results ['rounded outwards] so that
  309. ['at least 90% of the probability] is in the central area:
  310. [$../graphs/binomial_pdf_3.png]
  311. So far so good, but there is in fact a trap waiting for the unwary here:
  312. quantile(binomial(50, 0.5), 0.05);
  313. returns 18 as the result, which is what we would expect from the graph above,
  314. and indeed there is no x greater than 18 for which:
  315. cdf(binomial(50, 0.5), x) <= 0.05;
  316. However:
  317. quantile(binomial(50, 0.5), 0.95);
  318. returns 31, and indeed while there is no x less than 31 for which:
  319. cdf(binomial(50, 0.5), x) >= 0.95;
  320. We might naively expect that for this symmetrical distribution the result
  321. would be 32 (since 32 = 50 - 18), but we need to remember that the cdf of
  322. the binomial is /inclusive/ of the random variable. So while the left tail
  323. area /includes/ the quantile returned, the right tail area always excludes
  324. an upper quantile value: since that "belongs" to the central area.
  325. Look at the graph above to see what's going on here: the lower quantile
  326. of 18 belongs to the left tail, so any value <= 18 is in the left tail.
  327. The upper quantile of 31 on the other hand belongs to the central area,
  328. so the tail area actually starts at 32, so any value > 31 is in the
  329. right tail.
  330. Therefore if U and L are the upper and lower quantiles respectively, then
  331. a random variable X is in the tail area - where we would reject the null
  332. hypothesis if:
  333. X <= L || X > U
  334. And the a variable X is inside the central region if:
  335. L < X <= U
  336. The moral here is to ['always be very careful with your comparisons
  337. when dealing with a discrete distribution], and if in doubt,
  338. ['base your comparisons on CDF's instead].
  339. [heading Other Rounding Policies are Available]
  340. As you would expect from a section on policies, you won't be surprised
  341. to know that other rounding options are available:
  342. [variablelist
  343. [[integer_round_outwards]
  344. [This is the default policy as described above: lower quantiles
  345. are rounded down (probability < 0.5), and upper quantiles
  346. (probability > 0.5) are rounded up.
  347. This gives /no more than/ the requested probability
  348. in the tails, and /at least/ the requested probability
  349. in the central area.]]
  350. [[integer_round_inwards]
  351. [This is the exact opposite of the default policy:
  352. lower quantiles
  353. are rounded up (probability < 0.5),
  354. and upper quantiles (probability > 0.5) are rounded down.
  355. This gives /at least/ the requested probability
  356. in the tails, and /no more than/ the requested probability
  357. in the central area.]]
  358. [[integer_round_down][This policy will always round the result down
  359. no matter whether it is an upper or lower quantile]]
  360. [[integer_round_up][This policy will always round the result up
  361. no matter whether it is an upper or lower quantile]]
  362. [[integer_round_nearest][This policy will always round the result
  363. to the nearest integer
  364. no matter whether it is an upper or lower quantile]]
  365. [[real][This policy will return a real valued result
  366. for the quantile of a discrete distribution: this is
  367. generally much slower than finding an integer result
  368. but does allow for more sophisticated rounding policies.]]
  369. ]
  370. [import ../../example/policy_eg_10.cpp]
  371. [policy_eg_10]
  372. [endsect] [/section:understand_dis_quant Understanding Quantiles of Discrete Distributions]
  373. [endsect] [/section:pol_Tutorial Policy Tutorial]
  374. [/ math.qbk
  375. Copyright 2007, 2013 John Maddock and Paul A. Bristow.
  376. Distributed under the Boost Software License, Version 1.0.
  377. (See accompanying file LICENSE_1_0.txt or copy at
  378. http://www.boost.org/LICENSE_1_0.txt).
  379. ]