Skip to content

Commit

Permalink
Merge branch 'release/v3.2'
Browse files Browse the repository at this point in the history
  • Loading branch information
HDembinski committed Aug 24, 2018
2 parents 31d4204 + 06596ef commit 1e36e99
Show file tree
Hide file tree
Showing 72 changed files with 3,060 additions and 3,076 deletions.
5 changes: 3 additions & 2 deletions .clang-format
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,9 @@ BraceWrapping:
BreakBeforeBinaryOperators: None
BreakBeforeBraces: Attach
BreakBeforeTernaryOperators: true
BreakConstructorInitializersBeforeComma: false
ColumnLimit: 78
BreakConstructorInitializersBeforeComma: true
# BreakInheritanceListBeforeComma: true
ColumnLimit: 90
CommentPragmas: '^ IWYU pragma:'
ConstructorInitializerAllOnOneLineOrOnePerLine: true
ConstructorInitializerIndentWidth: 4
Expand Down
2 changes: 0 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,6 @@ matrix:
- os: osx # minimum osx Xcode 8.3
osx_image: xcode8.3
env: PY=OFF NUMPY=OFF SERIAL=OFF
allow_failures:
- os: osx

git:
depth: 10
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ develop | [![Build Status Travis](https://travis-ci.org/HDembinski/histogram.svg
3. Visual Studio 14 2015


This `C++11` library provides a multi-dimensional [histogram](https://en.wikipedia.org/wiki/Histogram) class for your statistics needs. The library is **header-only**, if you don't need the Python module.
This `C++11` open-source library provides a state-of-the-art multi-dimensional [histogram](https://en.wikipedia.org/wiki/Histogram) class for the professional statistician and everyone who needs to count things. The library is **header-only**, if you don't need the Python module. Check out the [full documentation](http://hdembinski.github.io/histogram/doc/html/).

The histogram is very customisable through policy classes, but the default policies were carefully designed so that most users don't need to customize anything. In the standard configuration, this library offers a unique safety guarantee not found elsewhere: bin counts *cannot overflow* or *be capped*. While being safe to use, the library also has a convenient interface, is memory conserving, and faster than other libraries (see benchmarks).

Expand Down Expand Up @@ -40,6 +40,7 @@ Check out the [full documentation](http://hdembinski.github.io/histogram/doc/htm
* Support for under-/overflow bins (can be disabled individually for each dimension)
* Support for variance tracking (++)
* Support for addition and scaling of histograms
* Support for custom allocators
* Optional serialization based on [Boost.Serialization](https://www.boost.org/doc/libs/release/libs/serialization/)
* Optional Python-bindings that work with [Python-2.7 to 3.6](http://www.python.org) with [Boost.Python](https://www.boost.org/doc/libs/release/libs/python/)
* Optional [Numpy](http://www.numpy.org) support
Expand Down
2 changes: 1 addition & 1 deletion build/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@ foreach(SRC IN ITEMS ${TEST_SOURCES})
else()
target_link_libraries(${BASENAME} ${LIBRARIES})
endif()
if (BASENAME MATCHES "fail")
if (BASENAME MATCHES "fail_")
if (DEFINED PYTHON_EXECUTABLE)
add_test(NAME ${BASENAME} COMMAND ${PYTHON_EXECUTABLE}
../test/pass_on_fail.py ${BASENAME})
Expand Down
12 changes: 6 additions & 6 deletions doc/benchmarks.qbk
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,14 @@ The following plot shows results of a benchmark on a 9 GHz Macbook Pro. Random n
[variablelist Plot legend:
[[root] [[@https://root.cern.ch ROOT classes] (`TH1I` for 1D, `TH3I` for 3D and `THnI` for 6D)]]
[[py:numpy] [numpy functions ([python]`numpy.histogram` for 1D, `numpy.histogramdd` for 2D, 3D, and 6D)]]
[[py:hd_sd] [[classref boost::histogram::dynamic_histogram] with [classref boost::histogram::adaptive_storage], called from Python]]
[[hs_ss] [[classref boost::histogram::static_histogram] with [classref boost::histogram::array_storage<int>]]]
[[hs_sd] [[classref boost::histogram::static_histogram] with [classref boost::histogram::adaptive_storage]]]
[[hd_ss] [[classref boost::histogram::dynamic_histogram] with [classref boost::histogram::array_storage<int>]]]
[[hd_sd] [[classref boost::histogram::dynamic_histogram] with [classref boost::histogram::adaptive_storage]]]
[[py:hd_sd] [[funcref boost::histogram::make_dynamic_histogram dynamic histogram] with [classref boost::histogram::adaptive_storage], called from Python]]
[[hs_ss] [[funcref boost::histogram::make_static_histogram static histogram] with [classref boost::histogram::array_storage<int>]]]
[[hs_sd] [[funcref boost::histogram::make_static_histogram static histogram] with [classref boost::histogram::adaptive_storage]]]
[[hd_ss] [[funcref boost::histogram::make_dynamic_histogram dynamic histogram] with [classref boost::histogram::array_storage<int>]]]
[[hd_sd] [[funcref boost::histogram::make_dynamic_histogram dynamic histogram] with [classref boost::histogram::adaptive_storage]]]
]

[classref boost::histogram::static_histogram] is always faster than [classref boost::histogram::dynamic_histogram] and safer to use, as more checks are done at compile time. It is recommended when working in C++ only. [classref boost::histogram::adaptive_storage] is faster than [classref boost::histogram::array_storage] for histograms with many bins, because it uses the cache more effectively due to its smaller memory consumption per bin. If the number of bins is small, it is slower because of overhead of handling memory in a dynamic way.
A [classref boost::histogram::make_static_histogram static histogram] is always faster than [classref boost::histogram::make_dynamic_histogram dynamic histogram] and safer to use, as more checks are done at compile time. It is recommended when working in C++ only. [classref boost::histogram::adaptive_storage] is faster than [classref boost::histogram::array_storage] for histograms with many bins, because it uses the cache more effectively due to its smaller memory consumption per bin. If the number of bins is small, it is slower because of overhead of handling memory in a dynamic way.

The histograms in this library are mostly faster than the competition, in some cases by a factor of 2. Simultaneously they are more flexible, since binning strategies can be customised. The Python-wrapped histogram is slower than numpy's own specialized function for 1D, but beats numpy's multi-dimensional histogramming function by a factor 2 to 3.

Expand Down
5 changes: 5 additions & 0 deletions doc/changelog.qbk
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@

[master]

[heading 3.2 (not in boost)]

* Allocator support everywhere
* Internal refactoring

[heading 3.1 (not in boost)]

* Renamed `bincount` method to `size`
Expand Down
3 changes: 2 additions & 1 deletion doc/concepts.qbk
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ An `axis_type` converts input values into bin indices.

An `axis_type` is required to:

* derive publically from [classref boost::histogram::axis::axis_base] or [classref boost::histogram::axis::axis_base_uoflow]
* derive publically from [classref boost::histogram::axis::labeled_base] and [classref boost::histogram::axis::iterator_mixin]
* be default/copy/move constructable
* be copy/move assignable
* be equal comparable
Expand Down Expand Up @@ -37,6 +37,7 @@ A `storage_type` is required to:
* be default/copy/move constructable
* be copy/move assignable
* be equal comparable
* have a nested type `allocator_type`
* have a nested type `element_type`, which represent the bin count
* have a nested type `const_reference`, its const reference version
* have a constructor `storage_type(std::size_t n)`, which prepares the storage of `n` bins.
Expand Down
16 changes: 8 additions & 8 deletions doc/guide.qbk
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,13 @@ The term /histogram/ is usually strictly used for something with bins over discr

[section Static or dynamic histogram]

The histogram class comes in two variants with a common interface, see the [link histogram.rationale.histogram_types rationale] for more information. Using a [classref boost::histogram::static_histogram static histogram] is recommended. You need a [classref boost::histogram::dynamic_histogram dynamic histogram] instead, if:
The histogram host class can store axis objects in a static or dynamic container, see the [link histogram.rationale.histogram_host rationale] for details. Use the factory functions [funcref boost::histogram::make_static_histogram make_static_histogram] and [funcref boost::histogram::make_dynamic_histogram make_dynamic_histogram] to make the corresponding histograms. Using static histograms is recommended, because they are faster and usage errors are caught at compile-time. Use dynamic histogram, if:

* you only know the histogram configurations at runtime, not at compile-time
* You only know the axis configurations at runtime, not at compile-time.

* you want to write C++ code that interoperates with the Python module included in the library
* You want to write C++ code that interoperates with the Python module included in the library.

Use the factory function [funcref boost::histogram::make_static_histogram make_static_histogram] (or [funcref boost::histogram::make_dynamic_histogram make_dynamic_histogram], respectively) to make histograms with the default storage policy. The default storage policy makes sure that counting is safe, fast, and memory efficient. If you are curious about trying another storage policy or using your own, have a look at the section [link histogram.guide.expert Advanced Usage].
These factory functions create histograms with the default storage type [classref boost::histogram::adaptive_storage], which provides safe counting, is fast and memory efficient. If you think you need another storage type or if you want to create your own, have a look at the section [link histogram.guide.expert Advanced Usage].

Here is an example on how to use [funcref boost::histogram::make_static_histogram make_static_histogram]. You pass one or several axis instances, which define the layout of the histogram.

Expand All @@ -40,7 +40,7 @@ When you work with dynamic histograms, you can also create a sequence of axes at

[funcref boost::histogram::make_static_histogram make_static_histogram] cannot handle this case because a static histogram can only be constructed when the number and types of all axes are known already at compile time. While strictly speaking that is also true in this example, you could have filled the vector also at run-time, based on run-time user input.

[note Memory for bin counters is allocated lazily, because if the default storage policy [classref boost::histogram::adaptive_storage adaptive_storage] is used. Allocation is deferred to the first time, when input values are passed to the histogram. Therefore memory allocation exceptions are not thrown when the histogram is created, but possibly later. This gives you a chance to check how much memory the histogram will allocate and possibly give a warning if that amount is excessively large. Use the method `histogram::size()` to see how many bins your axis layout requires. At the first fill, that many bytes will be allocated. The allocated amount of memory may grow further later when the capacity of the bin counters needs to grow.]
[note Memory for bin counters is allocated lazily, if the default storage policy [classref boost::histogram::adaptive_storage adaptive_storage] is used. Allocation is delayed to the first time, when input values are passed to the histogram. Therefore memory allocation exceptions are not thrown when the histogram is created, but possibly later. This gives you a chance to check how much memory the histogram will allocate and possibly give a warning if that amount is excessively large. Use the method `histogram::size()` to see how many bins your axis layout requires. At the first fill, that many bytes will be allocated. The allocated amount of memory may grow further later when the capacity of the bin counters needs to grow.]

[endsect]

Expand All @@ -61,7 +61,7 @@ In addition to the required parameters for an axis, you can assign an optional l

Without the labels it would be difficult to remember which axis was covering which quantity, because they look the same otherwise. Labels are the only axis property that can be changed later. Axes objects with different label do not compare equal with `operator==`.

By default, additional under- and overflow bins are added automatically for each axis where that makes sense. If you create an axis with 20 bins, the histogram will actually have 22 bins along that axis. The two extra bins are generally very good to have, as explained in [link histogram.rationale.uoflow the rationale]. If you are certain that the input cannot exceed the axis range, you can disable the extra bins to save memory. This is done by passing the enum value [enumref boost::histogram::axis::uoflow uoflow::off] to the axis constructor:
By default, additional under- and overflow bins are added automatically for each axis where that makes sense. If you create an axis with 20 bins, the histogram will actually have 22 bins along that axis. The two extra bins are generally very good to have, as explained in [link histogram.rationale.uoflow the rationale]. If you are certain that the input cannot exceed the axis range, you can disable the extra bins to save memory. This is done by passing the enum value [enumref boost::histogram::axis::uoflow_type uoflow_type::off] to the axis constructor:

[import ../examples/guide_axis_with_uoflow_off.cpp]
[guide_axis_with_uoflow_off]
Expand Down Expand Up @@ -95,7 +95,7 @@ Why weighted increments are sometimes useful, especially in a scientific context

After the histogram has been filled, you want to access the counts per bin at some point. You may want to visualize the counts, or compute some quantities like the mean from the data distribution approximated by the histogram.

To access each bin, you use a multi-dimensional index, which consists of a sequence of bin indices for each axes in order. You can use this index to access the value for each and the variance estimate, using the method `histogram::bin(...)`. It accepts integral indices, one for each axis of the histogram, and returns the associated bin counter type. The bin counter type then allows you to access the count value and its variance.
To access each bin, you use a multi-dimensional index, which consists of a sequence of bin indices for each axes in order. You can use this index to access the value for each and the variance estimate, using the method `histogram::at(...)` (in analogy to `std::vector::at`). It accepts integral indices, one for each axis of the histogram, and returns the associated bin counter type. The bin counter type then allows you to access the count value and its variance.

The calls are demonstrated in the next example.

Expand Down Expand Up @@ -140,7 +140,7 @@ Here is an example which demonstrates the supported operators.

When you have a high-dimensional histogram, sometimes you want to remove some axes and look the equivalent lower-dimensional version obtained by summing over the counts along the removed axes. Perhaps you found out that there is no interesting structure along an axis, so it is not worth keeping that axies around, or you want to visualize 1d or 2d projections of a high-dimensional histogram.

For this purpose use the `histogram::reduce_to(...)` method, which returns a new reduced histogram with fewer axes. The method accepts indices (one or more), which indicate the axes that are kept. The static histogram only accepts compile-time numbers, while the dynamic histogram also accepts runtime numbers and iterators over numbers.
For this purpose use the `histogram::reduce_to(...)` method, which returns a new reduced histogram with fewer axes. The method accepts indices (one or more), which indicate the axes that are kept. The static histogram only accepts compile-time numbers, while the dynamic histogram also accepts runtime numbers and iterators over numbers.

Here is an example to illustrates this.

Expand Down
Loading

0 comments on commit 1e36e99

Please sign in to comment.