Replies: 3 comments
-
Here is an example of the unified "event loop" this would enable, using AwkwardArray and Vector. Today, we can do this: @nb.njit
def compute_masses(awkarray):
out = np.empty(len(awkarray), np.float64)
for i, event in enumerate(awkarray):
total = vector.obj(px=0.0, py=0.0, pz=0.0, E=0.0)
for vec in event:
total = total + vec
out[i] = total.mass
return out
out = compute_masses(awkarray)
hist.fill(out) What we'd like to be able to write is this: @nb.njit
def compute_masses(hist, awkarray):
for event in array:
total = vector.obj(px=0.0, py=0.0, pz=0.0, E=0.0)
for vec in event:
total = total + vec
hist.fill(total.mass)
compute_masses(hist, awkarray) While the fill will be a little slower that the optimized batched lookup in a vectorized fill, you will be able to avoid a large memory allocation and write significantly more natural code in your event loop. |
Beta Was this translation helpful? Give feedback.
-
One good deliverable for the project will be a "Minimal Working Example", similar to the result here for vector, which goes over "from scratch" creation of an ultra-simple histogram-like object in Numba. This would both be a good way to become familiar with Numba, as well as give an idea for how this is implemented at a high level for future developers. |
Beta Was this translation helpful? Give feedback.
-
Just to note here that we also have a strong usecase for this (accumulating calibration constants is one, but probably many others). Specifically, we'd be interested in a setup like this: import boost_histogram as bh
import numba
n_gains = 2
n_pixels = 1855
n_capacitors = 4096
n_samples = 40
baseline = bh.Histogram(
bh.axis.Integer(0, n_gains, underflow=False, overflow=False),
bh.axis.Integer(0, n_pixels, underflow=False, overflow=False),
bh.axis.Integer(0, n_capacitors, underflow=False, overflow=False),
storage=bh.storage.Mean(),
)
spike_height = bh.Histogram(
bh.axis.Integer(0, n_gains, underflow=False, overflow=False),
bh.axis.Integer(0, n_pixels, underflow=False, overflow=False),
bh.axis.Integer(0, n_capacitors, underflow=False, overflow=False),
storage=bh.storage.Mean(),
)
@numba.njit
def fill(baseline, waveform, first_capacitor):
for gain in range(n_gains):
for pixel in range(n_pixels):
for sample in range(n_samples):
# some expensive computation already done in numba
if not is_spike():
baseline.fill(gain, pixel, first_capacitor + sample, sample=waveform[gain, pixel, sample])
else:
spike_height.fill(gain, pixel, first_capacitor + sample, sample=waveform[gain, pixel, sample]) Right now, we have a custom |
Beta Was this translation helpful? Give feedback.
-
This is a discussion around Numba support, which might get a IRIS-HEP fellow this summer; project listed here. Summary from the proposal:
Recent developments in Scikit-HEP libraries have enabled fast, efficient histogramming powered by boost-histogram using the hist library and fitting into a larger ecosystem of plotters and users. One key feature in enabling a fully Numba-enabled event loop for analyses is the histogramming step - most loops read (awkward) data, perform operations (including with vectors), and then fill a histogram. The awkward and vector portions are developed or mostly devleped, leaving the histogramming step as the one element missing from a fully Numba enabled eventloop. This project will investigate ways to enable a fill from inside the LLVM Numba loop without stepping through Python, with the goal of providing first class Numba support for boost-histogram.
Beta Was this translation helpful? Give feedback.
All reactions