Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Examples: Prefix Sum Compute Example #29940

Open
wants to merge 9 commits into
base: dev
Choose a base branch
from

Conversation

cmhhelgeson
Copy link
Contributor

@cmhhelgeson cmhhelgeson commented Nov 21, 2024

Related issue: #XXXX

Description

Creates an example demonstrating various prefix sum algorithms written using the TSL node system. Also measures the performance of the algorithms against each other. Ideally, this is an expanding example where different versions and increasingly performant versions of the prefix sum can be added over time, in a syntax that is hopefully much more familiar and accessible to Javascript programmers than equivalent examples in CUDA or other GPGPU languages.

As demonstrated in this visualization, validated elements of a prefix sum are highlighted in green, while incorrect elements are highlighted in red. The image below demonstrates a correct implementation of a Sklansky prefix sum against the reverse of the expected elements (i.e maxElements to 1 instead of 1 to maxElements).

Currently, I'd like to implement a few more algorithms before this is pulled in.

image

@@ -0,0 +1,325 @@
import * as THREE from 'three';
import { storageObject, If, vec3, uniform, uv, uint, float, Fn, vec2, uvec2, floor, instanceIndex, workgroupBarrier, atomicAdd, atomicStore, workgroupId, storage } from 'three/tsl';

Check notice

Code scanning / CodeQL

Unused variable, import, function or class Note test

Unused imports atomicAdd, atomicStore, storageObject, workgroupId.
const numElements = 16384;


const computePrefixSklanskyFn = Fn( ( currentElements, uniformStorage ) => {

Check notice

Code scanning / CodeQL

Unused variable, import, function or class Note test

Unused variable computePrefixSklanskyFn.
gui.add( effectController, 'Right Display Algo', algorithms );

// Allow Workgroup Array Swaps
init( false, 'Fake Prefix' );

Check warning

Code scanning / CodeQL

Superfluous trailing arguments Warning test

Superfluous argument passed to
function init
.
init( false, 'Fake Prefix' );

// Global Swaps Only
init( true, 'Incorrect' );

Check warning

Code scanning / CodeQL

Superfluous trailing arguments Warning test

Superfluous argument passed to
function init
.
const currentElementsBuffer = new THREE.StorageInstancedBufferAttribute( array, 1 );
const currentElementsStorage = storage( currentElementsBuffer, 'uint', currentElementsBuffer.count ).label( 'Elements' );
const infoBuffer = new THREE.StorageInstancedBufferAttribute( infoArray, 1 );
const infoStorage = storage( infoBuffer, 'uint', infoBuffer.count );

Check notice

Code scanning / CodeQL

Unused variable, import, function or class Note test

Unused variable infoStorage.
examples/webgpu_compute_prefix_sum.html Fixed Show fixed Hide fixed

const LoopThroughWorkgroup = ( callback ) => {

const WORKGROUP_SIZE = uint( 64 ).toVar( 'WORKGROUP_SIZE' );

Check notice

Code scanning / CodeQL

Unused variable, import, function or class Note

Unused variable WORKGROUP_SIZE.

case 'Validate': {

const currentElements = new Uint32Array( await renderer.getArrayBufferAsync( currentElementsBuffer ) );

Check notice

Code scanning / CodeQL

Unused variable, import, function or class Note

Unused variable currentElements.
Copy link

github-actions bot commented Nov 26, 2024

📦 Bundle size

Full ESM build, minified and gzipped.

Before After Diff
WebGL 339.14
78.99
339.14
78.99
+0 B
+0 B
WebGPU 483.59
134.17
483.62
134.18
+34 B
+8 B
WebGPU Nodes 483.06
134.07
483.09
134.08
+34 B
+8 B

🌳 Bundle size after tree-shaking

Minimal build including a renderer, camera, empty scene, and dependencies.

Before After Diff
WebGL 464.62
111.98
464.62
111.98
+0 B
+0 B
WebGPU 552.72
149.52
552.75
149.53
+34 B
+7 B
WebGPU Nodes 508.6
139.24
508.63
139.25
+34 B
+7 B

@cmhhelgeson
Copy link
Contributor Author

I think there's an issue with how performance is being measured. Using the renderer.info method used in the bitonic sort and storage buffer examples, it doesn't seem as if the performance is changing at all based on the selected compute shader. Is there perhaps a way to use the timestamp queries within WebGPU to more accurately measure performance, or is compute.info already using that functionality.

Additionally, I'm not sure if there's a way to revert my commit to specifically ignore the build files that were erroneously pushed in.

@mrdoob
Copy link
Owner

mrdoob commented Nov 26, 2024

/ping @RenaudRohlinger

@RenaudRohlinger
Copy link
Collaborator

RenaudRohlinger commented Nov 26, 2024

Thanks for this super useful example! It helped me identify a very tricky issue with timestamp queries.

I’ve submitted a PR (#29970) that should resolve the issues you encountered. I tested it using your example, and the logs now appear accurate, providing both render and compute timestamp information as expected.

PS: When using timestamp queries, please ensure you refer to await computeAsync and await renderAsync, as timestamp queries are asynchronous operations. Without this, the information will always be incorrect or return 0. /cc @cmhhelgeson

image

@cmhhelgeson cmhhelgeson marked this pull request as ready for review November 27, 2024 00:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants