Final project for the course GPU Architecures and Computing @ TU Wien.
The convex hull of a set of points P in R² is the smallest subset V of P such that the convex combination of points in V yields the smallest convex set that contains all points in P.
Execution of the implemented algorithm used to compute the convex hull.
- Implement in C++ the quickhull algorithm, used to compute the convex hull of a set of points, using the software acceleration provided by the CUDA SDK.
- Test the implementations for correctness.
- Benchmark the different implementations in terms of scalability and efficiency compared to a full CPU implementation.
The project has been divided into independent modules to implement and compare:
- A parametrized generator that can generate the required input for the algorithm.
- A sequential version, which will serve as a baseline for comparison.
- The parallel algorithm in CUDA using synchronous programming trying different memories:
- Standard memory transfer
- Pinned memory
- Unified memory
- Zero-copy memory
- A version of the parallel algorithm in CUDA using the Thrust library.
- Test the different implementations, by generating connected random sets of points and comparing the consistency of the results produced by the different implementations.
- Perform an extensive performance analysis comparing the different implementations and the scalability of these algorithms with different numbers of points and dimensions.
A Makefile
is provided in the main directory. Just type
make
To compile and build the executables, which will be present in the bin
directory created by the Makefile
.
In the bin
directory there will be present different executables:
serial_quickhull.x
which executes the serial implementationcuda_quickhull.x
which executes the CUDA implementation with standard memory transferpinned_quickhull.x
which executes the CUDA implementation with pinned memory transferzero_quickhull.x
which executes the CUDA implementation with zero-copy memory transferunified_quickhull.x
which executes the CUDA implementation with the unified memory transferthrust_quickhull.x
which executes the CUDA implementation using the Thrust library
To compile the code, it is necessary to have the NVCC NVIDIA compiler installed in the system. Furthermore, to execute the code, it is necessary to have a CUDA compatible device installed in the system (you can check this list).
Project carried out by Cimador Gabriele, Eremia Andreea-Evelina, Stabile Marco and Tamborrino Michele.
Repository licensed with the MIT license. See the LICENSE for rights and limitations.