|
123456789101112131415161718192021222324252627282930313233 |
- Parallel & Distributed Computer Systems HW3
-
- January, 2025
-
- Write a program that sorts $N$ integers in ascending order, using CUDA.
-
- The program must perform the following tasks:
-
- - The user specifies a positive integers $q$.
-
- - Start a process with an array of $N = 2^q$ random integers is each processes.
-
- - Sort all $N$ elements int ascending order.
-
- - Check the correctness of the final result.
-
- Your implementation should be based on the following steps:
-
- V0. A kernel where each thread only compares and exchanges. This "eliminates" the 1:n innermost loop. Easy to write, but too many function calls and global synchronizations.
-
- V1. Include the k inner loop in the kernel function. How do we handle the synchronization? Fewer calls, fewer global synchronizations. Faster than V0!
-
- V2. Modify the kernel of V1 to work with local memory instead of global.
-
- You must deliver:
- - A report (about $3-4$ pages) that describes your parallel algorithm and implementation.
-
- - Your comments on the speed of your parallel program compared to the serial sort, after trying you program on aristotelis for $q = [20:27]$.
-
- - The source code of your program uploaded online.
-
- Ethics: If you use code found on the web or by an LLM, you should mention your source and the changes you made. You may work in pairs; both partners must submit a single report with both names.
- Deadline: 2 February, $2025$.
|