Project Milestone Report

Summary

To reiterate our project goals, we will be implementing different parallel implementations of Conway's Game of Life in 3D. We will be using the NVIDIA GPUs on the GHC machines for our CUDA implementation and the PSC machines for the OpenMP implementation.

As for our progress so far, we have completed the sequential implementation, the OpenGL debug renderer, and set up for the CUDA implementation. Using our sequential implementation, we are able to take in a variety of input files and rule-sets and output the correct results. These can then be neatly visualized with our renderer for easier debugging.

Goals and Deliverables

We will discuss our progress on each of our four original planned deliverables and our thoughts on the feasibility of the stretch goals. Note that we still believe that we will be able to produce all four planned deliverables by the poster session.

Sequential Implementation

We have completed and rigorously tested a sequential implementation in C++ that takes in a starting state and rule-set, generates output for an inputted number of frames, and prints the time taken for computation. To test this, we also created test input files that demonstrate the correctness of the implementation.

Parallel CUDA Implementation

We have a planned out approach for our CUDA implementation, and have finished all of the compilation and file outlines related to this deliverable. Creating our own CUDA-compatible Makefiles and foundational CUDA code took longer than expected and put us a little behind schedule on this goal, but we are now fully ready to tackle the algorithm implementation as our next step.

Parallel OpenMP Implementation

We have a general idea of our approach and are currently working on setup for this deliverable. We assume creating the setup can be done fairly quickly as we have experience from setting up Cuda with CMake.

Visualizations for Debugging Purposes in OpenGL

We have created an interactive visualizer for our output files. Using the generated output from our sequential implementation, we can view the results frame by frame, change the currently showing frame forward/back one, and move the camera around in 3D space. This has allowed us to speed up the debugging and testing process as we can now more easily see how the output looks in 3D space. For example, some of our test cases output a repeating pattern that can be easily recognized. The renderer does slow down significantly as we reach larger, more dense test cases (as in a mostly filled cube with side length of greater than 100), but it consistently runs above 60 frames per second on smaller cases and is usable (albeit a bit slow) for larger ones. However, since we have the ability to pause frames, our rendering speed is very fast when paused, allowing us to view the output very quickly.

Stretch Goals

Throughout our development of the sequential alogirhtm and OpenGL visualizer, we realized that two of our stretch goals were fairly easy additions. Thus, we have already added the ability to experiment with various rulesets by including the ruleset in the input and adding additional parsing of our input files. Additionally, we made the visualizer interactive by being able to manually change the current frame and move the camera with your mouse and/or keyboard controls.

We will likely not be able to complete the other two stretch goals (an OpenMPI implementation and integration of our rendering code with CUDA) as they are both significantly more involved, but we will keep the option open if our planned deliverables are completed ahead of schedule.

Planned Demo

We will have two parts to our demo: an analysis section with speedup graphs and video demos of interesting results using our renderer.

Speedup Analysis

This part of the poster session will showcase relative speedup of our three implementations on a variety of test cases. Furthermore, it will included a detailed analysis of these results and discuss the strengths/weaknesses of each implementation.

Video Demo

We will show off our most fun/interesting test cases via a compilation video. Each test case in the video will be rendered using our OpenGL renderer. We will also keep the option of a live implementation (using one of our own computers) open, since it would be fairly easy to render pre-compiled scenes.

Issues

The main issue is just that setting up the Makefiles/boilerplate code for CUDA and OpenMP has been much more difficult than we expected. A variety of errors stemming from GHC machine compatibility, integrating OpenGL things, etc. pushed back our project timeline slightly. We still have to get OpenMP up and running, but we hope that this one will go a little smoother now that we have some experience with creating these files.

Updated Schedule

Date	Goal	Assigned To
April 19	Milestone report due	All
April 20-23	Setup input and output for CUDA	Kate
April 20-23	Setup input and output for OpenMP	Katie
April 24-27	Finish basic CUDA kernel algorithm	Kate
April 24-27	Finish basic OpenMP idea	Katie
April 28-31	Finish CUDA algorithm	Kate
April 28-31	Finish OpenMP	Katie
May 1-2	Wrap up code and start report	All
May 3-4	Complete final report and presentation	All