Vollständig Programmierbare GPU-Pipelines
Zusammenfassung der Projektergebnisse
Real-time rendering is not only important in the entertainment industry, but for all types of visual applications, such as medical data visualization, simulations, education, or architecture. The real-time rendering pipeline is typically tied to efficient hardware, i.e., the graphics processing unit (GPU). However, while tight coupling to a hardware architecture yields high performance and power efficiency, flexibility is sacrificed. Thus, it is not surprising that the real-time rendering pipelines hardly changed since the introduction of programmable shading. Although programmability added flexibility within certain stages of the pipeline, the pipeline itself remains rigid. It consists of a fixed number of stages with strictly defined input and output characteristics. The order of stages cannot be changed, and several of the prescribed stages are available as fixed-function units with only minimal configuration flexibility. This inflexible design restricts research and development of new rendering architectures. Any novel rendering approach, which does not follow the predefined pipeline, is doomed to fail, as it can never compete in performance with approaches that fit the hardware pipeline. In this project we showed that modern manycore processors, like the general purpose programmable cores on the GPU, are capable of running rendering pipelines in a complete software approach. To reach this result, efficient dynamic scheduling of the pipeline stages is essential, as the number of primitives may multiply by multiple orders of magnitude throughout the execution. Thus, scheduling must consider multiple trade-offs, like generating parallelism and keeping data compact, allowing for the execution on arbitrary processing cores and keeping data local for faster access, or streamlining the processing and supporting dynamic decision making. Considering these trade-offs and deriving general solutions to multiple scheduling problems, we derived a complete streaming rendering pipeline, whose performance is reasonably close to the hardware pipeline, while offering unprecedented flexibility. With this approach, we showed that novel rendering algorithms can easily be derived and that small changes to the pipeline can have a lasting impact on rendering quality and rendering speed. Our pipeline cannot only be used to alter the rendering pipeline, but to also experiment with alternative hardware designs and completely new rendering pipelines. For example, we presented the novel rendering pipeline for vector graphics, typically used for font rendering and on websites. Our hierarchical rasterizer for vector graphics rendering not only outperforms the state-of-the-art hardware-supported rendering methods, but also achieves significantly better quality. Another example for the advantages of our approach is the application to virtual reality rendering on head-mounted displays. For this example we could significantly reduce the perceived latency and thus tackle one of major issues with head-mounted displays: motion sickness. Finally, we applied the scheduling solutions devised for rendering pipelines to other domains, including sparse linear algebra operations—typically found in material simulation, dynamic graph processing—such as large social networks, and mesh processing—creation and manipulation of three dimensional models. In all domains, we outperformed the previous state-of-the-art, including handcrafted solutions from both research and industry. These surprising results show that advanced, adaptive scheduling strategies have the potential to transform multiple domains relying on efficient computation on manycore processors.
Projektbezogene Publikationen (Auswahl)
-
“Dynamic Scheduling for Efficient Hierarchical Sparse Matrix Operations on the GPU”. in: Proceedings of the International Conference on Supercomputing. ICS ’17. Chicago, Illinois: ACM, 2017, 7:1–7:10. ISBN : 978-1-4503-5020-4
Andreas Derler, Rhaleb Zayer, Hans-Peter Seidel, and Markus Steinberger
-
“Effective Static Bin Patterns for Sort-middle Rendering”. In: Proceedings of High Performance Graphics. HPG ’17. Los Angeles, California: ACM, 2017, 14:1–14:10. ISBN : 978-1-4503-5101-0. 3105777
Bernhard Kerbl, Michael Kenzel, Dieter Schmalstieg, and Markus Steinberger
-
“A High-Performance Software Graphics Pipeline Architecture for the GPU”. in: ACM Trans. Graph. 37.4 (Nov. 2018)
Michael Kenzel, Bernhard Kerbl, Dieter Schmalstieg, and Markus Steinberger
-
“faimGraph: High Performance Management of Fully-Dynamic Graphs under tight Memory Constraints on the GPU”. in: High Performance Computing, Networking, Storage and Analysis. SC ’18. Dallas, Texas, USA, 2018. ISBN : 978-1-5386-8384-2/18
Martin Winter, Daniel Mlakar, Rhaleb Zayer, Hans-Peter Seidel, and Markus Steinberger
-
“On-the-fly Vertex Reuse for Massively-Parallel Software Geometry Processing”. In: Proc. ACM Comput. Graph. Interact. Tech. 1.2 (Aug. 2018)
Michael Kenzel, Bernhard Kerbl, Wolfgang Tatzgern, Elena Ivanchenko, Dieter Schmalstieg, and Markus Steinberger
-
“The Broker Queue: A Fast, Linearizable FIFO Queue for Fine-Granular Work Distribution on the GPU”. in: Proceedings of the International Conference on Supercomputing. ICS ’18. Beijing, China, 2018
Bernhard Kerbl, Michael Kenzel, Joerg H Mueller, Dieter Schmalstieg, and Markus Steinberger
-
“Hierarchical Rasterization of Curved Primitives for Vector Graphics Rendering on the GPU”. in: Computer Graphics Forum 38.2 (2019), pp. 93–103
Mark Dokter, Jozef Hladky, Mathias Parger, Dieter Schmalstieg, Hans-Peter Seidel, and Markus Steinberger
-
“The Camera Offset Space: Real-time Potentially Visible Set Computations for Streaming Rendering”. In: ACM Trans. Graph. 38.6 (Nov. 2019), 231:1–231:14
Jozef Hladky, Hans-Peter Seidel, and Markus Steinberger
-
“Subdivision-Specialized Linear Algebra Kernels for Static and Dynamic Mesh Connectivity on the GPU”. in: Computer Graphics Forum (2020)
Daniel Mlakar, Martin Winter, Pascal Stadlbauer, Hans-Peter Seidel, Markus Steinberger, and Rhaleb Zayer