Memory interference and performance prediction in GPU-accelerated heterogeneous systems

Masola, Alessio

Nowadays, a variety of applications, including automated factories, autonomous vehicles, and Cyber Physical Systems (CPS), are experiencing significant growth. Given the diverse range of challenges that must be addressed, such as real-time management and visualization of a factory's current state through a 3D digital twin, trajectory calculation within autonomous vehicles, visualizing Human Machine Interfaces (HMI), traffic management in smart cities equipped with cameras, IoT devices, and their associated features, a broad array of \emph{heterogeneous devices} with various \emph{hardware accelerators} are being utilized to solve these different problems. In such applications, \emph{power consumption} and \emph{task execution latency} are key aspects to consider, hence investigating approaches that mitigate power consumption while still fully utilizing the computational power provided by the devices becomes necessary. Modern devices use hardware processors that enable the acceleration of highly parallel and data hungry computational workloads; a widely known example of such parallel processor is the \emph{Graphic Process Unit} (GPU), a hardware peripheral traditionally used for graphics rendering but nowadays it is also used as a general purpose compute accelerator. This thesis addresses an analysis of the state of the art of techniques that can be employed to \emph{optimize power consumption} and task execution latency, as well as two types of latencies/interference that tasks can potentially experience: latencies arising from tasks that are concurrently scheduled on the same acceleration unit, i.e., on a partitioned GPU, and the second type under consideration is the latencies experienced by tasks running on embedded boards, specifically on GPU-embedded systems, with a high computational load on the CPU side. Methods are proposed to understand and derive predictive models for latencies in both of the two types of interference. Furthermore, this thesis concludes with a comparative study of two GPU memory management methodologies: explicit copies versus unified virtual memory.

Memory interference and performance prediction in GPU-accelerated heterogeneous systems / Masola, A.. - (2024).