Pronto: A Low Overhead Message Passing System for High Performance Many-Core Processors
Many-core processors provide the raw computation power required by modern high-performance multimedia and signal processing workloads. The conversion of this computation power into execution performance is often constrained by the overheads of communication between concurrent tasks. This paper presents Pronto, a low overhead message passing system which simplifies the semantics of data movement between communicating tasks by performing buffer management, message synchronization and address translation directly in hardware. The integration of these functions into hardware results in transfer latencies upto 30% shorter than state of the art MPI derivatives. The overheads for communication with Pronto in an 18-core processor array are under 5% for 64-word burst transfers, and less than 0.5% of total execution time using workloads such as the JPEG decoder and FIR filter. Furthermore, this paper also studies the effect of task mapping and interconnect traffic on the predictability of data block arrival times, and provides insight on where interconnect contention can be tolerated.
- There are currently no refbacks.