Pipelined Execution of Windowed Image Computations
Many image processing operations manipulate an individual pixel using the values of other pixels in the neighborhood. Such operations are called windowed operations. The size of the windowed operation is a measure of the size of the given pixel's neighborhood. A windowed computation applies a windowed operation on all pixels of the image. An image processing application is typically a sequence of windowed computations. While windowed computations admit high parallelism, the cost of inputting and outputting the image often restricts the computation to a few computational units.
In this paper we analytically study the running of a sequence of z windowed computations, each of size w, on a z-stage pipelined computational model. For an NÃ—N image and nÃ—n input/output bandwidth per stage, we show that the sequence of windowed computations can be run in at most N2/n2(1+Î´) steps, where Î´=(n/N+6n3/(wN2)+zw/N+zn2N2). This produces a speed-up of z/(1+Î´) over a single stage; \delta, the overhead is quite small. We also show that the memory requirement per stage is O(wN+n2). With values of N, n and w that reflect the current state-of-the-art, over 20 pipeline stages can be sustained with less than 5% overhead for a 10M-pixel image. Each of these stages would require less than 128 Kbytes of storage.
- There are currently no refbacks.