We have proposed a systematic method for mapping systolic algorithms to hardware. This method covers several topics: partitioning, data input / output, efficiently use of Process Elements. An important topic is code generation. The starting point is the sequential code. All used transformations modify code and finally we obtain the parallel code of every Process Element of the Systolic Processor.
Miguel Valero-García, Juan J. Navarro, José M. Llabería, Mateo Valero, and Tomás Lang. Mapping QR Decomposition of Bounded Matrix on a 1D Systolic Array , pp. 25-38. Elsevier Publishers, 1992.
Alvaro Suárez, José M. Llabería, and Agustín Fernández. Scheduling Partitionings in Systolic Algorithms. In Application Specific Array Processors (ASAP'92) , pp. 619-633, Berkeley (USA), August 1992.
Alvaro Suárez. Ordenación de la ejecución de particiones en algoritmos sistólicos . PhD thesis, Universitat Politècnica de Catalunya (UPC), November 1993. Advisor José M. Llabería.
Miguel Valero-García, Juan J. Navarro, José M. Llabería, and Mateo Valero. Implementation of Systolic Algorithms Using Pipelined Functional Units. In Application Specific Array Processors (ASAP'90) , pp. 272-283, Princeton, NJ (USA), September 1990.
In the other hand, one method to execute efficiently parallel algorithms with hypercube communication topology (i.e. FFT algorithms, All-to-All personalized communications, Jacobi methods for singular value decomposition and eigenvalue computation) on torus multicomputers has been developed. It was proposed and evaluated a new embedding of hypercubes on torus multicomputers. It was shown that this embedding is optimal for rings in terms of execution time. In addition, some new techniques to reduce the communication costs in hypercube algorithms have been studied.
Luis Díaz de Cerio, Miguel Valero-García, and Antonio González. Overlapping Communication and Computation in Hypercubes. In 2nd International Euro-Par Conference (Euro-Par'96) , pp. 253-257, Lyon (France), August 1996. Lecture Notes in Computer Science #1123.
Antonio González, Miguel Valero-García, and Luis Díaz de Cerio. Executing Algorithms with Hypercube Topology on Torus Multicomputers. IEEE Transactions on Parallel and Distributed Systems , vol. 6, no. 8, pp. 803-814, August 1995.
Luis Díaz de Cerio, Miguel Valero-García, and Antonio González. A Study of the Communication Cost of the FFT on Torus Multicomputers. In 1st IEEE International Conference on Algorithms and Architectures for Parallel Processing , pp. 131-139, Brisbane (Australia), April 1995.
Luis Díaz de Cerio, Miguel Valero-García, and Antonio González. Efficient FFT on Torus Multicomputers: A Performance Study. In 2nd Austrian-Hungarian Workshop on Transputer Applications , pp. 233-242, Budapest (Hungary), September 1994.
Miguel Valero-García and Antonio González. FFT on Massively Parallel Processors. In COST 229 Workshop on Massively Parallel Computing , pp. 1-9, Madeira (Portugal), April 1993.
Antonio González and Miguel Valero-García. The Xor Embedding: An Embedding of Hypercubes onto Rings and Toruses. In Application Specific Array Processors (ASAP'93) , pp. 15-28, Venecia (Italy), October 1993.